Last Thursday I finally earned my Ph.D. in Physics. The title of my work, together with my advisor Nestor Caticha, is "Learning on Hidden Markov Models", where we studied the performance of learning algorithms for HMMs, a kind of machine learning model that is a special case of a wider class named graphical models. Machine learning is an alternate name, although you can consider it as a particular area, of artificial intelligence. The diference between both terms is as diffuse as you want, but technically the former is preferred.

It may seems strange that a physics thesis is about machine learning, but statistical physics is an area with a lot of insterdisciplinar applications. It is an area that studies the interactions between systems composed of a large number of individual interacting units. It has already given a lot of important results when applied to the study of perceptrons, simplified models of artificial neural networks, and our hope in our work was that it could give interesting results for HMMs too. When I had prepared a suitable english version of my thesis and when our papers were submitted I will put links to them here.

But coming back to the main point, what exactly does physics have to do with machine learning? Well, one of the first insights of machine learning appeared when two guys, McCulloch and Pitts, introduced a simplified mathematical model of a neuron. The simplified model was inspired in the real neuron in the brain, it was composed of "synapses" from where the neuron received inputs in the form of numerical values, a "body", mathematically a function that processed the input turning it in an output numerical value that were transmitted to other unit by an output synapse. This model is in the paper "McCulloch, W. and Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 7:115 - 133".

Linking a great number of these units together by the synapses, one could construct a network as complicated as we want. It can be shown that these networks can be used to store memories and to infer answers to questions. These artificial neural networks (ANNs) can have complex or simplified architectures, the simplest one being the so-called (Rosenblatt) perceptron. Well, our brain is a natural neural network with about 10

Statistical physics studies systems with a great number of interacting individual units trying to make predictions of the typical behavior of the system as a whole. It makes the connection between Newtonian mechanics and Thermodynamics. In Newtonian mechanics the systems are described by position and velocity of each particle, but in thermodynamics a system is described by "bulk" macroscopic properties as temperature, pressure and volume. Thermodynamics is recovered from mechanics when we analyse the equations for a large number of particles and take averages, mathematically this limit, widely known as the Thermodynamical Limit, is attained only when the size of the system (the number of individual components) goes to infinity. This approach helped to understand interesting properties of matter, as the famous Phase Transitions (like water boiling or ice melting in water).

I turns out that there is a way to map neural networks into already studied physical systems such that when you apply the methods of statistical physics to then and take the thermodynamic limit, you can calculate properties! It was made for perceptrons and worked pretty well! Later, other methods of statistical physics were used with the same success, one of the most celebrated, and most controversial for most mathematicians, is a mathematical trick named the Replica Trick. But I´ve already written too much and will let those matters for another post.

It may seems strange that a physics thesis is about machine learning, but statistical physics is an area with a lot of insterdisciplinar applications. It is an area that studies the interactions between systems composed of a large number of individual interacting units. It has already given a lot of important results when applied to the study of perceptrons, simplified models of artificial neural networks, and our hope in our work was that it could give interesting results for HMMs too. When I had prepared a suitable english version of my thesis and when our papers were submitted I will put links to them here.

But coming back to the main point, what exactly does physics have to do with machine learning? Well, one of the first insights of machine learning appeared when two guys, McCulloch and Pitts, introduced a simplified mathematical model of a neuron. The simplified model was inspired in the real neuron in the brain, it was composed of "synapses" from where the neuron received inputs in the form of numerical values, a "body", mathematically a function that processed the input turning it in an output numerical value that were transmitted to other unit by an output synapse. This model is in the paper "McCulloch, W. and Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 7:115 - 133".

Linking a great number of these units together by the synapses, one could construct a network as complicated as we want. It can be shown that these networks can be used to store memories and to infer answers to questions. These artificial neural networks (ANNs) can have complex or simplified architectures, the simplest one being the so-called (Rosenblatt) perceptron. Well, our brain is a natural neural network with about 10

^{11}neurons. It´s a huge number! So huge that for practical purposes we can treat this number as infinite. This is where physics enter, to be more specific, statistical physics.Statistical physics studies systems with a great number of interacting individual units trying to make predictions of the typical behavior of the system as a whole. It makes the connection between Newtonian mechanics and Thermodynamics. In Newtonian mechanics the systems are described by position and velocity of each particle, but in thermodynamics a system is described by "bulk" macroscopic properties as temperature, pressure and volume. Thermodynamics is recovered from mechanics when we analyse the equations for a large number of particles and take averages, mathematically this limit, widely known as the Thermodynamical Limit, is attained only when the size of the system (the number of individual components) goes to infinity. This approach helped to understand interesting properties of matter, as the famous Phase Transitions (like water boiling or ice melting in water).

I turns out that there is a way to map neural networks into already studied physical systems such that when you apply the methods of statistical physics to then and take the thermodynamic limit, you can calculate properties! It was made for perceptrons and worked pretty well! Later, other methods of statistical physics were used with the same success, one of the most celebrated, and most controversial for most mathematicians, is a mathematical trick named the Replica Trick. But I´ve already written too much and will let those matters for another post.

**Picture taken from:**http://www.nada.kth.se/~asa/Game/BigIdeas/ai.html
What is your next move, now that you earned you PhD? I am curious because I also would like to pursue studying in the area of theoretical physics. Another question I have in mind is whether if the study physics is for elites and geniuses only. I am good and math and physics, but I don't consider myself any of the above. Your answer will be greatly appreciated. Jason.

ReplyDeleteI plan to continue research in both areas I like: Artificial Intelligence and Fundamental Physics. I've applied for a research position in the UK and I'm waiting the answer.

ReplyDeletePhysics is not only for geniuses. Geniuses appear as in every area or discipline, but a big part of the work is made by those who like what do and work hard.

If you really want to be a theoretical physicist, go on and do not bother about prejudices. If you need some help, do not hesitate in sending me an email personally. I'll be happy to help.