Physics in Machine Learning, Deep Learning, and AI.
Let’s see if the following scenario sounds familiar or relatable:
You are a current student who realized that you need to learn about Machine Learning (ML), Deep Learning (DL), and AI in general. Maybe you are a mid-career professional who is looking to pivot, realizing the same or maybe you have to learn something about these things because you got “the memo” that you will be working with newly hired AI expert!
Chances are that irrespective of the exact situation, we all end up following a similar plan of action: look at free resources on the internet, Youtube videos, Lecture notes of famous courses, maybe a bootcamp or two, maybe a paid course if you can afford to. After all that, you feel reasonably secure that you know when to apply which method, why different subject areas like computer vision or healthcare require specific methods suited to those areas. You may even attempt to read original research papers on the methods you intend to use. That’s when you start realizing that the entire structure is heavily steeped in Math!
You had a feeling about it but you pushed it back to be dealt with later. Now, you start to notice how many people were talking about getting a foundation in math to be able to use ML/DL/AI more effectively and interpret the results properly. So you dust off the old books on calculus, linear algebra, and probability and statistics and repeat the whole cycle mentioned above, but this time for math. How much math you need to do depends on your own individual case. Unless you are a researcher or an engineer tweaking these systems at the fundamental level, you only need an intuitive grasp of the concepts and how they become the foundation of ever complicated architectures.
Finally, you feel you are set and have a reasonable foundation to build upon and go deeper. As soon as you do that, you encounter technical terms, jargon and concepts that were never explained in any material you studied or any math you learned! You come across terms like ensemble, temperature, entropy, partitions, free energy, attractors, stability, equilibrium, non-equilibrium, diffusion, manifolds, groups, symmetries, eigenspaces, dynamics, variational, renormalization, etc. You may even hear someone mention Ising Model, quantization, or pretty much anything with quantum as a prefix.
Now you despair that also need to know Physics!
It does feel like this whole thing is a bottomless pit but there is a good reason for the need to know a bit of physics. All of the terms I mentioned above and many more are at the foundation of this whole field, starting with neural networks. Most of them are concepts borrowed from Physics and applied to DS/ML/DL/AI. It is a fact that starting with “neurons” in neural networks, a whole bunch of methods, architectures, and protocols are new applications of statistical physics. Occasionally, well-established methods from astronomy and particle physics have also been modified and adapted.
It is not so surprising that the Nobel Prize for advances in AI was awarded in the Physics category. Now, it may even be clearer why notable figures like Elon Musk and Jensen Huang have been advocating studying physics.
I have been observing and thinking about this need to explain the physics that operates behind-the-scenes for a while now. In subsequent posts, I aim to comprehensively explain each and every physics concept that is used in AI. I am not attempting to convert anyone to study physics endlessly or become an expert at it. My goal for this blog is to give you intuitive understanding and sufficient insights so that if someone tries to “blind you with physics”, you can hold your own!
Let me briefly explain why I think I can do this. Firstly, I love teaching, am passionate about it. Majority of my teaching has been in Math and Physics. I am trained as a soft matter physicist, but at various points in my life, I have worked in astrophysics, nonlinear dynamics and chaos, pattern formation, Bose-Einstein condensation, physics of granular systems, complex fluids, and locomotion of microorganisms. That basically covers the entire length scale, from cosmic scale to quantum scale and the in-between scale.
I transitioned to a teaching career thirteen years ago. I am a teaching-track math professor at NYU, Courant Institute. I have taught math to an incredibly diverse group of undergraduates. I have taught abstract math concepts to every possible major on campus. I enjoy finding ways to explain and engage students who are usually apprehensive and sometimes terrified of learning math. I plan on bringing the same passion here and to quote Einstein, “Make everything as simple as possible but not simpler.”
I genuinely hope you enjoy my offering and maybe even get some practical benefit out of it. I know I am going to thoroughly enjoy this new creative outlet.