You guys know Henry from MinutePhysics, right? Well, he and I just made a video on a certain quantum mechanical topic – “Bell’s inequalities”. It’s a really mind-warping topic that not enough people know about, and even though it’s a quantum thing, it’s based on some surprisingly simple math, and you should definitely check it out.
For this video, we have in mind those viewers who actually want to learn some quantum mechanics more deeply. And obviously, it’s a huge topic, nowhere near the scope of a single video. But the question we asked was what topic could we present that’s not meant to be some eye-catching piece of quantum weirdness, but which actually lays down some useful foundations for anyone who, you know, wants to learn this field. What topic would set the right intuitions for someone before they dove into, say, the Feynman lectures. Well, a natural place to start, where quantum mechanics itself started, is light.
Specifically, if you want to learn quantum, you have to have an understanding of waves, and how they’re described mathematically. And what we’d like to build to here is the relationship between the energy in a purely classical wave and the probabilities that govern quantum behavior. In fact, we’ll actually spend most of the time talking through the pre-quantum understanding of light, since that sets up a lot of the relevant wave mechanics.
The thing is, a lot of ideas from quantum mechanics, like describing states as superpositions with various amplitudes and phases, come up in the context of classical waves in a way that doesn’t involve any of the quantum weirdness people might be familiar with. This also helps to appreciate what’s actually different in quantum mechanics, namely, certain restrictions on how much energy these waves can have, how they behave when measured, and quantum entanglement, though we won’t cover entanglement in this video. So, we’ll start with the late 1800s understanding of light as waves in the electromagnetic field. Here, let’s break that down a bit. The electric field is a vector field, and that means every point in space has some arrow attached to it, indicating the direction and strength of the field. Now, the physical meaning of those arrows is that if you have some charged particle in space, there’s going to be a force on that particle in the direction of the arrow, and it’s proportional to the length of the arrow and the specific charge of the particle. Likewise, the magnetic field is another vector field, where now the physical meaning of each arrow is that when a charged particle is moving through that space, there’s going to be a force perpendicular to both its direction of motion and to the direction of the magnetic field, and the strength of that force is proportional to the charge of the particle, its velocity, and the length of the magnetic field arrow. For example, a wire with a current of moving charges next to a magnet is either pushed or pulled by that magnetic field.
A kind of culmination of the 19th century physics understanding of how these two fields work are Maxwell’s equations, which among other things describe how each of these fields can cause a change to the other. Specifically, what Maxwell’s equations tell us is that when the electric field arrows seem to be forming a loop around some region, the magnetic field will be increasing inside that region, perpendicular to the plane of the loop. And symmetrically, such a loop in the magnetic field corresponds to a change in the electric field within it, perpendicular to the plane of the loop. Now the specifics for how exactly these equations work is really beautiful, and worth a full video on its own. But all you need to know for now is that one natural consequence of this mutual interplay and how changes to one field cause changes to the other in its neighboring regions is that you get these propagating waves, where the electric field and magnetic fields are oscillating perpendicular to each other, and perpendicular to the direction of propagation. When you hear the term “electromagnetic radiation”, which refers to things like radio waves and visible light, this is what is talking about: propagating waves in both the electric and magnetic fields. Of course, it’s now almost mainstreamed to know of light as electromagnetic radiation, but it’s neat to think about just how surprising this was in Maxwell’s time, that these fields that have to do with forces on charged particles and magnets, not only have something to do with light, but that what light is is a propagating wave as these two fields dance with each other, causing this mutual oscillation of increasing and decreasing field strength. With this as a visual, let’s take a moment to lay down the math used to describe waves.
It’ll still be purely classical, but ideas that are core to quantum mechanics, like superposition, amplitudes, phases, all of these come up in this context, and I would argue with a clearer motivation for what they actually mean. Take this wave and think of it as directed straight out of the screen towards your face.
And let’s go ahead and ignore the magnetic field right now, just looking at how the electric field oscillates. And also, we’re only going to focus on one of these vectors oscillating in the plane of the screen, which we’ll think of as the xy plane. If it oscillates horizontally like this, we say that the light is “horizontally polarized”, so the y component of this electric field is zero at all times. And we might write the x component as something like cos(2πft), where f represents some frequency, and t is time. So, if f was 1, for example, that means it takes exactly 1 second for this cosine function to go through a full cycle. For a lower frequency, that would mean it takes more time for the cosine to go through its full cycle. As the value t increases, the inside of this cosine function increases more slowly. Also, we’re going to include another term in here – φ, called the “phase shift”, which tells us where this vector is in its cycle at time t=0. You’ll see why that matters in just a moment.
Now by default, cosine only oscillates between -1 and 1, so let’s put another term in front – A that gives us the amplitude of this wave. One more thing, just to make things look a little more like they often do in quantum mechanics, instead of writing it as a column vector like this, I’m going to separate it out into two different components using these symbols called “kets”. This ket here indicates a unit vector in the horizontal direction, and this ket over here represents a unit vector in the vertical direction. If the light is vertically polarized, meaning the electric field is wiggling purely in the up-and-down direction, its equation might look like this, where the horizontal component is now zero, and the vertical component is a cosine with some frequency, amplitude, and a phase shift. Now if you have two distinct waves, two ways of wiggling through space over time that solve Maxwell’s equations, then adding both of these together gives another valid wave, at least in a vacuum. That is, at each point in time, add these two vectors tip-to-tail to get a new vector. Doing this at all points in space and all points in time gives a new valid solution to Maxwell’s equations, at least this is all true in a vacuum. This is because Maxwell’s equations in a vacuum are what’s called “linear equations”. They’re essentially a combination of derivatives acting on the electric and magnetic fields to give zero. So if one field F_1 satisfies this equation and another field F_2 satisfies it, then their sum F_1 plus F_2 also satisfies it, since derivatives are linear. So the sum of two or more solutions to Maxwell’s equations is also a solution to Maxwell’s equations. This new wave is called a “superposition” of the first two. And here, superposition essentially just means sum, or, in some context, weighted sum, since if you include some kind of amplitude and phase shift in each of these components, it can still be called a superposition of the two original vectors. Now, right now the resulting superposition is a wave wiggling in the diagonal direction. But if the horizontal and vertical components were out of phase with each other, which might happen if you increase the phase shift in one of them, their sum might instead trace out some sort of ellipse. And in the case where the phases are exactly 90° out of sync with each other, and the amplitudes are both equal, this is what we call “circularly polarized light”. This, by the way, is why it’s important to keep track not just of the amplitude in each direction, but also of the phase – it affects the way that two waves add together. That’s also an important idea that carries over to quantum, and underlies some of the things that look confusing at first. And here’s another important idea. We’re describing waves by adding together the horizontal and vertical components, but we could also choose to describe everything with respect to different directions.
I mean, you could describe waves as some superposition of the diagonal and the anti-diagonal directions. In that case, vertically polarized light would actually be a superposition of these two diagonal wiggling directions, at least when both are in phase with each other, and they have the same magnitude. Now, the choice of which directions you write things in terms of is called a “basis”. And which basis is nicest to work with? Well, that typically depends on what you’re actually doing with the light. For example, if you have a polarizing filter, like that from a set of polarized sunglasses, the way these work is by absorbing the energy from electromagnetic oscillations in some particular direction. A vertically oriented polarizer, for example, would absorb all of the energy from these waves along the horizontal directions, at least classically that’s how you might think about it. So, if you’re analyzing light and it’s passing through a filter like this, it’s nice to describe it with respect to the horizontal and vertical directions. That way, what you can say is that whatever light passes through the filter is just the vertical component of the original wave. But if you had a filter oriented, say diagonally, well, then it would be convenient to describe things as a superposition of that diagonal direction and it’s perpendicular anti-diagonal direction. These ideas will carry over almost word-for-word to the quantum case. Quantum states, much like this wiggling direction of our wave, are described as a superposition of multiple base states, where you have many choices for what base states to use.
And just like with classical waves, the components of such a superposition will have both an amplitude and a phase of some kind. And by the way, for those of you who do read more into quantum mechanics, you’ll find that these components are actually given using a single complex number rather than a cosine expression like this one. One way to think of this is that complex numbers are just a very convenient and natural mathematical way to encode an amplitude and a phase with a single value. That can make things a little confusing, because it’s hard to visualize a pair of complex numbers, which is what would describe a superposition of two base states. But you can think about the use of complex numbers throughout quantum mechanics as a result of its underlying wavy nature and its need to encapsulate the amplitude and the phase for each direction. Okay, just one quick point before getting into the quantum. Look at one of these waves, and focus just on the electric field portion like we were before. Classically, we think about the energy of a wave like this as being proportional to the square of its amplitude. And I want you to notice how well this lines up with the Pythagorean theorem. If you were to describe this wave as a superposition of a horizontal component with amplitude A_x and a vertical component with amplitude A_y, then its energy density is proportional to (A_x)^2 plus (A_y)^2. And you can think of this in two different ways: either it’s because you’re adding up the energies of each component in the superposition, or it’s just that you’re figuring out the new amplitude using the Pythagorean theorem, and taking the square. Isn’t that nice? In the classical understanding of light, you should be able to dial this energy up and down continuously however you want by changing the amplitude of the wave. But what physicists started to notice in the late 19th and early 20th centuries was that this energy actually seems to come in discrete amounts. Specifically, the energy of one of these electromagnetic waves always seems to come as an integer multiple of a specific constant times the frequency of that wave. We now call this constant “Planck’s constant”, commonly denoting it with the letter h. Physically, what this means is that whenever this wave trades its energy with something else, like an electron, the amount of energy it trades off is always an integer multiple of h times its frequency. Importantly, this means there is some minimal non-zero energy level for waves of a given frequency – hf. If you have an electromagnetic wave with this frequency and energy, you cannot make it smaller without eliminating it entirely. That feels weird when the conception of a wave is a nice continuously oscillating vector field, but that’s not how the universe works as late 19th and early 20th century experiments started to expose. In fact, I’ve done a video about this, called “the origin of quantum mechanics”. However, it’s worth noting that this phenomenon is actually common in waves when they’re constrained in certain ways, like in pipes or instrument strings, and it’s called “harmonics”. What’s weird is that electromagnetic waves do this in free space, even when they’re not constrained. And what do we call an electromagnetic wave with this minimal possible energy? A photon! But like I said, the math used to describe classical electromagnetic waves carries over to describing a photon.
It might have, say, a 45° diagonal polarization, which can be described as a superposition of a purely horizontal state and a purely vertical state, where each one of these components has some amplitude and phase. And with a different choice of basis, that same state might be described as a superposition of two other directions. All of this is stuff that you would see if you started reading more into quantum mechanics, but this superposition has a different interpretation than before, and it has to. Let’s say you were thinking of this diagonally polarized photon kind of classically, and you said it has an amplitude of 1 unit for some appropriate unit system. Well, that would make the hypothetical amplitudes of its horizontal and vertical components each √(1/2). and like Henry said, the energy of a photon is this special constant h times its frequency. And because in a classical setting, energy is proportional to the square of the amplitude of this wave, it’s tempting to think of half of the energy as being in the horizontal component, and half of it as being in the vertical component. But waves of this frequency cannot have half the energy of a photon. I mean, the whole novelty of quantum here is that energy comes in these discrete indivisible chunks. So these components with an imagined amplitude of 1/√2 could not exist in isolation, and you might wonder what exactly they mean. Well, let’s get experimental about it. If you were to take a vertically oriented polarizing filter, and shoot this diagonally polarized photon right at it, what do you think would happen? Classically, the way you’d interpret the superposition is that the half of its energy in the horizontal direction would be absorbed, but because energy comes in these discrete photon packets, it either has to pass through with all of its energy, or get absorbed entirely. And if you actually did this experiment, about half the time the photon goes through entirely, and about half the time it gets absorbed entirely, and it appears to be random whether a given photon passes through or not. If it does pass through, forcing it to make a decision like this actually changes it so that it’s polarization is oriented along the filter’s direction. This is analogous to the classic Schrodinger’s cat setup: you have something that’s in a superposition of two states, but once you make a measurement of that superposition, forcing it to interact with an observer, in a way where each of those two states would behave differently. From the perspective of that observer, this superposition collapses to be entirely in one state, or entirely in another: dead or alive, horizontal or vertical. One pretty neat way to see this in action, which Henry and I talked about in the other video, is to take several polarized sunglasses or some other form of polarizing filters and start by holding two of them between you and some light source. If you rotate them to be 90° off from each other, the light source is blacked out completely. (or at least with perfect filters it would be) Because all of the photons passing through that first one are polarized vertically, so they actually have a 0% chance of passing a filter oriented horizontally. But if you insert a third filter oriented at a 45° angle between the two, it actually lets more light through! And what’s going on here is that 50% of the photons passing that vertical filter will also pass through the diagonal filter. And once they do, they’re going to be changed to have a purely diagonal polarization. And then once they’re in that state, they have a 50/50 chance of passing through the filter oriented at 90°. So even though 0% of the photons passing through the first would pass through that last if nothing was in between, by introducing another filter, 25% of them now passed through all three. Now that’s something that you could not explain unless that middle filter forces the photons to change their states. And that experiment, by the way, becomes all the weirder when you dig into the specific probabilities for angles between 0 and 45°, and that’s actually what we talked about in the other video. For example, one specific value we focus on there is the probability that a photon whose polarization is 22.5° off the direction of a filter is going to end up passing through that filter. Again, it’s helpful to think of this wave as having an amplitude of 1, and then you think of the horizontal component as having amplitude sin(22.5°), which is around 0.38; and the vertical component would have an amplitude cos(22.5°), which is around 0.92. Classically, you might think of its horizontal component as having energy proportional to 0.38^2, which is around 0.15. Likewise, you might think of the vertical component as having an energy proportional to 0.92^2, which comes out to be around 0.85. And like we said before, classically, this would mean, if you pass it through a vertical filter, 15% of its energy is absorbed in the horizontal direction, but because the energy of light comes in these discrete quanta that cannot be subdivided, instead, what you observe is that 85% of the time the photon passes through entirely, and 15% of the time it gets completely blocked.
Now I want to emphasize that the wave equations don’t change. The photon is still described as a superposition of two oscillating components, each with some phase and amplitude, and these are often encoded using a single complex number. The difference is that classically the squares of the amplitudes of each component tells you the amount of that wave’s energy in each direction; but with quantized light, at this minimal non-zero energy level, the squares of those amplitudes tell you the probabilities that a given photon is going to be found to have all of its energy in one direction or not. Also, these components could still have some kind of phase difference. Just like with classical waves, photons can be circularly polarized, and there exists polarizing filters that only let through photons that are polarized circularly, say in the clockwise direction. Or rather, they let through all photons probabilistically, where the probabilities are determined by describing each one of those photons as a superposition of the clockwise and counterclockwise states, and then the square of the amplitude of the clockwise component gives you the desired probability. Photons are, of course, just one quantum phenomenon, one where we initially understood it as a wave thanks to Maxwell’s equations, and then as individual particles or quanta – hence, the name “quantum mechanics”. But as many of you well know, there’s a flipside to this, where are many things that were understood to come in discrete little packets, like electrons, were discovered to be governed by similar wavy quantum mechanics. In cases way more general than this one photon polarization example, quantum mechanical states are described as some superposition of multiple base states, and the superposition depends on what basis you choose. Each component in this superposition is given with an amplitude and a phase often encoded as a single complex number. And the need for this phase arises from the wave nature of these objects. As with the photon example, the choice of how to measure these objects can determine a set of base states, where the probability of measuring a particle to be in one of these base states is proportional to the squares of the amplitudes of these numbers. It’s funny to think, though, that if the wavy nature of electrons and other particles was discovered first, we might instead refer to the whole subject as “harmonic mechanics”, or something like that, since the weirdness there is not that waves come in discrete units, but that particles are governed by wave equations. This video was supported in part by Brilliant. And as viewers of this channel know, what I like about Brilliant is that they’re a great compliment to passively watching educational videos.
All of you here want to learn more math, or physics, or the math that prepares you for physics. And the only way to actually learn this stuff is to actively grapple with puzzles and problem solving. Brilliant offers many really well curated sequences of problems that help you to master all sorts of technical subjects. You all like physics clearly, so I think that you would enjoy their courses on “Classical Mechanics” and “Gravitational Physics”. And honestly, “Group Theory” would give you a really good foundation. But there are many other great courses too, especially in math. If you go to brilliant.org/3b1b, that one lets them know that you came from here, and also the first 200 people that go to that link are going to get 20% off the annual Brilliant premium subscription. That’s the subscription I’ve been using, and it’s actually really fun to have a bank of these puzzles and problems. But of course, for those of you who want some more passive viewing, don’t forget that Henry and I just put out a video on Bell’s inequalities over on MinutePhysics. If for some reason you haven’t been following MinutePhysics these days, (and I don’t know why you wouldn’t have been) the videos there have been really top-notch. So definitely take a moment to poke around the rest of his channel.