## Einstein on Brownian Motion

In 1827, botanist Robert Brown observed through his microscope that pollen particles in resting water exhibited motion which he was unable to explain. This motion came to be known as Brownian motion, which now also has a precise mathematical definition as a stochastic (i.e. random) process.

Fast forward to 1905: the scientific community had not yet accepted, as we do today, that matter consists of atoms. Though analysis based on atoms and molecules had proven useful in some applications, many physicists regarded atoms as a hypothetical tool rather than a physical reality. Alternative theories postulated that the structure of the world consisted, for example, of different forms of energy and energy transformations. Albert Einstein's 1905 paper "On the movement of particles suspended in stationary liquids required by the molecular-kinetic theory of heat" (kind of a mouthful, but hey, the guy was German, so what did you expect?) explained Brownian motion as arising from the random bombardments of the pollen particles by moving water molecules, like in the animation below:

Einstein was able to derive formulas for the mean square displacement of a suspended particle as well as Avogadro's number, and when Jean Perrin experimentally verified these results in 1908, the remaining skeptics accepted this as evidence of the physical reality of atoms.

In this post, I will walk you through Einstein's pivotal analysis of Brownian motion.

Note: nothing in this post is new- it's all copied from various sources, including Einstein's paper, Robert E. Kennedy's book on the same (as well as other Einstein papers), and other sources from around the internet. I found it very difficult to find any single source that provided a satisfactory explanation of every step in Einstein's paper, so I hope this post, which has taken me quite a while, achieves that and is clear enough to follow easily without the need to scour the internet for additional information. If not, please post any questions in the comments section!

### The setup

We have a container with volume $V$ filled with liquid (the solvent- let's just call it water going forward) and a solute (e.g. sugar) dissolved therein. The container is divided into two sections by a wall through which the water can flow freely, but which is impermeable to the sugar, and the latter is confined to the walled-off section of the container, which has volume $V^*$. The volume of the remainder of the container is then $V-V^*$.

Since the sugar molecules cannot pass through the wall, they bounce back when they hit it and thus exert a pressure on the wall (the water passes right through the wall and thus does not exert a pressure on it). Van 't Hoff's law states that if the number of moles of sugar in the volume $V^*$ is $z$, then this pressure $\Pi$, known as the osmotic pressure, is given by $\Pi V^* = zRT$, where $R$ is the gas constant and $T$ is the temperature in Kelvin. This is the same as the ideal gas law, which is not totally unexpected since the pressure is due to the sugar molecules bouncing around in the volume $V^*$ the same way a gas in a closed container would.

If, instead of sugar, some larger particles are suspended (but not dissolved, the difference lying solely in the particle size) in the water, according to Einstein, the osmotic pressure on the wall should be given by the same formula as that for solutions above; in other words, the osmotic pressure only depends on the number (or more precisely, the concentration, or number per unit volume) of dissolved/suspended particles. The next section shows his derivation based on the "molecular-kinetic theory of heat."

### The osmotic pressure due to suspended particles

In the last post, I showed you Einstein's derivation of a formula for the entropy of $n$ particles in the water bouncing around due to random molecular motion and collisions: $$S = \frac{\bar{E}}{T} + k \ln \left[ \int e^{\frac{-E}{kT}}dp_1 \, dp_2 \, ... \, dp_{3n} \, dq_1 \, dq_2 \, ... \, dq_{3n} \right] \tag{1}$$ Here, $\bar E$ is the total energy of the $n$ particles, $T$ is the absolute temperature in Kelvin, $k$ is the Boltzmann constant (equal to the gas constant $R$ divided by Avogadro's number $N$), and the $p_i$'s and $q_i$'s are the state variables of the system of $n$ particles, representing the 3 components of momentum and position of each particle, hence the indices' running up to $3n$. The integral is taken over the different configurations of the system (i.e. different possible values of position and momentum of the particles). The $E$ inside the integral is the energy of the system as a function of the state variables.

The free energy (Helmholtz free energy) of the system is defined as $F = E - TS \ \ \text{(2)}$. If you read the Wikipedia article on Helmholtz free energy, you can see the derivation of the formula $dF = -S \, dT - P \, dV$ via basic calculus (in particular, the product rule for differentiation). From this formula, it's clear that $-\dfrac{\partial F}{\partial V} = P$ when the volume increases by an infinitesimal amount at constant pressure. We'll use this relation later on.

Plugging in equation (1) for $S$ in equation (2), we obtain \begin{align} F &= -kT \ln \left[ \int e^{\frac{-E}{kT}}dp_1 \, dp_2 \, ... \, dp_{3n} \, dq_1 \, dq_2 \, ... \, dq_{3n} \right] \\[3mm] &= -kT \ln B \tag{3} \end{align} The key insight that makes the nearly-impossible calculation of the integral $B$ unnecessary is that $B$ can be shown to have the form $B = J(V^{*})^n \ \ \text{(4)}$, where $J$ is some function that does not depend on $V^*$. I'll go through the details of that derivation at the end of this post so as not to deviate too far from the more interesting line of reasoning.

Combining equations (3) and (4), we recover van 't Hoff's law for the osmotic pressure $\Pi$: \begin{align} \Pi &= -\dfrac{\partial F}{\partial V^*} \\[3mm] &= \dfrac{\partial}{\partial V^*} kT \ln [ J(V^{*})^n ] \\[3mm] &= \dfrac{\partial}{\partial V^*} kT \, [\ln J + n \ln V^{*} ] \\[3mm] &= kT\dfrac{n}{V^*} \\[3mm] &\Downarrow \\[3mm] \Pi V^* &= (Nk)T\dfrac{n}{N} \\[3mm] \Pi V^* &= zRT \end{align} In the last line, we used the fact that $k = \dfrac{R}{N}$ and denote the number of moles of particles by $z$. We have shown that van 't Hoff's law is a consequence of the molecular-kinetic theory. This law can equivalently be expressed as $\Pi = kT \nu \tag{$\spadesuit$}$ where $\nu = \dfrac{n}{V^*}$ is the concentration of particles in the volume $V^*$.

### The diffusion coefficient

In deriving a formula for the diffusion coefficient (i.e. the constant in the differential equation governing the diffusion of the particles over time- if you aren't familiar with the diffusion equation, you'll see presently...),  Einstein introduced a fictitious force and used the fact that the free energy is minimized when the system is in dynamic equilibrium. I couldn't fully understand this argument (if you do, please click here and answer my questions!), so instead, I will show a slightly more direct derivation by letting the force due to osmotic pressure be balanced by friction in the fluid.

Suppose again we have $n$ particles in the volume $V^*$, which has a cross-sectional area $A$ perpendicular to the $x$-axis, and that the concentration of the particles $\nu (x)$ may vary with $x$. Consider the volume $\Delta V^*$ between $x$ and $x+ \Delta x$, which contains $N_1$ particles. We have: $$\frac{\Pi (x) - \Pi (x+ \Delta x)}{\Delta x} = \frac{F(x) - F(x+ \Delta x)}{A \Delta x} = \frac{N_{1} F_{\Pi}}{A \Delta x}$$ where $\Pi$ is the osmotic pressure. $F(x)$ is the force on the cross-sectional area at $x$ (positive $F$ would be to the right), and $F_{\Pi}$ is the average force on a particle in the region $\Delta V^*$. Taking the limit as $\Delta x \rightarrow 0$ gives: $$\frac{\partial \Pi}{\partial x} = - \nu F_{\Pi} \tag{5}$$ Note that both sides are functions of $x$ and time, but I am leaving the $x$'s and $t$'s out to simplify the notation a bit. The $\frac{N_1}{A \Delta x} = \frac{N_1}{\Delta V^*}$ became $\nu$ in the limit since that was the number of particles per unit volume.

The force $F_{\Pi}$ acts on the particles and "competes" with the force of friction in the fluid. The drag force due to friction is proportional to the particle velocity since a faster-moving particle hits more molecules that slow it down; if the particles are spheres, the drag force is given by Stokes' Law: $$F_d = -6 \pi \mu R_{p} v \tag{6}$$ where $\mu$ is the viscosity coefficient of the fluid (measurable by simple experiments), $R_p$ is the particle radius, and $v$ is the particle velocity. This lower-case $\pi$ is the usual 3.14. The negative sign reflects the fact that the drag force points in the opposite direction of the velocity. I'm not going to present a derivation of Stokes' Law since it would be very long, so let's take that as a given and plug on.

Before combining all this, we need one more tidbit. Taking  the partial derivative with respect to $x$ on both sides of equation $(\spadesuit)$ from above, we obtain: $$\frac{\partial \Pi}{\partial x} = kT \frac{\partial \nu}{\partial x}$$ In dynamic equilibrium, $F_{\Pi} + F_{d} = 0$, so combining (5) and (6) with the above gives: \begin{align} - \frac{1}{\nu} \frac{\partial \Pi}{\partial x} &= 6 \pi \mu R_{p} v \\[3mm] \implies -kT \frac{\partial \nu}{\partial x} &= 6 \pi \mu R_{p} \nu v \tag{7} \end{align} Now $\nu v$, the density times the velocity, is the flux of particles (at position $x$ and time $t$), i.e. the number of particles passing through a unit area perpendicular to the $x$-axis, per second. Notice that the units are $\dfrac{\text{particles}}{\text{m}^3} \times \dfrac{\text{m}}{\text{s}} = \dfrac{\text{particles}}{\text{m}^2 \ \text{s}}$, as one would expect of such a quantity.

Fick's First Law states that this flux equals $-D \frac{\partial \nu}{\partial x}$, where $D$ is the diffusion coefficient referred to above. The negative sign means that if the particle density is higher on the left, particles tend to diffuse to the right. $D$ has units of $\frac{\text{m}^2}{\text{s}}$ and measures the mean squared displacement of particle diffusion per unit time. This Stack Exchange thread has a detailed explanation of the physical interpretation of the diffusion coefficient, and this Wikipedia article presents a quick and simple derivation of Fick's First Law.

Combining Fick's First Law with equation (7), we see that $$D = \frac{kT}{6 \pi \mu R_{p}} = \frac{RT}{N} \frac{1}{6 \pi \mu R_{p}} \tag{8}$$

### Root mean squared displacement

Assume the $n$ particles' movements are independent and introduce a time interval $\tau$. This time interval is much shorter than the time intervals of observation but long enough that we can consider the movements of a single particle in consecutive time intervals of length $\tau$ to be independent.

Since the movements of the $n$ particles are independent of each other, we can think of them as $n$ different stochastic processes. In other words, we can consider them $n$ separate observations of results of the same random experiment. In a time interval $\tau$, the $x$-coordinate of the position of any given particle will change by some amount $\Delta$. If a particle moves to the right, $\Delta > 0$, and if it moves to the left, $\Delta < 0$. The values of $\Delta$ follow some probability distribution $\phi (\Delta)$ so that after a time period $\tau$ elapses, the proportion of particles $\frac{dn}{n}$ which have experienced a displacement in the $x$ direction between $\Delta$ and $\Delta + d \Delta$ satisfies the equation $$\frac{dn}{n} = \phi(\Delta) \, d \Delta$$ There are a few conditions that the probability distribution $\phi$ should satisfy. First, all probabilities should add up to 1, so $$\int \limits_{-\infty}^{\infty}{\phi(\Delta) \, d \Delta} = 1$$ Also, since $\tau$ is small, $\phi( \Delta)$ should be zero except for very small absolute values of $\Delta$. Finally, a particle should be equally likely to have been displaced to the left or right, so $\phi$ should be an even function, i.e. $$\phi(\Delta) = \phi( - \Delta)$$ If we denote the particle density by $\nu = f(x,t)$, then by the definition of $\phi$, we have: $$f(x,t + \tau) = \int \limits_{-\infty}^{\infty}{f(x+ \Delta , t) \phi(\Delta) \, d \Delta} \tag{9}$$ This equation states that the number of particles at position $x$ at time $t + \tau$ is the sum (integral) of the numbers of particles that were at positions $x + \Delta$ at time $t$ and were displaced by $-\Delta$ in the time $\tau$ (recall that $\phi(-\Delta) = \phi(\Delta)$), summed over the possible values of $\Delta$. Since $\phi(\Delta)$ is assumed to be non-zero only for very small values of $\Delta$, extending the integral out from $-\infty$ to $\infty$ doesn't change the value, but it will be useful below since it will allow us to use well known results for Gaussian integrals.

Now we can expand $f$ into a Taylor series on both sides of equation (9). On the left, we have \begin{align} f(x, t + \tau) &= f(x,t) + \tau \frac{\partial f}{\partial t} + \frac{1}{2} \tau^{2} \frac{\partial^{2} f}{\partial t^2} + ... \\[3mm] &\approx f(x,t) + \tau \frac{\partial f}{\partial t} \end{align} On the right side, we need to expand out to second order (you'll see why in a second)- note that it is justified to do the Taylor expansion inside the integral since we are expanding around $\Delta = 0$, and only small values of $\Delta$ (i.e. those for which the Taylor expansion is accurate) contribute anything to the integral: \begin{align} f(x,t) + \tau \frac{\partial f}{\partial t} &\approx \int \limits_{-\infty}^{\infty}{\left[ f(x,t) + \Delta \frac{\partial f}{\partial x} + \frac{1}{2} \Delta^{2} \frac{\partial^{2} f}{\partial x^2} \right] \phi(\Delta) \, d \Delta} \\[3mm] &= f(x,t) \int \limits_{-\infty}^{\infty}{\phi(\Delta) \, d \Delta} + \frac{\partial f}{\partial x} \int \limits_{-\infty}^{\infty}{\Delta \phi(\Delta) \, d \Delta} + \frac{1}{2} \frac{\partial^{2} f}{\partial x^2} \int \limits_{-\infty}^{\infty}{\Delta^{2} \phi(\Delta) \, d \Delta} \end{align} The first integral on the right-hand side is equal to 1, and the second is zero since $\phi(\Delta)$ is an even function (so $\Delta \phi(\Delta)$ is an odd function). So the above simplifies to $$\frac{\partial f}{\partial t} = \frac{1}{2 \tau} \frac{\partial^{2} f}{\partial x^2} \int \limits_{-\infty}^{\infty}{\Delta^{2} \phi(\Delta) \, d \Delta}$$ For any choice of the distribution $\phi$, the quantity $\frac{1}{2 \tau} \int_{-\infty}^{\infty}{\Delta^{2} \phi(\Delta) \, d \Delta}$ will be a constant, which we call $D$. It follows that the particle density $f$ satisfies the well known diffusion equation (or Fick's Second Law): $$\frac{\partial f}{\partial t} = D \frac{\partial^{2} f}{\partial x^2} \tag{10}$$ This $D$ is the same one referred to in Fick's First Law from above (in fact, the same Wikipedia article shows a derivation of the diffusion equation from Fick's First Law). This means that the formula derived above for the diffusion coefficient is valid here, and equation (10) completely describes the evolution of the particle density.

If we consider a separate coordinate system for each particle whose origin is at the particle's position at $t=0$, then equation (10) still holds, except now $f(x,t)$, instead of describing number of particles at position $x$ at time $t$, would describe the number of particles experiencing a displacement of $x$ from their respective initial positions in an elapsed time $t$. Given the conditions $f(x,0)=0$ for $x \neq 0$ and $\int \limits_{-\infty}^{\infty}{f(x,t) \, dx} = n$, the differential equation (10) is that of diffusion from a point, and its solution is: $$f(x,t) = \frac{n}{\sqrt{4 \pi D t}} \exp \left(\frac{-x^2}{4Dt} \right)$$ Since $f(x,t)$ represents a number of particles, $\frac{1}{n} f(x,t)$ is the probability that a given particle experiences a displacement $x$ in time $t$. Indeed, using the Gaussian integral formula $\int_{-\infty}^{\infty}{e^{-ax^2}dx} = \sqrt{\frac{\pi}{a}}$ with $a=4Dt$, we see that $\int_{-\infty}^{\infty}{\frac{1}{n}f(x,t) \, dx}=1$. It follows that the mean value of the squared displacement is: \begin{align} \left< x^2 \right> &= \int \limits_{-\infty}^{\infty}{x^{2} \frac{1}{n} f(x,t) \, dx} \\ &= \frac{1}{\sqrt{4 \pi Dt}}\int \limits_{-\infty}^{\infty}{x^{2} \exp \left( \frac{-x^2}{4Dt} \right) \, dx} \\[3mm] &= 2Dt \end{align} where the last step utilized the Gaussian integral formula $\int_{-\infty}^{\infty}{x^{2} e^{-ax^2} \, dx} = \frac{1}{2} \sqrt{\frac{\pi}{a^3}}$. Finally, it we see that the root mean sqaured displacement, a measure of the average distance a particle is expected to travel in time $t$, is $\sqrt{\left< x^2 \right>} = \sqrt{2Dt}$, which is proportional to the square root of $t$.

This result was verified by experiment not long after Einstein published his paper, and this is often considered the defining result that convinced the remaining skeptics at the time that atoms existed.

#### Appendix: derivation of $B = J (V^*)^n$
Consider the $n$ particles at positions $(x_1 , y_1, z_1 ), ... , (x_n , y_n, z_n )$ and each surrounded by an infinitesimal parallelpiped region $dx_i \, dy_i \, dz_i$ which is contained in $V^*$. Consider the same $n$ particles, but now at different positions $(x_i ', y_i ', z_i ')$ and surrounded by parallelpiped regions of the same size as before (thus no label change necessary on those). Writing $$dB = J \, dx_1 \, dy_1 \, ... \, dz_n$$ and $$dB' = J' \, dx_1 \, dy_1 \, ... \, dz_n$$ we see that $\dfrac{dB}{dB'} = \dfrac{J}{J'}$.
The probability that the system is in the first configuration is $dB$ divided by the integral over all configurations, $B$, and likewise, the probability the system is in the second configuration is $\frac{dB'}{B}$. Since the particles move independently and the parallelpiped regions are the same size, the two probabilities must be equal, and so $$\frac{dB}{B} = \frac{dB'}{B} \implies dB = dB' \implies J = J'$$ This shows that $J$ is independent of the particle positions and of the volume $V^*$, and so \begin{align} B = \int dB &= \int \limits_{V^*}{J \, dx_1 \, dy_1 \, ... \, dz_n} \\[3mm] &= J \int \limits_{V^*}{dx_1 \, dy_1 \, ... \, dz_n} \\[3mm] &= J (V^*)^n \ \tag*{\square} \end{align}