Intuitive Understanding of the Clausius Definition of Entropy

This post came from my answer to a Physics Stack Exchange question. In classical thermodynamics, we define the change in a system’s entropy when it absorbs heat $\delta Q$ at temperature $T$ to be $\delta S = \frac{\delta Q}{T}$. Why?

The important thing here is that in classical (Clausius’s) thermodynamics there was found to be a new function of state, i.e. a quantity which depends only on a system’s macroscopic state and and not on the path the system takes in reaching that state. And this finding of a new function of state allows a new (as of Clausius’s time) definition of temperature. So the equation $\delta S = \frac{\delta Q}{T}$ is actually a definition of temperature: temperature is that quantity which $\delta Q$ must be divided by to make it into an exact differential (infinitessimal change of a state).

It all goes back to very first formulations of the second law of thermodynamics by Carnot and Clausius, to wit, that it is impossible to build a perpetual motion machine of the second kind or “heat can never pass spontaneously from a colder to a warmer body” and the implications of this law to the efficiencies of heat engines. A perpetual motion machine of the second kind is one whose state undergoes a cycle in state space and, on returning to its beginning state, has pumped heat from a colder to a hotter body without any input of work.

The Wikipedia page on Temperature under the heading “Second law of Thermodynamics” gives a reasonable summary of these ideas; “The Laws of Thermodynamics” (Chapter 44) in the first volume of the Feynman Lectures on Physics is a much fuller exposition.

It all comes down to the efficiencies of reversible heat engines, which, in Carnot’s conception, work either by (i) drawing heat from a hotter (“higher temperature”, not yet well defined) reservoir and dumping some of it to another cooloer (“lower temperature””, not yet well defined) reservoir whilst outputting the difference as useful work or (ii) work in the inverse way, taking in mechanical work to pump heat from the cooler to the hotter body. A “reservoir” here is a hot body that is so big that any amount of heat added to or taken from it does not appreciably change its macrostate.

By a thought experiment whereby the work output of one reversible heat engine taking heat from hot reservoir and dumping it to the cold reservoir is used to drive another reversible engine taking heat in the opposite direction. After a little work with this idea, it readily follows that the efficiencies of the two reversible heat engines must be the same. Otherwise if one efficiency were greater than the other, we could use the greater efficiency engine as the heat pump and violate the Carnot / Clausius statement of the second law. So we have now Carnot’s theorem that:

The efficiencies of all reversible heat engines working between the same two reservoirs must all be the same and depends only on those reservoirs and not on the internal workings of the heat engines

Once you understand this, you now have a way of comparing different reservoirs from the point of view of idea heat engines. Namely, we take a particular reservoir as a standard and call its temperature unity, by definition. Then, if we run a reversible heat engine between a hotter reservoir and this one, and $t$ units of heat is taken from hotter one for each unit of heat dumped to our standard reservoir (thus producing $t-1$ units of work), then we shall call the temperature of the hotter one $t$ units, by definition. Likewise, if we run a reversible heat engine between our standard reservoir and a colder one and we find that our reservoir delivers $t$ units of heat to the engine for every unit dumped to the colder reservoir, then the colder one is by definition at a temperature of $\frac{1}{t}$ units. In general the proportions of heat flowing between reservoirs of temperatures $T_1$ and $T_2$ ($T_1>T_2$) defined in this way in a reversible heat engine (i.e. heat $Q_1$ is drawn from reservoir one and heat $Q_2$ is dumped into reservoir 2, thus producing work $Q_1 – Q_2$) are always in the same proportions and given by:

$$\frac{Q_1}{T_1} = \frac{Q_2}{T_2}$$

From this definition, it then follows that $\frac{\delta Q}{T}$ is an exact differential because $\int_a^b \frac{d\,Q}{T}$ between positions $a$ and $b$ in phase space must be independent of path (otherwise one can violate the second law as formulated by Clausius / Carnot). This last statement is not altogether obvious: you’ll have to look at Feynman for details. So we have this new function of state “entropy” definied to increase by the exact differential $\mathrm{d} S = \delta Q / T$ when the a system reversibly absorbs heat $\delta Q$.

So the entropy expression is the way it is owing to the way we define the thermodynamic temperature, which definition is in turn justified by Carnot’s theorem. Note that this is partially convention: we could have used ratios of square roots efficiencies above instead of simply ratios of efficiencies, and then we would need to define $\mathrm{d} S = \delta Q /\sqrt{T}$. Clearly it makes life simpler to define temperature as we have above.

What happens in practice is the following. Now that we have a definition of the ratio of temperatures in terms of the efficiency $\eta$ of the reversible heat engine running between reservoirs of these temperatures:

$$\frac{T_2}{T_1} = 1-\eta$$

one defines a “standard” unit temperature (e.g. as something like that of the triple point of water), then the full temperature definition follows. This definition can be shown to be equivalent to the definition of temperature for a system:

$$T^{-1} = k\,\partial_U S$$

i.e. the inverse temperature (sometimes quaintly called the “perk”) is how much a given system “thermalizes” (increases its entropy) in response to the adding of heat to its internal energy $U$ (how much the system rouses or “perks up”). The Boltzmann constant depends on how one defines one’s unit temperature – in natural (Plank) units unity temperature is defined so that $k = 1$.

With our knowledge of statistical mechanics, thermodynamic entropy can be defined in terms of marginal information and is tightly linked to the information-theoretic Shannon entropy. The two are equal for a thermalized system of perfectly uncorrelated (statistically independent) constituents – the special case envisaged by Boltzmann’s Stosszahlansatz (molecular chaos, although Boltzmann’s own word means “collision number hypothesis”). So the equation $\delta S = \frac{\delta Q}{T}$ becomes even more a definition of temperature in terms of more basic ideas of energy and information. Note, however, that the thermodynamic entropy calculated from marginal distributions $N \sum p_j \log p_j$ is not in general equal to the Shannon entropy in these units: one in general has to take into account correlations between particles, which lessens the entropy (because particle states partially foretell other particle states). See, for a good explanation of these ideas:

E. T. Jaynes, “Gibbs vs Boltzmann Entropies”, Am. J. Phys. 33, number 5, pp 391-398, 1965

as well as many other of his works in this field.

Now temperature is sometimes said to be proportional to the average energy of a system’s thermalized constituent particles. This is true for ideal gasses, but not the general definition. To see this, let’s apply the definition $T^{-1} = k\,\partial_U S$ to a thermalised system of quantum harmonic oscillators: suppose they are at distinguishable positions. At thermodynamic equilibrium, the Boltzmann distribution for the ladder number (number of photons / phonons in a given oscillator) is:

$$p(n) = \left(e^{\beta\,  \hbar\,\omega }-1\right) e^{-\beta\,\hbar\, \omega \, (n+1)}$$

The mean oscillator energy is then:

$$\left<E\right> = \frac{\hbar\,\omega}{2}\,\coth\left(\frac{1}{2}\,\beta\,\hbar\,\omega \right)$$

The Shannon entropy (per oscillator) is then:

$$S = -\sum\limits_{n = 0}^\infty p(n) \log p(n) = \frac{\beta\,\hbar\,\omega\,e^{\beta  \,\hbar\,\omega}}{e^{\beta\,\hbar\,\omega}-1} – \log \left(e^{\beta\,\hbar\,\omega}-1\right)$$

so the thermodynamic temperature is then given by (noting that the only way we change this system’s energy is by varying $\beta$):

$$T^{-1} = \partial_{\left<E\right>} S = \frac{\mathrm{d}_\beta S}{\mathrm{d}_\beta \left<E\right>} = \beta$$

but this temperature is not equal to the mean particle energy at very low temperatures; the mean particle energy is:

$$\begin{array}{lcl}\left<E\right> &=& \frac{1}{\beta}+\frac{1}{12}\,\beta\,\hbar^2\omega^2-\frac{1}{720}\,\beta^3 \,\hbar^4\,\omega^4+\frac{\beta^5\,\hbar^6\,\omega^6}{30240}+O\left(\beta^7\right) \\
&=&  T+\frac{1}{12}\,T^{-1}\,\hbar^2\,\omega^2-\frac{1}{720}\,T^{-3}\,\hbar ^4\, \omega^4+\frac{T^{-5}\,\hbar^6\,\omega^6}{30240}+O\left(T^{-7}\right)\end{array}$$

so that you can see that the your original definition as the mean particle energy is recovered for $T>>\hbar\omega$, the photon energy.