While attending an event in Syracuse, New York, I got to talking with an older chemical engineer who had once worked with my dad at Bristol-Myers Laboratories. I shared that I was writing a book on thermodynamics and we spoke some about this. At the conclusion, he looked at me and said, “You know, I never understood entropy.” I’m sure it wasn’t the first time this sentence has been spoken.
What is it about entropy that creates such a stumbling block to learning thermodynamics? More importantly, how can this stumbling block be removed? That’s the question in front of me and likely other educators right now. It’s a challenge that I am taking on, as you’ll see in my upcoming posts.
But before I start taking on that challenge, I must first better understand entropy myself. To this end, there’s a certain feature of entropy that I’ve never completely understood, a feature that has been one of my own stumbling blocks. In this post, I share this situation with you in the form of a Riddle me this question. Let me give you some context first, and then I’ll get to my question.
Entropy is a property of state that we can’t measure, feel, or viscerally understand
Entropy (S) is a property of state that quantifies the most probable distribution of a system of particles based on location and velocity (momentum) for different fixed properties of the system, such as, for example, volume (V) and energy (U). One could consider the following a valid statement: S = f(V,U). Unfortunately, we can’t directly measure or otherwise sense this property but instead must calculate it. Thus, our understanding of entropy must come from an abstract thought process rather than our gut feel. No wonder it’s so challenging to understand.
Given this, let’s start a discussion about the two commonly used approaches to understanding entropy. The first approach involves probability. The second approach involves the famous equation, dS = δQrev/T, and the corresponding fact that absolute entropy can be calculated by the integration of this equation from absolute zero to a given temperature, since the entropy of a given substance in its pure crystalline form is zero. The fascinating thing about these two seemingly different approaches is that they are fundamentally connected.
Entropy and probability
Have you ever put a drop of dye into a glass of water and watched what happens? The small drop spreads out and eventually disperses throughout the water, resulting in a nice, light, uniform shade of color. There’s no internal force making this happen. Instead, the uniform spread results from the random motion of the dye molecules in the water, no direction of motion being statistically more probable than any other direction.
To generalize this example, randomization in nature leads to events that are likely or “probable” to occur. The uniform spread of the dye molecules through water or the uniform spread of gas molecules through a room are both highly probable. The 50:50 distribution of heads and tails after the randomized flipping of a coin many times is highly probable. This is how randomization, probability, and large numbers work.
Taking a conceptual leap here, the absence of energy gradients in a system, such as those involving mechanical (pressure), thermal (temperature), and chemical (chemical potential) forms, is highly probable; the presence of such energy gradients is highly improbable, for they don’t naturally occur. When they are made to occur, such as when you drop a hot metal object into a cold lake, the temperature gradient between the two dissipates over time. You won’t see the gradient become larger over time; the metal object won’t become hotter. The probability-based mathematics involved with statistical mechanics shows why this is so.
Taking another, larger, conceptual leap, assume you have a system comprising a very large number of helium atoms for which the total energy (U) is simply the total number of atoms times their average kinetic energy. If you put all of these atoms into a single container of volume V and energy U, then, absent any external energy field like gravity, the atoms will spread themselves out to achieve a uniform density and a Maxwell-Boltzmann distribution of energies. (See illustration detail at right, from the larger illustration at the end of this post.)

While scope limitations prevent me from diving into the mathematics behind these distributions, suffice to say that while the presence of the moving atoms inside the system may seem very chaotic and random (if you could actually see them), the reality is actually the opposite. The distributions of the atoms by location and velocity (momentum) have beautiful structures, ones defined as being the most probable.
It was Ludwig Boltzmann who developed the mathematics that revealed these distributions. His mathematics involved no assumption of an acting force or of any kind of favoritism, only of probability, the probability of nature’s natural, random tendencies. And it was also Boltzmann who used these mathematics to explain the second approach to understanding entropy.
Entropy and δQ/T
Many of us learned about entropy by the following equation that Rudolf Clausius discovered deep in his theoretical analysis of Sadi Carnot’s heat engine:
dS = δQrev/T [1]
An infinitesimal change in entropy (dS) equals the infinitesimal change in energy of the system caused by reversible thermal energy exchange with another system (δQrev) divided by the absolute temperature (T). This equation enabled Clausius to complete the first (differentiated) fundamental equation of state when he started with the 1st Law of Thermodynamics
dU = Q – W
and made two substitutions, one with equation [1] above and the other by assuming a fluid system:
dU = TdS – PdV
The arrival of [1] together with the later realization that the entropy of a substance in its pure crystalline form equals zero at absolute zero enabled calculation of absolute entropy at any temperature. Using heat capacity and phase-change data, [1] can be mathematically integrated from absolute zero to yield absolute entropy. In this way, entropy quantifies the total amount of thermal energy (Q), adjusted with division by T, required to construct the system of moving atoms and molecules, including all forms of motion such as translation, vibration, and rotation, at a given temperature and pressure. The integration also accounts for phase change and volume expansion [the heat capacity involved is Cp and thus allows for variable volume].
One can thus think of entropy as the quantification of what I call the “structural energy” of a system. It’s the energy required to create the structure of moving atoms and molecules. But why is this concept connected with probability?
Boltzmann’s probabilistic reasoning explains why dS = δQ/T
The equality shown in [1] has always interested me. I searched for a physical explanation of why the equation itself actually works, but couldn’t find an answer until I read one of Boltzmann’s papers (here). His explanation made sense to me.
Two variables, location and velocity (momentum), play a critical role in Boltzmann’s mathematics. Let’s leave location aside for this discussion and focus on momentum, and more specifically, energy. Building on the illustration to the right, also taken from the larger illustration at the end of this post, Boltzmann proposed an infinite series of hypothetical buckets, each characterized by an infinitesimal range of energy, and then played the game of how many different ways can atoms be placed into the buckets to result in the fixed total energy constraint? (Total energy = summation of # atoms in bucket x energy of bucket). After much math, he discovered that by far the most commonly occurring placements or arrangements of the atoms aligned with the Maxwell-Boltzmann distribution. While all arrangements, no matter how improbable, were possible, only the most probable arrangement dominated. No other arrangement even came close. In the example at right, you can see how the Maxwell-Boltzmann distribution evolves with 7 balls (atoms). When you add 1023 additional atoms plus a large number of buckets, the Maxwell-Boltzmann distribution locks in as the only answer. It clearly dominates based on pure probability.

Let’s turn our attention to the number of buckets involved in Boltzmann’s mathematics, and more specifically, consider how many of the buckets are actually accessible. At absolute zero, there’s only one accessible energy bucket and thus one way to arrange the atoms. [Note: The equation linking entropy to number of arrangements(W) is the famed S = k ln(W). When W = 1, S = 0.] As temperature increases, the atoms start bubbling up into the higher energy buckets, meaning that the number of accessible buckets increases. And as temperature approaches infinity, so too does the number of accessible buckets approach infinity. In this way, temperature defines the number of energy buckets accessible for the distribution of atoms.
Now consider that the value of δQ quantifies an incremental amount of energy added to the system (from thermal energy exchange), and then take this to the next step: δQ also quantifies an incremental number of accessible buckets. So δQ/T quantifies, in a way, a fractional increase in accessible buckets and thus the infinitesimal increase in entropy. In his publication, Boltzmann demonstrated this connection between his probability-based mathematics and [1], which had been derived from classical thermodynamics. This is rather fascinating, the connection between these two very different concepts of entropy. And perhaps this is also why entropy is confusing. Entropy is related to the amount of thermal energy entering a system, and it’s also related to the number of different ways that the system can be constructed. It’s not obvious how the two are connected. They are, as shown by Boltzmann. It’s just not obvious.
But then why does dS exactly equal zero for adiabatic reversible work?
Based on the above discussion, I can see why dS equals δQ/T from a physics and statistical mechanics viewpoint. And I can further understand why this works regardless of the substance involved; temperature alone governs the number of energy buckets accessible for the distribution of atoms and molecules. The nature of the substance plays no role in [1].
But what doesn’t make sense to me is the other important feature of entropy that we learn in the university: entropy experiences no change (none!) during adiabatic reversible work. Consider that S = f(U,V). During reversible adiabatic expansion, such as occurs in a work-generating turbine, energy (U) decreases and so decreases entropy, while volume (V) increases and so increases entropy. This process is labeled isentropic, meaning that entropy remains constant, which means that the two changes in entropy exactly cancel each other. Not almost. Exactly. That’s what is meant by saying dS = 0 during reversible adiabatic expansion. I just don’t understand how the physics involved results in a such an exact trade-off between energy and volume. Is there something fundamental that I’m missing here? Is there a clean proof that this must be so?
In my reading of Rudolf Clausius’s works, I never saw him state that reversible adiabatic expansion has no impact on entropy. His focus was primarily on the steps in Carnot’s ideal heat engine cycle involved in the transfer of heat into (Qin) and out of (Qout) the working substance. He sought to understand how heat is transformed into work(W), and it was through this search that he learned that Qin/Thot = Qout/Tcold and also that W = Qin – Qout. These discoveries led to his defining the maximum efficiency for the continuous transformation of heat into work [Wmax / Qin = (Thot – Tcold)/Thot] and also led to discovery of the new state function, entropy (S), and its main feature, dS = δQ/T. But in this, I did not find any discussion around the proof that dS = 0 for reversible adiabatic expansion. I did see discussion that because δQ is zero during the reversible adiabatic expansion, then dS also is zero. However, this, to me, is not proof. What if the changes in entropy for the two adiabatic volume changes in Carnot’s cycle are both non-zero and also cancel each other?
This then is my question to you.
Have you ever read an explanation based on physics of why entropy experiences zero change during reversible adiabatic expansion? I welcome any information you have on this topic, hopefully in documented form. Thank you in advance.
My journey continues.
The below figure is from Block by Block – The Historical and Theoretical Foundations of Thermodynamics

References
Sharp, Kim, and Franz Matschinsky. 2015. “Translation of Ludwig Boltzmann’s Paper ‘On the Relationship between the Second Fundamental Theorem of the Mechanical Theory of Heat and Probability Calculations Regarding the Conditions for Thermal Equilibrium’ Sitzungberichte Der Kaiserlichen Akademie Der Wissenschaften. Mathematisch-Naturwissen Classe. Abt. II, LXXVI 1877, Pp 373-435 (Wien. Ber. 1877, 76:373-435). Reprinted in Wiss. Abhandlungen, Vol. II, Reprint 42, p. 164-223, Barth, Leipzig, 1909.” Entropy 17 (4): 1971–2009.
Cercignani, Carlo. 2006. Ludwig Boltzmann: The Man Who Trusted Atoms. Oxford: Oxford Univ. Press.