Chapter 5 – Entropy and the Second Law of Thermodynamics – Identifying the micro behind the macro

In the previous chapters, we developed a picture of matter built from atoms that move, interact, and exchange energy. From this microscopic behavior emerged the First Law of Thermodynamics, which tells us that energy is conserved, along with the mass and energy balances that allow us to track that conservation in real systems.

But conservation alone does not determine what actually happens.

Many processes that are allowed under the First Law are, for all practical purposes, never observed. For example, a system of particles in a given volume could, in principle, rearrange itself so that all particles occupy one-half of the volume. The First Law allows it. Nothing is violated.

But it doesn’t happen.

Not because it is impossible—but because it is overwhelmingly improbable.

This raises a fundamental question:

Why do systems consistently evolve toward some states and not others?

This chapter answers that question by introducing entropy and the Second Law of Thermodynamics, which together describe how a vast number of colliding atoms naturally organize into highly predictable macroscopic structures characteristic of thermodynamic equilibrium.

Picture yourself as an atom. What do you see?

Put a single atom into an isolated box and, in the ideal classical world, it moves forever—bouncing off the walls like a billiard ball. Its energy remains constant, and since that energy is purely kinetic, its speed does not change—only its direction.

Add a second atom. Now collisions occur, but nothing fundamental changes. Energy is conserved, and in principle, we can still predict the motion of both atoms indefinitely.

Add a third, then a trillion. The same laws apply. Every collision still follows Newton. There is no new physics—only an overwhelming increase in complexity. What we could calculate for a few atoms becomes effectively impossible for many.

And yet, something remarkable happens.

Instead of chaos, we observe structure. The atoms spread uniformly throughout the volume, and their velocities settle into a smooth, predictable distribution. The system reaches equilibrium: stable at the macroscopic level, even as microscopic motion continues without end.

Where does this structure come from?

Not from a new force. It comes from probability acting through collisions.

When atoms collide, energy is conserved. If one gains, the other loses. Moreover, it’s typically the slower one that gains and the faster one that loses, meaning that fast atoms typically don’t get faster and slow atoms typically don’t get slower. Statistically, the faster they get, the greater the concentration of slow atoms in their immediate neighborhood and the higher the probability they’ll hit a slow atom and thus be pulled back towards the average. The same goes in reverse; slow atoms are more likely to see faster atoms and be pulled back towards the average. In the world of statistics, this “pull” is termed “regression toward the mean.” There’s a stronger and stronger pull towards the mean as a given atom’s velocity increases or decreases further and further away from the mean. This statistical “regression toward the mean” smooths out the distribution, producing the familiar bell-shaped curve for velocity [1].

Furthermore, atoms move freely through the available space. Given enough time, they explore the entire volume without preference, leading to a uniform spatial distribution. You see this happen when a drop of dye is put into a glass of water; it slowly disperses throughout the entire glass over time, not because there’s any driving force, but rather because of probability.

Out of all the possible ways to arrange a trillion atoms, the overwhelming majority correspond to these smooth, uniform distributions. Highly uneven arrangements—such as all atoms crowding into one half of the box, or a few atoms carrying all the energy—are not impossible, but they are so improbable that they are never observed.

What appears as order emerging from chaos is simply the statistical consequence of an enormous number of collisions, each governed by the same deterministic laws.

Probability and the Second Law

The Second Law of thermodynamics really comes down to a surprisingly simple statement:

Nature moves toward its most probable state.

There is no external force driving this behavior. The atoms continue to move and collide exactly as before. What changes is how we interpret the outcome of a vast number of such interactions.

Consider a system defined by U, V, and N. These constraints fix the total energy, the available volume, and the number of atoms. Within those constraints, there are an astronomically large number of ways to arrange the atoms in position and velocity.

Each specific arrangement is called a microstate.

Now group together all microstates that produce the same overall distribution of atoms—same spread in space, same distribution of velocities. Each such grouping is a macrostate. Some macrostates correspond to highly uneven distributions—atoms crowded into one region or energy concentrated in a few particles. Others correspond to smooth, uniform distributions.

Each macrostate contains many microstates—but not equally many.

The mathematics behind counting them is complex, but the result is simple and profound: one macrostate overwhelmingly dominates all the others. It contains almost all of the possible microstates. It is the most probable macrostate and represents the most probable distribution. And that macrostate corresponds to what we observe:

Atoms spread uniformly throughout the volume and follow a smooth, Gaussian distribution of velocity [1].

Entropy as a property of the system

And now, finally, we introduce the connection between these statistical results and entropy. It turns out that the number of microstates (W) in the most-probable macrostate quantifies the entropy of the system by the following famous equation:

S = k_B ln W

where k_B is Boltzmann’s constant.

This equation provides the bridge between the microscopic and macroscopic worlds. It tells us that entropy is not an abstract or arbitrary construct—it is directly tied to the number of ways the atoms in a system can be arranged while satisfying the constraints imposed on that system.

Once U, V, and N are specified, the number of accessible microstates W is determined, and so is the entropy.

In this sense, entropy is a state property, just like internal energy, volume, and pressure.

The Direction of Natural Processes

A system naturally evolves toward the most probable macrostate, i.e., the one with the greatest number of accessible microstates. When the system reaches this state and occupies the macrostate with the largest W, it is said to be equilibrated and entropy becomes a valid property of that system. Until this time, the system is not equilibrated and assigning entropy to it is invalid. Unlike U, V, and N which apply to both non-equilibrated and equilibrated systems, entropy applies to equilibrated systems only.

Importantly, nothing in this evolutionary process violates the deterministic laws governing individual atoms. Every collision still follows Newton’s laws. Energy is still conserved. The directionality we observe at the macroscopic level emerges not from a new force, but from the overwhelming statistical dominance of certain arrangements over others.

Entropy and Heat

The statistical definition of entropy provides physical meaning. But classical thermodynamics, as a practical science, requires something more: a way to calculate entropy changes from measurable quantities. That connection was established by Clausius.

Through his analysis of heat engines, Clausius showed that for a reversible process,

dS = δQ_rev / T

and from this discovered a new property of matter – entropy, S. The concept of “reversibility” will be discussed later.

This result is remarkable. For any system—regardless of its composition!—the change in entropy is determined by the reversible heat transferred into the system divided by the temperature at which the transfer occurs.

This equation does not replace the statistical definition of entropy; it emerges from it as demonstrated by Boltzmann when he later showed that temperature governs the total number of energy states accessible to the atoms, while δQ_rev causes an incremental increase in the number of accessible energy states. In the end, when you divide δQ_rev by T, you’re dividing the increase in the number of accessible energy states caused by the addition of thermal energy by the number of accessible energy states already there.

Clausius’ equation is thus the macroscopic expression of Boltzmann’s microscopic reality.

Absolute Entropy and the Third Law

While energy (U) is not an absolute property and is always used in the context of change (dU), entropy is different. Yes, Clausius’s above relation defined entropy based on change but a later discovery based on Nernst’s heat theorem provided the means to define entropy based on absolute.

This discovery, known as the Third Law of Thermodynamics, states that the entropy of a perfect crystal of a pure substance approaches zero as temperature approaches absolute zero. A value of zero for entropy means that there is only one single way to put the system together, i.e., only one microstate, which is consistent with all motion ceasing at absolute zero. With this reference, absolute entropy can be determined by integrating heat capacity data:

$S(T) = \int_{0}^{T} \frac{C_p}{T} \, dT \; + \; \text{(phase transition contributions)}$

In practice, entropy is not measured directly. It is inferred from calorimetric measurements—heat capacities and phase changes—combined with this absolute reference state.

While this approach may seem far removed from the statistical definition, it is entirely consistent with it. As thermal energy is added to the system, temperature increases, the number of accessible energy states increases, and the number of microscopic arrangements that are compatible with that number of energy states also increases.

Entropy and the interaction of systems

Consider two systems, each initially at equilibrium. Bring them into contact.

Each has its own values of U, V, N, and S. The moment interaction is allowed, e.g., flexible membrane, conducting wall, porous barrier, the combined system is no longer in equilibrium and immediately begins evolving toward a new equilibrium state. While the combined system values of U, V, and N are simply the sum of those values of the two systems, the combined system value of S is not.

At this new equilibrium:

temperature is uniform
pressure is uniform
chemical potential is uniform

From a statistical perspective, the combined system now has access to a larger set of microstates (W) than the two systems had separately and so the combined system entropy (S = k_B ln W) is greater than the sum of the entropies of the two initial systems:

S_combined $\ge$ $\sum$ S_parts

This is a direct consequence of the statistical definition of entropy.

Energy gradients do not exist in the most probable state

Note in the above example of two different systems coming into contact with each other, that the initial bimodal distribution energy—temperature, pressure, or chemical potential—is a viable microstate. There is no constraint suggesting it isn’t. It meets the U, V, N requirements for all microstates.

However, statistical mathematics shows that the most probable state is the one with no energy gradients. As you add any kind of energy gradient, such as one that would result in a bimodal distribution, the number of viable microstates decreases significantly. Gradients do not exist in the most probable state.

The presence of any type of energy gradient in a system means that equilibrium has not been achieved.

Entropy and the Direction of Change

As energy gradients dissipate, the system moves toward a more probable state, one defined by entropy. This is Clausius’ statement of the Second Law:

The entropy of an isolated system increases to a maximum.

We can now reinterpret familiar observations:

Heat flows from hot to cold
Gases expand to fill available volume
Concentration differences diminish

Each reflects the same principle:

As energy gradients dissipate, systems evolve from less probable states to more probable ones.

Gradients in temperature, pressure, and chemical potential represent constraints on how energy and matter are distributed. When those constraints are removed, the system moves toward a more uniform—and therefore more probable—distribution.

This is the physical content of the Second Law.

The Role of Entropy in Thermodynamics

With entropy now defined and interpreted, we can return to the First Law and complete the framework of classical thermodynamics.

From earlier, we have:

dU = Q – W

For reversible processes involving simple compressible systems, the work term can be written as

W = PdV

and heat transfer can be expressed in terms of entropy as

Q = TdS

Substituting, we obtain:

dU = TdS – PdV

This equation expresses changes in internal energy entirely in terms of state properties and quantifies the First Law of Thermodynamics based on energy and its conservation (see here). In an isolated system for which both Q and W are zero, energy does not change (dU = 0).

This is a critical step.

Heat and work quantify changes in energy by two different means; the changes that they cause have been replaced by state properties. This transformation is what enables the systematic development of thermodynamics, including the analysis of phase equilibria and chemical reactions.

A Brief Historical Perspective

Historically, these ideas did not emerge in this order.

Clausius introduced entropy through the study of heat engines long before Boltzmann’s statistical probabilistic interpretation revealed the microscopic meaning of the concept.

This historical sequence—macroscopic definition first, microscopic explanation later—has influenced how thermodynamics is traditionally taught. Students often encounter entropy as a formal construct before understanding its physical basis.

In this book, we reverse that order.

We begin with atoms, motion, and energy, and use those foundations to build toward entropy as a natural consequence of the behavior of large systems.

Why the Second Law was needed

The First Law does not prevent the flow of heat from cold to hot; it does not prevent temperatures from diverging upon contact. The Second Law was needed to show how highly improbable this situation was and how highly probable the opposite was: temperatures move toward each other upon contact.

Summary

We can now summarize the key ideas of this chapter:

A system defined by fixed U, V, and N can exist in many possible microscopic arrangements (microstates).
These microstates group into macrostates characterized by observable properties.
The number of microstates corresponding to a macrostate is denoted by W.
Entropy is defined as $S = k_B \ln W$ , providing a bridge between microscopic and macroscopic descriptions.
Systems evolve toward the macrostate with the largest number of accessible microstates, thus attaining thermodynamic equilibrium.
For an isolated system, this corresponds to a maximum in entropy.
Clausius’ relation $dS = \delta Q_{\mathrm{rev}} / T$ dS provides a practical means of calculating entropy changes.
The Second Law of Thermodynamics reflects the statistical tendency of systems to move toward more probable states.
The First Law defines what is conserved; the Second Law defines the direction of change.

Together, these laws form the foundation of thermodynamics.

Looking Ahead

We have established that systems evolve toward the most probable distribution of energy and matter. The next step is to understand the structure of that distribution in more detail.

In particular, we will examine how energy is distributed among the available states of a system and how this leads to the Boltzmann distribution, which provides a quantitative description of equilibrium at the microscopic level.

Notes

[1] Technically the Gaussian distribution is based on the population distribution of one component of velocity such as v_x as opposed to speed, which is √(v_x²+v_y²+v_z²), or energy. Because these properties are all tied together, you’ll often see population distributions based on one or the other or both. When I speak of a Gaussian-velocity distribution, I am referring to v_x, which has both positive and negative values centered around zero for a non-moving system.

Chapter 5 – Entropy and the Second Law of Thermodynamics

Looking Ahead

Share this:

Like this: