Systematically extending classical nucleation theory

The foundation for any discussion of first-order phse transitions is Classical Nucleation Theory(CNT). CNT, developed in the first half of the twentieth century, is based on a number of heuristically plausible assumtptions and the majority of theoretical work on nucleation is devoted to refining or extending these ideas. Ideally, one would like to derive CNT from a more fundamental description of nucleation so that its extension, development and refinement could be developed systematically. In this paper, such a development is described based on a previously established (Lutsko, JCP 136:034509, 2012 ) connection between Classical Nucleation Theory and fluctuating hydrodynamics. Here, this connection is described without the need for artificial assumtions such as spherical symmetry. The results are illustrated by application to CNT with moving clusters (a long-standing problem in the literature) and the constructrion of CNT for ellipsoidal clusters.


I. INTRODUCTION
The process of the nucleation of first order phase transitions is of importance across the range of scientific disciplines from chemistry and physics to biology and materials science. From both the theoretical and the experimental perspectives, its most challenging feature is that it is an intrinsically multiscale problem. Small clusters of new phase forming in a background of mother phase are thermodynamically unstable if they are below the size of the critical cluster. They can only form and grow by a series of thermal fluctuations and the formation of a critical cluster is consequently a rare event. To observe nucleation under conditions of interest in many applications requires macroscopic volumes of material and times scale as long as hours or days even though the outcome -the critical cluster -is itself a microscopic object with a size of micrometers or even nanometers. The fact that the growing cluster, viewed as a subsystem of the total volume, is by definition not in an equilibrium state makes the problem even more challenging. As a result, nucleation remains an area of intense research by experimentalists, theorists and simulators alike.
Despite -or, perhaps, because of -this complexity, the primary theoretical description of nucleation has long been a collection of heuristic ideas known collectively as Classical Nucleation Theory (CNT) [1]. The basic idea of CNT is that clusters of new phase grow (or shrink) due to the attachment (or detachment) of single monomeric growth units from (or to) the mother phase. The rate of capture is calculated by treating the cluster as being a quasi-static object that acts as a sink for mass and energy and then calculating the rates of flow of mass and/or energy by whatever means is appropriate to a given problem -e.g. by means of hydrodyamics for nucleation in solution. To separate the rates of attachment and detachment of the monomers for a cluster of N growth units, a detailed balance condition is invoked [1] to demand that the ratio be proportional to exp (−β∆F (N, N + 1)) where β is the inverse temperature and ∆F (N, N + 1) = F (N + 1) − F (N ) is the free energy difference between clusters of size N and N + 1. The free energy is approximated by a capillary model: the cluster is assumed to be spherical with a sharp interface between its interior and the mother phase. The free energy is the sum of a bulk term scaling as the volume and a surface term scaling as the area of the cluster.
While CNT is the basis for a large part of the work on nucleation, it is clearly a crude approximation and, indeed, has internal inconsistencies. For example, in the capillary model for the free energy the density of the mother phase outside the cluster is assumed to be constant but in the transport calculation used to obtain the attachment rate, the density has a non-uniform profile [1]. This and many other reasons have inspired a lot of work aimed at improving CNT [2]. One target is the calculation of the free energy of the cluster: it is relatively easy to improve the capillary model by e.g. allowing the surface tension or the density inside the cluster to depend on the cluster size [3]. More generally, classical Density Functional Theory [4,5] provides very accurate, microscopic models that can be used both to directly calculate the properties and free energy of critical clusters as well as providing a basis for the derivation of more coarse-grained descriptions up to and including the capillary model [5,6]. In this sense, the problem of the free energy may be considered to be solved.
The dynamics of CNT are more problematic. Ideally, one would like, in analogy to the free energy, to have a more fundamental description from which CNT could be derived. This would then presumeably allow for the systematic improvement of the description in the same was as the DFT free energy serves as a basis for more coarse-grained models. Recently, such a synthesis has been proposed in which fluctuating hydrodynamics is used as a starting point [7,8]. The DFT free energy is introduced as a means of calculating the pressure thus addressing both the free energy calculation and the dynamics at the same time and the resulting theory termed Mesoscopic Nucleation Theory (MeNT). It has been shown that CNT can be derived from this starting point by introducing appropriate approximations and many other consequences of the theory have been developed. Most particularly, it has been shown to give a rich and quite non-classical description of nucleation, even for the simplest application of liquid-vapor nucleation [9].
One problem with the work to date on MeNT has been that it relies heavily on the assumption of spherical symmetry. As such, the main promise -of providing a basis for nontrivial extensions of CNT -has so far been unfulfilled. The goal of the present paper is to show how such extensions may be systematically investigated. As in previous work on MeNT, attention will be restricted to diffusion-limited nucleation in the over-damped limit which has the enormous advantage that the full hydrodynamic description (involving density, momentum and energy fields) reduces to a contracted descritpion formulated entirely in terms of the density. Furthermore, diffusion-limited nucleation is of considerable practical importance being applicable to the nucleation of macromolecules in solution and to colloidal systems. As in the previous development of MeNT, the key step for coarse graining will be the introduction of parameterized density profiles. Based on the experience with spherically-symmetric clusters, it is argued that the Fokker-Planck equation must be covariant and that this fixes its structure once a metric is specified for the space of coarse-graining parameters. The metric is, in turn, directly determined from the full over-damped fluctuating hydrodynamics starting point thus completing the theory. This theoretical development is the subject of Section II of this paper. The third Section details applications to three cases: first, the spherically symmetric results are re-derived using the new approach, second a version of CNT is developed that allows for displacement of the center of mass and finally the theory is developed for ellipsoidal clusters haveing three independent degrees of freedom. The paper concludes with perspectives for further developments.

II. THEORY
A. The dynamical model We take as the starting point the equation for the evolution of the local density, ρ t (r), as derived from fluctuating hydrodynamics in the over-damped limit and using the DFT expression for the pressure, where D is the coefficient for tracer diffusion for the colloids, F [ρ t ] is the so-called Helmholtz free energy functional coming from DFT and ξ t (r) is the (three-dimensional) white noise with correlation ξ a t (r) ξ b t ′ (r ′ ) = δ ab δ (t − t ′ ) δ (r − r ′ ). A discussion of the derivation of this equation from fluctuating hydrodynamics can be found in Ref. ([8]). We take the point of view that this equation as written is really a short-hand for a difference equation obtained by discretizing in space and time. Using a standard discretization scheme based on centered-differences, Eq.(1) turns out to be Ito-Stratonovich equivalent (ISE) [8]. The autocorrelation of the forces is an operator, where ∇ ′ is the Laplacian for r ′ and both Laplacians act on everything to their right. This can be simplified by considering its action on a test function, where we assume that all functions of interest are nonzero only within a volume V (the system volume) and we denote the surface of this volume as ∂V . Assuming that the surface term can be neglected, the autocorrelation is the same operator as acts on the free energy gradient in the original SDE thus demonstrating the existence of a fluctuationdissipation relation (FDR) which in turn immediately implies that in equilibrium, the probability to observe a given density configuration, ρ (r), is proportional to exp (−βF [ρ]) as one would expect.

B. Transition probabilities and the geometry of density space
The SDE describes the evolution of the density field ρ t (r) and in this language, nucleation consists of a transition from (the neighborhood of) an initial field ρ (i) (r) describing the mother phase to a (the neighborhood of) a final state ρ (f ) (r) describing the new phase. In the simplest case of liquid-vapor nucleation, the initial state would be a vapor for which the average density is a constant so ρ (i) (r) = ρ v where the vapor density ρ v is determined by the thermodynamic conditions. The final state is a liquid for which the average density is also constant so ρ (f ) (r) = ρ l with the liquid density ρ l again being determined by the thermodynamics. More complicated states, such as crystals, are of course also possible. One way to characterize the transition is by specifying the nucleation pathway which is a sequence of density fields starting with ρ (i) (r) and ending with ρ (f ) (r). The sequence can be parameterized by some continuous index as ρ λ (r) for, say, 0 ≤ λ ≤ 1 with ρ 0 (r) = ρ (i) (r) and ρ 1 (r) = ρ (f ) (r). Using generalizations of the Onsager-Machlup formalism [10], the probability to make the transition from the given initial state to the given final state can be formulated as a path integral over the probability to observe any given pathway with the latter being given by an expression of the form where explicit, exact expressions can be given for the Lagrangian functional L [11]; in the weak noise limit, (for which the amplitude of the noise is small compared to the deterministic term), the dominant contribution is This exact result allows one to ask for the most likely path (MLP) from the intial to the final state: namely, the path that maximizes the transition probability which can be formulated as an Euler-Lagrange equation.
In the weak noise approximation, it is straightforward to demonstrate that the FDR implies that the MLP passes through the critical cluster, defined as the saddle point state ρ (c) (r) for which and that the energy barrier for nucleation is precisely ∆F = F ρ (c) −F ρ (i) [8]. Thus, it would seem that the MLP is a good candidate for a mathematically precise characterization of the "nucleation pathway". In general, determining the MLP requires solving the Euler-Lagrange equation which is second order in time, however in the case of barrier crossing, there is a simpler alternative: the MLP can be constructed by solving the first order equation starting at the critical state, ρ (c) (r), and perturbing infinitesimally in the direction of the unstable (generalized) eigenvalue of the Hessian giving two paths: one leading back to the initial state and one leading to the final state. The union of these two paths is the MLP in the weak-noise approximation [8].
An important point in the present context is that the path is a geometric object -time does not enter into the determination of the MLP. The construction just described can be understood as gradient decent on the potential energy surface F [ρ] in density space with a metric giving the distance between two infinitesimally close densities, ρ (r) and ρ (r) + dρ (r), as and the length of a path in density space is then The operator (∇ · ρ (r) ∇) −1 is to be interpreted in the obvious way: for example the quantity φ (r) = (∇ · ρ (r) ∇) −1 dρ (r) is determined by solving ∇ · ρ (r) ∇φ (r) = dρ (r) (10) so that, e.g., or, replaceing the factor of dρ (r), where the second term on the right is a surface integral evaluated on the boundaries enclosing the system. Since ds 2 must be non-negative and since the first term on the right is obviously nonegative, we can ensure non-negativity a sufficient condition is that either φ (r) = 0 or ∇φ (r) = 0 on the surface. In fact, the inverse of a differential operator only has meaning if the corresponding boundary conditions are supplied: otherwise, there is no unique potential φ (r) making te problem ill-defined. The boundary conditions follow from the physics of the original SDE, in the present case Eq.(1). This was derived using fluctuating hydrodynamics and, in the case of a finite system, the most natural requirement is that the total mass of the system be conserved. This means that the deviations in the density, dρ (r), must conserve the total mass, i.e. that Integrating Eq.(10) then gives the no-flux condition Further development depends on the details of the physical problem. In the case of hard walls, we would restrict attention to the class of functions satisfying the no-flux condition locally, ∇φ (r) · dS = 0 for all points on ∂V . This automatically ensures that ds 2 is non-negative. It also completes the argument for the fluctuation-dissipation relation by eliminating the last term in Eq.(3). For periodic boundaries, anything leaving via one wall re-enters via another so that the no-flux condition is global but the periodicity itself provides the boundary condition. Here, I will always consider hard walls, conserved mass and the space of functions satisfying the local no-flux condition. Finally, we remark that the geometric interpretation given here is not restricted to the weak-noise regime. In fact, using the language of differential geometry, the strong-noise Lagrangian is fully covariant and the MLP determined by it is independent of any choice of parameterization of density space as discussed in classic papers by Graham [11].

C. Order Parameters
When determining the MLP, we must minize the transition probability with respect to the density pathway. This can be done in an exact sense as described above but we could also imagine a simpler, more restricted procedure whereby we represent the density field by some parameterized form, where f is some fixed functional form that depends on a set of N order parameters, x α , for α = 1, ..., N . The distance between such a density distribution and one with slightly different parameters, ρ (r; x + dx), follows from the general expression where the metric in parameter space is Defining φ α (r; x) as this can also be written as The first term on the right displays the expected symmetry of the metric with respect to the parameters: the vanishing of the surface term follows from the no-flux condition. Given the parameterization, the best approximation to the MLP would come from minimizing the Lagrangian evaluated for such paths, namely (in the weak-noise regime) which, upon expanding and using the definitions above and the functional chain rule, can be written as It is not possible to further simplify without more information. One case amenable to analysis is that of a complete parameterization. An example would be to represent the density as an expansion in a complete set of basis functions, v i (r) , as Another would be, in the discretized version of the problem If the set of parameters is complete, then there would have to be a completeness relations of the form ∂x β dr = δ α β and using this, one easily shows (see Appendix A ) that the weak-noise Lagrangian for the MLP with parmaterized paths is which has the same structure as the original continuum case, Eq.(5).
D. Coarse-grained dynamics: the dynamics of parameterized paths Given that there is an induced metric in the space of order-parameters, it is natural to ask if this can be viewed as arising directly from a stochastic description so that one could speak of an order-parameter dynamics. If so, and given the natural physical requirement of the existence of an equilibrium-like state (at least for certain circumstances), one expects that any such description would reduce in the weak-noise limit to where the equilibrium state is ensured by an FDE, g αβ = a q α a q β a .Note that the indices for the noise (written using latin letters) are different in nature than the ones for the order-parameters (greek letters): in fact, there is no requirement that the number of noise terms be the same as the number of order-parameters. In general, the existence of the FDE only demands that the number of noise terms be sufficient that the matrix q has enough degrees of freedom to satisfy the FDE. Thus the noise-indices are simply indexes do not refer to geometric coordinates. Such a weak-noise dynamics is also consistent with the parameterized Lagrangian derived above.
The question we pose here is whether any further justification can be given for this equation and if, so, can we say anything about the strong-noise regime? There is probably no unique answer except in the special case of a complete parameterization. In this case, a straightforward derivation, given in Appendix B results in the Ito SDE where q α a (x t ) q β a (x t ) = g αβ (x t ) and the new contribution to the deterministic driving force is while the corresponding Fokker-Plank equation is The new contribution to the driving force, A α (x t ), vanishes in the case of a single variable and it is tempting to neglect it in general. For example, without it, the equilibrium distribution is easily seen to be P eq (x) = N (det g (x t )) 1/2 exp (−βF x) (where N is a normalization constant) whereas with it, we cannot explicitly determine the equilibrium distribution. Fortunately, in the following, we will mostly be concerned with the weak noise limit, for which analytic results are possible, and in which this term does not appear and these equations reduce to the simple SDE proposed intuitively in Eq.(25).

E. Spherically Symmetric systems
If all quantities are spherically symmetric and if the system volume is a sphere of radius R V , then it is straitforward to show that the inverse operator needed for the metric and consistent with the boundary conditions is where the integration constant A is arbitrary. It is then easy to evaluated the metric with the result where the cumulative mass up to radius r is Integration by parts gives The anomolous flux is calculated using These results reproduce those previously derived using spherical symmetry from the beginning [7,8].

F. Nucleation rates
The most important question from a practical point of view is the nucleation rate. This is related to the mean first passage time for barrier crossing. For the one dimensional case, there is an exact expression for this quantity, however, in the general case no such result exists. The standard result [12][13][14][15] valid in the weak noise limit is, in our language, where F (c) αβ is the Hessian of the free energy evaluated at the critical cluster x c , λ − the (sole) negative eigenvalue of g αβ (x c )F (c) βγ and N is the number of order parameters. The critical cluster is determined as usual by ∂F (x)/∂x α | xc = 0. The integral on the right is a measure of the occupation of the metastable basin and the domain of integration is restricted to this region.

III. CLASSICAL NUCLEATION THEORY AND GENERALIZATIONS
A. Generalized Classical Nucleation Theory

The density in CNT
We take "CNT" to be the following elements: a sharp interface between the cluster and bath, uniform density inside and outside the cluster and the capillary model for the free energy. Mathematically, the sharp interface means that there is an indicator function, χ (r), which is equal to one inside the cluster and zero outside so that, with the second element of constant densities, we have that χ (r; x)) . (36)

The CNT metric
The equation to be solved for the potential is so that we postulate a solution of the form Now, in general, because of its discontinuous nature, the Laplacian of the indicator function will be proportional to a Dirac delta function. In fact, if the surface is described by an equation of the form ψ (r; x) = 0 and the interior of the cluster by ψ (r; x) > 0 then and ∇χ (r; x) = (∇ψ (r; x)) δ (ψ (r; x)) .
As a trivial example, for a sphere of radius R we have ψ (r; x) = R−r. Hence, substituting the ansatz for the potential into the Poisson equation and equating coefficients of the delta function and its derivatives gives with the boundary conditions on the surface of the cluster and the global no-flux boundary condition.

The CNT free energy and critical cluster
Finally, the capillary model generalizes to where S (x) and V (x) are the surface and volume of the cluster respectively, γ is the surface tension between the two phases and f (ρ) is the Helmholtz free energy of the homogeneous bulk system. In general, the initial, final and critical states will satisfy Since we are restricting attention to profiles that conserve mass, there is a constraint and

B. Classical Nucleation Theory
We recover CNT by (a) demanding spherical symmetry; (b) taking the only parameter to be the radius of the cluster, R; (c) assuming the surface tension is independent of the radius; and (d) assuming the cluster radius is small compared to the system size. With these approximations, all quantities depend only on the cluster radius, R, and if the system is confined to a spherical volume with total radius R V , then the exterior density is where M is the total mass and ρ = M/V (R V ) is the average density. We calculate the metric directly from the closed expression, Eq.(32) and find that The interior density, ρ 0 , is, in CNT, taken to be the bulk equilibrium density and therefore independent of the cluster radius. The stationarity condition, Eq.(47) then determines the critical radius and the corresponding barrier is Note that the ambient gas plays the role of a resevoir fixing the chemical potential at µ = f ′ (ρ). For arbitrary radii, the excess free energy can be written in the compact form We can do something similar for the undersaturated fluid and in general, if we define where the lower (plus) sign is for the under-saturated solution and the upper (minus) sign for the super-saturated solution. In the case of the undersaturated solution, the stationary distribution is then where N is a normalization factor. This is the CNT expression for the distribution of clusters in the undersaturated solution.
Finally, the (weak-noise limit of the ) Fokker-Planck equation is This describes P t (R), the probability for a given cluster to have radius R, but it can easily be related to C(N ), the concentration of clusters containing N moleculeso [16], which then satisfies a similar equation (again, keeping only terms appropriate to the weak-noise limit), For a weak solution, ρ 0 ≫ ρ, one then has that In CNT, the same result is derived for the case of diffusion-limited homogeneous nucleation in which case Eq.(57) is recognized as the Zeldovich equation with the "attachment rate" f N = Dg −1 N N (N ) for which the expression given here agrees with that derived in CNT (see e.g. Ref. [1], Eq. 10.18) except that the latter includes a heuristic coefficient (the "sticking probability") inserted by hand. Thus, the theory recovers the well-known results of CNT in this limit.

C. Generalization: CNT with moving clusters
The CNT model can be generalized by allowing the clusters to move. In this case, we begin with so that the parameters are the radius, R, and the location of the center of the cluster, ∆ . Solving the Poisson equation and neglecting finite size terms (e.g. of order R/R T and ∆/R T ), one finds and all off diagonal terms vanish. Since the free energy is independent of ∆ , the center of mass just undergoes brownian motion with diffusion constant D 3

D. Generalization: Ellipsoidal Clusters
The surface of an ellipsoid with axes aligned along the Cartesian directions is specified by three parameters, a 1 , a 2 and a 3 as It includes the oblate spheroid (a 1 = a 2 > a 3 ), the prolate spheroid (a 1 > a 2 = a 3 ) and the sphere (a 1 = a 2 = a 3 ) as obvious special cases. Somewhat more convenient parameters are the average radius and the eccentricities defined (assuming a 1 > a 2 , a 3 ) respectively as R = (a 1 a 2 a 3 ) 1/3 (63) so that the radius is the overall measure of size of the ellipsoid and the eccentricities are measures of the shape. Indeed, the volume and surface area are where E (φ, k) and F (φ, k) are incomplete elliptic integrals of the first and second kind. For small eccentricities, the area can be expanded to get The capillary model for the cluster is and for the free energy, The only other element needed is the metric. This is obtained by a straightforward but somewhat involved calculation. The details and exact results are given in the SI. Here, I will only give the result for two limits. First, in the weak solution appoximation (ρ v /ρ l ≪ 1), the leading order contributions to the metric are and all other elements are zero. The determinant is To lowest order in the eccentricities, the equilibrium distribution will be The critical cluster occurs at and with the critical radius and energy barrier calculated for a spherical cluster. The excess free energy can be written as Thus fluctuations in the eccecentricites are strongly damped suggesting an alternative expansion wherein the eccentricities are treated as small parameters and the densities are unconstrained. The lowest order results in this case are and, The marginal distribution for the radius is which differs from that one would get by freezing out the eccentricities, The unstable eigenvalue of gH is

IV. CONCLUSIONS
Classical nucleation theory has long provided not only a mathematical model for nucleation allowing one to estimate nucleation rates and other physically interesting quantities, but also the language used to discuss nucleation: concepts such as the critical nucleus, the competition between surface tension and bulk free energy differences, etc. However, it is also severly limited in applicability due to the underlying assumptions of spherical clusters that are large compared to the growth units, slow growth and many others. As attention is increasingly drawn to problems that violate those assumptions -i.e. nanoscale processes, multistep nucleation -the obvious alternative is to turn to a more fundamental description such as kinetic theory or fluctuating hydrodynamics but the price paid is to lose contact with the familiar phenomenology and language. The goal of this paper has been to develop a bridge between these two levels of description that allows for the development of post-CNT models that are nevertheless grounded in the more microscopic approaches.
The structure of such a model, stochastic models with a deterministic driving force based on free energy gradients and a fluctuating force with amplitude determined by a fluctuation-dissipation relation -could be guessed and is enough for the weak-noise limit. The requirement of covariance supplies additional information needed to generalize beyond weak noise. What is missing, and what cannot be guessed, are the kinitic coefficients -what has been called here, the metric -which govern the kinetics of the process. In the domain of CNT, the kinetics are usually thought to be of secondary importance since the exponential dependence of the nucleation rate on the free energy barrier dominates practical calculations. However, outside this domain the barriers become smaller and kinetics becomes more important. The main contribution of this work has been to clarify how the metric should be determined based on the parameterization of the density and how to then construct a self-consistent model.
The general framework has been illustrated first by recovering previous results for spherically-symmetric systems, which include CNT when reduced to a single order parameter. Then, two novel generalizations were developedone for moving clusters and the second for ellipsoidal clusters. The latter in particular could be used as a basis for incorporating the effects of shear on nucleation. In any case, the goal here was not to delve too much into applications but, rather, to illustrate how such models can be developed with minimal heuristic input.
So, K αβ = g αβ as claimed. Inserting this result into Eq.(23) gives for the weak-noise Lagrangian of the paramaterized paths

Appendix B: Deriving the SDE
The SDE for the density is and we use the Stratanovich interpretation (recall that this model is actually Ito-Statonovich equivalent). If the density can be represented as Multiplying through by the appropriate operator gives Integrating over the spatial coordinates yields where we have identified the metric in the space of parameters, and the parameterized free energy and the noise amplitude is Writing this as we note that the equivalent Fokker-Planck equation is (with an implied integration over the spatial coordinate). This can be written in anti-Ito form as Note that in this case, the former can be written as showing that if A α is neglected, then a stationary solution is The Fokker-Planck equation is equivalent to the anti-Ito sde where as usual q α a (x t ) q β a (x t ) = g αβ (x t ) .The Ito Fokker-Planck equation and sde are, respectively, and