On the derivation of a Nonlinear Generalized Langevin Equation

We recast the Zwanzig’s derivation of a nonlinear generalized Langevin equation (GLE) for a heavy particle interacting with a heat bath in a more general framework. We show that it is necessary to readjust the Zwanzig’s definitions of the kernel matrix and noise vector in the GLE in order to recover the correct definition of fluctuation-dissipation theorem and to be able performing consistently the continuum limit. As shown by Zwanzig, the nonlinear feature of the resulting GLE is due to the nonlinear dependence of the equilibrium map by the heavy particle variables. Such an equilibrium map represents the global equilibrium configuration of the heat bath particles for a fixed (instantaneous) configuration of the system. Following the same derivation of the GLE, we show that a deeper investigation of the equilibrium map, considered in the Zwanzig’s Hamiltonian, is necessary. Moreover, we discuss how to get an equilibrium map given a general interaction potential. Finally, we provide a renormalization procedure which allows to divide the dependence of the equilibrium map by coupling coefficient from the dependence by the system variables yielding a more rigorous mathematical structure of the nonlinear GLE.


State-of-art
The generalized Langevin equation (GLE) is an important mathematical tool for investigating a wide class of both quantum and classical diffusive phenomena [1] and it is an extension of the Langevin equation [2]. The major advantage of GLE-based approaches is the reduction of the number of degrees of freedom which describe the system-environment interactions. For example, in case of a particle moving in a given environment, instead of solving the Newton's equations of motions for the system and for each particles of the environment, one can just focus on the velocity vector, U, of the diffusing particle, that is, the GLE reads: where γ(t − τ) is the friction function and, finally, F(t) is a zero mean, stochastic, Gaussian colored noise and they are related through the fluctuation-dissipation relation (FDR) [3]: where k B is the Boltzmann constant and T the temperature. Despite the drastic reduction of the degrees of freedom, the GLE has shown to provide a reliable description of the physical diffusive processes. In fact, it has been employed in many fields of research, for instance, in Biophysics, for studying anomalous diffusion of proteins in lipid membrane [4][5][6][7], in Hydrodynamics, for studying how probe particles diffuse in viscoelastic materials [8][9][10][11][12], in Quantum Mechanics, for explaining the dissipation in quantum system on a more fundamental and general level [13][14][15][16][17] as well as in case of specific system such as the Caldeira-Leggett model [18,19] and, in Chemistry, as a diffusion model for the chemical reactions [20,21]. Nevertheless, the accuracy of the physical description provided by the GLE strictly depends on how one models the parameters and the timedependence of the friction function. Its parametrization, in fact, is an open field of research which involves both

Introduction to the problem
In this section, we revisit and formalize the theoretical derivation of a GLE introduced by Zwanzig [31] putting in evidence what are the main inconsistencies and giving an insight about how to overcome them. The physical system that Zwanzig considered is composed by a subsystem that we call heavy particle described by the variable X ≔ (Q, P) where Q is the generalized coordinate and P the conjugated momentum and a heat bath which is a set of N particles whose state is described by the collective variables Y ≔ (q, p) with the same meaning as before.
The whole Hamiltonian of this system is given by Zwanzig is where the second term represents the interaction Hamiltonian since it involves the variables of both heavy particle and heat bath through the a(X) vector. In fact, in order to switch off the interactions, we have to impose somehow a(X) = 0 for each X ≠ 0. Finally, the matrix K is a symmetric constant matrix. The equation of motion for the heat bath and heavy particle variables are, respectively: where A and B are two constant antisymmetric matrix, i.e.: such that the ranks of A and B are, in general, different and where W(x) = ∇ X a T (X), V(X) = A∇ X H s (X).
The heat-bath equation in (4) can be formally solved and we have: The time-derivative of the a(X) vector can be rewritten in terms of the W matrix, i.e.
By replacing equation (5) into the first equation in (4) and by making use of equation (6), one gets and, introducing the canonical partition function where k B and T, respectively, are the Boltzmann constant and the temperature whereas and Tr denotes the trace of a matrix.

Criticism
We now explain our main criticism to the Zwanzig's derivation showing that the definitions (8) and (9) provided by Zwanzig yield a FDR which is inconsistent with the standard definition [3]. The main reason of this inconsistency can be better explained by means of the following argument. In writing the Hamiltonian for two interacting systems, all the information about the nature of the interaction is contained in the potential function which gives a fully microscopic and detailed description of system. Conversely, in the (generalized) Langevin equation, one just takes into account a coarse-grained representation of these interactions encoded into two terms: the stochastic noise vector and the friction matrix or function. The role of the FDR is somehow to balance the energy dissipated by the friction with that supplied by the random noise. Consequently, the friction and noise term have to contain the information about the strength of the interactions. For example, in the easiest case of a standard Langevin equation , the second term on the right-hand side can be interpreted as a convolution between a kernel function ζ(t − τ) ∼ γδ(t − τ) (δ is the Dirac delta distribution) and the velocity vector of the heavy particle, U(t). In this case, it is also evident that the strength of the interaction is represented by the friction constant γ which is a coarse-grained description of the heavy particle-environment interactions. This feature must hold also in case of a GLE, that is, we expect that, in the Zwanzig's derivation, all the microscopic information contained in the potential function are condensed within the friction matrix L.
By inspection, we immediately realize that L and F, respectively, in equation (8) and (9) do not show this feature. In fact, (up to two constants k B and T) L depends on B and K. Now, B is a constant antisymmetric matrix whereas K is the matrix associated to the oscillation frequencies of the heat bath particles when they do not interact with the heavy one 1 .
Therefore, the FDR formulated by Zwanzig in (11) does not provide any balance. Based on this observation, we now explicitly show that, introducing the same Hamiltonian considered by Zwanzig, that is: where we recall that capital letters represent the heavy particle variables. By comparing (12) and (3), we note that the first two terms in the Hamiltonian above correspond to H s (X) whereas the terms in the square brackets correspond to H b (X, Y). Therefore, the Hamiltonian (12) can be deduced from that in equation (3) making explicit in the latter the position and momentum variables, i.e.: H P p Q q H P Q p a P Q K p a P Q q a P Q K q a P Q p a P Q K q a P Q , , , , Now, by imposing K pp = 1 and a p (P, Q) = 0, K pq = K qp = 0 and setting K diag , , , , equation (13) reduces to (12). We note that i 2 { } w is the set of free oscillation frequencies of the heat bath particles whereas i i N 1 { } g = is the set of coupling coefficients which determine the strength of the heat bath-heavy particle interactions. Thus, a specific interaction can be switched off just imposing γ i → 0 which implies a Q 0 q i ( ) = . By specializing definitions (8) and (9) to this example, a cumbersome but straightforward calculation leads to the noise term whereas for the kernel matrix, we have As already stressed, the friction matrix (15) does not depend on the coupling coefficients, γ i and, thereby, the FDR is ill-defined as we will see in a moment. In fact, by plugging equations (14) and (15) into the definition (11) for the FDR, one gets: It is evident that the friction function on the last right-hand side, that is: { } w = , therefore, the obtained FDR (16) does not recover the usual definition.
Finally, we stress that, adopting the same example, Zwanzig writes a different friction function (which is also the correct one) and it reads (see equation (23) in [31]): This suggests that he maybe absorbed, ad hoc, the γ i -coefficients within ζ.
Based on that, we conclude that the definition (18) should be adjusted in order to obtain the correct ζ function without any ad hoc-adjustment and this is the principal aim of the paper.

Notation
In this section, we formalize the derivation introduced in the previous section.
We consider a Hamiltonian function describing a classical physical system composed by a heat bath (P) interacting with a heavy particle (B). The heat bath consists of N particles with mass m = 1 smaller than the heavier particle mass M.
We assume that the system is embedded in the three dimensional physical space 3  so that the phase space, Therefore, the heat bath conjugated coordinates are:  can be split into three terms: where, as usual, we have: where the potential function V : 3    describes any external force which just acts on the heavy particle and it could have a generic dependence on X.
The interaction term is given by: Thus, the interactions are described by the so-called interaction matrix which is a linear application acting on the heat bath degrees of freedom. Note that each K ij is a three dimensional matrix since also the spatial degrees of freedom are included.
The meaning of the interaction described by H INT is to assume a harmonic approximation around the equilibrium positions Q * ≔ a(X) of the heat bath system, where a: provides the equilibrium configuration of all the heat bath particles for a given X 3  Î .
The system of Hamilton's equations, then, reads: ´and the superscript T denotes the transpose operation.
The system of equations above can be reduced to a GLE for the system position vector X(t). To do so, one first focuses on the heat bath equations (the last two equations) and, by just having a linear dependence on the Qcoordinates, one interprets the factor Ka(X) as an external force. Hence, one exploits the Laplace transform method to obtain a formal solution Q S (t) for the heat bath position vector. Actually, the Hamiltonian (21) presents a translational invariance since it only depends on Q − a(X). This feature reflects on the formal heat bath solution and, as we will see later, we obtain an explicit mathematical expression of the heat bath solution directly in terms of Q S (t) − a(X(t)). Then, the solution can be plugged into the equation of motion of the heavy particle and the resulting equation explicitly takes the form of a GLE.
We will explicitly employ such a procedure in the next section after having studied more deeply the properties of the equilibrium map a(X) which plays a fundamental role in the definition of the kernel matrix and noise vector.
We conclude this section, mentioning that the procedure above described, called elimination of the heat bath degrees of freedom, has been already introduced by Kac et al [30] first and by Zwanzig [31] later. However, Kac et al just considered linear system-heat bath interactions that, in our notation, translates in defining the a map to be a i (X) ≔ X and absorbing all the parameters representing the interaction strength within the K matrix. Zwanzig extended their system assuming a generic dependency of the a map on the heavy particle position vector X and since it could be nonlinear, the elimination procedure led to a nonlinear GLE.

Formal derivation of a nonlinear GLE
In this section, inspired by the work of Zwanzig, we will derive a nonlinear GLE and define the kernel matrix and noise term in such a way to be able (i) to recover the standard definition of FDR, (ii) to define a continuum limit without any ulterior adjustments.
In the section 2.1, we show that, by choosing a suitable potential function, one can always recast the starting Hamiltonian into an Hamiltonian of the same form given in equation (3). In section 2.2, we show that, an appropriate manipulation of the equilibrium map allows to naturally define the correct kernel matrix and noise vector.

General setting: study of the equilibrium map
Let us first clarify the role and the physical meaning of the map a : First of all, we note that each element of the set a X { ( )} = plays the role of an equilibrium configuration of the respective heat bath particle for any fixed X 3  Î . In fact, one can demonstrate [33] that such equilibrium map can be always derived once that a generic interaction potential is given.
In practice, given any Hamiltonian function of the form  (20) and Q X , ( ) V is a rather generic interaction term, one can define the global equilibrium map a(X) ≔ {a 1 (X), L ,a N (X)}, imposing that any component of this map realizes the following requirement In other words, equation (23) provides an implicit relation between a and X. Therefore, the Taylor expansion (harmonic approximation) of the potential with respect to the equilibrium map a reads: is, by construction, a constant matrix (in q) which just contains the oscillation frequencies around the equilibrium positions associated to each heat bath particle. The first term in the expansion (24) could, in general, depends on X and affects the equations of motion, however, if the dependence of the potential function, V, on the Xand q-coordinates of the form ∥X − q i ∥, such a factor is X and Q-independent and hence it could be dropped as we will notice in the next example. On the contrary, the second one can be identified with the interaction potential introduced in equation (21). Depending on the mathematical structure of the initial potential V, the resulting equilibrium map a(X) can depend on X in a general fashion.

Examples of a possible equilibrium map.
Let us consider a Lennard-Jones-like interaction potential of the form where the parameters , i i  s h Î have the dimension of an energy and a length, respectively; lastly, i  Î  are dimensionless parameters. Note that we do not consider any mutual interaction between the heat bath particles. The requirement (23), namely: is the equilibrium configuration of the heat bath, yields This set of relations can be geometrically interpreted imaging that any particle, let us say, the i-th one, lies on a point of the sphere R 2 i  with radius R i = 2 1/6 σ i . The reason why any point on the sphere is equivalent to each other is due to the fact we do not have any mutual interaction between the heat bath particles.
Therefore, every equilibrium position vector * q i can be related to the heavy particle vector X, introducing any unit vector n(X) (  n X 1 In general, it could depend on X, for example, such as n(X) = X/∥X∥. Thus, the equilibrium map may read: can be dropped since constant and the potential function ca be approximated as follows Q X q a X K q a X , 1 2 , Note that the factor in the K ii matrix of equation (27) has the physical dimension of a frequency, therefore, we define a new set of frequencies which allows us to rewrite the initial set of parameters (η i , ò i , σ i ) in terms of a new one: (ω i , η i , ò i ); in particular, the equilibrium map a i in equation (26) now reads: We note that if X  Î then we have n(X) = X/|X| and this implies that n X sgn X ( ) ( ) = . Therefore, during the system dynamics, that is, along the equation of motion, the function X = X(t) gives rise to a discontinuous equilibrium map since n X sgn X t ( ) ( ( )) = can change sign discontinuously in time passing from +1 to −1 and viceversa. Hence, the application to a one-dimensional case requires that the potential function is not of the form V = V(|X − ò i q i |).
By going back to the three-dimensional case, we observe that the interaction matrix K does not depend on the parameters η i and ò i describing the interactions.
We will show that, in order to coherently define a continuum limit of the kernel function appearing in a GLE obtained through the elimination procedure, we have to normalize the equilibrium map as we show in the upcoming section.

Generalization of the equilibrium map
In order to make our derivation as general as possible, we assume to have n heavy particle position-dependent functions, X f I Essentially, the main trick is to separate, in the equilibrium map, the dependence by X from the dependence by the parameters η i,I , so that the f I -functions do not depend on such parameters. Although each function f I could have a generic X dependence, without loss of generality, we assume that only the gradient of the first function f 1 equals the identity matrix in 3  . These assumptions can be formalized as follows The specific case provided in equation (26) can be obtained from the definition (31) posing n = 2 and f 1 (X) ≔ X and f 2 (X) ≔ X/∥X∥. Therefore, in this setting, the Hamiltonian function of the system is:

Alternative derivation
The system of Hamilton's equations of motion (22) based on the Hamiltonian (33) written as a system of second order differential equations reads: We note that equation (35) is linear in the heat bath-variables Q, this allows us to apply the Laplace transform method [24][25][26]. Hence, the Laplace transform of equation (35) is In order to recover the common structure of a GLE, we note that the mathematical structure of the equation obtained above allows to define the kernel matrix and the noise term as follows: We wish to stress a significant difference between the current definitions of noise vector and kernel matrix in equation (44) and the previous ones in equation (8) and (9). In the latter, the noise and kernel terms do not contain any information about the strength of the interactions represented by the coefficients γ i,I and all the parameters γ i,I are contained in the equilibrium map. In the former case, the parameters η i,I have been completely absorbed into the noise term and kernel matrix. Therefore, we can recast equation (43) into a nonlinear GLE: An important step forward in the investigation of stochastic behaviors arising from interacting Hamiltonian systems would be made if we could solve equation (45) which is a nonlinear stochastic differential equation. Two different approaches could be employed. From one side, one can attempt a semi-analytical approach exploiting the same method introduced in [34]. The authors showed that, even though we have a nonlinear differential equation, but we explicit know the time-dependence of the functions entering the equation 2 , we can still apply the Laplace transform method to equation (45). However, this approach cannot always provide an exact solution but a series representation of the solution is possible. So far, this approach has been applied just for investigating nonlinear differential equations which are not stochastic but it could be extended to address stochastic differential equation. A more detailed study will be done in a future paper. Thus, by exploiting equation (52), the average of the tensor product above becomes: where we used equation (40) and knowing that K t cos 1 2 ( )and K t sin 1 2 ( )commute with any polynomial of K. We obtained in equation (54), the fluctuation-dissipation relation; in fact, by comparing the operator on the right-hand side above with equation (44), we explicitly have: As proof of consistence and show-case example, let us finally apply such a derivation to the Hamiltonian system in equation  (15) in the Zwanzig's result [31]. However, now, the continuum limit can be computed. We conclude this section discussing the importance of including all the parameters representing the interactions (in this case η i ) directly in the definitions of the kernel matrix and noise terms In particular, this can be appreciated as soon as one computes the continuum limit of the friction matrix, which consists in assuming that the set of heat bath frequencies d becomes a continuum set  Ì S together with a distribution function g(ω) of frequencies which depends on the specific interactions we considered.
In this limit, one replaces the sum over the frequencies with an integral [31]: { } h = is any set of coefficients and η(ω) is the associated frequency-dependent function in the continuum limit. Now, if the kernel function is defined as in equation (17), i.e., without including the parameters γ i , the continuum limit, which reads: