Unified Description of Classical and Quantum Behaviours in a Variational Principle

We give a pedagogical introduction of the stochastic variational method and show that this generalized variational principle describes classical and quantum mechanics in a unified way.


Introduction
Variational approach conceptually plays a fundamental role in elucidating the structure of classical mechanics, clarifying the origin of dynamics and the relation between symmetries and conservation laws. In classical mechanics, the optimized function is characterized by Lagrangian, defined as T − V with T and V being a kinetic and a potential terms, respectively.
We can still argue the variational principle in quantum mechanics, but the Lagrangian does not have any more the form of T − V , instead it is given by ψ * (ih∂ t −Ĥ)ψ, whereĤ is a Hamiltonian operator and ψ is a wave function. Therefore, at first glance, any clear or direct correspondence between classical and quantum mechanics does not seem to exist in the variational point of view, but it does exist. If we extend the idea of the variation to stochastic variable, the variational principle describes classical and quantum behaviors in a unified way.
This method is called stochastic variational method (SVM) and firstly proposed by Yasue [1,2,3,4,5] so as to reformulate Nelson's stochastic quantization [6,7]. This framework is, however, based on special techniques attributed to stochastic calculus which is not familiar to physicists. In this paper, we give a pedagogical introduction of SVM in a self-contained manner, showing the unified description of classical and quantum mechanics. As another review, see, for example, Ref. [8].

Variational method for stochastic variables
Because of the limitation of pages, we cannot explain all aspects of stochastic calculus in detail. See for example, Ref. [9] for standard techniques which are not explained here.

Forward and Backward SDEs
In the variational principle for stochastic variables, a particle trajectory is not any more smooth and given by a zig-zag path in general. As the consequence, the evolution of a particle trajectory is defined by the following stochastic differential equation (SDE), In this paper, a difference dA(t) is always defined by A(t + dt) − A(t) independently of the sign of dt. The last term in Eq. (1) is the origin of the zig-zag motion and called noise term. The parameter ν characterizes the strength of this noise term. One can easily see that u(r(t), t) is reduced to the usual classical definition of the particle velocity in the limit of vanishing ν. The property of W t depends on the stochastic property of the noise term. In the present paper, we assume that W t is the Wiener process, which is characterized by the following correlation properties, where E[ ] indicates the average of stochastic events. It is clear from the above properties that dW t behaves as the so-called Gaussian white noise. Such a SDE (Langevin equation) has been used in statistical physics to discuss, for example, thermalization. This is essentially an irreversible process and we exclusively discuss the time evolution for a given initial condition. However, in the formulation of a variational method, we should fix not only an initial condition but also a final condition. If we consider a backward process in time, dt < 0, it should describe a stochastic process from the final condition to the initial condition.
Then what is the time-reversed process corresponding to Eq. (1)? To discuss this, let us define the probability distribution as where r(t) is the solution of Eq. (1) and ρ I (r i ) is the initial particle distribution with r(t i ) = r i at an initial time t i . As is well-known, the evolution equation is given by the Fokker-Planck equation, If the probability distribution evolves from ρ I (r) to ρ F (r) ≡ ρ(r(t f ), t f ) at a final time t f following Eq. (6), the corresponding time-reversed process should describe the evolution from ρ F to ρ I . Suppose that this process is described by where, it should be noted, dt < 0. Then differently from the classical dynamics, it generally holds,ũ (r, t)dt = −u(r, t)|dt|.
To understand this reason, let us consider the case u = 0, where Eq. (6) becomes a simple diffusion equation, and hence the corresponding time-reversed process should describe an accumulation (opposite of diffusion) process. However, if the condition (8) were satisfied, Eq. (7) coincides with the diffusion equation and cannot describe the accumulation process.
To obtain the precise relation instead of Eq. (8), we calculate the Fokker-Planck equation assuming Eq. (7) as For the two Fokker-Planck equations (6) and (9) to be equivalent, we find the following condition This is the consistency condition for Eq. (7) to be the time-reversed process of Eq. (1). 1 In fact, for the diffusion case where u = 0, we obtain by the consistency condition, Substituting this, one can see that Eq. (7) indeed describes the accumulation process. Interestingly, this consistency condition can be derived even from the property of Bayes' theorem. See Ref. [16].
For the later convenience, let us introduce the mean velocity, Once this quantity is determined, one can easily find u andũ by using the consistency condition. This mean velocity is parallel to the flow of the particle probability distribution. In fact, the above two Fokker-Planck equations are reduced to the following simple equation,

Velocities in SVM and partial integration formula
The action which we will optimize is the time integral of the Lagrangian, which depends on the particle velocity. When the trajectory is described by stochastic variables, however, the definition corresponding to the velocity is not trivial.
As is well-known, the time derivative of the trajectory described by SDE, for example Eq. (1), is not well-defined in the limit of |dt| → 0. This can be seen from the fact that dW t has a size proportional to |dt| from Eq. (3), and thus dr/dt ∼ dW t /dt ∼ 1/ |dt|.
However, it is known that there are two possible definitions which have a well-defined limit of dt proposed by Nelson [6,7]: One is the the mean forward derivative and the other the mean backward derivative, These expectations are conditional averages, where P t (F t ) indicates to fix values of r(t ′ ) for t ′ ≤ t (t ′ ≥ t). For the Wiener process, one can easily find DW t = 0, but, in general, DW t = 0. 2 When r(t) is described by the forward and backward SDEs defined above, we find As a matter of fact, it is impossible to control the behavior of each trajectory completely because of the random noise. What we can adjust is, at best, only the trend of stochastic motions. Then the mean forward derivative defined above represents the most probable velocity forward in time when a particle is located at r(t), and the mean backward derivative is that of the backward in time. Therefore, what we should obtain by the variational procedure is the form of u (or equivalentlyũ), and it is natural to express the velocities appearing in an optimized function by these quantities.
Because of the two different time derivatives, the partial integration formula for the stochastic variable is modified as The derivation is given in Appendix A. One should notice that when the time derivative D moves from the left variable (X(t)) to the right (Y (t)), it is replaced byD.

Variation of stochastic action
As an example of SVM, let us consider the optimization of the one particle Lagrangian, where m is the mass of the particle and V is a potential. As is well-known, Newton's equations of motion is obtained when the usual variational method is applied to this. To implement the stochastic variation to this Lagrangian, we need to express each term by the corresponding stochastic quantities. Due to the existence of the two possible definitions of the time derivatives, the most general quadratic form of the kinetic energy of the Lagrangian is given by [10] m 2ṙ where A ± = 1/2 ± α 1 and B ± = 1/2 ± α 2 with α 1 and α 2 being arbitrary real constants. Note that the right hand side reduces to the left hand side in the limit ν → 0, independently of the values of α i (i = 1, 2). If α 1 = 0, the optimized dynamics violates the time reversal symmetry and we obtain, for example, the Navier-Stokes-Fourier equation [15]. In the present discussion, however, we focus on dynamics with the time-reversal symmetry, and choose (α 1 , α 2 ) = (0, 1/2). Then, the stochastic action corresponding to Eq. (17) is given by It is known that there are several definitions for products of stochastic variables, for example, the Ito definition, Stratonovich definition and so on. However, there is no this ambiguity for, for example, (Dr(t)) 2 in this formulation. See Appendix B for details.
The variation of the stochastic variable is introduced as Here f (x, t) is an arbitrary infinitesimal function satisfying f (x, t i ) = f (x, t f ) = 0. Then, for example, the variation of the kinetic term is calculated as Here we have first used the definition of the mean forward derivative, and then the stochastic partial integration formula. The potential part does not contain any time derivative terms, and its variation is the same as that in the classical variational method. Then the result of the variation is obtained as It is clear from the definition of the mean derivatives that r(t) inDu is described by the backward SDE. Then, substituting the definition ofD and applying Ito's lemma (Appendix C), we obtaiñ Similarly, r(t) in Dũ is given by the forward SDE leading to Dũ(r(t), t) = ∂ t + u(r(t), t) · ∇ + ν∇ 2 ũ(r(t), t).
Note that the last noise term in Ito's lemma disappears in the above expressions, because of the conditional average included in the definition of the mean derivatives. In fact, In the variational principle of stochastic variables, we require that δI vanishes for 1) any choice of f (x, t), and also 2) any distribution of the stochastic variable r(t). To satisfy these, u (or equivalentlyũ) should be the solution of Substituting Eqs. (23) and (24), we obtain where the mean velocity v is defined by Eq. (12). It is worth mentioning that Eq. (26) is formally expressed as Note that the stochastic variable r(t) is replaced by the position parameter x in the above, only after operating all mean derivatives. This is nothing but the stochastic generalization of the Euler-Lagrange equation.

Schrödinger equation
These two equations (13) and (27) determine the optimized dynamics of the action given by Eq. (19). However, these coupled equations can be cast into a more familiar form. Let us introduce the following complex function, where the phase is defined by v(x, t) = 2ν∇θ(x, t).
Then, from Eqs. (13) and (27), the evolution equation of this quantity is given by When we choose ν =h/(2m), this is reduced to the Schrödinger equation, and ψ(x, t) is identified with the wave function. Furthermore, one can easily find that |ψ(x, t)| 2 gives the probability density distribution, without introducing any quantum mechanical interpretations. In short, the procedure described above, can be regarded to give an alternative quantization scheme.

Stochastic Noether theorem
In the SVM quantization scheme, the physical operator is defined through Noether's theorem for the stochastic action [11]. Let us consider the spatial translation by an arbitrary constant spatial vector A, and require the invariance of the stochastic action for this transform. Then we find Here we have used the stochastic Euler-Lagrange equation (28) and which is obtained from the stochastic partial integration formula. Because A is an arbitrary constant, one can deduce that the quantity, mE (Dr +Dr)/2 , is conserved. Using the solution obtained by the stochastic variation, this quantity is expressed as Here the conserved quantity is integrated for the initial particle distribution. This is the wellknown expression of momentum expectation value in quantum mechanics and −ih∇ is identified with the momentum operator. Similarly, we can obtain the conservation laws of energy, angular momentum and charge by the stochastic Noether theorem.

Canonical equation
Although the canonical formulation of SVM has not yet been established, we can still formally write down the stochastic canonical equation [12]. Let us introduce quantities corresponding to the momenta as Then the stochastic Hamiltonian can be introduced as the Legendre transform as H(r, p,p) = 1 2 (p · Dr +p ·Dr) − L(r, Dr,Dr).
The canonical equation is obtained by requiring that the invariance of the stochastic Hamiltonian by the following infinitesimal transforms, where η, ζ andζ are infinitesimal constants. Substituting these into the both sides of Eq. (37) and keeping up to the first order of these infinitesimal variables, we find the following relations: Finally, using the relations above and the definitions of the momenta, the stochastic Euler-Lagrange equation is re-expressed as

Concluding remarks
We have discussed the application of SVM to quantize classical particle systems. When we apply the stochastic variation of the action (19) assuming Eqs. (1) and (7), we obtain the Schrödinger equation. The result of the variation depends on the assumed form of Eqs. (1) and (7). If we use them with the limit of ν i → 0, the stochastic variation of the same action leads to Newton's equation of motion. That is, the usual variational method is a special case of SVM, and both classical and quantum mechanics are described in the framework of this more generalized variational method. It is also possible to apply SVM to quantize Klein-Gordon field [13] and abelian gauge field [14]. As a related work associated with quantized fields and random fields, see Ref. [17].
The framework of SVM itself can be regarded more general than the method of quantization. In fact, it is possible to derive the Navier-Stokes-Fourier equation by applying SVM to the action, which leads to the Euler equation when the usual classical mechanical method of variation is applied [15]. It is interesting to note that the Gross-Pitaevskii equation also can be obtained in the framework of SVM [15].
There are various proposals for the non-conventional quantization scheme. One of them is the so-called stochastic quantization proposed by Parisi and Wu [18,19]. In a similar way to SVM, the effect of quantum fluctuation is taken into account through SDE even in this method, but the philosophy for quantization seems to be completely different. For example, a fictitious time variable is introduced in the stochastic quantization. That is, to quantize a 3 + 1 dimensional system, we need to consider 3 + 1 + 1 dimension. Then SDE describes the evolution in this fictitious time. Moreover, what is calculated in this approach is propagators while the Schrödinger equation and physical operators are obtained in SVM. For other quantizations, See, for example, Refs. [16,20,21,22,23,24].

Acknowledgments
This work is financially supported by CNPq. KT is supported by the Brazilian Ministry of Science, Technology and Innovation (MCTI-Brazil), and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), project 550026/2011-8.

Appendix A. Stochastic partial integration formula
The time variable is discretized as t j = a + j b − a n , j = 0, 1, 2, · · · , n. (A.1) Then, we can show the following with the notations such as X j ≡ X(t j ) etc.
This is called the stochastic generalization of the partial integration formula.

Appendix B. Ito definition or Stratonovich definition?
For usual (non-stochastic) numbers, an integral of a function f (x) is defined by where dx i = x i+1 − x i and x i = x a + i−1 j=0 dx j . Here we have used x N = x b . However, the right hand side can be re-expressed as for a general smooth function f (x). However, these two different definitions give different results when x is a stochastic variable. Let us denote the Stieltjes integral for the Wiener process as t 0 W s dW s . Then corresponding to the argument above, we can define this integral in two different ways: one is the Ito definition, and the other is the Stratonovich definition These two different definitions are known to yield the difference in the results by t/2. Therefore, we must specify the definition of the product for the quantity like f (W t )dW t .
In the stochastic Lagrangian, we have the term such as Dr(t) · Dr(t). It is very similar to the quantity discussed above, and thus one might insist that we need to specify one of the definitions for this product. However, D is a conditional expectation value, and it does not contain any dW t dependence! Therefore, we do not need to introduce special definitions for these type of products. This is also one of the advantages to introduce the mean derivatives.