Machine-learning Kohn-Sham potential from dynamics in time-dependent Kohn-Sham systems

The construction of a better exchange-correlation potential in time-dependent density functional theory (TDDFT) can improve the accuracy of TDDFT calculations and provide more accurate predictions of the properties of many-electron systems. Here, we propose a machine learning method to develop the energy functional and the Kohn-Sham potential of a time-dependent Kohn-Sham system is proposed. The method is based on the dynamics of the Kohn-Sham system and does not require any data on the exact Kohn-Sham potential for training the model. We demonstrate the results of our method with a 1D harmonic oscillator example and a 1D two-electron example. We show that the machine-learned Kohn-Sham potential matches the exact Kohn-Sham potential in the absence of memory effect. Our method can still capture the dynamics of the Kohn-Sham system in the presence of memory effects. The machine learning method developed in this article provides insight into making better approximations of the energy functional and the Kohn-Sham potential in the time-dependent Kohn-Sham system.


Introduction
A major challenge in TDDFT is the development of high-accuracy approximations for the Kohn-Sham potential v KS (r, t) in the time-dependent Kohn-Sham (TDKS) system.The Kohn-Sham potential is an effective potential that allows us to equivalently convert the interacting electron system to a fictitious non-interacting system known as the Kohn-Sham system.As a tradeoff of this conversion, it is extremely hard to derive the explicit expression of the Kohn-Sham potential, which, in principle, determines the property of the electron system.As a result, significant research effort has been devoted to developing better approximations for the Kohn-Sham potential.
The Runge-Gross theorem [1] guarantees that the Kohn-Sham potential is uniquely determined (up to a purely time-dependent function) by the time-dependent electronic density n(r, t) for a given initial many-body state Ψ 0 = Ψ(r, t 0 ) of the interacting system and initial Kohn-Sham state Φ 0 = Φ(r, t 0 ) of the non-interacting Kohn-Sham system.The associated Kohn-Sham potential is often written as a sum of three terms v KS [n, Ψ 0 , Φ 0 ](r, t) = v ext (r, t) + v H [n](r, t) + v xc [n, Ψ 0 , Φ 0 ](r, t), where the first term is the external potential, the second term is the Hartree potential describing the classical electron-electron interaction.The first two terms are numerically easy and can be calculated explicitly given the electronic density.The last term is the exchange-correlation (xc) potential.The xc potential includes all the complex and non-trivial effects whose exact form is unknown.Hence, the primary challenge in determining an accurate approximation of the Kohn-Sham potential is finding an effective approximation for the exchange-correlation potential.This task is challenging due to the complexity of electron interactions in many-electron systems.
To enhance the accuracy of TDDFT calculations and provide more precise predictions of the properties of many-electron systems, various approximation methods for the xc-potential have been developed.Commonly used methods include local density approximation (LDA), generalized gradient approximation (GGA), and metageneralized gradient approximation (meta-GGA) [2][3][4], etc.Besides these traditional explicit constructions, some semi-empirical approaches were developed as well [5].Recent works have provided approximation methods via machine learning techniques [6,7].In Ref. [7], the authors investigated a two-electron system, where the electron densities and the exact xc-potential at different times were calculated explicitly from the exact form of the time-dependent two-electron wavefunction.A machine learning model that maps the electron density to the xc-potential was trained under the supervised learning framework, using the pre-calculated density-xc-potential pair as the dataset.Such a method only works in small systems because the computational cost of calculating the many-body wavefunction grows exponentially with the system size.Therefore, it is natural to ask whether we can develop a machine learning-based method for approximating the energy functional, and hence the Kohn-Sham potential and the xcpotential, without feeding the exact xc-potential to the machine learning model for training, since the xc-potential is uniquely determined by the electron density.
A similar question in classical mechanics has been extensively investigated by researchers from the artificial intelligence community.Several types of neural networks have been developed to learn the potential from classical trajectories.The well-trained model can be further utilized to predict the trajectories that are not contained in the training set.Most of the neural networks in learning classical dynamics can be classified into two categories, the Hamiltonian neural network (HNN) [8][9][10][11][12] and the Lagrangian neural network (LNN) [13].HNN incorporates Hamilton's equations into the neural network architecture, while LNN uses ideas from the Lagrangian equation.Both methods have shown promising results for learning classical mechanics, but they have not yet been extended to learn the more challenging Kohn-Sham potential.
Our work generalizes the idea of HNN to the quantum chemistry regime.We propose a machine learning method for studying the Kohn-Sham system under adiabatic approximation, and we provide a positive answer to the question we posed earlier about the feasibility of using machine learning to approximate the energy functional without using any information about the exact xc-potential.In section 2, we establish the connection between the Kohn-Sham equations and Hamilton's equations.The connection lies as the cornerstone for the construction of our neural network.In section 3, we present our results with a harmonic oscillator example and a two-electron example.Conclusions and summaries are offered in section 4.

Kohn-Sham equations and Hamilton's equations
Before discussing the Kohn-Sham equations, it is first necessary to understand the connection between the time-dependent Schrödinger equation in quantum mechanics and Hamilton's equations in classical mechanics.The time-dependent Schrödinger equation governs how a quantum system evolves over time, while Hamilton's equations determine the dynamical behavior of a classical mechanical system in terms of its position, momentum, and energy.
The time-dependent Kohn-Sham equations are a form of the time-dependent Schrödinger equation, describing non-interacting particles.In TDDFT, the requirement for the Kohn-Sham equations is that the density produced by the non-interacting system should be the same as the density of the original interacting system [1,14].

Schrödinger equation and Hamilton's equations
The Schrödinger equation reads (in atomic units throughout the article): Given state |Ψ(t) , we can rewrite it as a linear combination of a set of basis states {|k k=1,2,... } (which leads to the set of basis functions in terms of position s k (r) = r|k } k=1,2,... ), expressed as |Ψ(t) = k c k (t) |k .This expression allows us to rewrite the Schrödinger equation into the form of a vector differential equation: where c(t) = [c 1 (t), . . ., c n (t), . ..]T is the coefficients vector, H is the matrix representation of the Hamiltonian Ĥ with the matrix elements being H ij = i| Ĥ|j .To uncover the relationship between the Schrödinger equation and Hamilton's equations, we define an energy functional: where c † (t) is the complex conjugate of c(t).To recover Hamilton's equations in the form of real-valued differential equations, we can separate the coefficients vector into a real part and an imaginary part: where q(t) = √ 2 Re c(t) and p(t) = √ 2 Im c(t).This transformation in quantum mechanics was first proposed in Ref. [15] and later investigated in Ref. [16][17][18], allowing us to express the complex-valued energy functional H[c(t)] to real-valued functional H[q(t), p(t)].Thus we have the following Hamilton's equations with respect to the real and imaginary part of the coefficients: ∂H[q(t), p(t)] ∂q(t) = − ṗ(t).
A more compact equivalent expression in terms of complex variables is given by: This equation connects the Schrödinger equation and Hamilton's equation.In the next section, we will see how Kohn-Sham equations are connected to Hamilton's equations with a similar treatment.

Kohn-Sham equations and Hamilton's equation
To derive similar equations in the time-dependent Kohn-Sham system, we need to go a bit beyond the Schrödinger equation in the above section and design a different energy functional expression.To simplify things, we assume the adiabatic approximation in the whole article [19][20][21], which is commonly used in TDDFT calculation.The adiabatic approximation means the exchange-correlation potential depends only on the instantaneous electron density n(t) ‡.There are two reasons for using the adiabatic approximation.One is that the energy functional under adiabatic approximation is a functional of the instantaneous electron density, as a consequence, the xc-potential can be calculated from the xc-energy functional by v xc [n(t)](r) = δExc[n(t)] δn(t) .The other consideration is that it is more convenient to construct a neural network that maps the instantaneous electron density to the energy functional.
Although the adiabatic approximation is widely used in TDDFT calculation, it often underestimates the complexity of the xc-potential.The exact xc-potential at time t is typically dependent on the electron density at all previous times, leading to the socalled memory effect [22][23][24].It is worth noting that such a complicated dependence on the history of the electron density is neglected by the adiabatic approximation.Thus, for a system where the memory effect is not negligible, our machine-learning method may not provide the exact Kohn-Sham potential.In the results part of this article, we will demonstrate this with examples.
With the adiabatic approximation, we can now discuss the relationship between the Kohn-Sham equations and Hamilton's equations.Note that the Kohn-Sham equations in TDDFT are time-dependent Schrödinger equations describing the non-interacting particles: where is the Hamiltonian in the Kohn-Sham system, m labels the m-th Kohn-Sham orbital.The corresponding energy functional is given by: where are the Hartree energy functional, external energy functional, and xc-energy functional, respectively.
Similar to the discussion in section 2.1, it is possible to rewrite the Kohn-Sham equations into vector differential equations in terms of the coefficients of the Kohn-Sham orbitals given a set of basis funtions {s k (r) = r|k }: where . .]T is the coefficient vector of the m-th Kohn-Sham orbital, and the matrix element of H KS [n(t)] is given by i| ĤKS [n(t)]|j .Since the electron density can be expressed in terms of the coefficient vector ] and H KS [n(t)], respectively.Thus the total energy functional defined in the Kohn-Sham system is consistent with the energy functional defined in Eq. 3 Similarly, we can show that Hamilton's equations hold in the Kohn-Sham system under the adiabatic approximation as well.The details of the proof are shown in Appendix B.
An equivalent real-valued Hamilton's equation can be obtained by applying the transformation given in Eq. 4: Therefore, Eq. 6 and 7 can be generalized to the time-dependent Kohn-Sham system.In Fig. 1, we show the workflow for converting the quantum equations to classical Hamilton's equations in both the Schrödinger equation and the Kohn-Sham equations.The Hamilton's equations derived in the Kohn-Sham system are the cornerstone of our machine-learning method.We will discuss the details in the next section.

Machine learning energy functional
From this part to the end, the discussion is limited to the 1-dimensional system.For simplicity, we use H[c(t)] and H[q(t), p(t)] instead of H KS [{c m (t)} m=1,2... ] and H KS [{q m (t), p m (t)}] for the following discussion, because as shown in Eq. 11, all Kohn-Sham orbitals follow the same differential equation.In this section, we will demonstrate how we incorporate the equations derived above into the neural networks.
Before discussing the technical details, we need to prepare the dataset.We then pick n basis functions.In our work, the sinc discrete variable representation (DVR) basis is used [17,25], where ∆x is the grid's spacing.Thus for any training given initial condition c(t = 0) = c 0 , c(t) = [c 1 (t), c 2 (t), . . ., c n (t)] defines a trajectory γ(t) in C n , or equivalently R 2n .In the rest of the discussion, we will use the real-valued Hamiltonian since the real-valued functions are better handled in contemporary machine learning frameworks, e.g., PyTorch [26], TensorFlow [27], etc.Given M different initial conditions, we can obtain M different trajectories γ (1) (t), . . ., γ (M ) (t).We can then perform sampling on the trajectories over N different timestamps.The total M × N sampling points and their time-derivatives {(γ (1) The Hamiltonian H[q, p] defines a real-valued function on R 2n , which determines one trajectory given an initial condition.Thus we can use a neural network ansatz to parameterize the Hamiltonian that controls the system's dynamics and train it with the dataset we obtained above.The trained Hamiltonian can be used to predict the future dynamics of the system.This idea is illustrated in Fig. 2.An important property of this framework is that the total energy is invariant on a given trajectory γ(t) because The structure of our neural network is shown in Fig. 3.The input layer of the neural network has 2n entries, the first n entries are for the scaled real part of the coefficients at time t, and the last n entries are for the scaled imaginary part of the coefficients.There are two hidden layers.There is one scalar as the output layer representing the energy functional of the system in our neural network.Let y (i) be the variable vector of the i-th layer, the i-th and the i + 1-th layer are connected by, where σ(•) is the non-linear activation function, W (i) and b (i) are trainable weights and biases of the i-th layer.In the numerical experiments we carried out, σ(x) = tanh x and σ(x) = softplus(x) were used for the activation functions.
Figure 3: The structure of our neural network.q i = √ 2Re c i is the scaled real part of the i-th entry of the coefficient vector, p i = √ 2Im c i is the scaled imaginary part of the i-th entry of the coefficient vector.The scaled real and imaginary parts of the coefficients at time t are fed into the neural network as our inputs.There are two hidden layers.Neurons are connected by non-linear activation functions (in the numerical experiments we carried out, softplus and tanh were chosen).The output layer is the energy functional of the Kohn-Sham system.The loss function relies on the gradient (shown in the last layer in the figure) of the energy functional, we compute it with auto differentiation functionality in Pytorch.
The loss function is given by: where q i = √ 2Re c i and p i = √ 2Im c i are the scaled real and imaginary parts of the i-th entry of the coefficient vector c, respectively.qi and ṗi are their time derivatives.
∂H ∂q i and ∂H ∂p i are from the gradient of the energy functional calculated using the autodifferentiation functionality in modern machine learning frameworks.In our work, we chose Pytorch.The Hamilton's-equation-inspired loss function was originally proposed in Ref. [28] and Ref. [8] for studying classical motions.We generalize it to the quantum regime for investigating potential inversion problems in Schrödinger equations and Kohn-sham equations.The neural network is optimized using the adaptive moment estimation method (Adam) [29].By minimizing the loss function, we can get the energy functional of the Kohn-Sham system parametrized by the neural network.
We need to take an additional step to calculate the Kohn-Sham potential from the energy functional in the system.Since the sinc DVR is used in this article, the corresponding matrix element of the potential term V ij KS is given by: where T ij is the matrix element of the kinetic energy term whose value is [17,25] The details of the proof of Eq. 17 are shown in Appendix B.

Harmonic oscillator
The first example we choose is the 1-dimensional harmonic oscillator.With this example, we will demonstrate that the system dynamics can be successfully reproduced by the abovementioned machine learning method.The potential is time-independent v(x) = 1 2 ω 2 x 2 , where ω = 1.The eigenstates and their time derivatives of the Harmonic oscillator have analytical expressions, where H n (x) is the n-th Hermite polynomial.We obtain the dataset from the lowest M eigenstates.For each eigenstate in the dataset, we chose 150 evenly spaced grid points in the range x ∈ [−6, 6].In the time domain, 2000 timestamps are evenly sampled in t ∈ [0, 4π] with a time step length ∆t = 6.3 × 10 −3 (a.u.).Thus, the input dimension of the neural network is 300 (twice the number of grid points), and the dimension of each hidden layer is 400.The number of eigenstates is set to be M = 15 in this numerical experiment, which means we have sampled over 15 different trajectories in this dataset, each of which contains 2000 sampling points.
To verify the trained neural network produces the correct dynamics of the system, we randomly pick an initial state that is not in the training set and let it propagate under the given Hamiltonian.In the example given below, the initial state is a linear combination of the lowest four eigenstates |ψ(x, . Then we compare the machine-learned time-dependent density to the exact density.To be precise, at time t, we can compute the gradient of the energy functional ∂H ∂q i and ∂H ∂p i using auto-differentiation, which can be used to propagate the coefficients to the next timestamp t + dt with the Runge-Kutta (R-K) method [30].In this sense, our neural network can be viewed as a differential equation solver.The entire pipeline of propagating the initial state with our neural network is summarized in Fig. 4. We compare the time evolution of the electron density propagated using the machine-learned energy functional and the exact result calculated from analytical expression in Fig. 5.The machine-learned density (black dashed line) exhibits a similar trend to the exact result at any point in the entire position space at early timestamps, e.g., t = 1.0 (a.u.).The error accumulates with time.As shown in Fig. 5, the error grows larger at t = 4.0 (a.u.), but still captures the general feature of the exact density.In the third plot of Fig. 5, the results are completely different at a longer time t = 15.0 (a.u.).
We use mean square error (MSE) to quantify the error in the density at each timestamp.Given the exact electron density n(x, t) and the machine-learned density n M L (x, t), the MSE at time t between n(x, t) and n M L (x, t) is given by 2 , where N g is the number of grid points in the space domain.The resulting MSE at different times t and its logarithmic plot are shown in Fig. 6.It is clear from the figure that the MSE grows in time.We notice the error in this example shows a periodic pattern; this is due to the periodicity of the system.To avoid the error from going too big, one can set a threshold and calibrate the time evolution of the trajectory when the error is larger than a predefined threshold.
As shown in Eq. 14, the energy functional on a given trajectory shouldn't be changed over time.Specifically, in our harmonic oscillator test, if we evaluate the machinelearned energy functional on the trajectory of an eigenstate φ n (x, t), the output should  be the eigenvalue n + 1 2 (up to a constant shift).This phenomenon is observed in Fig. 7a, where each red stages from bottom to top correspond to the trajectories of the first to the 15th eigenstates, respectively.As shown in the figure, each stage is flat, indicating that the energy associated with the trajectory remains constant over time.Another interesting fact is that the difference between consecutive stages is 1, which is exactly the energy level spacing of the Harmonic oscillator.These are non-trivial effects because energy conservation law has never been explicitly involved in the whole machine learning workflow.However, the neural network can discover the pattern from data automatically.
We calculate the machine-learned potential using Eq. 17 and show the machinelearned potential (black dots) versus the exact potential (red line) in Fig. 7b.The machine-learned potential captures the structure of the exact quadratic potential around the center of the position space, but it starts to deviate from the actual potential near the boundary of the pre-selected finite region.One possible reason for this phenomenon is that our training set only covers a fraction of the entire Hilbert space, which means (i) the training set covers only a finite range of the infinite position space, (ii) the training set includes only the lowest M eigenstates of the infinite many eigenstates.We show in Appendix C that having more eigenstates in the training set results in more accurate machine-learned potential.

Two electron test
We revisit the two-electron system previously studied in [7,31,32], and use the machine learning method developed in our work to study the Kohn-Sham potential in the system.With this example, we demonstrate that our machine-learning method can be extended to study realistic electronic systems as well.The many-body Hamiltonian governing the dynamics of the two-electron system is given by: where . The spatial part of the initial two-electron wavefunction is Ψ( , where φ H (x) is the ground state hydrogen wavefunction, φ W P (x) = (2α/π) with α = 0.1, x 0 = 10.0, p = −1.5.So we have one Gaussian wave packet centered at x 0 = 10.0 moving leftward.As the system evolves, the Gaussian wave packet collides with the static Hydrogen wavefunction and then passes through the Hydrogen wave function at longer times.
The two-electron wavefunction Ψ(x 1 , x 2 , t) can be computed numerically.Based on the numerically exact solution of the two-electron wavefunction, we can calculate the electronic density n(x, t) = 2 dx 2 Ψ ⋆ (x, x 2 , t)Ψ(x, x 2 , t) and the density current j(x, t) = 2Im dx 2 Ψ ⋆ (x, x 2 , t)∂ x Ψ(x, x 2 , t).For a spin-singlet state, the spatial parts of the two Kohn-Sham orbitals are computed as dx ′ , such that the corresponding Kohn-Sham potential is a real function [24,33].Coefficients of the Kohn-Sham orbital are then obtained as our dataset.
Similar to the discussion in the harmonic oscillator example, the system is discretized evenly in the range x ∈ [−24.78,21.97] with 200 grid points.The propagation starts from t = 0 to t = 15.0 (a.u.) with time step length ∆t = 5 × 10 −4 (a.u.).The input dimension of the neural network for the two-electron model is thus 400, and the dimension of each hidden layer is 400.
We show the time evolution of the density generated from the neural network in Fig. 8.The time evolution of the machine-learned density (black dashed line) captures the feature of the exact evolution (red solid line).As mentioned in the previous section, the error accumulates over steps, so we calibrate the initial condition every 500 timestamps.We notice that the machine-learned density overlaps well with the exact density at t = 5.0 (a.u.) and t = 13.0 (a.u.), but shows a small deviation at t = 8.5 (a.u.) when the collision between the two wave packets starts to happen.The choice of calibration every 500 timestamps is to ensure the difference between the machine-learned density and the exact density is small even when the collision happens, which can be increased to a larger number (e.g., 3000) in the absence of collision.As a comparison, we show the results start at t = 4.0 (a.u.) with no calibration in Fig. 9.The error is negligible at t = 5.0 (a.u.) (1000 steps).It starts to grow at t = 6.0 (a.u.) (2000 steps) and blows up at t = 7.0 (a.u.) (3000 steps).Therefore, calibrations to the initial condition are necessary to prevent the error from accumulating.We show the MSE between the machine-learned electron density and the exact electron density at different times in Fig. 10.Fig. 10a and Fig. 10b are the MSEs in linear and logarithmic scales with calibration every 500 steps, respectively.Both show a spike at t ≈ 8.5 (a.u.), which is consistent with our abovementioned observations: the error is larger at around t = 8.5 (a.u.) when the collision between the two wave packets starts to happen.Fig. 10c and Fig. 10d are MSEs in linear and logarithmic scales without calibrations.We observe that the MSE changes dramatically after t ≈ 6.5 (a.u.) in Fig. 10c.The relationship between MSE and time becomes evident in the logarithmic scale plot.As shown in Fig. 10d, the MSE grows exponentially with time.
Although the neural network reproduces the correct dynamics of the system, we need to mention that the machine-learned xc potential is not the same as the actual one.We show the comparison in Fig. 11.One note here is that we don't compare the xc potential directly.Instead, we compare the exchange-correlation part of the xc potential.The machine-learned xc potential (black dots) does not capture the features of the actual xc (red solid line) potential.Similar phenomena were observed in Ref. [34], where the dynamics of the same system were reproduced correctly by adjoint learning, but they mentioned that the machine-learned xc potential did not correspond to the actual one.The adiabatic local density approximation (ALDA) xc potential (blue dash line) is also shown for comparison.Due to the lack of memory effect, the ALDA xc potential does not capture the exact dynamics of the two-electron system.
In our machine learning method, we attempted to learn the correct dynamics under adiabatic approximation.So, the machine-learned xc potential is expected to be different from both the exact xc potential and the ALDA xc potential but to retain the features of both potentials.The results shown in Fig. 11 can show some evidence.In Fig. 11a and Fig. 11b, the machine-learned xc potential peaks at x = −10, which is in alignment with the ALDA xc potential.The trend of machine-learned xc potential is similar to the exact xc potential in the region of x > 0. The difference between the exact xc potential and the machine-learned potential, as well as the difference between the machine-learned and the ALDA potential, are shown in Figs.11d,11e and 11f.harmonic oscillator model and a two-electron soft Coulomb model.In both examples, the exact dynamics of the systems could be well reproduced from the machine learning method.The machine-learned potential in the harmonic oscillator test captures the general feature of the actual quadratic form potential, but it shows a discrepancy from the actual one in the two-electron test.We have analyzed the possible reasons, and the memory effect is the major source of error.The memory effect requires considering the densities of previous timestamps.To overcome this difficulty, we believe the neural networks capable of handling time series (e.g., RNN, LSTM) are promising [35,36].

Figure 1 :
Figure 1: Workflow for converting quantum equations into classical Hamilton's equation.The left blue block is the workflow for Schrödinger equation.The right red block is the workflow for Kohn-Sham equations.

Figure 2 :
Figure 2: Illustration of the Machine learning Hamiltonian framework.The sampling points over M trajectories are used for training a Hamiltonian modeled by a neural network, which will be used to predict the dynamics of the system.

Figure 4 :
Figure 4: Pipeline of propagating the initial state with our neural network

Figure 5 :
Figure 5: The machine learned density and the exact density at different timestamps.The black dashed line is the machine-learned density, the red solid line is the exact density.The three panels correspond to the results of timestamp t = 1.0 (a.u.) (left), t = 4.0 (a.u.) (middle), and t = 15.0 (a.u.) (right).

Figure 6 :
Figure 6: (a) MSE in density accumulates with time in linear scales.(b) MSE in density accumulates with time in logarithmic scales.The periodic pattern in local maximum and local minimum due to the periodicity of the system.

Figure 7 :
Figure 7: (a) is the figure of energy level.We manually shift the ground state energy to 0 for comparison.The neural network generates the correct level spacing for the harmonic oscillator test.(b) the machine learned potential (black dots) versus the actual potential (red line).

Figure 8 :
Figure8: The machine learned density and the exact density at different timestamps in the two-electron test with a calibration every 500 steps.The black dashed line is the machine-learned density, the red solid line is the exact density.The three panels correspond to the results of timestamp t = 5.0 (a.u.) (left), t = 8.5 (a.u.) (middle), and t = 13.0 (a.u.) (right).

Figure 9 :
Figure9: The machine learned density and the exact density at different timestamps in the two-electron test without calibrations.The black dashed line is the machine-learned density, the red solid line is the exact density.The three panels correspond to the results of timestamp t = 5.0 (a.u.) (left), t = 6.0 (a.u.) (middle), and t = 7.0 (a.u.) (right).

Figure 11 :
Figure 11: (a) -(c): The machine-learned xc potential, the exact xc potential, and the ALDA xc potential at different timestamps in the two-electron test.The black dashed line is the machine-learned density, the red solid line is the exact density, and the blue dashed line is the ALDA potential.(d) -(f): The red dotted line is the difference between the machine-learned xc potential and the exact xc potential.The blue dashed line is the difference between the machine-learned xc potential and the ALDA xc potential.The three columns correspond to the result at t = 5.0 (a.u.) (left), t = 8.5 (a.u.) (middle), and t = 13.0 (a.u.) (right).

i|
JY and JDW were supported by the U.S. Department of Energy, Office of Science, and Office of Advanced Scientific Computing Research, under the Quantum ComputingAppendix B. Kohn-Sham equations and Hamilton's equationWe prove Hamilton's equations in Kohn-Sham system in this part.The Kohn-Sham equations are given by,i ∂ ∂t |φ m (t) = ĤKS [n(t)] |φ m (t) , m = 1, 2, . . ., (B.1)where N is the number of electrons, and n(t) = N m=1 |φ m (t)| 2 .In section 2.2, we have shown the Kohn-Sham equation can be written as a differential equation in terms of the vector coefficients c m , given the basis functions {s i (r) = r|i }:i ċm = H KS [c]c m , (B.2)where the matrix element at i-th row j-th column is given by:(H KS [c]) ij = i| Ts [n(t)] + VH [n(t)] + Vext [n(t)] + Vxc [n(t)]|j .(B.3)To bridge Kohn-Sham equations and Hamilton's equation, we need to find an energy functionalH KS [c], such that ∂H KS [c] ∂c † m = H KS [c]c m .Under the adiabatic approximation, the energy functional corresponding to the system isH KS [n] = T S [n] + E ext [n] + E H [n] + E xc [n], where n(c; r) = m,i,j c * im c jm s * i (r)s j (r)is the electron density mentioned in the main text.Therefore, the four terms in the energy functional expansion can be calculated below:(i) Kinetic energy:T s [n] =m i,j i| Ts |j c jm c * im (B.4) ∂T s [n] ∂c * im = j i| Ts |j c jm (B.5) (ii) External energy: E ext [n] = m i,j i| Vext |j c jm c * im Vext |j c jm (B.7) (iii) Hartree energy: ∂E H [n] ∂c * im = dr dr ′ δE H [n] δn(c; r ′ ) δ(r − r ′ ) ∂n(c; r) ∂c * im = j dr v H (r)c jm s * i (r)s j (r) = j i| VH |j c jm (B.8) (iv) Exchange-correlation energy: ∂E xc [n] ∂c * im = dr dr ′ δE xc [n] δn(c; r ′ ) δ(r − r ′ ) ∂n(c; r) ∂c * im = j dr v xc (r)c jm s * i (r)s j (r) = j i| Vxc |j c jm (B.9)