Modelling non-markovian quantum processes with recurrent neural networks

Quantum systems interacting with an unknown environment are notoriously difficult to model, especially in presence of non-Markovian and non-perturbative effects. Here we introduce a neural network based approach, which has the mathematical simplicity of the Gorini–Kossakowski–Sudarshan–Lindblad master equation, but is able to model non-Markovian effects in different regimes. This is achieved by using recurrent neural networks (RNNs) for defining Lindblad operators that can keep track of memory effects. Building upon this framework, we also introduce a neural network architecture that is able to reproduce the entire quantum evolution, given an initial state. As an application we study how to train these models for quantum process tomography, showing that RNNs are accurate over different times and regimes.


I. INTRODUCTION
Traditionally, in the physical sciences, the solution to mathematical problems for which no analytic solution is available involves modelling methods leveraging a combination of approximation techniques (such as perturbation theory or semiclassical approaches) and the use of symmetries to reduce the complexity of the problem.Recently, with the surge in popularity of machine learning techniques [1,2] data-driven approaches, which instead rely on computational techniques that exploit statistical correlations, have started to find applications in different fields, ranging from chaos theory [3] to high-energy physics [4], eventually showing many applications and new perspectives even in the quantum domain [5,6].For instance, in quantum many-body physics, where numerical techniques are particularly challenged by the exponential scaling of the wavefunction, learning methods have been used in a number of different domains.In particular, artificial neural networks (a class of learning methods inspired by the functioning of biological neural networks) have been utilized for ground state estimation [7], quantum state tomography [8], classification of phases of matter [9], entanglement estimation [10], and to identify phase transitions [11].Although the current theoretical understanding of the effectiveness of these models is limited, a growing body of literature has established connections between neural networks and standard frameworks commonly used to analyze quantum systems such as renormalization groups [12], tensor networks [13][14][15], and the complexity theoretic tools developed in quantum computing [16].Moreover, classical optimization techniques borrowed from supervised machine learning have been employed to optimize the dynamics of many-body systems [17][18][19][20][21] and parametric quantum circuits [22][23][24][25][26][27][28].
Open quantum systems [29,30] present further challenges.Here, any modelling effort must take into account that the system interacts with the surrounding environment.These effects, which are usually unknown, are treated in a phenomenological way and can significantly increase the complexity of the model.This approach contrasts with what is typically done in the neural network community, where overparametrized models learn from data, rather than describing an environment with complex functional dependencies that can be hard to derive from phenomenological observations.Data driven approaches are responsible for the success of deep networks into fields like natural language processing and image recognition [2], where huge amounts of data are readily available.
Here we investigate the ability of neural networks to model the non-Markovian evolution of open quantum systems.We focus on Recurrent Neural Networks (RNNs), a type of artificial neural network specifically designed to model dynamical systems with possibly long range temporal correlations.In the quantum setting, RNNs have been previously employed for quantum control [31,32].We consider two main applications.In the first one we define a master equation which has the mathematical simplicity of the Gorini-Kossakowski-Sudarshan-Lindblad master equation [33,34], but which is able to model non-Markovian effects via the memory cells included in RNNs.The resulting RNN architecture shares some similarity with collisional models [35][36][37][38][39][40] or -machines [41], where explicit memory effects are introduced by using ancillary quantum systems.Nonetheless, our approach has the advantage that it can be easily modified by choosing different RNN architectures that better reproduce the expected properties in the temporal sequence.Moreover, it provides a convenient mathematical way to define the so called memory kernel [29], such that the resulting master equation has a completely positive solution.The reconstruction of the memory kernel from quantum state sequences, namely from quantum process tomography, has been the subject of different studies in recent years [42][43][44][45][46][47].In most of the approaches considered in the literature to reconstruct the memory kernel, one normally makes some assumptions on the microscopic model of the environment, and then fits the free parameters given the experimental data.However, the details of the microscopic model are difficult to know, and many uncontrollable assumptions have to be made.Our main result is to show that RNNs can learn memory effects in data sequences, without making any assumption on the underlying physical model.Indeed, even the Hamiltonian of the system can be learnt during this process.
Developing on the same formalism, we define and train a RNN architecture that reproduces the entire physical evolution given an initial quantum state, without introducing master equations.As a relevant application for this technique, we arXiv:1808.01374v1[quant-ph] 3 Aug 2018 consider quantum process tomography and show that RNNs are able to model the physical evolution of quantum systems over different regimes.
The paper is structured as follows.In Sec.II we present the relevant technical background.Specifically, in Sec.II A we introduce non-Markovian quantum processes and common master equation techniques to describe them, while in Sec.II B we introduce the RNN architecture employed in this paper.The main ideas are presented in Sec.III, where we introduce RNNbased master equations (Sec.III A), RNN-based quantum processes (Sec.III B), and applications for process tomography (Sec.III C).Numerical experiments are presented in Sec.IV and conclusions are drawn in Sec.V.

II. BACKGROUND A. Non-Markovian processes
Quantum systems, even the purest ones, are inevitably in contact with an environment.Because of this normally unknown interaction, quantum evolution deviates from the predictions of the Schrödinger equation, and different extensions have been proposed [29,48] to model such open quantum systems.When the interactions inside the environment happen on timescales much shorter than the internal timescales of the system, then a Markovian approximation is usually appropriate, and the evolution can be modeled via the Gorini-Kossakowski-Sudarshan-Lindblad (GKSL) master equation [33,34] where H is the Hamiltonian, which models the noise-free case, and L µ are called Lindblad operators.The superoperator L is called Liouvillian.Mathematical properties of the above equation are well understood [30].For any choice of H and L µ the solution of the master equation E t = e −Lt defines a completely positive trace preserving linear map, and thus is a mathematically well-posed mapping between states to states.With properly chosen Lindblad operators, the above master equation models the most general Markovian interaction with an environment [33,34].However, the Markovian approximation is not accurate in many situations, for instance when the interactions inside the environment have comparable strengths to the interactions inside the system [49,50].In that case the master equation has to be modified to take into account non-Markovian effects.One of the first and most accurate descriptions of non-Markovian evolution is the Nakajima-Zwanzig (NZ) master equation [29] where K NZ t is a super-operator, the so-called memory kernel, which describes the interaction with the environment, while I(t) is due to the initial correlations between system and environment.If the system and environment are initially uncorrelated, then I(t) = 0 for all t.It is clear that the above equation describes non-Markovian processes, because the state at time t + dt depends not only on ρ(t) but also on the states ρ(s) for s < t.The Nakajima-Zwanzig equation is at the basis of powerful Green function methods to study the spectral properties of the system [50][51][52][53], since the convolution disappears in the frequency domain.On the other hand, in the time domain the above equation is not easy to solve numerically.To avoid this problem, a different but equally accurate master equation has been proposed, the so-called time-convolutionless (TCL) master equation [29], which reads The main formal difference between Eq. (3) and Eq. ( 2) is that the whole history of states is fed into the NZ master equation, while in the TCL case the master equation explicitly depends on ρ(t) only, and all non-Markovian effects are included into the memory kernel.As such the non-Markovian nature of the above equation is less obvious, but it is known that both NZ and TCL master equations can describe the same process.Indeed there are formal mappings between K NZ and K TCL such that both both Eq.(3) and Eq. ( 2) produce the same physical evolution [29,54].
Although TCL master equations are relatively easy to solve numerically, the main problem is that the interaction with the environment is normally unknown.In other terms, while H is typically well characterized experimentally, the memory kernel depends on environmental quantities such as the temperature, but also on the spectral properties of the environment which are normally unknown.Non-linear spectroscopy can be used to find a model of the spectral density [51], but this still does not completely characterizes the memory kernel, without introducing further assumptions.The most commonly employed approximation is the assumption of a weak coupling between system and environment, such that the memory kernel can be formally obtained using perturbation theory.However, there are cases where these approximations are not justified, e.g. in quantum biology [49], where the strength of the interaction with the environment is comparable with the internal interactions inside the system.Motivated by these we study a new ansatz to model non-Markovian quantum processes based on the use of RNNs.

B. Recurrent Neural Networks
RNNs are a class of neural networks designed to model data sequences like time series.Unlike feedforward neural networks, that provide just a functional approximations of the input-output relationships in the data, RNNs can model temporal dependencies arising in the dataset.To understand their functioning, it is helpful to compare them with more standard feedforward networks.In feedfordward neural networks the input data s 0 propagates throughout many intermediate (hidden) layers before reaching the final output layer.Here "propagate" means that, step by step, the state s +1 of the ( +1)-th layer is updated, given the state of -th layer, as s +1 = f (W s + w ) where W is a weight matrix, w a weight vector and f is some non-linear function.The state at the final layer of the network (output layer) depends on all the weight matrices and vectors.Training is performed by updating those weights such that the neural network learns some desired input-output relationship hidden in the data.
In the case of temporal data, each input has also an explicit time dependence s 0 t .Although, in principle, one could still use a giant feedforward network with these data, this is rarely the optimal choice, because the number of free parameters quickly increases with the number of time steps.RNNs solve this issue with a more advanced architecture which is tailored for temporal data.In RNNs, the update rule for hidden layers at time t does not only depend on the state s t , but also on the previous state of the unit.In other terms, the update rule is , where s 0 t is the input temporal sequence.The free parameters W and w do not depend on t, and memory of the past is taken into account by the function f , which compresses and saves relevant informations of previous sequences into memory cells.This architecture allows RNNs to learn temporal sequences using a relatively small number of parameters, even when the temporal data has long range memory effects.The mapping between s t and s +1 t defines a RNN cell.
In this work we use a variant of RNN cells called Gated Recurrent Unit (GRU) RNN, shown in Fig. 1, and discussed in Appendix A. GRUs use a gating mechanism that allows them to better model long term dependencies than more simple RNNs [55].GRUs are based on a type of RNN cell called Long Short-Term Memory (LSTM) but can be more efficient for comparable performance [56,57].GRU and LSTM are commonly used and achieve state of the art performance for sequence modelling across multiple domains, including machine translation, image captioning and forecasting [58].The GRU state s t is a linear interpolation of the previous state s t−1 and a candidate state st , which depends on the auxiliary input x t .The input x j t for a depth j cell at time t is the state from the cell in the previous layer s j−1 t .GRU cells can be stacked to form a deep GRU network.More details can be found in Appendix A.

III. MAIN IDEA
In this section we present the three main contributions of this paper which all leverage on the modelling capabilities of RNNs to describe dynamics of open quantum systems.First, we describe a non-Markovian master equation.Second, we use RNNs to predict the time evolution of quantum state under a non-Markovian quantum process.Third, we show how these two techniques can be utilized to perform quantum process tomography.

A. RNN Quantum Master Equation
We postulate a quantum evolution similar to Eq. ( 1), namely where the notation ≤t refers to a superoperator that not only depends on t, but also on the entire history before time t, as in the TCL master equation (3).A convenient choice is then that of the GKSL form where H LS is a "Lamb-shift" term, namely a correction to the Hamiltonian induced by the environment, while L µ are Lindblad operators.The reason for this choice is that for small enough ∆ t the time evolution is simply given by ρ(t and since L ≤t is in the GKSL form, e ∆ t L ≤t is a completely positive trace preserving quantum channel, mapping states to states.If H LS and L µ are simply time-dependent functions, namely they depend only on t and not on previous times, then the above master equation is always Markovian [54].The main idea of this work is to use a RNN to define each Lindblad operator L µ ≤t and the correction Hamiltonian H LS ≤t j , see fig.2(a).In order to ensure the Hermiticity of the H LS ≤t operator, we construct it as H LS ≤t = A(t j ) + A(t j ) † , where A(t j ) is the output of the network and A(t j ) † its conjugate transpose.Since in RNNs the predicted output at time t depends on the entire history at previous times, this parametrization is expected to accurately reproduce genuinely non-Markovian effects, even with possible long-range dynamical correlations.The master equation then resembles a TCL non-Markovian master equation ( 3), but where the complicated memory superoperator is expressed via a simpler GKSL form with RNNs.We call our Eq.( 5) the Quantum Recurrent Neural (QRN) Master Equation.Similarly, we call the operators L µ ≤t recurrent Lindblad operators.A schematics of the resulting neural network is shown in Fig. 2(a).

B. RNN Quantum Processes
For any initial time t 0 the mapping between the initial state ρ(t 0 ) and the state at time t, namely defines a completely positive map, assuming no initial correlation with the environment.For any intermediate times t 0 < τ < t the mapping E(τ, t 0 ) is always completely positive.When also E (t, τ) := E(t, t 0 )E(τ, t 0 ) −1 is completely positive, then the mapping is called divisible [59].Divisibility is another way of characterising Markovianity, as quantum processes obtained from the GKSL master equation are always divisible with E(t , t) = T exp t t L s ds .In the previous section we have introduced the QRN master equation and shown that it can model non-Markovian effects, even when the evolution between intermediate steps is completely positive.This is possible because the RNN keeps a compressed record of the previous evolution.Using the formalism of the previous section we can indeed write where Moreover, is completely positive, being the operator exponential of a Liouvillian, but depends on the previous evolution, via the recurrent Lindblad operators.As such, the total map (7) is not divisible.
Based on this analogy, we can now drop the master equation and define a non-Markovian process as where E RNN ≤t j (t j+1 , t j ) is completely positive, but depends on the entire history before t j .Each E RNN ≤t j (t j+1 , t j ), being completely positive, outputs a valid quantum state at intermediate times ρ(t j ) and, being a RNN, then updates its internal memory.Complete positivity can be ensured by using the Kraus decomposition or the environmental representation.
For better comparison with the QRN master equation, in this work we use the simpler strategy shown in Fig. 2(b), where we use the output A(t j ) of the network to define a density operator via ρ(t j+1 ) = A(t j )A(t j ) † / Tr[A(t j )A(t j ) † ].This ensures that the states ρ(t j ) are valid density operators, throughout the entire evolution.

C. Application: Quantum Process Tomography
In this work we apply our QRN master equations and RNN quantum processes for quantum process tomography.We consider the following simple setup.We assume that the quantum system can be initialized in different initial states ρ α (0) where α = 1, . . ., N I , for some number N I .For each initialization, we assume that it is possible to perform full-quantum tomography at some time steps 0 ≤ t j ≤ T for j = 1, . . ., N T and some N T .After this procedure we are able to reconstruct the time evolutions ρ α (t j ) for different times and initializations.
Here we consider uniformly separated times where t j = j∆ T and ∆ T = T/N T , though it is straightforward to generalize this procedure to non-uniform sequences.Each state tomography requires O(d 2 ) measurements, where d is the dimension of the system's Hilbert space, so the total cost of reconstructing these sequences is O(d 2 N T N I ).Once these sequences are obtained, we use them to train the neural networks.
We consider two cases.In the first one we train a RNN to learn the quantum state evolution, as shown in Fig. 2(b).Here we assume no knowledge about the system's Hamiltonian or interaction with the environment.Training is then performed by minimising a cost function where ρ α (t j ) are the training data, ρα (t j ) are the states outputted by the RNN, and • is any operator norm.
In the second application we aim at reconstructing H LS ≤t and the recurrent Liouvillian operators L µ ≤t entering in the QRN master equation, where, on the other hand, we assume that the Hamiltonian H is known.The latter assumption can always be relaxed, as Hamiltonian evolution can be fully included into the correction Hamiltonian H LS , which is learnt from the data.In the QRN master equation, the operators H LS ≤t and L µ ≤t are obtained from the output of the RNN, as shown in Fig. 2

(b).
To train the RNNs to predict H LS and L µ given a starting state ρ α (t = 0) we propose the use of the differential equation ( 4) to define a cost function.
To explain the idea, let us first consider the opposite scenario, where the differential equation ( 4), with all of its operators, is already known, while the states ρ(t j ) are not.In this common case, the states ρ(t j ) at different times are evaluated from the numerical integration of the master equation.The latter can be obtained with an n-order Runge-Kutta integrator [60] which, in general, can be formally written as where L RK,n ≤t is n-th order Runge-Kutta integration step, which can be explicitly obtained for any n (see e.g.Ref. [60]).For instance, to the first order L RK,1 ≤t is simply L ≤t .To summarize, when H LS ≤t and L µ ≤t are known, we can use a Runge-Kutta integrator to obtain the time evolution ρ(t j ).
We now consider the opposite problem, namely where many time sequences ρ α (t j ) are already known, while the operators H LS ≤t and L µ ≤t in the QRN master equation are not.This is like assuming that the solutions of a differential equation are known and from them we want to reconstruct the differential equation itself.Based on the analogy with numerical integration via Runge-Kutta, we propose to use the following cost function: where • F is the Frobenius norm.The intuitive idea behind the above cost function is that of making the measured data sequences as close as possible to those coming from the numerical solution of a master equation.Minimizing the cost function is then equivalent to finding the best QRN master equation compatible with the measured sequences ρ α (t j ).It is expected that a higher order integrator (large n) performs better, especially for larger ∆ T , but requires heavier numerical computations.

IV. NUMERICAL EXPERIMENTS A. Learning quantum state sequences
We mimic experimental data by numerically generating sequences of states, which are then used to train the neural network.To generate the training data, we consider a simple yet important model of spontaneous decay of a two-level system [48,61], described by the master equation where {x, y} = xy+yx, σ α for α = x, y, z are the Pauli matrices, σ ± = (σ x ± iσ y )/2, ω is the Rabi frequency of oscillations around the z axis.The parameter is the decay rate with η = λ 2 − 2γ 0 λ.When λ > 2γ 0 the function γ(t) is always positive, so Eq. ( 13) takes the GKSL form (1) and, as such, defines a Markovian evolution.On the other hand, for λ < 2γ 0 the function γ(t) can be negative and the dynamics displays non-Markovian effects [48,62,63].We use the above model to obtain training data for the RNN architecture shown in Fig. 2(b).The training sequence ρ α (t j ), for α = 1, . . ., N I has been obtained by first choosing a random initial state ρ α (0) and then solving the master equation Eq. ( 13) to get the states ρ α (t j ) at subsequent times, up to a maximal time t max .These data sequences were then used to train a GRU neural network, by minimising the cost function (10).After training, we test the accuracy of the neural network by generating a new sequence of states ρ β (t j ), and the RNN prediction ρβ (t j ), for β = 1, . . ., N P .As before, ρ β (t j ) is obtained by selecting a random initial state ρ β (0) and solving Eq. ( 13), possibly for longer times than t max .On the other hand, the predicted evolution ρβ (t j ) is obtained by feeding the All the experiments run with the following parameters ω = 1 for Eq. 13 and λ = 2, γ 0 = 0.5 for Eq.14, and a discretisation interval of ∆t = 0.01.The simulations run over 3000 training examples and predictions were tested over 1000 test examples.All the evolutions run for a time of t max = 0.7.initial state ρ β (0) to the RNN to get the entire temporal sequence.The two evolutions are compared with the trace distance D(ρ, ρ) = 1 2 Tr |ρ − ρ|.In particular, we study the average trace distance T (ρ(t), ρ(t)) = 1 N P β D(ρ β (t), ρβ (t)) as a function of time.
In Fig. 3 we show a numerical solution of this numerical experiment (EXP1), where we can see that, once trained, the RNN is able to predict the evolution of a known starting state, up to the maximal training time t max .On the other hand, and as expected, the accuracy of the RNN prediction rapidly deteriorates for t > t max .

B. Learning the master equation: a simple case
In Fig. 4 we show the solution of a second numerical experiment (EXP2) obtained with the same training set of the first experiment (EXP1), discussed in Fig. 3.While EXP1 uses a RNN to model the entire quantum evolution, EXP2 uses the RNN to define a QRN master equation, following Sec.III C. Training is then performed by minimising the QRN cost function (12), where for simplicity we assume a first-order Runge-Kutta integrator (n = 1) and a single recurrent Lindblad operator (µ ≡ 1).In EXP2 we have chosen a Markovian regime so that the entire evolution can be modeled via a GKSL master equation (1).In this regime, the RNN Lindblad operators can be approximated as a simple time-dependent function, so the minimisation of the cost function ( 12) is equivalent to learning standard Lindblad operators.In Fig. 4 we compare the predicted recurrent Lindblad operators L µ as a function of time, α α L µ≡1 are compared for the Markovian evolution of a two-level system.We plot the time evolution of every entry of the matrix.Real and complex entries are plotted separately.All the experiments run with the following parameters ω = 1 for Eq. ( 13) and λ = 2, γ 0 = 0.5 for Eq. ( 14), and a discretisation interval of ∆t = 0.01.The simulations run over 1500 training examples and predictions were tested over 2500 test examples.All the evolutions run for a time of t max = 0.7.The Frobenius norm squared was used as distance between the matrices.with respect to the real ones defined in Eq. ( 13).In the system under study all entries of the Lindblad operators are zero (hence the flat lines in the figure) apart from one value.As we can see, the prediction is remarkably accurate at all times, even beyond the training time t > t max .The error in the prediction is shown in Fig. 5, by plotting the cost function (12) for different times.EXP2 shows that the RNN was able to learn the evolution of the Lindblad operator L t for t = 0.1 to t max = 0.7.In addition, the RNN was able to predict L t for t > t max which was the last time step used during training.

C. Learning the master equation: non-Markovian case
In this section we focus on the non-Markovian regime, where there are non-trivial memory effects that the RNN has to learn and reproduce.As in Sec.IV A, we consider state sequences generated numerically by solving a master equation.However, unlike our previous treatment, here we consider a more complicated non-Markovian model of the environment, which includes back-scattering effects.Back-scattering can refer, for instance, to a photon emitted to the environment that comes back at later times.As such, the information transferred to the environment is not completely lost.To model these non-Markovian effects we consider two qubits evolving with the following master equation where γ i (t) has the same functional form of Eq. ( 14) (but we denote the parameter γ 0 and λ with a superscript γ (i) 0 and λ (i) that refers to qubit index) and the two-qubit Hamiltonian is where c 1 = 0.3242, c 2 = 0.6723, and c 3 = 0.1353.We numerically solve Eq. (15) to get the data sequence ρ α 12 (t j ), where α = 1, . . ., N I indexes the different solutions obtained with different initial states.From these two qubit solutions we define then the state sequences as ρ α (t j ) = Tr 2 [ρ α 12 (t)].In other terms, qubit 1 is the principal system, while qubit 2 is an ancillary system.Because of the coherent interaction H 12 between qubit 1 and qubit 2, with this approach we can mimic the back-action of the environment onto the system.
We divide our numerical results into different experiments.In EXP3, shown in Fig. 6, we train a RNN to fully reproduce the entire quantum evolution, following the discussion of Sec.III B. We see that in spite of non-Markovian effects, the resulting error is comparable to that of the Markovian case (EXP1) shown in Fig. 3.
In Fig. 7 we use the same training data of EXP3 to run a new experiment (EXP4) where we train a QRN master equation, by minimising the cost function (12).We consider two cases: in the first one the RNN outputs a single recurrent Lindblad operator L µ≡1 .In the second one, the RNN outputs both model both a recurrent Lindblad operator L µ≡1 , and the renormalized Hamiltonian, namely the Lamb-shift term H LS .We note that, with a single recurrent Lindblad operator, the error is slightly larger than that of the Markovian case, shown in Fig. 4.However, the resulting error is very low when we include also the correction Hamiltonian H LS .Comparing Fig. 7 with the Markovian case, Fig. 4 ) 0 = 0.2 for Eq. ( 15), ω = 1 for Eq. ( 16), and a discretisation interval of ∆t = 0.01.≤t .The parameters are the same of Fig. 6.The Frobenius norm squared was used as distance between the matrices.time t max .Overall, in Fig. 7 we see that the RNN which learned both the recurrent Lindblad operator and the Lambshift Hamiltonian performed best, and was better able to predict the memory kernel at times greater than those seen during training.
Finally, in Fig. 8 ) 0 = 0.2 for Eq. ( 15), ω uniformly sampled in the interval [0.5, 1.5] for Eq. ( 16), and a discretisation interval of ∆t = 0.01.The simulations run over 3000 training examples and predictions were tested over 1000 test examples.All the evolutions run for a time of t max = 0.7.The Frobenius norm squared was used as distance between the matrices.quences where the qubit frequency ω in Eq. ( 16) is not fixed, but rather uniformly sampled from 0.5 to 1.5.The sampled frequency is used as an extra input to the RNN.This corresponds to the experimentally relevant case where the qubit frequency can be externally tuned to a known value.Uncertainty about this frequency can be estimated from the correction Hamiltonian H LS , which is learned from the data.In Fig. 8 we see that the error in EXP5 is remarkably low.Based on the success of this numerical experiment, we propose the following general strategy to introduce prior knowledge about the system.We can define a RNN where the known properties of the system, e.g. its Hamiltonian, are added as an extra input.Training is performed using datasets of quantum state sequences and their respective Hamiltonian, where the latter is sampled from the space of experimentally relevant Hamiltonians.The remarkable accuracy shown Fig. 8 suggests that this procedure forces the RNNs to better explore the manifold of quantum state sequences, to produce a more accurate and robust prediction.

V. CONCLUSIONS AND PERSPECTIVES
We have studied quantum state evolution using RNNs.We have shown that even when the system is interacting with a complicated surrounding environment, RNNs offer an accurate and robust tool for modeling and reconstructing the quantum evolution, both in the Markovian and non-Markovian regime, and even when there are back-scattering effects from the environment.
We have introduced two approaches for modelling open quantum systems with RNNs.In the first one, a deep RNN is trained to learn the entire quantum evolution, namely to reproduce the time sequence ρ(t j ) given an initial state ρ(0).In the second approach, we use a deep RNN to define a non-Markovian master equation where the memory kernel takes a convenient mathematical form, namely that of the GKSL equation.In our master equation the non-Markovian memory effects are taken into account by the structure of the RNN cells.The observed success of our approaches stems from the ability of RNNs to learn temporal sequences, where the future depends on the entire past.
Many extensions of our work are possible.Many-body systems with exponentially large Hilbert spaces could be considered by using RNNs which output a compressed representation of the state, such as tensor networks [64,65], restricted Boltzmann machines [8] or variational autoencoders [16].Although RNNs proved to be a powerful tool for modelling open quantum systems, the machine learning literature offers a number of other methods, such as Kalman Filters or models with Gaussian Process transitions, that appear particularly promising.For example, Gaussian Process State Space Models [66] allow one to model prior information on the system and return Bayesian estimates of the uncertainties.Both these features are desirable in a quantum context: prior knowledge on the system, such as the form of the noise-free Hamiltonian, may be used to further reduce the complexity of the model, while approximate values for the uncertainties might enable better control of experimental inaccuracies.
Finally, in more physical terms, it would be interesting to study what happens when partial information about the system is available.An experimentally relevant case is when the initial state ρ(0) is known, but one has access to a limited set of expectation values A k ρ(t j ) , where the observables A k are not enough to tomographically reconstruct the states ρ(t j ).This possibility may be considered, using our framework, by introducing a cost function between expectation values, rather than between density operators.A further challenge is then to model the disturbance of the measurement onto the system, namely the wave function collapse.This can be done using the process tensor formalism [44], which provides an avenue for generalising our approach in the presence of feedback.It would be interesting to study the performance of RNNs to model these experimentally relevant quantum evolutions.

FIG. 1 .
FIG. 1. Network diagram of the GRU cell .The output s t and input s t−1 represent the state at times t and t − 1, respectively, while x t is an auxiliary input that depends on the previous times, before t − 1. Rectangles represent neural network layers.Circles represent entrywise operations.Bifurcations represent copy operations and joined lines represent concatenation.Details are presented in Appendix A.

FIG. 2 .
FIG. 2.RNN architectures for open quantum systems: schematic of a one-to-many deep RNN used to approximate the master equation and to model the non-Markovian dynamics.Both networks take as input ρ(t 0 ) (green cell) and comprise of two GRU layers and a fully connected (FC) layer (blue cells).The yellow cells show the initial network output A(t j ) before the post-processing that makes H LS t j Hermitian and ρ(t j+1 ) a valid density operator.Panel a. shows the output for the QML master equation, i.e. the predicted value of H LS t and L µ t at intervals ∆t (red cells).Panel b. shows the output for a RNN modelling the time evolution of a quantum state undergoing a non-Markovian quantum process, i.e. the predicted value of ρ(t) at intervals ∆t (red cells).

FIG. 4 .
FIG. 4. (EXP2)Learning the Lindblad operator describing the Markovian evolution of a two level system: the learned and known entries of1
(t j ) and predicted states ρβ (t j ) as a function of time.The time t max is the maximum time used during training.
, one can see that the error grows faster in the non-Markovian regime after the training D(ρ β (t), ρβ (t)) between true states ρ β (t j ) and prediction states ρβ (t j ) as a function of time.The time t max is the maximum time used during training.The experiments run with the following parameters λ β The simulations run over 3000 training examples and predictions were tested over 1000 test examples.All the evolutions run for a time of t max = 0.7.
we run a different numerical experiment (EXP5).In EXP5 the training set is composed by data se-Markovian master equation with different ω values: average value of the cost function J on unseen examples as a function of time for a learned memory kernel with both the Lindblad operator L µ ≤t where µ = 2 and the Lamb-shift Hamiltonian H LS ≤t .t max specifies the maximum time in the range of times used for training.All the experiments run with the following parameters λ