Provably unbounded memory advantage in stochastic simulation using quantum mechanics

Simulating the stochastic evolution of real quantities on a digital computer requires a trade-off between the precision to which these quantities are approximated, and the memory required to store them. The statistical accuracy of the simulation is thus generally limited by the internal memory available to the simulator. Here, using tools from computational mechanics, we show that quantum processors with a fixed finite memory can simulate stochastic processes of real variables to arbitrarily high precision. This demonstrates a provable, unbounded memory advantage that a quantum simulator can exhibit over its best possible classical counterpart.


Cyclic random walks
Consider a small bead located on a circular ring of circumference 1 (as per figure 1).Its position can always be described by some real number y 0, 1 Î [ ).At each discrete time t  Î , the beadʼs position is stochastically perturbed.This perturbation is described by a real random variable X that is governed by a continuous probability density function P X ( ), such that Y Y X frac , 1 where Y t represents the random variable that governs the location of the bead at time t, and y y y frac 0, 1 ) denotes the fractional part of y, such that positions differing only by whole rotations around the ring are equivalent.We refer to P X ( ) as the shift function, and assume the process is stationary, in the sense that P X ( ) has no explicit dependence on t, and rotationally symmetric such that X has no dependence on the current value of Y t .This same formalism describes a diverse range of systems undergoing cyclic random walks, such as the azimuthal motion of gas molecules diffusing in an annular tube, or the position of a single electron travelling through an electric circuit with constant resistance.
We capture the dynamics of Y formally using the framework for describing stochastic processes.In general, a stochastic process  is characterised by a bi-infinite sequence of random variables Y t t { } , that governs its value at each discrete time t  Î .For convenience, we often segregate past and future values, such that Y Y Y respectively govern the values in the past and future with respect to time t=0.The cyclic random walk above is then entirely captured by the joint probability distribution P Y Y , ¬  ( ) such that for any instance of the process with past values y ¬ , future values y  will be observed with . Here, we consider the simulations of the above process to ever increasing precision.We adopt a natural technique of discretizing a continuous process, by introducing a family of stochastic processes n  { } that describe discrete approximations of this process, where in each the position of bead is represented to n bits of precision by a n-digit binary number.This is done by limiting y to a discrete set of N 2 n = equally spaced values, y j N j = (for j=0 to N 1 -).At each time-step, the probability that a bead in discrete location y j transitions to y k , is given by the probability p jk that a bead initially at y j will transition to any value of y whose n bit binary representation is y k .That is where y y y : represents the interval on the ring that is 'rounded to' y k .This results in a Markovian stochastic process that emits a symbol from the finite alphabet y k { } at each time-step, whose dynamics are governed by the stochastic matrix with elements p jk .As n  ¥, the statistics of n  approach that of ;  at the potential cost of tracking more information 8 .

Classical simulation costs scale with precision
We can formally describe simulators using the tools of computational mechanics [10][11][12][13].A simulator of a process is a device whose future output behaviour conditioned on any particular past should be statistically indistinguishable to the process itself.Specifically, let the state of the simulator at each time be s t , such that in the Figure 1.Cyclic random walk.At each time step, the system stochastically hops from state y 0, 1 t Î [ ) to y y x frac As x is chosen according to the real random variable X, the current value of the system is itself described by a sequence of real random variables 8 An alternative discretization is to calculate the transition probabilities by assuming the initial value of y t is uniformly distributed in j  .This yields asymptotically identical statistics as N  ¥, and does not change the results of this article.
subsequent time-step it can output y t 1 + and transition to state s t 1 + .For this device to be a statistically faithful simulator of a process P Y Y , ¬  ( ), we require that: 1.For each specific past y ¬ at each time t, we can deterministically configure the device using a function f into some state s f y = ¬ ( ), such that it will produce future outputs y  with probability Y y Y y 2. If a simulator is in state s f y t = ¬ ( ) at time t, and outputs y t in the subsequent time-step, its internal state must then transition to s f y y t t The first condition ensures the simulator can be initialised to simulate desired conditional future statistics; the second that a correctly initialised simulator continues to exhibit statistically correct statistics at every timestep.The memory cost of the simulator corresponds to the storage requirements of this internal state.This cost is bounded from below by the information entropy of the random variable S f Y : = ( ).In the asymptotic limit of many independent identically distributed copies of the simulator, this bound is tight as the ensemble of states may be compressed (such as by Shannonʼs noiseless encoding theorem [21], or Schumacher compression [22,23]).Physically a simulator can be viewed as a communication channel in time: it represents the exact object Alice must give to Bob at each time-step that captures sufficient past information for Bob to replicate the processes conditional future behaviour.f is known as the encoding function, which describes how the past is encoded within the channel.
This memory cost of the provably optimal classical simulator-known as the statistical complexity C μ -is extensively studied in complexity science [10].This value captures the absolute minimum memory any classical simulator of a process must store, and thus is a prominent quantifier of a processʼs structure and complexity 9  (e.g.[13][14][15][16][17][18][19][20]).Such an optimal simulator can be explicitly constructed, and corresponds to the simulator that stores in its internal memory the causal states of the process [10,11]: defined by an encoding function f such that . the conditional futures of y ¬ and y¢ ¬ coincide).
In our cyclic random walks, each n  is a first-order Markov process: the statistics of future outcomes depend only on the most recent value of Y t .When this example is discretized, the causal states are thus typically in oneto-one correspondence with the 2 n discrete values that Y can take 10 .That is, n  has 2 n causal states, labelled { } , where s j corresponds to the set of pasts ending in Y y j 0 = .When the simulator has been running for a sufficiently long time, the probability distribution over the internal memory converges on P S s i N 1 = = ( ) for each i-its steady state, in which all causal states occur with equiprobability.Thus, the classical statistical complexity scales linearly with the precision.

Quantum simulators are memory-efficient
It has recently been shown that quantum processors have the capability to simulate stochastic processes with less memory than is classically possible [25][26][27][28][29].Here, we construct an explicit quantum simulator for the cyclic random walk.Instead of storing each causal state s i directly, our quantum simulator stores a corresponding quantum state where kñ {| }forms an orthonormal basis.The stationary state of the quantum simulator is then given by the quantum ensemble state S S N j j j 1 r = å ñá | | (as all quantum states occur with equiprobability).Thus the memory required to store these states is given by the von Neumann entropy given H Tr log log where k l are the eigenvalues of ρ.The key improvement here is that S j ñ {| }are not in general mutually orthogonal, and thus H Q is generally less than C μ . 9The statistical complexity is distinct from algorithmic information (Kolmogorov-Chaitin complexity).Statistical complexity is, as the name would imply, intrinsically statistical-concerned with the replication of the statistical behaviour of a process; whereas algorithmic information relates to the compressibility of an exact string [24]. 10There are exceptions, such as when ), and the system jumps to a completely random point at each time-step; here there is only one causal state for all N, because the current position no longer affects the future outcomes at all.Nevertheless a quantum circuit (outlined in figure 2-with details in the appendix) acting on these quantum states will produce statistically identical outputs to the classical simulator.
The von Neumann entropy of a quantum state is equal to the Shannon entropy of the outcome statistics of a projective measurement on that state, minimised over all choices of projective measurement.This minimisation corresponds to a measurement in the basis in which the stateʼs density matrix is diagonal.A classical probability distribution maps onto a mixed quantum state, diagonal in a fixed basis.As such, the stationary state of the classical simulator can be assigned a quantum state, whose von Neumann entropy is exactly that distributionʼs Shannon entropy.This allows us to compare the entropic cost of the classical and quantum machines' memories on an equal footing.

Unbounded advantage of quantum memory
We now come to the main claim of our paper: there are stochastic processes that can be simulated to infinite precision using a finite amount of quantum memory.
Explicitly, we show that for certain cyclic processes, the quantum ensemble stateʼs eigenvalues k k for some finite value Ω.Our result relies on first observing that the eigenvalues k l can be directly related to transition probabilities p jk { } via the relation where  denotes the discrete Fourier transform (DFT), x x j k exp . (The proof relies on invoking the cyclic symmetry of the process-and hence of the transition probabilities-and is explicitly derived in the appendix.)The spread p j0 (as a function of j) is an indicator of how quickly a particle diffuses in the random walk.Thus, the Fourier-like relation between p j0 and k l indicates an inverse relationship between the amount of diffusion in the cyclic process and the spread of eigenvalues.The greater the variance of X, the more quickly a particle diffuses, and the smaller the spread of k l -resulting in a reduced quantum memory requirement.We now show that for some natural examples, this reduction is sufficiently large that H q remains bounded for all n (as illustrated in figure 3).
Example 1 (Gaussian noise).A cyclic process rotating at a constant rate subject to Gaussian noise has a shift function given by a Gaussian distribution G x exp ) about mean μ with standard deviation σ.Here, μ characterises the average velocity (in terms of the variableʼs mean displacement per time-step), and σ the size of the fluctuations.When 0 m = , this process corresponds to Gaussian diffusion.For our analysis, we take 1 s  and thus ignore fluctuations where the particle travels more than a complete loop around the ring in a single time-step (a value of 0.1 s = ensures that such events are less likely than one part in a million).As can be seen in figures 3(a) and (c), as the desired precision increases, the memory cost of simulating this process quickly converges onto a constant determined by the fluctuation strength σ; ultimately, infinite-precision simulation is possible using only a finite quantum memory.This behaviour may be understood analytically by seeing that for large N, the eigenvalues associated with the quantum simulatorʼs internal memory are also given by samples from a Gaussian distribution: = -¼ -, where for convenience we have cyclicly offset the label of the eigenvalues' indices by N (proof in appendix).This demonstrates that increasing σ tightens the spread of eigenvalues, and thus reduces the memory requirement for the quantum simulator.| that encodes the past.At t=0, an ancillary system, initialised in state S 0 ñ | , is fed into the simulator.A controlled unitary is then enacted such that U j S j S : The state of the ancillary system and memory are then coherently swapped, and the ancillary system is then emitted as output.Measurement of the ancillary system then correct samples Y 1  .Iteration of this procedure then generates output behaviour statistical identical to that of the original process.
In the appendix, we prove that as the precision n N log = increases, the sum lim log converges on a finite value, bounded (in bits) by Thus, for any fixed 0 1 s <  , the Gaussian random walk may be simulated to arbitrarily high precision using a quantum simulator of bounded entropy.Moreover, this also implies an unbounded divergence between the classical and the quantum statistical complexity [26,30] C Q , which is upper bounded by H Q .
Example 2 (Uniform white noise).In the second example, we consider a particle that is perturbed by uniformly distributed noise.At each time-step, the particle can move anywhere in the range of m  D from its current position with uniform probability, where 1 2 D < .Again, μ characterises the average velocity, and here Δ the size of the fluctuations.The associated shift function is a top-hat function, that has a uniform value of ] and 0 everywhere else.
The entropy of the quantum simulator, H q is plotted for various precision in figures 3(b) and (d).We see that for any fixed 0 D > , the quantum memory required by our simulator converges to a bounded value.As in the Gaussian scenario, the quantum simulator can replicate a classical simulation to any given precision using with finite entropy.In the appendix, we prove this analytically.We show that as N  ¥, the entropy remains finite, and is bounded above by H 3.067 In particular, for large N, the eigenvalues of the relevant = -¼ -, where x sinc( ) is the normalised sinc function, Larger values Δ will result in a smaller spread of eigenvalues, and result is smaller H q .For any given 0 D > the entropy is finite in the limit N  ¥.This establishes a second natural example where the quantum simulator can demonstrate an unbounded memory advantage over its best possible classical counterpart.

The origin of quantum advantage
The source of classical inefficiency can be understood by considering dynamics on causal states.Consider two instances of n  , one where Y y j 0 = , and the other where Y y j 0 1 = + .As their conditional future statistics differ )), a classical simulator must be configured differently for each instance (corresponding to being initialised in one of two different causal states, s j or s j 1 + ).Nevertheless, there is finite probability that at the next time-step, both instances of the process emit the same output (up to precision n).Should this happen, we would not be able to use the current state of the machine to determine the causal state it was in at the previous time.That is, there is some probability that the distinction between s j and s j 1 + will never be reflected in the future statistics of the process-a phenomenon known as crypticity [27,31].As n increases, this occurs with greater likelihood (tending to unit probability as n  ¥), and thus proportionally more information is wasted.Ultimately, in the limit of high precision, a vanishingly small proportion of the information stored in the classical memory is pertinent to the statistical behaviour of the processʼs future.
Quantum simulators compensate for this waste by mapping these causal states to non-orthogonal quantum states.The quantum state (equation ( 4)) associated with neighbouring causal states ( S j ñ | and S j 1 ñ +

|
) also become increasingly similar with increasing n-resulting in progressively greater savings.Consider the Gaussian scenario, where H q is bounded by equation (6).For small σ, the memory cost scales as log 2 s -, such that halving the variance of fluctuations at each time-step adds one bit to the memory cost of the quantum simulator.The standard deviation of the shift function has set an effective length scale over which the system must be simulated classically.The statistical behaviour of future outputs from two systems that are initially prepared in points separated by more than one standard deviation are typically distinguishable, and so these points must be stored as nearly orthogonal quantum states at some memory cost.On the other hand, when two points are initially closer than the standard deviation scale, the probability that they could be distinguished by their future behaviour diminishes, and they may be represented by increasingly overlapping quantum states.In this regime, a fixed finite memory can accommodate any desired precision.
We gain further insight into the origins of quantum advantage by considering the cases where it does not appear: 0 s = and 0 D = .In both these cases, the shift function is a Dirac delta distribution.As such, no matter how high the precision, by observing the future outputs, it will always be possible to distinguish whether the system came from some site s j or its neighbour s ; j 1 + the dynamics of the system are wholly reversible.If s j always transitions to s k and s j 1 + always to s k 1 + , being able to distinguish between these two sites is crucial to produce the correct statistical behaviour, even as the precision increases.As such, the quantum simulator cannot tolerate overlap between the states s j ñ | and s j 1 ñ + | , and must store them orthogonally (allowing them to be distinguished).In this scenario, the quantum simulator cannot demonstrate any advantage in memory cost over its classical analogue.

Discussion and outlook
In this article, we presented a task in which quantum mechanics has an unbounded memory advantage over the most memory-efficient classical alternative: the simulation of a classical cyclic stochastic process.We found that the classical simulator has a memory requirement that scales linearly with the precision required, while the quantum simulatorʼs requirement may be bounded by a finite value, even at arbitrarily high fixed precision.This establishes a rare scenario where the scaling advantage of quantum processing can be provably established.
This finding leads to a number of natural open questions-the first being of generality.Certainly, the examples presented are sufficiently simple that such divergences are unlikely to be merely a mathematical oddity.The unbounded quantum advantage relies on n  { } having two properties: (a) the number of causal states grows with n, and (b) the conditional future statistics P X S s i  = ( | )between different causal states converges sufficient quickly with n.If these conditions can be formalised, we may be able to establish similar divergences in much more general scenarios, such as the simulation of non-Markovian or non-cyclic processes.Beyond von Neumann entropy, it would be interesting if similar scaling can be found for other metrics of memory cost, such as the dimension-namely, whether there is an encoding that allows for simulation to arbitrary precision using a Hilbert space of bounded dimension.Meanwhile the inefficiency of classical simulators have shown to directly result in unavoidable increased heat dissipation [32][33][34].This hints that quantum processing may allow significant energetic savings for stochastic simulation, especially for systems that become increasingly difficult to simulate as they scale in size.
On a foundational level, the statistical complexity is often regarded as a fundamental measure of a processʼs intrinsic structure-the rationale being that it quantifies the minimal amount of information about a processʼs history that must be recorded to allow for predictions about that processʼs future behaviour.The measure has been applied to understand structure within diverse complex settings: from the dynamics of neurons [14] and the stock market [18], to quantifying self-organisation [15], among other examples [16,17,19,20].The discovery of more efficient quantum models has led to the idea that the complexity of a system depends on what sort of information we use to observe it [26,30].In this context, our results establish a family of processes that can look ever more complex classically, but remain simple quantum-mechanically.It would be fascinating to see if divergences between quantum and classical complexities can be found in existing studies, such as the examples above.Could it be that these systems appear complex classically-but look much simpler when viewed through the lens of quantum theory?

Appendix
Classical costs from computational mechanics We here present some minimal details from the mathematical framework of computational mechanics [10][11][12][13] to substantiate the claim that the classical simulatorʼs minimal memory cost is equal to the precision N log .In computational mechanics, the evolution of a dynamical property (over domain  ) is characterised by a discrete-time stochastic process , written as bi-infinite sequence of random variables ¬ = ¼ -is the infinite string of random variables occuring before (and including) time t.For stationary processes (such as the time-independent cyclic random walks described in this article), this distribution has no explicit time dependence, so we omit the superscript t.
A faithful simulator of process  is a machine (or programme) that, having been initialised in accordance with the observation of past y t ¬ , then generates a series of outputs y t  according to the distribution , where S f Y = ¬ ( )is the random variable describing the internal state of the simulator (formed by applying the function f on each variate of Y ¬ ).Moreover, once initiated into state s t , when the simulator outputs y t in the subsequent time-step, its internal state must then transition to the state s f y y t t )(where y y t ¬ indicates the concatenation of y t to the end of string y ¬ ).
The memory cost of such a simulator is given by the information entropy of S, H S P S s P S s log ).The function f that minimises this classically corresponds to identifying the causal state of a particular past [10,11] . The causal states are unique for any given process, and so their entropy H S ( ) is a property of the process itself known as its statistical complexity C μ , capturing the intuition that a more complex process requires more memory to simulate.
For Markovian processes, such as discussed in this article, the number of causal states required is equal to the number of unique rows in the stochastic matrix describing the evolution.When these rows are generated by the discretization of a continuous process into N divisions-such as when they are derived from the cyclic walkʼs shift function P X ( )-the number of states will be equal to N, except for very specific (e.g.pathologically fractal) choices of P X ( ) and N. Since by symmetry the probability of the simulator being in any particular state is equal, the classical memory cost of a simulator hence scales with the number of sites as N log , or linearly with the precision n N log 2 = .

Details of the quantum circuit in figure 2
Let us consider figure 2 in more depth (see also [25]).The circuit consists of one persistent internal memory state, and an 'output tape'-a line of quantum states, which are fed into the system one at a time.Suppose each state on the output tape is initialised into some arbitrary state fñ | .For any two quantum states xñ | and yñ | in the same Hilbert space, it is always possible to construct a unitary transformation V such that V x y ñ = ñ | | .This will be of the form y x y x ) .(Note: the orthogonality of jñ {| }allows us to pairwise use the above construction for each j y ñ | .)For a Markovian process discretized such that the stochastic matrix with elements p jk describes its evolution, the above prescription supplies the unitary operation required for our quantum simulator when we set each S p k | , as per equation (4) (states kñ {| }and jñ {| }are in the same basis).We may now evaluate the action of a single time-step (grey dashed box within figure 2).Here, the joint Hilbert space corresponds to that of the internal memory together with the output tape.In the figure, we explicitly wrote the initial state of the output tape as | -the state of the system at the end of the grey box.The tape system is then ejected from the simulator.If one were to measure this state in the kñ {| }basis, one projects onto state kñ | with probability p kj , and hence the output statistics of this measurement match that of the process being simulated.Moreover, after measuring, due to the entanglement, we know that when kñ | is measured, the internal memory must be in state S k ñ | , which is exactly the quantum state that would have been prepared if we had mapping the output statistics onto a classical causal state and then prepared S k ñ | directly.Hence, the quantum circuit in figure 2 can function as a discretized simulator for a Markovian process.
However, it is very important to note that there is no need whatsoever to measure the output tape kñ | for the quantum simulator to continue functioning.If it suits oneʼs purpose to store the output states in quantum memory (e.g. to perform further quantum information processing on the output data), then the quantum simulator still functions correctly.In this mode of operation, the measurements can be omitted from figure 2, and after M steps, the simulator would have produced the entangled state where S S y y , , )is the quantum state that would have been prepared if the system was originally in causal state S t then outputted string y y i i M 1 ¼ , and a new causal state directly set according to this output sequence.Measuring the string of output tape subsystems thus still ensures that the internal memory state collapses into the correct causal state S t M ñ +

|
, conditional on the string observed.In the first mode of operation (as drawn in figure 2), only one ancillary quantum system is required, as it can be reset and re-used between timesteps (the output tape carries away classical information only).In the second mode, the quantum output explicitly fulfils the role of the ancillary system, and a fresh ancillary system (provided by the 'blank' output tape set to some fixed choice of pure quantum state) is inserted at each time step.In both modes, the ancillary system does not need to persist between time steps in order for the simulator to continue producing statistically correct outputs.As such, in both cases, it is the von Neumann entropy Tr log r r of the first subsystem, which remains within the simulator at all times, that we consider to be the internal memory cost.

Derivation of discrete eigenspectrum
The quantum machine state corresponding to the system being in classical state α is given as ) (that is, the transition probabilities depend only on differences between indices).It hence follows that the Gram matrix associated with ρ is circulant [35].Since all rows can be derived by cyclic permutation of the top row, we shall drop one index and write the top row as g g 0 = a a .The eigenvalues of the Gram matrix are given by g k exp -, which can immediately be recognised as the DFT of g a a { } , which we denote as g 0  a ( ).Moreover, the inner product S S p p , has the form of a convolution p q * , where we have rewritten p j a as q j 0 a-( ) such that q is the N-periodic extension of the reflection of p; q p j N j 0 0

= + (
) .We may then apply the circular convolution theorem to find the eigenvalues of g, and therefore ofρ: These eigenvalues can hence be found efficiently by numerical algorithms, such as the fast-Fourier transform.
Example (Dirac-delta shift function).Let the shift function be It can be seen that all p 0 j0 = except for the one at index j¢ that incorporates the delta peak where p , and so k

Sampling Fourier transforms
It will be useful to show an auxiliary relationship between discrete and continuous Fourier transforms Let g x ( ) be a function over the range x 0, 1 Î [ ]that is sampled at N equally spaced points with values given by g g which when evaluated at integer k is exactly the DFT of the samples g n { }, which we write as k l { }.If g is periodic, it is always possible to offset the position of the sample window of g by some integer c without changing the values of gʼs DFT.For the functions we consider in this article, it is more convenient to start at  -and g 1 0 = .Moreover, once the sample window has been set, the values of g x ( ) outside this window can not affect k l , since they do not feature in the sum.Thus, instead of considering sampling g x ( ) across a finite window, we can consider an infinite delta train sampled at the same intervals, but across a function g x once ( ) where g x g x once = ( ) ( ) inside the range of the sample window (i.e. ,

-⎡ ⎣
) for the window used in this article) and g x 0 once = ( ) outside this range.Here ) 11 This works by constructing a fictitious purification of ρ, given where we have used the convolution theorem in the final step.The periodic sampling of g x ( ) causes the Fourier transform to be periodic with period N (a phenomenon known as aliasing), such that ; k k N l l = + the convolution with a delta train effectively makes k l a periodic sum of g x once  ( ( )).This periodicity allows us the freedom to choose a convenient range of k.In this article, we will typically use outside the chosen range, then we can approximate

Asymptotic limit of eigenvalues
For large N, we can derive an expression for k l in terms of the probability density function P x ( ).We substitute p 0 a with P N N 1 a ( ), which for Riemann-integrable P x ( ) is an arbitrarily good approximation in the limit of N  ¥.Similarly, we may substitute p j a with , where P x ( ) • denotes the 1-periodic extension 12 of P x ( ).Taking the limit of the Riemann sum for a product of two functions, we then see where y j N = .Moreover, since P only has support in 0, 1 [ ), we can rewrite the integral limits from -¥ to ¥, and conclude that Thus by treating g j as samples from a function g y ( ) for large N, and hence • shown in equation (10), the eigenvalues k l { } are given by ¼ -, where g y g y once = ( ) ( ) over an (arbitrary) single period of g y ( ) and takes the value zero elsewhere.Due to the periodic summation, it can be seen also that k k N l l = + , and so we are also free to choose the most convenient range for k, which will typically be from , then the approximation is reasonable.This assumption amounts taking enough samples of g x ( ) to admit a faithful reconstruction of g x ( ) under the Nyquist-Shannon theorem [36].This holds true for the examples we shall now consider, where we will ultimately take large values of N.
Example 1 (Gaussian noise).Suppose the shift function of the particle is given by a Gaussian distribution ) about μ with standard deviation 1 s  such that we can ignore the probability of the particle looping around the ring.

Derivation of eigenvalues
We can express G x , m s ( ) as a Gaussian: It can be easily verified that g x g x , , We also note that g x ,  m s ( ( ))is also Gaussian: Likewise, we can express G x , 2 m s [ ( )] as a Gaussian: . 17 Taken together (making sure to substitute in the correctly modified values of μ and σ), this allows us to provide an analytic solution for equation (14) for Gaussian shift functions: Hence, we see that choosing Gaussian transfer function with standard deviation 1 s  corresponds to a spectrum of eigenvalues with standard deviation 1 4ps .

Upper bound on quantum memory cost
We now demonstrate that the entropy of such a system, given H log , these last two solutions disappear, and since we are in the regime of 1 s  , this condition is satisfied.Hence, for small σ, c k ( ) monotonically decreases from its maximum value at k=0 for both positive and negative k.This allows us to apply the Maclaurin-Cauchy integral bound (see e.g.[37]), which holds for any monotonically decreasing region m, ¥ [ ) of a function c k ( ) (here, m=0 , we find from equation (20) that To obtain a bound on H Q , we double the above since c k ( ) is even, and multiply by 1 ln 2 to convert from nats to bits (equivalently, change the base ln to log 2 since x log . In terms of the shift functionʼs standard deviation σ, this gives our result In the limit of small σ, the leading term of the entropy thus scales with log 2 s -, such that halving the width of the standard deviation adds one bit to the maximum required quantum memory cost. Example 2 (Uniform white noise).The normalised top-hat (rectangular) shift function allowing for jumps of up to D around a constant displacement μ is written

Derivation of eigenvalues
Taking the square root of this function alters its normalisation, but not its shape: ).As this tends to 0 for large k, we can approximate the values of k l for large N using equation (14), to find the eigenspectrum Upper bound on quantum memory cost Through the careful deployment of mildly intimidating algebra, we can also derive an upper bound on entropy cost of simulating the square shift function.The outline of the proof is as follows.To bound c k k å ( ) where c k ln , we first construct a monotonically decreasing function k d( ) that satisfies c k k d  ( ) ( ) at every k, and then show that k d å ( ) is bounded from above.This sum will hence also upper-bound c k k å ( ).As with the Gaussian example, for algebraic convenience, we will use natural logarithms and only consider the region of positive k.In the final stage, we will convert from nats to bits, and use the evenness of c k ( ) to arrive at the full bound.
Explicitly, we write where we have made the substitution x k 2p = D.
In the region x 0 > , we can expand The function y y ln has a maximum value of However, as we plan to ultimately apply the Maclaurin-Cauchy integral convergence test, it is only convenient to use this upper bound in the region of x where f x ( ) monotonically decreases.We identify this region by setting 2 ln 1 0 However, once again consider c x ( ).Since it has the form of y y ln -, it follows that in any region, c x e 1
Since where k split ⎡ ⎢ ⎤ ⎥ represents the lowest integer above (or including) k split .This rounding is necessary since k exp

Figure 2 .
Figure 2. Circuit for memory-efficient quantum simulator.The above circuit samples P Y y  ¬ ( | ) when supplied with the appropriate quantum state S t ñ| that encodes the past.At t=0, an ancillary system, initialised in state S 0 ñ | , is fed into the simulator.A controlled unitary is then enacted such that U j S j S :

Figure 3 .
Figure 3. Bounded quantum memory costs for unbounded precision.The memory required to simulate a cyclic random walk is plotted against the precision N for the Gaussian and top-hat shift functions.In both examples, the quantum simulator has an unbounded memory advantage-the classical cost scales as N log whilst the quantum cost converges upon a constant value.The more rapidly the shift function diffuses X, the lower the limiting quantum memory requirement.
x i ¢ ñ | are states orthogonal to each other and to xñ | , and y i ¢ ñ | are states orthogonal to each other and to yñ | .Thus, in the joint Hilbert space N N   Äof two quantum systems of dimension N, it is possible to build a 'controlled' unitary operation U containing the elements j j j y ñ | in an arbitrary (generally non-orthogonal) set of states j j N 0 1

| 11 .
p ba ba { } is simply connected, the quantum machine will reach a stationary state than directly calculating the entropy of ρ, we can instead evaluate the entropy of the associated Gram matrix g, whose elements g ab are given by the overlaps S The circular symmetry of the cyclic random walk ensures that the discretized transition probabilities satisfy p p

N 1 l
= for allk.Thus, the von Neumann entropy of the simulatorʼs memory is N log .Example (Uniform shift function).Consider the uniform shift function P x 1 k=0 and 0 for all other k.As such, we find that the eigenvalue 1 the entropy of the Gram matrix is zero, for all values ofN.
This function is independent of the constant displacement μ.Indeed, non-zero μ only results in perfectly cancelling terms e k 2 i p m and e k 2 i p m in the Fourier transform.Basic Fourier analysis tells us that S x D ( ) transforms into a normalised sinc function ( the triangle function into the square of this: then upper bound c x ( ) in the region of x all x 0  .At this point, it is convenient to express this again in terms of k, making the substitution k exp in general not an integer.To upper bound c k ( ) at all points, we must round up this split between the regions of k, since exp e e 1 -( ) upper bounds all f k ( ). (I.e.being slightly too inclusive in the first region will result in a slightly higher value of k d( ) for the first k satisfying k k split  .)Having derived our monotonically decreasing function k d( ), we are now in a position to show that for an upper bound, it is fine if a term is counted twice!), we evaluate the two regions separately.Firstly, Secondly, using the Maclaurin-Cauchy integral test (see e.g.[37]), we bound the second line follows by substituting k d split ⎡ ⎢ ⎤ ⎥ ( )with the maximum value of k d( ), and by failing to round up the lower bound of the integral (thus including an extra contribution equal to k k double the above (c k ( ) is even, and equation (37) bounds only the region 0, ¥ [ )), and we convert from nats to bits (by including a factor of where each random variable Y t governs the value y t  Î of the dynamical property at time t.The statistical behaviour of a process may be represented in a causal manner by writing it as the conditional probability distribution , defined by the equivalence relationship: y y but this is arbitrary; any fñ | could be made into S 0 ñ | by acting on it first with a unitary gate containing S 0 f ).