Can shallow quantum circuits scramble local noise into global white noise?

Shallow quantum circuits are believed to be the most promising candidates for achieving early practical quantum advantage—this has motivated the development of a broad range of error mitigation techniques whose performance generally improves when the quantum state is well approximated by a global depolarising (white) noise model. While it has been crucial for demonstrating quantum supremacy that random circuits scramble local noise into global white noise—a property that has been proved rigorously—we investigate to what degree practical shallow quantum circuits scramble local noise into global white noise. We define two key metrics as (a) density matrix eigenvalue uniformity and (b) commutator norm that quantifies stability of the dominant eigenvector. While the former determines the distance from white noise, the latter determines the performance of purification based error mitigation. We derive analytical approximate bounds on their scaling and find in most cases they nicely match numerical results. On the other hand, we simulate a broad class of practical quantum circuits and find that white noise is in certain cases a bad approximation posing significant limitations on the performance of some of the simpler error mitigation schemes. On a positive note, we find in all cases that the commutator norm is sufficiently small guaranteeing a very good performance of purification-based error mitigation. Lastly, we identify techniques that may decrease both metrics, such as increasing the dimensionality of the dynamical Lie algebra by gate insertions or randomised compiling.


I. INTRODUCTION
Current generations of quantum hardware can already significantly outperform classical computers in random sampling tasks [1,2] and hopefully future hardware developments will enable powerful applications in quantum machine learning [3], fundamental physics [4,5] and in developing novel drugs and materials [6][7][8][9].The scale and precision of the technology today is, however, still below what is required for fully fault-tolerant quantum computation: Due to noise accumulation in the noisy intermediate-scale quantum (NISQ) era [10], one is thus limited to only shallow-depth quantum circuits which led to the development of a broad range of hybrid quantumclassical protocols and quantum machine learning algorithms [11][12][13].
The aim in this paradigm is to prevent excessive error buildup via a parameterised, shallow-depth quantum circuit and then perform a series of repeated measurements in order to extract expected values.These expected values are then post processed on a classical computer in order to update the parameters of the circuits, e.g., as part of a training procedure.A major challenge is the potential need for an excessive number of circuit repetitions which, however, can be significantly suppressed by the use of advanced training algorithms [14][15][16] or via classical-shadows-based protocols [17][18][19].As such, the primary limitation of near-term applications is the damaging effect of gate noise on the estimated expected values which can only be reduced by advanced error mitigation techniques [12,20].
Error mitigation comprises a broad collection of diverse techniques that generally aim to estimate precise expected values by suppressing the effect of hardware imperfections [12,20].Due to the diversity of techniques and due to the significant differences in the range of applicability, the need for performance metrics was recently emphasised [20].This motivates the present work to characterise noise in typical practical circuits, e.g., in quantum simulation or in quantum machine learning, and define two key metrics that determine the performance of a broad class of error mitigation techniques: (a) eigenvalue uniformity as a closeness to global depolarising (white) noise and (b) norm of the commutator between the ideal and noisy quantum states.While (b) determines the performance of purification based error mitigation techniques [21,22], (a) implies a good performance of all error mitigation techniques.
Our primary motivation is that gate errors in complex quantum circuits are scrambled into global white noise [1,23].This property has been proved for random circuits by establishing exponentially decreasing error bounds; surprisingly, in our numerical simulations we find that in many practical scenarios the same bounds apply relatively well.In particular, we find that both our metrics, (a) the distance from global-depolarising noise and (b) the commutator norm, are approximated as where ν is the number of gates in the quantum circuit, ξ is the number of expected errors in the entire circuit and α is a constant.As such, if one keeps the error rate small ξ 1 but increases the number of gates in a circuit then both (a) and (b) are expected to decrease.This is a highly desirable property in practice, e.g., white noise does not introduce a bias to the expected-value measurement but only a trivial, global scaling as we detail in the rest of this introduction.
In the present work we simulate a broad range of quantum circuits often used in practice and identify scenarios where this approximation holds well, by means of gate parameters and circuit structures are sufficiently random.We also identify strategies that improve scrambling local gate noise into global white noise, such as inserting additional gates into a circuit to increase the dimensionality of its Lie algebra [24].In most cases, however, we conclude that white noise is not necessarily a good approximation due to the large prefactor α in Eq. (1).Thus the performance of some error mitigation techniques that rely on a global-depolarising noise assumption is limited.On the other hand, we find that in all cases the commutator norm, our other key metric, is smaller by at least 1-2 orders of magnitude guaranteeing a very good performance of purification-based error mitigation techniques.
Our work is structured as follows.In the rest of this introduction we briefly review global depolarising noise and how it can be exploited in error mitigation, and then briefly review purification-based error mitigation techniques and their performance.In Section II we introduce theoretical notions and finally in Section III we present our simulation results.

A. Global depolarisation and error mitigation
In the NISQ-era, we don't have comprehensive solutions to error correction, which has led the field to develop error mitigation techniques.These techniques aim to extract expected values O ideal := tr[Oρ id ] of observables-typically some Hamiltonian of interestwith respect to an ideal noiseless quantum state ρ id .
A very simple error model, the global depolarising noise channel, has been very often considered as a relatively good approximation to complex quantum circuits.For qubit states, the channel mixes the ideal, noise-free state with the maximally mixed state Id/d of dimension d = 2 N as Here η ≈ F is a probability that approximates the fidelity as F = η + (1 − η)/d.The white noise channel has been commonly used in the literature for modelling errors in near-term quantum computers [25] and, in particular, it has been shown to be a very good approximation to noise in random circuits [1,23].White noise is extremely convenient as it lets the user extract, after rescaling by η, the ideal expected value of any traceless Hermitian observable O via Of course, for small fidelities η 1 the expected value tr[Oρ wn ] requires a significantly increased sampling to suppress shot noise.In this model, the scaling factor η is a global property and can be estimated experimentally, e.g., via randomised measurements [25], via extrapolation [26] or via learning-based techniques [27].
Global depolarisation, however, may not be sufficiently accurate model to capture more subtle effects of gate noise and thus rescaling an experimentally estimated expected value yields a biased estimate of the ideal one as O bias := tr[Oρ]/η − O ideal .The bias here O bias is not a global property, i.e., it is specific to each observable, and requires the use of more advanced error mitigation techniques to suppress.
Intuitively, one expects the bias is small for quantum states that are well approximated by a global depolarising model as ρ ≈ ρ wn and, indeed, we find a general upper bound in terms of the trace distance as (4) Here O ∞ is the operator norm as the absolute largest eigenvalue of the traceless O, refer to ref. [28] for a proof.As such, a small trace distance guarantees a small bias and thus indirectly determines the performance of all error mitigation techniques -and further protocols [19,29].
In this work, we characterise how close noisy quantum states ρ in practical applications approach white noise states ρ wn and consider various types of variational quantum circuits that are typical for NISQ applications.When the above trace distance is small then it guarantees a small bias in expected values which allows us to nearly trivially mitigate the effect of gate noise, i.e., via a simple rescaling.

B. Purification-based error mitigation and the commutator norm
Another core metric we will consider is the commutator norm between the ideal and noisy quantum states as E C := [ρ id , ρ] 1 , which determines the performance of purification based error mitigation techniques [28] -a small commutator norm has significant practical implications as it guarantees that one can accurately determine expected values using the ESD/VD [21,22] error mitigation techniques.In particular, independently preparing n copies of the noisy quantum state and applying a derangement circuit to entangle the copies, allows one to estimate the expected value The approach is very NISQ-friendly [30,31] and its approximation error E ESD approaches in exponential order a noise floor as we increase the number of copies n [21]; This noise floor is determined generally by the commutator norm E C but in the most typical applications of preparing eigenstates, the noise floor is quadratically smaller as E 2 C [28].
Note that this commutator can vanish even if the quantum state is very far from a white noise state, in fact it generally vanishes when ρ id approximates an eigenvector of ρ.When a state is close to the white noise approximation then a small commutator norm is guaranteed, however, we demonstrate that the latter is a much less stringent condition and a much better approximation in practice than the former: in all instances we find that the commutator norm is significantly smaller than the trace distance from white noise states.

II. THEORETICAL BACKGROUND
In this section we introduce the main theoretical notations and recapitulate the most relevant results from the literature.

A. General properties of noisy quantum states
Recall that any quantum state of dimension d can be represented via its density matrix ρ that generally admits the spectral decomposition as where we focus on N -qubit systems of dimension d = 2 N .Here λ k are non-negative eigenvalues and |ψ k are eigenvectors.Since i λ i = 1, the spectrum λ is also interpreted as a probability distribution.
If ρ is prepared by a perfect, noise-free unitary circuit, only one eigenvalue is different from zero and the corresponding eigenvector is the ideal quantum state as |ψ id .In contrast, an imperfect circuit prepares a ρ that has more than one non-zero eigenvalues and is thus a probabilistic mixture of the pure quantum states |ψ k , e.g., due to interactions with a surrounding environment.In fact, noisy quantum circuits that we typically encounter in practice produce quite particular structure of the eigenvalue distribution: one dominant component that approximates the ideal quantum state |ψ 1 ≈ |ψ id mixed with an exponentially growing number of "error" eigenvectors that have small eigenvalues.White noise is the limiting case where non-dominant eigenvalues are exponentially small ∝ 1/d and |ψ 1 ≈ |ψ id .
The quality of the noisy quantum state is then defined by the probability of the ideal quantum state as the fidelity F := ψ id |ρ|ψ id ; We show in Appendix A that for any quantum state it approaches the dominant eigenvalue λ 1 as where we compute the error term analytically in terms of the commutator norm E C = [ρ id , ρ] 1 from Section I A. This property is actually completely general and applies to any density matrix.

B. Practically motivated noise models
Most typical noise models used in practice, such as local depolarising or dephasing noise, admit the following probabilistic interpretation: a noisy gate operation Φ(ρ) can be interpreted as a mixture of the noise-free operation U that happens with probability 1 − and an error contribution as Here U k is the k th ideal quantum gate and the completely positive trace-preserving (CPTP) map Φ err happens with probability and accounts for all error events during the execution of a gate.A quantum circuit is then a composition of a series of ν such quantum gates which prepares the convex combination as Here ρ id := |ψ id ψ id | is the ideal noise-free state, ρ err is an error density matrix and η = (1 − ) ν is the probability that none of the gates have undergone errors.This probability actually [23,28] approximates the fidelity F := ψ id |ρ|ψ id given the noise model in Eq. ( 7) as Here we approximate (1 − ξ/ν) ν = e −ξ + O( 2 /ν) for small and large ν where ξ := ν is the circuit error rate as the expected number of errors in a circuit.In practice the approximation error E F = ψ id |ρ err |ψ id is typically small and in the limiting case of white noise it decreases exponentially as E F = 1/d due to ρ err = Id/d.Assuming sufficiently deep, complex circuits, ref. [28] obtained an approximate bound for the commutator between the ideal and noisy quantum states as This bound confirms that as we increase the number of quantum gates ν in a circuit but keeping the circuit error rate ξ constant, the commutator norm decreases as ∝ 1/ √ ν [28].Furthermore, this function closely resembles to Eq. ( 1) which is a central aim of this work to explore.

C. White noise in random circuits
Random circuits have enabled quantum supremacy experiments using noisy quantum computers for two primary reasons: (a) the outputs of these circuits are hard to simulate classically and (b) they render local noise into global white noise [1], hence introducing only a trivial bias to the ideal probability distribution similarly as in Section I A.
Ref [23] considered random circuits consisting of s twoqubit gates, each of which undergoes two single qubit Simulating families of 10-qubit Strong entangling layer (SEL) ansatz circuits [32] at random gate parameters for an increasing number ν of gates and per-gate depolarising error rates .(a) the uniformity measure W (ν) of the error eigenvalues of the density matrix from Eq. ( 12) closely match the theoretical model (dashed lines) for random circuits and confirm that increasing the number of gates in random circuits scrambles local noise into global white noise.(b) the commutator norm C(ν) from Eq. ( 14) is significantly smaller in absolute value and decreases with a larger polynomial degree (steeper slope of the dashed lines) than the uniformity measure -this suggests that the dominant eigenvector of the density matrix ρ approximately commutes with ρ even when noise is not well described by white noise.The → 0 simulations were approximated using = 10 −8 ( = 10 −7 ) when calculating W (C).
(depolarising) errors each with probability ˜ (assuming single-qubit gates are noiseless).We can relate this to our model by identifying the local noise after each two-qubit gate with the error event in Eq. ( 7) via the probability Ref [23] then established the fidelity F of the quantum state which one obtains from a noisy cross-entropy score as This coincides with our approximation from Eq. ( 9) up to an additive error in the exponent which, however, diminishes for low gate error rates.In the following we will thus assume F ≡ F .
Measuring these noisy states in a the standard measurement basis {|j } d j=1 produces a noisy probability distribution pnoisy (j) = j|ρ|j .Ref. [23] established that this probability distribution rapidly approaches the white noise approximation pwn = F p id + (1 − F )p unif .In particular, the total variation distance (via the l 1 norm between the two probability distributions is upper bounded as This expression is formally identical to the bound on the commutator norm in Eq. (10); Indeed if the noise in the quantum state approaches a white noise approximation, it implies that the commutator norm must also vanish in the same order.On the other hand, the reverse is not necessarily true as Eq. ( 11) is a stronger condition than Eq. ( 10) as the latter only guarantees that the dominant eigenvector approaches |ψ 1 ≈ |ψ id but does not imply anything about the eigenvalue distribution of ρ or ρ err .

III. NUMERICAL SIMULATIONS A. Target metrics
In the NISQ-era comprehensive error correction will not be feasible and thus hope is primarily based on variational quantum algorithms [11-13, 33, 34].In this paradigm a shallow, parametrised quantum circuit is used to prepare a parametrised quantum state that aims to approximate the solution to a given problem, typically the ground state of a problem Hamiltonian.Due to its shallow depth the ansatz circuit is believed to be error robust and its tractable parametrisation allows to explore the Hilbert space near the solution.On the other hand, such circuits are structurally quite different than random quantum circuits and it was already raised in ref. [23] whether error bounds on the white noise approximation extend to these shallow quantum circuits.
We simulate such quantum circuits under the effect of local depolarising noise -while note that a broad class of local coherent and incoherent error models can effectively be transformed into local depolarising noise using, e.g., twirling techniques or randomised compiling [35][36][37][38].We analyse the resulting noisy density matrix ρ by calculating the following two quantities.First, we quantify 'closeness' to a white noise state from Eq. ( 2) by computing uniformity measure W as the l 1 -distance between the uniform distribution and the non-dominant eigenvalues of the output state as which only depends on spectral properties of the quan-tum state and can thus be computed straightforwardly.We show in Statement 2 that W is proportional to the trace distance from a white noise quantum state as uo to a bounded error E w .The uniformity measure W thus determines the bias in estimating any traceless expected value as discussed in Section I A. Second, we calculate the commutator norm E C from Section I A relative to 1 − λ 1 as which we relate to the commutator norm between the "error part" of the state ρ err and the ideal quantum state ρ id in Lemma 1.In the following, we will refer to C as the commutator norm -and recall that it determines the ultimate performance of purification-based error mitigation as discussed in Section I A.

B. Random states via Strong Entangling ansätze
We first consider a Strong Entangling ansatz (SEA): it is built of alternating layers of parametrised singlequbit rotations followed by a series of nearest-neighbour CNOT gates as illustrated in Fig. 5 -and assume a local depolarising noise with probability .We simulate random quantum circuits by randomly generating parameters |θ k | ≤ 2π -note that these circuits are not necessarily Haar-random distributed and thus results in Section II C do not necessarily apply.
We simulate 10-qubit circuits and in Fig. 1 (a) we plot the eigenvalue uniformity W (ν) while in Fig. 1 (b) we plot the commutator norm C(ν) for an increasing number ν of quantum gates -all datapoints are averages over ten random seeds.These results surprisingly well recover the expected behaviour of random quantum circuits as for small error rates → 0 both quantities W (ν) and C(ν) can be approximated by the function from Eq. (1) as we now discuss.
In Section II C we stated bounds of ref. [23] on the distance between pnoisy and pwn .Based on the assumption that these bounds also apply to the probability distributions p noisy = ψ k | ρ |ψ k and p wn := ψ k | ρ wn |ψ k we derive in Statement 4 the approximate bound on the eigenvalue uniformity as Furthermore, by combining Eq. ( 14) and the bound in Eq. (10) we find that the commutator norm C is similarly bounded by the same function.On the other, Fig. 1 (b) suggests that the commutator norm decreases with a larger polynomial degree and thus we approximate both W (ν) and C(ν) using the function where we fit the two parameters α and β to our simulated dataset.The second equation above is an expansion for small circuit error rates ξ as detailed in Appendix A 2 a.Indeed, in Fig. 1 (blue circles) for small → 0 we observe a nearly linear function in the log-log plot in Fig. 1 and thus remarkably well recover the theoretical bounds with the polynomial power approaching b → 1/2.
Furthermore, comparing Fig. 1 (b, blue circles) and Fig. 1 (a, blue circles) suggest that the commutator norm has both a significantly smaller absolute value (smaller α) and decreases at a faster polynomial rate (larger beta) than the uniformity measure.In fact, the commutator norm is more than two orders of magnitude smaller than the uniformity measure which suggests that even when ρ err is not approximated well by a white noise state it, nevertheless, almost commutes with the ideal pure state ρ id .
We finally consider how the absolute factor α depends on the number of qubits: we perform simulations at a small error rate → 0 and fit our model function αν β to extract α(N ) for an increasing number of qubits.The results are plotted in Fig. 7 (e) and suggest that the prefactor α(N ) initially grows slowly but then saturates while note that a polylogarithmic depth is sufficient to reach anticoncentration [23].

C. Variational Hamiltonian Ansatz
Theoretical results guarantee that the SEL ansatz initialised at random parameters approach for an increasing depth unitary 2-designs thereby reproducing properties of random quantum circuits [39,40].It is thus not surprising that the model introduced in Section II C gives a remarkably good agreement between the SEL ansatz (dots on in Fig. 1) and genuine random circuits (fits as continuous lines in Fig. 1).
Here we consider the Hamiltonian Variational Ansatz (HVA) [41,42] at more practical parameter settings: The HVA has the advantage that we can efficiently obtain parameters that increasingly better (as we increase the ansatz depth) approximate the ground state of a problem Hamiltonian -we will refer to these as VQE parameters.We also want to compare this circuit against random circuits and thus also simulate the HVA such that every gate receives a random parameter as detailed in Appendix B 1.
While the VQE parameter settings capture the relevant behaviour in practice as one approaches a solution, the random parameters are more relevant, e.g., at the early stages of a VQE parameter optimisation.Furthermore, as the circuit is entirely composed by Pauli terms in the problem Hamiltonian, the dimensionality of its dynamical Lie algebra is entirely determined by the problem Hamiltonian in contrast to an exponentially growing algebra of the SEL ansatz [24].FIG. 2. XXX Hamiltonian: same simulations as in Fig. 1 but using 10-qubit HVA quantum circuits constructed for the XXX spin problem Hamiltonian.(a, c) at randomly chosen circuit parameters of the HVA we find the same conclusions as for random circuits in Fig. 1.(b) when the HVA circuit approximates the ground state of the Hamiltonian (VQE parameters) we find the noise in the circuit is not approximated well by white noise, i.e., the uniformity measure W (ν) is large and does not decrease as we increase ν.(d) On the other hand, the commutator norm C(ν) is significantly smaller than W (ν) confirming that the the ideal quantum state approximately commutes with the noisy one.The → 0 simulations were approximated using = 10 −8 ( = 10 −7 ) when calculating W (C).

D. Heisenberg XXX spin model
We first consider a VQE problem of finding the ground state of the 1-dimensional XXX spin-chain model.We construct the HVA ansatz from Section III C for this problem Hamiltonian as a sum H XXX = H 0 + H 1 as The Pauli operators XX, Y Y and ZZ determine couplings between nearest neighbour spins in a 1-D chain and we choose them to be of unit strength.Furthermore, Z k are local on-site interactions |∆ k | ≤ 1 that were generate uniformly randomly such that the Hamiltonian has a non-trivial ground state.
First, we simulate the HVA ansatz for N = 10 qubits with randomly generated circuit parameters as |θ k | ≤ 2π and plot results for an increasing number of quantum gates in Fig. 2 (a, c).We a find similar behaviour for the eigenvalue uniformity W (ν) as with random SEL circuits in Fig. 1 (a) and obtain a reasonably good fit for → 0 using our model function from Eq. ( 15).The commutator norm in Fig. 2 (c) is again significantly smaller in magnitude than the uniformity measure and decreases faster with a higher polynomial order similarly to as with the random SEL ansatz in Fig. 1

(b) .
Second, in Fig. 2 (b,d) we simulate the ansatz at the VQE parameters that approximate the ground state.Since the ansatz parameters become very small as one approaches an adiabatic evolution, it is not surprising that the output density matrix is not well-approximated by a white noise state: the uniformity measure is very large in Fig. 2 (b).The commutator norm in Fig. 2 (d) again, is significantly smaller than W (ν) and although it appears to slowly grow with ν, it appears to decrease for ν → ∞.This agrees with observations of ref. [28] that the circuits need not be random for the commutator to be sufficiently small in practice.Furthermore, in Fig. 7 (a, b) we investigate the dependence on N and find that the prefactor α grows slowly and appears to saturate for N ≥ 10.

E. TFI
In the next example we consider the transverse-field Ising (TFI) model H TFI = H 0 + H 1 using constant onsite interactions h i = 1 and randomly generated coupling strengths |J i | ≤ 1 as We first simulate the HVA ansatz with random variational parameters in Fig. 3 (a, c).While at small error rates → 0 Fig. 3 (a, blue) can be fitted well with our polynomial approximation form Eq. ( 15), we observe that the eigenvalue uniformity W (ν) in Fig. 3 (a, blue) decreases with a small polynomial degree.Indeed, as the HVA ansatz is specific to a particular Hamiltonian, its dynamical Lie algebra may have a low dimensionality [24] resulting in a limited ability to scramble local noise into white noise; this explains why in Fig. 3 (a) the uniformity measure decreases more slowly, i.e., in a smaller polynomial order, than random circuits.For this reason, we additionally simulate in Fig. 6 the TFI-HVA ansatz but with adding R z gates in each layer whose generator is not contained in the problem Hamiltonian.The increased dimensionality of the dynamic Lie algebra, indeed, improves scrambling as the white noise approximation is clearly better in Fig. 6 -while note that the increased dimensionality may also lead to exponential inefficiencies in training the circuit [24].
In stark contrast to the case of the uniformity measure W (ν), we find that the commutator norm in Fig. 3 (c,  blue) decreases substantially for an increasing ν despite the low dimensionality of the Lie algebra.This nicely demonstrates that a small commutator norm is a much more relaxed condition than white noise as the latter requires that the noise is fully scrambled in the entirety of the exponentially large Hilbert space.Finally, we simulate the TFI circuits at VQE parameters and find qualitatively the same behaviour as in the case of the XXX problem.

F. Quantum Chemistry: LiH
We consider a 6-qubit Lithium Hydride (LiH) Hamiltonian in the Jordan-Wigner encoding as a linear combination of non-local Pauli strings P k ∈ {Id, X, Y, Z} ⊗N as We construct the HVA ansatz by splitting this Hamiltonian into two parts with H 0 being composed of the diagonal Pauli terms in Eq. ( 17) while H 1 composed of non-diagonal Pauli strings.Such chemical Hamiltonians typically have a very large number of terms with r h 1 but a significant fraction only have small weights h k thus the HVA would have (a, c) at randomly chosen circuit parameters both W (ν) and C(ν) decrease as expected for random circuits due our randomised compiling strategy [43,44].(b) at the VQE parameters white noise is an increasingly bad approximation, i.e., the uniformity measure W (ν) increases as we increase ν.(d) the commutator norm C(ν) is smaller than W (ν) in absolute value by 2 orders of magnitude.The → 0 simulations were approximated using = 10 −8 ( = 10 −7 ) when calculating W (C).
a large number of gates with only very small rotation angles.For these reasons we construct a more efficient circuit whose basic building blocks are constructed using sparse compilation techniques [43,44]: Each single layer in the HVA ansatz consists of gates that correspond to 100 randomly selected terms of the Hamiltonian with sampling probabilities p k ∝ |h k | proportional to the Pauli coefficients.This approach has the added benefit that it makes the circuit structure random as opposed to the fixed structures in Section III D and in Section III E.
Results shown in Fig. 4 (a,c) agree with our findings from the previous sections: at randomly chosen circuit parameters the uniformity measure decreases according to Eq. ( 15); the commutator norm similarly decreases but in a higher polynomial order while its absolute value is smaller by at least an order of magnitude.In contrast, Fig. 4 (b) suggests that the errors are not well approximated by white noise with a large and non-decreasing W (ν) ≈ 0.5.Furthermore, Fig. 4 (b) again confirms that despite white noise is not a good approximation, the commutator norm is small in absolute value, i.e., ≈ 10 −3 in the practically relevant region.This guarantees a very good performance of the ESD/VD error mitigation techniques sufficient for nearly all practical purposes.

IV. DISCUSSION
Random quantum circuits-instrumental for demonstrating quantum advantage-are known to scramble local gate noise into global white noise for sufficiently long circuit depths [1]: general bounds have been proved on the approximation error which decrease as ν −1/2 as we increase the number ν of gates in the random circuit [23].
In this work we consider shallow-depth, variational quantum circuits that are typical in practical applications of near-term quantum computers and answer the question: can variational quantum circuits scramble local gate noise into global depolarising noise?While the answer to this question is relevant for the fundamental understanding of noise processes in near-term quantum devices, it has significant implications in practice: the degree to which local noise is scrambled into white noise determines the performance of a broad class of error mitigation techniques that are of key importance to achieving value with near-term devices [20].As such, we derive two simple metrics that bound performance guarantees: first, the uniformity measure W characterises the performance of error mitigation techniques that assume global depolarising (white) noise [25]; second, the norm C of the commutator between the ideal and noisy quantum states determines the performance of purification-based error mitigation techniques [21,22] via bounds of ref. [28].
We perform a comprehensive set of numerical experi-ments to simulate typical applications of near-term quantum computers and analyse characteristics of noise based on the aforementioned two metrics.In all experiments in which we randomly initialise parameters of the variational circuits we semiquantitatively find the same conclusions.First, both metrics, the eigenvalue uniformity W and the commutator norm C are well described by our polynomial approximation from Eq. ( 15) for small gate error rates.Second, this confirms that, similarly to genuine random circuits, local errors get scrambled into global white noise with a polynomially decreasing approximation error as we increase the number of gates.Third, the commutator C decreases at a higher polynomial rate and has a significantly, by 1-2 orders of magnitude, smaller absolute value in the practically relevant region than the eigenvalue uniformity W .This confirms that purification based techniques are expected to have a superior performance compared to error mitigation techniques that, e.g., assume a global depolarising noise.
We then investigate the practically more relevant case when the ansatz circuits are initialised near the ground state of a problem Hamiltonian; in all cases we semiquantitatively find the same conclusions.First, the errors do not get scrambled into white noise and the approximation errors are large thus effectively prohibiting or at least significantly limiting the use of error mitigation techniques that assume global depolarising noise.Second, the commutator norm is quite small in absolute value, i.e., ≈ 10 −2 − 10 −4 in the practically relevant region; Since the ansatz circuit prepares the ground state, the square of the commutator norm determines the performance of ESD/VD thus for all applications we simulated we expect a very good performance of the ESD/VD approach.Third, we identify strategies to improve scrambling of local noise into global white noise as we increase circuit depth: We find that inserting additional gates to a HVA that is otherwise not contained in the problem Hamiltonian increases the dimensionality of the dynamic Lie algebra and thus leads to a reduction of both metrics.We find that applying randomised compiling to these nonrandom, practical circuits also reduces both metrics.
While purification-based techniques [21,22] have been shown to perform well on specific examples, the present systematic analysis of circuit noise puts these results into perspective and demonstrates the following: First, the superior performance of the ESD/VD technique is not necessarily due to randomness in the quantum circuitsalbeit, in deep and random circuits its performance is further improved.Second, while some error mitigation techniques perform well on quantum circuits well-described by white noise [25][26][27], we identify various practical scenarios where a limited performance is expected.
The present work advances our understanding of the nature of noise in near-term quantum computers and helps making progress towards achieving value with noisy quantum machines in practical applications.As such, results of the present work will be instrumental for identifying design principles that lead to robust, error-tolerant quantum circuits in practical applications.
Recall that any quantum state can be transformed into a non-negative arrowhead matrix following Statement 1 from [28]  following secular equation [28,47] With this we compute the deviation between dominant eigenvalue λ 1 and the fidelity as where we have used that D k ≤ λ 1 and that all summands are non-negative as D k , C k , λ 1 ≥ 0, and in the second inequality we have used the series of matrix ∞ as established in [28].We have also introduced the abbreviation [ρ id , ρ] given all p-norms of the matrix [ρ id , ρ] are equivalent up to a constant factor.In particular, any p-norm of the commutator can be computed as [ρ id , ρ] p = 2 1/p Var[ρ] where we used the quantum mechanical variance Var[ρ] := ψ id |ρ 2 |ψ id − F 2 as established in [28].Furthermore, in the second inequality in Eq. (A2) we have used that max By denoting the commutator norm as E C , we can thus finally conclude that λ 1 −F ∈ O(E C ) as stated in Eq. ( 6).

Trace distance from white noise states
In this section we evaluate analytically the trace distance of any quantum state ρ from the corresponding white noise state in Eq. ( 2) in terms of a distance between probability distributions.
Statement 1.We can approximate the white noise-state in Eq. (2) in terms of the dominant eigengvalue λ 1 and the dominant eigenvector |ψ 1 of the quantum state as up to an approximation error E w that is bounded via Eq.(A6).
Proof.We start by approximating the weight η in Eq. ( 2) as η ≈ F ≈ λ 1 via Eq.( 9) as well as we approximate the dominant eigenvalue using Eq. ( 6) and then collect the approximation errors as We now use results in [28] for bounding the distance between the ideal and noisy quantum states as , where E C is the commutator norm from Eq. ( 6).We thus establish the approximation where we collect all approximation errors as ) .(A6) Statement 2. We define the eigenvalue uniformity as W := 1 2 p err − p unif 1 via the non-dominant eigenvalues of the density matrix p err := (λ 2 , λ 3 , . . ., λ d )/(1 − λ 1 ).This metric is related to the trace distance from a white noise state (as in Eq. ( 4)) as where the approximation error E w is stated in Statement 1.
Proof.We substitute the approximation of ρ wn from Eq. (A4) including the error term E w and then we use the spectral decomposition of ρ to obtain the trace distance as In the second equation we analytically evaluated the trace distance and thus in the third equation we rewrite the result in terms of p err which is our "error probability" distribution as p err := (λ 2 , λ 3 , . . ., λ d )/(1 − λ 1 ).
Statement 3. Alternatively to Statement 2, if a quantum state admits the decomposition in Eq. (8) then we can state the trace distance without approximation as This is directly analogous to the uniformity measure of the non-dominant eigenvalues of ρ in Statement 2, however, this expression quantifies the uniformity of the probability distribution p µ which are eigenvalues of the error density matrix ρ err .
Let us assume the decomposition in Eq. ( 8).We find the following result via a direct calculation as where we have used the spectral resolution of the error density matrix and then analytically evaluated the trace distance.Given ρ err is a positive-semidefinite matrix with unit trace, its eigenvalues µ k form a probability distribution that we denote as p µ .

Upper bounding the uniformity measure
In this section we upper bound the uniformity measure based on the number of gates and error rates in a quantum circuit.
Statement 4. We adopt the bounds of [23] in Eq. (11) for the distance between probability distributions measured in the standard basis ), where the approximation error E w is stated in Statement 1.
Proof.Let us consider measurements performed in the basis as the eigenvectors of the density matrix which yield probabilities as the eigenvalues as Measuring the white noise state in the same basis yields the following approximation of the probabilities using the error term from Eq. (A4) as The distance of the above two measurement probability distributions is then  bound on the measurement probabilities 1 2 pnoisy − pwn 1 from Eq. ( 11) approximately holds for any measurement basis we can bound the eigenvalue uniformity as In the last equation we introduced the approximation of F from Eq. ( 9) as well as the approximate dominant eigenvalue from Eq. ( 6).

a. Expanding the upper bound
We now expand the upper bound from Statement 4 for small ξ as.More specifically, we consider the parametrised fit function from Eq. ( 15  In Fig. 6 we repeated the same simulation as in Fig. 3 (a), i.e., using a HVA ansatz for the TFI spin model at random circuit parameters, but we appended to each layer a series of parametrised Rz gates on each qubit.This guarantees that the dynamic Lie algebra generated by the Pauli terms of the TFI problem in Eq. ( 16) is expanded by the inclusion of Pauli Z operators.Increasing the circuit depth of the HVA ansatz thus leads to a faster increase of the dimensionality of the Lie algebra which demonstrably leads to a faster scrambling of local noise into global white noise, e.g., steeper slope of the → 0 fit in Fig. 6 than in Fig. 3.

Scaling with the number of qubits
In Fig. 7 we simulate the same circuits as in Figs. 1 to 4 at error rates → 0 and plot the fit parameter α-which is the prefactor in Eq. ( 15)-for an increasing number of qubits.The results appear to confirm an asymptotically non-increasing trend confirming theoretical expectations of [23] for random circuits whereby α is constant bounded in terms of the number of qubits.
FIG.1.Simulating families of 10-qubit Strong entangling layer (SEL) ansatz circuits[32] at random gate parameters for an increasing number ν of gates and per-gate depolarising error rates .(a) the uniformity measure W (ν) of the error eigenvalues of the density matrix from Eq. (12) closely match the theoretical model (dashed lines) for random circuits and confirm that increasing the number of gates in random circuits scrambles local noise into global white noise.(b) the commutator norm C(ν) from Eq. (14) is significantly smaller in absolute value and decreases with a larger polynomial degree (steeper slope of the dashed lines) than the uniformity measure -this suggests that the dominant eigenvector of the density matrix ρ approximately commutes with ρ even when noise is not well described by white noise.The → 0 simulations were approximated using = 10 −8 ( = 10 −7 ) when calculating W (C).

1 FIG. 3 .
FIG.3.TFI same simulations as in Fig.1but using 10-qubit HVA quantum circuits constructed for the TFI spin problem Hamiltonian.(a, c) at randomly chosen circuit parameters W (ν) decreases more slowly, in smaller polynomial order than random circuits -see text and see simulations with added layers of Rz gates in Fig.6.(b) at the VQE parameters white noise is again not a good approximation, i.e., the uniformity measure W (ν) is large and does not decrease as we increase ν.(d) the commutator norm C(ν) is smaller than W (ν) in absolute value by an order of magnitude.The → 0 simulations were approximated using = 10 −8 ( = 10 −7 ) when calculating W (C).

1 FIG. 4 .
FIG. 4.LiH same simulations as in Fig.1but using 6-qubit HVA quantum circuits constructed for a LiH molecular Hamiltonian.(a, c) at randomly chosen circuit parameters both W (ν) and C(ν) decrease as expected for random circuits due our randomised compiling strategy[43,44].(b) at the VQE parameters white noise is an increasingly bad approximation, i.e., the uniformity measure W (ν) increases as we increase ν.(d) the commutator norm C(ν) is smaller than W (ν) in absolute value by 2 orders of magnitude.The → 0 simulations were approximated using = 10 −8 ( = 10 −7 ) when calculating W (C).

1 2 pnoisy − pwn 1
and assume the same bounds approximately apply to any measurement basis.Then, it follows that the uniformity measure from Statement 2 is approximately bounded by the same bounds asW = O( e −ξ ξ/ √ ν 1 − e −ξ ) + O( E w 1 − λ 1

FIG. 6 .
FIG.6.(left) TFI-HVA ansatz: same simulations as in Fig.3(a) but with added parametrised Rz gates after each layer.The additional gates increase the dimensionality of the dynamic Lie algebra which leads to a faster scrambling of local gate noise into white noise, e.g., the → 0 curve is steeper than in Fig.3 (a).See Appendix B for more details.(right) the dependence on the number of qubits shows a very similar trend as without the Rz gates, i.e., compare to Fig.7 (c).

2 .
Inserting additional gates to the TFI ansatz

)
We obtain the above matrix by applying a suitable unitary transformation ρ := U ρU † such that | ψid := U |ψ id = (1, 0, . . .0) while F, C k , D k ≥ 0 with k ∈ {2, 3, . . ., d} with d denoting the dimension, and all other matrix entries are zero.Given the above arrowhead representation of a quantum state, one can analytically compute eigenvalues of the density matrix as roots of the