Near-Term Distributed Quantum Computation using Mean-Field Corrections and Auxiliary Qubits

Distributed quantum computation is often proposed to increase the scalability of quantum hardware, as it reduces cooperative noise and requisite connectivity by sharing quantum information between distant quantum devices. However, such exchange of quantum information itself poses unique engineering challenges, requiring high gate fidelity and costly non-local operations. To mitigate this, we propose near-term distributed quantum computing, focusing on approximate approaches that involve limited information transfer and conservative entanglement production. We first devise an approximate distributed computing scheme for the time evolution of quantum systems split across any combination of classical and quantum devices. Our procedure harnesses mean-field corrections and auxiliary qubits to link two or more devices classically, optimally encoding the auxiliary qubits to both minimize short-time evolution error and extend the approximate scheme's performance to longer evolution times. We then expand the scheme to include limited quantum information transfer through selective qubit shuffling or teleportation, broadening our method's applicability and boosting its performance. Finally, we build upon these concepts to produce an approximate circuit-cutting technique for the fragmented pre-training of variational quantum algorithms. To characterize our technique, we introduce a non-linear perturbation theory that discerns the critical role of our mean-field corrections in optimization and may be suitable for analyzing other non-linear quantum techniques. This fragmented pre-training is remarkably successful, reducing algorithmic error by orders of magnitude while requiring fewer iterations.


Introduction
One prospective trajectory for quantum information hardware is distributed quantum computing [1][2][3], the quantum analog of the celebrated classical field [4][5][6][7].Distributed quantum computing seeks to eliminate the need for large, monolithic quantum computers, which suffer from cooperative noise [8,9].Instead, large-scale problems will be split among many smaller quantum computers that are in communication with each other via a quantum interconnect, a standardized form of quantum communication between remote quantum computing platforms [10,11].
While the benefits of distributed quantum computing are abundant, many obstacles complicate its realization.For instance, due to the no-cloning theorem [12], extensive quantum entanglement would be a required component of quantum interconnects in order to enable non-local operations such as quantum teleportation [1,2,9,13].Moreover, faulttolerant quantum computing would be needed to compute and transmit quantum information between distributed simulators reliably [11,14].Finally, long coherence times or relatively local topology would be necessary to manage the time delays associated with communication between remote locations [9,15].
Nevertheless, the promise of scalability continues to inspire research in various facets of distributed quantum computing.Researchers have characterized the compilation of quantum circuits into cohesive network instructions [16] and devised a language to communicate such instructions more efficiently than conventional circuit diagrams [17].Likewise, much work has been done to develop the non-local operations integral to distributed quantum computing, which have been supported with experimental realizations [13,[18][19][20].Other studies have developed algorithms tailored to quantum distributed architectures, including Shor's algorithm, quantum sensing, and combinatorial optimization [21][22][23][24], while additional research has focused on the quantum advantage provided by quantum distributed computing [25][26][27][28].Still other research has addressed how to approach distributed algorithm design [24,29], the effect of noise in distributed quantum computing [30], architecture selection and scalability [14,[31][32][33], and resource allocation [34][35][36], particularly to optimize teleportation cost [37][38][39].
Although the interest in its theoretical application continues to grow, a wide gap remains between much distributed quantum computing research and its physical implementation.Research along a different vein has instead concentrated on applications that are realizable using near-term hardware, stretching the limit of noisy quantum simulators' utility.Although not distributed in the sense of the works discussed above (which assume that the distributed hardware forms a quantum network), these approaches involve small groups of qubits simulated in parallel or in sequence to address a larger problem.Entanglement forging is one such approach [40], which relies on shifting computation to classical post-processing in order to assemble information from two smaller circuits, thereby halving the maximum circuit size required for the calculation.The quantum tensor network approach uses the framework of tensor networks to identify weakly entangled subgroups and parallelize quantum simulation [41].Similarly, Quantum Multi-Programming (QMP) takes advantage of the increasing size of available quantum simulators to execute multiple shallow quantum circuits concurrently [42,43].
In order to bridge the distributed quantum computing paradigm with the capabilities of near and moderate-term hardware, in this manuscript, we design two procedures that approximately link distributed simulators while remaining amenable to small-scale, noisy devices.Our schemes of fragmented quantum simulation explore what problems can be Interactions J αβ ij are represented by lines between qubits; one label is included for clarity.Interactions that span the two fragments form the interface I. 1b: The case where the two fragments are linked via a classical channel.Mean-field measurements ⟨ Ŝ(j) β ⟩ are exchanged classically.One auxiliary qubit a 1,2 is included in each fragment's simulation, interacting with the fragment qubits according to a target qubit in the opposite fragment (identified in the figure by a blue / green circle).1c: The case where the two fragments are linked via a quantum channel.While mean-field measurements ⟨ Ŝ(j) β ⟩ are still exchanged classically, the auxiliary qubits are physically shared between fragments using some form of quantum communication.
addressed without full information transfer between hardware.First, focusing on the task of time evolution, we partition a system of qubits into subgroups (referred to as fragments) that are treated separately.We harness mean-field measurements to inform mean-field corrections [44] that link the distinct fragments.These simulations could be executed in parallel on a single simulator (as in QMP [42,43]), outsourced to different simulators (as in distributed computing [1,2,9,13]), or even simulated using a mixture of classical and quantum resources (as in heterogeneous computing [45,46]).We further make use of a limited number of auxiliary qubits to mimic the presence of the qubits located on distant simulators.
In our first approach to distributed time evolution, we rely on classical communication to transmit partial state information between distant simulators through measurements, omitting a quantum link between devices.Transmitting incomplete information reduces the generally exponential number of measurements required to relay complete information of a quantum state via a classical channel.For locally interacting systems, the classical fragmentation scheme closely approximates quantities local to each fragment -including the fidelity of the fragment -for timescales up to several 1/J, where J weights the system's interactions.We present a second scheme that is supplemented by limited quantum information transfer, consequently composing an interface of classical and partial quantum information transfer that approximately connects quantum simulators.We show numerically that the limited use of quantum communication significantly extends the scheme's performance to longer evolution times, even for long-range interacting systems.As non-local operations become more available, this technique could be employed in moderate-term distributed applications before a fully connected quantum network is achievable.
Using the same fragmentation framework, we devise a fragmented pre-training approach for variational quantum algorithms, focusing on the variational quantum eigensolver algorithm (VQE) [47].The pre-training can be performed classically or using resource-limited hardware, as only portions of the full circuit are considered.For classical MaxCut problem graphs, the pre-training method reduces energy error by various orders of magnitude on average, and requires over an order of magnitude fewer circuit preparations.For transverse field Ising-like models [48,49] outside of the classical domain, our pre-training scheme maintains a significant advantage in the regime of a small transverse field h.
The remainder of the paper is organized as follows.In Section 2, we first present a fragmented approach to quantum simulation that only involves the classical transfer of partial state information.We further consider an alternate scheme for the case of linking quantum simulators with reduced quantum information transfer through selective qubit shuttling [50] or teleportation [51,52], in addition to classical information transfer.In Section 3, the performance of each scheme is evaluated for the time evolution of quantum Ising-like spin Hamiltonians [53], which are amenable to quantum simulation using trapped ions and Rydberg platforms [54,55].Finally, in Section 4 we expand the scheme to apply to the optimization of quantum circuits.The use of our fragmentation scheme to assist VQE is evaluated in Section 5 [47].The role of mean-field corrections in the optimization through the lens of perturbation theory [56] is explored in Sections 5.2.2 and 5.2.3.In Section 5.2.3, we introduce a non-linear perturbation theory to study mean-field corrected Hamiltonians, analytically formalizing the success of our pre-training approach.

Fragmented Quantum Simulation
In our method of fragmented quantum simulation, we divide a system of N qubits into two or more sub-systems, here referred to as fragments (see Fig. 1a).Each fragment contains some number of qubits N f < N , such that f N f = N .The fragments are treated separately, but it is possible to approximate the presence of a fragment's environment, that is, the qubits outside of a given fragment, through corrective fields and interactions [57].We devise meanfield corrections (described in detail in Section 2.1) [44], which are informed by measurements of a fragment's environment, to actively adjust the state of a fragment.Corrective interactions are mediated by the inclusion of auxiliary qubits within each fragment's simulation, such that f N f +a > N , where N f +a = N f + N a and N a represents the number of auxiliary qubits included in fragment f .Each auxiliary qubit mimics the behavior of one environment qubit, which we refer to as the target qubit for that auxiliary.Each auxiliary qubit interacts with the fragment's qubits according to the same interaction terms as the corresponding target qubit, as prescribed by the original Hamiltonian, enabling entanglement to grow beyond the N f fragment qubits.
Fig. 1b provides an overview of our classically-linked fragmentation scheme, and a detailed diagram is provided in Fig. A1.We define a fragment's interface I to be the collection of interactions existing in the original Hamiltonian that act between fragment qubits and environment qubits.The combination of auxiliary qubits and mean-field corrections collectively mimics the action of the interface on the fragment.The growth and faithfulness of the entanglement within a fragment will be limited by the number of auxiliary qubits included -an unavoidable limitation of the scheme -but the effects of this limitation can be mitigated through judicious fragmentation of the system.Firstly, to mitigate fragmentation error (that is, the error produced by the omission of some system interactions and the resultant reduction of Hilbert space), one can choose to divide the system qubits such that the qubits interacting most influentially with each other are confined to a single fragment.Secondly, it is possible to make an informed choice of target qubit for each auxiliary.This is explored further in Section 2.2.

Mean-Field Corrections
Consider the class of spin models: Here, Ŝ(i) α and Ŝ(j) β are spin-1/2 spin operators acting on sites i and j, where α, β ∈ {x, y, z}.The coefficient J α,β ij gives the strength and sign of the interaction.For concreteness and without loss of generality, we have selected transverse fields h i to point along the x-axis.The Hamiltonian acting strictly within some sub-system f will neglect any operators acting outside of f , yielding the bare Hamiltonian that acts within a fragment f when no corrections are included.
Clearly, the simple exclusion of interactions that span the interface between f and its environment (i.e., the fragmented evolution of f under H (f ) ) will, in general, poorly approximate the evolution of the sub-system under the full Hamiltonian.The fragment qubits will behave as a closed system without external interactions.Although generally these interactions cannot be exactly simulated without modeling all of the system's spins on a single fragment, we introduce a mean-field to partially capture the action of each missing interaction.Mean-field methods have frequently been used to simplify the simulation and study of quantum systems, and statistical physics [44,58,59].Here, the strength and sign of the introduced meanfield correction is informed by the measurement of the corresponding environment spin, while the correction's axis is determined by that of the corresponding interaction's spin operator that would act within fragment f .The resulting mean-field corrected Hamiltonian is given by: H The strength and direction of the mean-fields appearing in H (f ) M F should be updated regularly to reflect the current state of the environment spins.Physically, this requires regular meanfield measurements of the fragments.Evolution must therefore be reset to the initial state in order to proceed by one time step dt, with each new mean-field measurement being stored to progress the evolution.The process of incrementing the time evolution by one time step per simulation is commonly implemented in order to track the time dynamics of an observable [60], resulting in a complexity that scales as O(N 2 t ) in the number of time steps N t .

Auxiliary Target Spin Selection
For nearest-neighbor spin models (e.g., the transverse field Ising model [49]), the selection of a target spin for each auxiliary is somewhat trivial, as at most two qubits interact with a fragmented section of the chain.The choice of auxiliary qubit encoding may be unclear for more general systems.Here, we present a method for auxiliary target qubit selection that yields, on average, the optimal auxiliary qubit encoding.Specifically, we consider how auxiliary target selection affects the simulation error to the first non-vanishing order in dt.This simulation error arises from the omission of interactions forming the interface of some particular fragment f and the remaining environment spins E. The full derivation of the leading error is provided in Appendix A.2; here, we sketch the derivation and build on the result.
The fidelity between a system evolved using our fragmented procedure with that of the full system can be expressed as: The unitary operator U (t) = exp(−iHt) evolves the system exactly under the full Hamiltonian H, while U (f ) I t) evolves the system under a fragmented Hamiltonian that neglects interactions crossing the interface of fragment f .In Appendix A.2, this expression is expanded for short times t to understand how the evolution error ϵ(t) = 1 − F (t) depends on the strength of the neglected interactions.Through the use of Taylor expansion and the Baker-Campbell-Hausdorff (BCH) formula [61], we arrive at the first non-vanishing correction to the fidelity: where var(O) is the quantum variance of operator O.The error ϵ(t) = 1 − F (t) is thus given by var(H − H (f ) I )t 2 for short times t.The form of the short-time error provides a simple rule for choosing the target auxiliary qubits for fragment f to minimize error; namely, select the environment qubit(s) whose interactions contribute most significantly to the variance var(H − H (f ) I ).This choice will minimize the short-time error of evolving the state by the fragmented Hamiltonian, which will lead to higher fidelity performance, on average (see Section 3.3).Moreover, if the auxiliary selection is updated sufficiently often, the selection becomes exact as the short-time error dominates from the time of one auxiliary encoding to the next.

Practical Implementation of the Optimal Auxiliary Encoding
Although the final form of the short-time evolution error provides insight into optimal auxiliary selection, the procedure for estimating a particular qubit's contribution to the error within the distributed framework is less straightforward.For a general spin Hamiltonian, this variance is given by: Estimating the full variance of Eq. ( 6) requires 4-point correlation measurements.If the distributed simulators are linked solely via classical channels, correlation measurements are only accessible when all relevant qubits are local to a single fragment.This implies that two auxiliary qubits -one targeting j and one targeting j ′ -must already be placed within the fragment in order to access the required 4-point correlator measurements.For N E environment qubits, there are O(N 2 E ) combinations, requiring O(N 2 E ) copies of the system in order to estimate all required 4-point correlators, undermining (although not necessarily precluding) the motivations for fragmented quantum simulation with such a technique.
The correlator calculation simplifies significantly when the variance is calculated with respect to a known product state, but a new issue arises: for many spin model Hamiltonians, the variance will vanish for certain initial product states.In fact, for the case of the transverse field Ising model [49], this quantity vanishes for all computational basis states, providing no insight into the proper auxiliary choice.
We propose a two-part solution that addresses these issues.First, we propose a proxy v(a) that estimates the contribution of one potential auxiliary a to the variance: Inserting the two Dirac delta functions δ j,a δ j ′ ,a eliminates the cross-terms in Eq. 6 that depend on multiple environment qubits.Thus, a single auxiliary is required to estimate v(a), and in total O(N E ) partitions are required to acquire v(a) for all potential auxiliary targets.In addition to requiring fewer measurements, this proxy focuses on a's contribution to the variance while neglecting the cross-terms that involve contributions from other potential auxiliary qubits.Secondly, to avoid scenarios where the variance vanishes for initial product states, we suggest first evolving the system for one time step dt for a particular choice of a before estimating v(a).Although this procedure is more involved than calculating v(a) for the initial product state directly, the overhead remains linear in the number of potential auxiliary qubits.

Simulators Linked via Classical Information
We first focus on the scheme free of quantum information transfer, where the auxiliary qubits are selected at the beginning of the simulation and fixed to target a single environment qubit throughout the evolution.We refer the reader to Fig. 1 and Appendix A.1 for an in-depth look at how a system is fragmented for time evolution.As a representative example, consider Figure 2: Nearest neighbor TFIM with constant J = 1.0.To produce 2a, N = 12 qubits are split into two fragments, and the fidelity between the fragment qubits' state and the exactly evolved system is plotted for various numbers of auxiliary qubits, with (dashed lines) and without (solid lines) mean-field corrections.The fidelity is averaged over non-zero h values ranging between ±1.Performance progressively increases with increasing N a and the addition of mean-field corrections.2b displays the local expectation of Ŝz and Ŝx for a system of N = 12 qubits for the specific case of h = 1.0, with the corner label indicating site index.Here, we fragment the system into four fragments, each containing three qubits, and contrast the case of no communication (in red) to that of including N a = 2 auxiliary qubits and mean-field corrections, which match the exact expectation values for longer simulation times.
the transverse field Ising model (TFIM) [48,49] with a uniform transverse field: This system has been studied in depth to better understand the physics of quantum phase transitions [62][63][64].We consider the evolution of a quantum system initialized in the computational basis state |0⟩ under H T F IM , implementing exact unitary evolution numerically using PennyLane [65] with mean-field measurements updated every dt = 0.1/J.Fig. 2 displays the scheme's performance for a 12-qubit model with nearest neighbor interactions of J = 1.0.The results presented in Fig. 2a are averaged over non-zero transverse fields h ranging from ±1, while Fig. 2b features the specific case of h = 1.0.In Fig. 2a, the system is split into two fragments, each simulating six of the system qubits and some number of auxiliary qubits.The average of the quantity F f is plotted, which we define as the fidelity between the reduced density operator of the system qubits within the fragment (tracing out any auxiliary qubits a) and the reduced density of the same system qubits for the exact evolution of the full system (tracing out all environment qubits forming E).We use the generalization of fidelity for density matrices [66] to enable the focused evaluation of the fragment sub-system: For a short evolution time, the scheme captures the correct state of the system qubits within the fragment.This time can be extended by the inclusion of additional auxiliary qubits.In Fig. 2b, we consider a specific instance of the TFIM with J = 1.0 and h = 1.0.To test the scheme, we split the system into smaller partitions with N f = 3.When we make use of two auxiliary qubits and mean-field corrections, the local expectation values exhibit little error for several units of Jt, as expected from the fidelity results.
In Fig. 3, we examine the scaling performance for the nearest neighbor model, increasing the number of fragments simulated with increasing N (keeping N f constant) with N a = 2.In the left panel, the first fragment is considered.This fragment contains the boundary of the chain, and consequently, the fragment's interface consists of only one missing interaction.The right panel considers the second fragment, which is on the interior of the chain for N > 6 and consequently neglects two interactions, leading to reduced performance.This is manifested in the reduced fragment fidelity going from N = 6 to N = 9 for fragment 2.However, there is no such visible drop going from N = 9 to N = 12 due to the monogamy of entanglement [67] -that is, although the number of qubits in the system grows, the qubits that are most strongly entangled with each other remain local to one fragment, and thus the amount of lost information shrinks as N is further increased.We therefore expect our classical scheme to scale well with N for systems that are locally interacting, and to serve as a strong approximation for moderate evolution times.
Figure 3: Scaling performance of the classical scheme for the nearest neighbor TFIM with J = 1.0, averaged over h values ranging from ±1.The number of system qubits within a fragment N f is kept constant as N is increased, with N a fixed to be zero (no communication, in black/gray) or two (in blue).The left panel plots ⟨F f ⟩ for the first fragment (which includes the boundary qubit and thus only involves one interaction crossing the interface), while the right panel plots ⟨F f ⟩ for the second fragment (which, for N = 9 and N = 12, is an interior fragment with two interactions crossing the interface).

The Addition of Quantum Information Transfer
Next, we examine the case of selective quantum information transfer between quantum simulators, applicable when non-local operations are available, even if only in a limited capacity.In this case, the fragmentation scheme can be modified to include limited quantum information transfer (a quantum channel [9]).The role of the auxiliary qubits shifts from being bystanders confined to a single fragment to qubits that are physically shared between simulators through selective non-local interactions, accomplished through qubit shuttling [50] or teleportation [51,52] (see Fig. 1c).If the simulations are being executed in parallel on a single quantum simulator [42,43], this would only require a few additional SWAP gates to include a limited number of cross-simulation interactions.In addition to providing more complete information transfer, a quantum channel further enables the active correction of auxiliary encoding as the system evolves.The selected number of auxiliary qubits places a limit on the number of qubits that are physically teleported / shuttled to a fragment; however, which environment qubits play this role can be changed from one time step to the next depending on which potential auxiliary qubit(s) have the largest contribution to the most recent estimate of the short-time error.When quantum channels and synchronized measurements are available, all correlation measurements are accessible.The quantity v(a) can thus be estimated for any a at any time.As the potential auxiliary qubits' contribution to the variance shift, new auxiliary qubits can be selected -that is, we can make a new selection for which qubit(s) physically interact with a fragment native to a different simulator.If the time steps are sufficiently small such that the first non-vanishing order in the error dominates, then this becomes optimal even for long simulation times.
To evaluate this scheme numerically, we abstract away the details of information transport; this topic has been investigated by other research in the context of distributed time evolution [68].In our actively updated simulation, the quantity v(a) is calculated for each potential auxiliary qubit at each time step to determine its contribution to the shorttime error.This requires the estimation of correlators between each potential auxiliary and each fragment system qubit (requiring O(N f N E N 2 αβ ) per fragment, where N αβ is the number of αβ interaction types), but if there is only one kind of interaction (as is the case for the TFIM and other Ising-like models, with α = β = z), all relevant correlators can be estimated from a set of full system snapshot measurements.The largest contributors are selected to be auxiliaries -numerically, this amounts to keeping the interactions between these qubits and the fragment qubits, while zeroing the J αβ ij coefficients of all other environment-fragment interactions (see Appendix A.3).Any zeroed interactions can be approximately included via mean-field corrections.At the next time step, the selection of zeroed interactions might change due to a change in the selected auxiliaries for each fragment, as dictated by the short-time error.For reference, the independent case (no information transfer) is included in red.The results are averaged over 100 Ising-like Hamiltonians with constant h = 1.0 and randomly generated graphs J ij (see Section 3.2).The N qubits are split into groups such that N f = 3, with an additional N a = 2 auxiliary qubits employed in the simulation.
Fig. 4 compares this scheme (labeled "Q.Channel" to indicate the addition of quantum information transfer) to the previous scheme in Section 3.1, which involves only classical information transfer ("C.Channel").The graph plots the fragment fidelity F f averaged over 100 transverse field Ising-like models with h = 1.0 and randomly generated graphs J ij .Each edge ij exists with probability 0.5, and edge weights J ij are sampled from a Gaussian distribution with mean µ = 0.0 and width σ = 1.0.Furthermore, we randomly select a computational basis state to initialize the fragmented system.Although both schemes outperform the case of no information transfer (in red), the complicated long-range nature of the Hamiltonians considered challenges the previous scheme, which only employs classical information transfer.In contrast, the quantum scheme preserves a large fragment fidelity, even at late simulation times.

Short-Time Error Auxiliary Selection
The benefit of using short-time error to inform auxiliary selection can be isolated by evaluating the performance of each auxiliary choice independently.Consider a system of N = 12 qubits, fragmented into two groups of N f = 6.This leaves six environment qubits from the perspective of each fragment that could be targeted by an auxiliary qubit.Selecting two auxiliary qubits (N a = 2), we rank the six potential choices for target auxiliary encoding according to the size of v(a).In Fig. 5, the six target encoding choices are divided into three groups of two based on v(a), and each option is explored for randomly generated transverse field Isinglike Hamiltonians with h = 1.0, as considered in the previous section.A total of 100 such Hamiltonians are generated and simulated; the averaged results are presented in Fig. 5, where v 0 corresponds to encoding the two environment qubits with the largest value for v(a).On the left, the results are plotted for the case of classical information transfer.Any separation between the fidelity curves corresponding to different auxiliary choices indicates that the v(a) metric meaningfully separates the potential auxiliary choices according to fidelity performance.The fact that the ordering corresponds to the ranked choice is evidence that using short-time error to select auxiliary encoding propagates to better performance at later times.In red, we consider random auxiliary encoding.The random performance roughly converges to the middle-ranked choice v 1 and can be thought of as the performance averaged over auxiliary encoding.In the center, the results are plotted for the case of additional quantum information transfer, without actively updating the auxiliary encoding.The results qualitatively match those of the classical case, with slightly better performance overall, consistent with Fig. 4. In the right panel, we consider the quantum channel with actively updated auxiliary encoding.In this case, v 0 (v 2 ) corresponds to selecting the two auxiliary targets with the largest (smallest) values for v(a) at each decision.The performance of v 0 marginally increases with the introduction of active updates, while the performance of v 2 marginally decreases.However, the random performance increases most markedly.Here, the rapid shuffling of auxiliary qubit encoding allows the fragments to quickly share information, leading to performance comparable to the optimal variance choice, v 0 .The random, actively updated case has the added advantage of being measurement-efficient as it forgoes any variance estimation, but the highly frequent change of auxiliary encoding may lead to an overhead in qubit routing / swapping in order to be realized.
Finally, we note that in the averaged results presented in Fig. 5, the mean-field corrected simulation (plotted with a dashed line) outperforms the corresponding simulation that fully neglects these interface interactions for every case considered.Appendix B investigates the use of mean-field corrections to reduce simulation error through a numerical study.

Fragmented Quantum Circuits
We now investigate the use of fragmentation in quantum circuit evolution.Consider the fragmentation of a parameterized quantum circuit (PQC) of size N into multiple smaller PQCs.To fragment a circuit, multi-qubit unitaries that act on qubits outside the N f +a qubits devoted to a single sub-system's PQC are neglected.Although this resembles the first step of circuit cutting techniques [69], no data processing is required to reconstruct the cut gates; they are simply ignored.Crucially, some auxiliary qubits are included in each sub-system PQC, such that the full set of sub-system PQCs overlap with one another and f N f +a > N (see Fig. 6).The collection of fragmented circuits can be optimized alone prior to optimizing the full circuit as a new approach to pre-training, commonly employed to boost variational quantum algorithms [70][71][72][73][74][75][76].Pre-training generally uses classical resources and can greatly increase the accuracy of a variational algorithm's solution, which is crucial for many applications such as reaching chemical accuracy for quantum chemistry problems [77][78][79].Our pre-training approach is motivated by the fact that the parameter solutions of the smaller circuits are expected to be smoothly connected to the parameter solutions of the full quantum circuit, as explored by [80].We constrain the pre-training to use small circuits that are cheap to simulate classically.Furthermore, employing smaller circuits limits entanglement growth, which has been shown to improve training and avoid barren plateaus [80][81][82][83][84].

Application 2: Fragment-Initialized VQE
Our method of fragmenting a quantum circuit can be applied to classically pre-train quantum circuit parameters for the variational quantum eigensolver (VQE) [47].For this application, a PQC of size N is divided into smaller PQCs, each having size N f +a < N .To optimize each sub-system PQC, the mean-field-corrected Hamiltonian given in Eq. ( 3) is minimized.
In addition to facilitating the study of quantum systems and statistical physics, mean-field methods have been introduced for data analysis and loss function modification [85][86][87][88].In our pre-training technique, employing mean-field terms serves to link the optimization of the separate circuits by their current mean-field measurements.Overlapping parameters (that is, parameters shared by two fragmented PQCs) are initialized for one PQC using the most recent values from the other, further uniting the separate circuit optimizations.The meanfield measurements are updated regularly, and optimization halts when the steady state (up to some set precision) is reached for all parameters -those shared and those unique to one PQC -or the maximum number of iterations is reached.The algorithm is outlined in Algorithm 1.
Figure 6: Diagram depicting how a circuit can be fragmented into a number of smaller circuits with overlapping registers, analogous to the inclusion of auxiliary qubits.The N = 6 qubits are partitioned into three groups of two (with q 1 , q 2 addressed by the top PQC, q 3 , q 4 addressed by the middle PQC, and q 5 , q 6 addressed by the bottom PQC).Two additional auxiliary registers are included in each small PQC, such that some of the parameterized two-qubit gates appear in multiple PQCs.Gates that address qubits beyond the scope of one PQC are neglected by that particular circuit.
Algorithm 1 Fragment pre-training with mean-field corrections.(Randomly) initialize {θ i } for the brickwork section of the full PQC.Divide {θ i } into a set {θ f,i } for each fragment f .Initialize ⟨ Ŝ(j) β ⟩(0) = 0. repeat for f in system do θ f,i=a (k) ← θ i=a (k) for auxiliary spins a in f .

Details of Ansatz
We focus on pre-training brickwork circuits with a limited number of layers to constrain entanglement growth between fragments.Although a circuit ansatz with high complexity is often necessary for interesting VQE applications in order to provide enough expressivity to reach the ground state [89,90], fragmentation-based pre-training is still beneficial through the use of a layer-wise approach [91].If a shallow brickwork circuit is placed ahead of a more expressive PQC ansatz, the brickwork layers can first be optimized using the fragmented approach.These layers serve to bring the state of the system to have some ground state overlap.The full circuit VQE can then be performed, initializing the leading brickwork layers of the circuit with the pre-trained parameter values and initializing the remaining gates of the ansatz to approximately act as identity -specifically, we choose to randomly initialize these parameters to be small values bounded by ±ε (with ε = 10 −5 for our results), to balance maintaining the optimized action of the initial layers after pre-training while avoiding training issues associated with a true identity initialization [70,92].The overall circuit layout is outlined in Fig. 7.

Performance and Analysis of Fragmented Pre-Training
To evaluate our pre-training method, we focus on random Ising-like models.In Section 5.2.1, we present the numerical performance of the scheme for the classical case of zero transverse field (h = 0).Having established the advantage of the approach, in Section 5.2.2 we derive its success as stemming from the mean-field corrective terms included in the loss function, which shift the global minimum of the collective fragmented circuit to coincide with that of the full optimization problem.Finally, in Section 5.2.3, we use perturbation theory and numerical simulation to demonstrate that our approach remains beneficial for |h| > 0, in the regime of a weak transverse field.
Figure 7: The circuit ansatz is built from L s layers l(θ) with linear entangling gates, which are amenable to fragmentation.These are followed by a set of L C layers l(ϕ) with an all-to-all entangling architecture.Only the brickwork layers parameterized by θ i are pre-trained using the fragmented scheme, while the layers parameterized by ϕ j are employed only in the final training process.

VQE Results for MaxCut
We first benchmark the scheme using randomly generated classical Ising Hamiltonians, where the all-to-all J ij interactions are sampled from a Gaussian distribution (mean µ = 0.0, width σ = 1.0) and the transverse field h is fixed to be zero.These models can be mapped to MaxCut problems with randomly generated graphs [84].For the circuit ansatz, a fixed number of brickwork layers is used (L s = 4) to keep this portion of the circuit shallow, while the all-to-all entangling portion of the circuit is made up of L c = N layers.The parameterized single qubit rotations within each layer are selected to be one rotation about x followed by one rotation about y, and the entangled gates are selected to be controlled z (CZ) rotations.All simulations are performed numerically using PennyLane [65].Lastly, note that parameter convergence (evaluated every 100 iterations) is used as the stopping criterion for both the fragmented circuit and full circuit optimization, with a maximum of 5000 iterations permitted.
To assess the performance of pre-training using circuit fragmentation, the same circuit is optimized using random initial values (referred to as "vanilla VQE").Fig. 8 provides a caseby-case comparison between fragment-initialized VQE and vanilla VQE for 500 such models, for circuits of up to 15 qubits.In the top panels, the final percent error ϵ = (E − E 0 )/|E 0 | (where E 0 is the true ground state energy) is plotted for both approaches, along with the geometric mean of the results.The geometric mean of the fragment-initialized final error lies roughly three orders of magnitude below that of the vanilla VQE, with this gap growing even larger with increasing system size.For the larger system sizes, the vanilla VQE struggles to find a solution having ϵ < 10 −2 , while the fragment-initialized approach reaches ϵ ∼ 10 −7 for the same problem Hamiltonian.Moreover, using the same stopping criterion, the fragmentinitialized VQE reaches this solution in fewer iterations (N iter ), decreasing the average number by nearly an order of magnitude, as illustrated by the bottom panels of Fig. 8.After successful pre-training, the parameters of the stitched-together circuit produce a loss that is already in the neighborhood of the minimum, so fewer iterations are required to reach convergence.For this simulation, we employ a batched optimization of T different fragmented circuits performed in parallel.See Appendix D.1 for a description of this approach.The final percent error ϵ is provided in the top panel, while the required number of iterations N iter to reach convergence is provided in the bottom panel.For the fragment initialized case, these metrics refer to the full circuit training that occurs after pre-training.Fragment pre-training reduces the geometric mean of ϵ by orders of magnitude, even as the system size increases.Likewise, the mean number of required iterations is reduced by nearly an order of magnitude.

Solving MaxCut with Mean-Field Terms
Our modification of fragmented loss functions to replace missing (that is, inaccessible) interactions with mean-field terms is critical to the success of pre-training.We here demonstrate that when there is no transverse field (as is the case for Ising-like Hamiltonians that map to classical graph problems), mean-field replacement of interactions results in a ground state and ground state energy that coincide with that of the exact Hamiltonian.This can be shown using a simple logical argument.First, it is wellestablished that the ground state of a classical Ising Hamiltonian will be a computational basis state -indeed, this is why the ground state can be mapped to the solution of a classical problem.We denote the ground state by |x * ⟩.The ground state energy is simply a sum of the expected values of weighted ZZ interactions, taken with respect to the computational basis state |x * ⟩: Notice that for any computational basis state |x⟩, the value of the expectation of a ZZ interaction exactly equals the value of the product of the expectation of the individual Z operators; that is, ⟨x| z ⟩, the resultant energy is unchanged: The above is a central reason for the success of our fragmented training: for Ising-like models with zero transverse field, optimizing a Hamiltonian with mean-field corrections will solve the original problem mapped to the full Hamiltonian.Two potential error sources can arise: 1) the state produced by stitching the optimized circuits together can differ from the output of the individual circuits, and 2) the fragmented optimization may have limited success, e.g., by landing in a local minimum or stalling in a barren plateau.A balance should be struck between these complications: the first error source can be mitigated by considering larger fragments with a larger number of auxiliary qubits or possibly by limiting the number of inter-fragment unitaries, as done in [93], while the second can be mitigated by considering smaller fragments with fewer circuit parameters.

Mean-Field Terms as First-Order Perturbation Corrections
We now use perturbation theory to elucidate our technique of replacing multi-qubit interactions with mean fields when h ̸ = 0.In the previous section, it is established that the ground state and ground state energy of an Ising-like Hamiltonian with zero transverse fields remain unchanged when one or more of the interactions are replaced by the corresponding mean-field approximation term.Following a similar argument, one can further establish that the computational basis states are stationary states of the mean-field corrected Hamiltonian H M F (|ψ⟩), and therefore H M F (|ψ⟩) and the unaltered Hamiltonian H share the same spectrum and set of eigenstates (although this term is used loosely for H M F (|ψ⟩), as the dependence on |ψ⟩ causes the stationary Schrödinger equation to deviate from a linear eigenvalue problem).
In this section, we consider adding a small transverse field to the classical Ising-like model, propelling the problem into the quantum domain.The first-order corrections to the ground state |x * ⟩ and ground state energy E g are computed using perturbation theory.The case of the mean-field corrected Hamiltonian H M F (|ψ⟩) is treated with a version of perturbation theory modified to accommodate mean-field terms, and notably, the same first-order corrections to |x * ⟩ and E g are recovered.For a full derivation, please refer to Appendix C.
Adding a transverse field, the unaltered Hamiltonian containing all interactions is given by: where H 0 contains the intra-fragment interactions: H I contains the inter-fragment interactions: V contains the perturbing transverse field: and λ is a perturbation parameter.We remind the reader that the inter-fragment interactions H I are those that will be replaced by mean-field corrections.
In contrast, the mean-field corrected Hamiltonian denoted H M F is given by: where the form of the Hamiltonian now depends on the state of the system due to the meanfield corrections: Before any corrections can be computed, it is imperative to establish the correct zeroth order energies and eigenstates for each Hamiltonian.Following perturbation theory, the zeroth order eigenstates of H and H M F generally equal those of the unperturbed counterparts (that is, taking h = 0); these coincide with the set of computational basis states {|x⟩} -including the unperturbed ground state, |x * ⟩.However, there are degeneracies in the unperturbed Hamiltonians, and thus, degenerate perturbation theory is required.
When the unperturbed spectrum contains degeneracies, the proper linear combinations of the unperturbed eigenstates forming the degenerate subspace must be determined; these are the states that the perturbed eigenstates approach as h → 0. The unperturbed Isinglike model possesses Z 2 symmetry.Practically, this means that for each eigenstate |x⟩, the "flipped" eigenstate |x⟩ := i X i |x i ⟩ is degenerate.For the unaltered Ising-like model H, the proper zeroth order eigenstates for the degenerate subspace containing the ground state are given by |± The transverse field will break the ground state degeneracy of H, and the positive superposition |+ x * ⟩ is preferred by the ground state.
Shifting attention to the mean-field corrected Hamiltonian H M F , the stationary Schrödinger equation is no longer linear in |ψ⟩, and the linearity that characterizes quantum mechanics no longer applies.The notion of finding proper linear combinations is not an appropriate procedure due to the problem's nonlinearity.In particular, superpositions of degenerate eigenstates can yield different energies for H M F and thus effectively exist outside the degenerate subspace.
To illustrate this, consider a single mean-field factor, ⟨ Ŝ(i) z ⟩, such as those within H M F .While the expectation value of this quantity with respect to a computational basis state |x * ⟩ yields evaluating the same term with respect to |+ x * ⟩ leads to the term vanishing as Notably for the ground state of the unperturbed Hamiltonian |x * ⟩, this means that the pure computational basis states |x * ⟩, |x * ⟩ are energetically preferred over any linear combination of them.Thus, for H M F , the computational basis states remain the proper zeroth order eigenstates with a perturbative transverse field.
After establishing the zeroth order eigenstates and eigenenergies (|k (0) ⟩ and E k , respectively) of conventional Hamiltonians such as Eq. 12, perturbation theory proceeds by expanding |k⟩ and E in λ in the stationary Schrödinger equation and equating orders of λ: Following this procedure for H and carefully treating the degeneracy, the first-order energy correction E k vanishes and the first-order eigenstate correction takes the form: where D k represents the degenerate subspace that |k (0) ⟩ occupies.
To derive the analogous correction to H M F , we employ a modified approach to perturbation theory that can accommodate the nonlinearity of the stationary Schrödinger equation.In particular, the expanded form of |k M F ⟩ is explicitly inserted into the statedependant terms of H M F prior to equating orders of λ to compute the corrections.Following this procedure, the first order energy correction E k,M F again vanishes, and the first order eigenstate correction takes on an identical form to that of H: There is one crucial difference between Eq. ( 21) and Eq. ( 22): the zeroth order eigenstates |m (0) ⟩ and |m x * ⟩ to be F = 0.5 rather than perfect unity (see Appendix C.3).Nonetheless, half overlap provides significant information about the true ground state for pre-training.In Fig. 9, we examine the mean performance of the pre-training scheme as a function of the transverse field strength h.The case of neglecting mean-field terms during pre-training is also considered to highlight the vital role these corrections play at small values of h.When h = 0, the model is classical, and the global fragmented Hamiltonian with meanfield corrections shares the same ground state and ground state energy of the full model, as discussed in 5.2.2.This leads to remarkable pre-training performance, even as N is increased.The fragment initialization that neglects mean-field terms is not guaranteed to share the ground state of the full model -in fact, the two are likely to be orthogonal.The pre-training will still feature low entanglement, which likely explains why the mean-field free initialization scheme outperforms random initialization, but overall, the average error exceeds that of the mean-field corrected case by orders of magnitude, particularly as N is increased.When a small transverse field is added to the model, the half overlap between the fragmented and full Hamiltonian ground states provided by including mean-field corrections leads to error orders of magnitude smaller than the other approaches.Only as h is further increased -entering the regime where first-order perturbation theory is inadequate to describe the ground state -do the approaches begin to perform comparably, with increasing error and required iterations.

Conclusion
We have presented two near-term approaches to the distributed Hamiltonian evolution of a quantum system and a pre-training technique for variational quantum circuits.Our time evolution schemes are built upon the idea that the relative importance of interactions spanning a sub-system and its environment can be ascertained using the principle of minimizing the short-time evolution error, which is derived to be proportional to the quantum variance of the difference between the full and fragmented Hamiltonians.The first scheme employs only classical information transfer in the form of mean-field measurements to update mean-field corrections, as well as a limited number of auxiliary qubits anchored to each fragment, enabling limited entanglement growth.Although our method is lossy, metrics local to the system qubits addressed by a single fragment can closely mimic the true values from exact evolution, including the gold standard comparison of state fidelity.Moreover, this scheme is flexible, as it is amenable to any mixture of classical and quantum hardware and can process the fragments in series or parallel.In our second scheme, the information stored by qubits designated to be auxiliaries is physically shared between fragments, either through qubit shuttling or quantum teleportation.This approach is appropriate when quantum hardware is available and limited quantum communication is feasible.If desired, the choice of which qubits act as auxiliaries can be updated from one time step to the next, as dictated by the minimum error rule, to extend the performance of the approximate scheme.
Finally, we examine how our fragmented simulation scheme can be modified to apply to quantum circuits.Here, a single circuit is fractured into several smaller overlapping circuits, which are more manageable (requiring lower connectivity and less prone to suffer from noise and barren plateaus) and, if sufficiently small, even classically treatable.We devise a scheme that employs fragmented circuits to pre-train the parameters of the full PQC.Crucially, the use of overlapping registers coupled with the mean-field corrective terms in the loss function links the optimization of the individual circuits.The inclusion of mean-field corrections shifts the solution of the collective circuit optimization to have a large overlap with the solution of the full problem.We demonstrate that the pre-training scheme reduces the final percent error by orders of magnitude as well as the number of iterations required when compared to randomly initialized full circuit optimizations of VQE.Although the scheme's performance is particularly strong for classical Ising Hamiltonians, we develop a non-linear perturbation theory to analytically show that the mean-field terms included in optimization act as firstorder perturbation corrections when a small transverse field added, extending the success of the scheme into the quantum realm.
This manuscript motivates and facilitates numerous future research directions.Although we emphasized limited quantum information transfer, subsequent studies might explore how the number of auxiliary qubits and the frequency of re-encoding affect distributed simulation, or devise the details of physically implementing a limited quantum channel.Likewise, higher-order moments beyond mean-field terms may be explored as higher-order corrections.Moreover, rather than using auxiliary qubits to target specific environment qubits, a method of mapping salient environment states to auxiliary qubits (as employed by some classical fragmentation methods such as DMET [57]) may further improve the method, although the measurement-efficiency of such a technique may prove challenging in a quantum setting.Regarding fragment pre-training for variational algorithms, future works might develop efficient circuit fragmentations that are tailored to specific problem Hamiltonians and/or symmetries, rather than our more general, batched approach.Finally, alternative partitioning schemes might be considered to enable pre-training of non-brickwork circuits.
This work represents a pivotal stepping stone on the path to large-scale distributed quantum computing.In the near term, our distributed computing method with classical channels can be implemented by a single small simulator in sequence, or by a collection of small simulators that are either quantum or classical in nature.This permits the simulation of large system quantum dynamics without the noise and connectivity concerns of a large-scale quantum device [8], allowing experimentalists to address challenging problems in quantum chemistry and condensed matter physics [94,95].As non-local operations on quantum hardware improve, our proposal for limited quantum information transfer can be implemented, enabling cross-simulator measurements and higher accuracy.Lastly, our fragmented pretraining method can reduce the error of large-scale variational quantum algorithms by orders of magnitude while reducing the number of training epochs.Such improvements are vital to this field, which seeks to address problems ranging from drug discovery to NP-hard optimization on quantum hardware despite persistent training difficulties [47,77,[96][97][98].

Acknowledgements
This work was done during A.M.G.'s internship at NVIDIA.A.M.G. acknowledges support from the National Science Foundation (NSF) through the Graduate Research Fellowships Program, as well as support through the Theodore H. Ashford Fellowships in the Sciences.At CalTech, A.A. is supported in part by the Bren-endowed chair.S.F.Y. thanks the AFOSR and the NSF (through the CUA PFC and QSense QLCI) for funding.

Appendix A. Fragmentation for Time Evolution Appendix A.1. A Toy Model
For added clarification, here we consider a toy model of N = 6 qubits fragmented into two groups such that N f = 3 (see Fig. A1, a more explicitly labeled version of Fig. 1b).One auxiliary qubit is included for each fragment (N a = 1).The auxiliary qubit of Fragment 1, a 1 , targets qubit 5 in Fragment 2. Similarly, the auxiliary qubit of Fragment 2, a 2 , targets qubit 1 in Fragment 1.In the figure, the interactions mediated by the auxiliary qubits are explicitly labeled to illustrate how each auxiliary qubit plays the role of one qubit from the environment.Which environment qubit is selected for this role will depend on the quantum variance var(H −H (f ) I ) (see Section 2.2).The mean-field corrections applied to a 1 are included in the figure to account for the missing interactions with qubits 4 and 6 that the target qubit (qubit 5) would participate in if all qubits were present.Although only one such arrow appears in the figure, these corrections exist for each qubit participating in fewer interactions than it would in the full system, regardless of whether the qubit is categorized as a fragment qubit or an auxiliary targeting an environment qubit.β acting on qubits i and j, weighted by J α,β ij .The interactions highlighted in red are omitted due to fragmentation; these form the "interface" of the two fragments.An auxiliary qubit a interacts with the qubits within a fragment, according to the prescribed interactions of the circled target qubit in the opposite fragment.While, for visual clarity, this diagram only displays the mean-field corrections applied to auxiliary a 1 , in reality, such corrections are applied to each qubit that participates in one or more interactions beyond the fragment boundary.
we implement the union of connectivity matrices ∪ f J αβ,(f ) ij for each αβ interaction type, defined by: Any interactions still zeroed in the union ∪ f J αβ,(f ) ij are approximately included via mean-field corrections.If the auxiliary encoding is updated, the union ∪ f J αβ,(f ) ij will change due to the change in selected auxiliaries {a f } of each fragment.
The results are plotted in light green in Fig. B1.The scattered data falls within a closed area, with V − V M F dropping below zero for concurrence 0 < C(|ψ⟩) < 1.The difference V − V M F is also bounded between −0.25 and 1, reaching its maximum positive value for product states (where C(|ψ⟩)).
To understand the features of a state that extremizes V − V M F , we probe the boundaries of the envelope of possible states by examining parameterized two-qubit states.The specific parameterized states discussed in the following paragraphs are meant to be representative of the form of the states that lie on the edge of the closed area in Fig. B1, but generally, many possible two-qubit states will produce identical values of (C(|ψ⟩), V − V M F ).
First, we consider a parameterized state |ψ(α)⟩ that smoothly approaches the Bell state When α = 0, the resulting state |11⟩ is a product state with zero concurrence.Furthermore, |11⟩ is computational basis state with V = V M F = 0.As α approaches 1/2, the concurrence increases, reaching a maximum value.The variance difference, on the other hand, decreases to take on negative values -see the left panel of Fig. B2.Thus, it is states with structured entanglement such as |ψ(α)⟩ which will not benefit from mean-field corrections.Plotting V − V M F versus the concurrence of |ψ(α)⟩ in Fig. B1, we see that this set of states lies on the lower boundary of the envelope of possible values.Secondly, we consider the two-qubit state produced by rotating qubit 1 of the state |00⟩ about x by an angle θ: |ψ(θ)⟩ := cos (θ/2)|00⟩ − i sin (θ/2)|10⟩. (B.5) The resulting state will always be a product state with zero concurrence; however, the variance difference will depend on θ.Sweeping across θ (see the middle panel of Fig. B2), we see that V − V M F grows as θ is increased.A state of this kind lies on the left boundary of the envelope (see Fig. B1).Finally, we consider a state that lies on the top boundary of the envelope, with positive values for both the variance difference and the concurrence.We parameterize this by interpolating between a product state with maximum V − V M F and a state with maximum concurrence while avoiding a Bell state, which is known to lie on the lower boundary: The right panel of Fig. B2 reveals that V − V M F and C(|ψ(β)⟩) remain non-negative as β is varied.Again, plotting V − V M F versus the concurrence of |ψ(β)⟩ in Fig. B1 reveals that such a state indeed lies at the top boundary of the envelope.
In the time evolution applications considered in the main text, all initial states are computational basis states.It is thus unlikely that the state evolves into one with large entanglement across the interface with the particular structure required to produce a negative value of V − V M F .This is why, on average, including mean-field corrections enhances the performance of the scheme.More generally, we expect that in practical situations where fragmentation is employed, it is unlikely that highly structured entanglement will form across a fragment interface through evolution.Therefore, it is reasonable to expect that mean-field corrections will still improve performance in general applications.The unperturbed Hamiltonian is classical, with computational basis states for eigenstates.It also possesses Z 2 symmetry, such that each eigenstate |x⟩ and its "flipped" version |x⟩ are degenerate, including the ground state |x * ⟩.It is necessary to find the correct linear combinations of the degenerate states to properly calculate the higher order corrections.Typically, this is accomplished by diagonalizing the perturbing Hamiltonian V in the subspace of degenerate eigenstates, which yields the proper zeroth order eigenstates as well as the first order energy corrections, {E k }.The particular V of this problem -a global transverse field -vanishes in the subspace of |x⟩, |x⟩, requiring the first order energy corrections {E (1) k } to vanish but providing no insight into the correct zeroth order eigenstates.
To calculate the first order eigenstate corrections, the correction is split into one within the degenerate space D k and one in the space outside D k , which we define as Dk .For the latter correction, we have: where P Dk is the projects out the degenerate subspace.Following [56], the correction within the degenerate subspace is given by: k − E The first order energy difference in the denominator of the first fraction of this expression is singular, because the degeneracy within D k has not been lifted by the first order energy correction.However, this apparent issue provides us with the correct zeroth order eigenstates required to cancel the singular denominator: {|k (0) ⟩} must be selected to diagonalize the object W := V P Dk (E k − H 0 − H I ) −1 P Dk V within the subspace of degenerate states, such that the quantity ⟨k ′(0) |W |k (0) ⟩ vanishes for all k ′ ̸ = k within D k .With this choice for {|k (0) ⟩}, the first order eigenstate correction within the degenerate space vanishes, and the full first order eigenstate correction takes the form: M F ⟩ and E (0) k,M F .However, the modified Schrödinger equation written above is no longer an eigenvalue problem; the dependence on |k M F ⟩ is non-linear.In the spirit of conventional perturbation theory, we will proceed in the usual manner, taking special care to include the non-linearity.Expanding |k M F ⟩ and E k,M F in orders of λ: Figure C1: Ising-like model of N = 6 with all-to-all J ij sampled from a Gaussian distribution centered at zero with width 1.0.Sweeping across transverse field strength h, the ground state |ψ g ⟩ is computed using exact diagonalization of H. Fidelity is plotted between |ψ g ⟩ and the minimum energy state of H (f ) (light green), the minimum energy state of H (f ) M F (dark teal), ground state first order perturbation theory for H (black), and ground state first order perturbation theory for H (f ) M F (red). at most N f = 3 and two auxiliary registers, producing fragmented circuits of size N f +a = 5 or smaller, which can comfortably be optimized using classical resources.A number T = 10 of such partitioned circuits are generated for the same problem Hamiltonian and optimized in parallel according to Algorithm 1.After their initial optimization, the loss associated with each of the T sets of pre-trained parameters is estimated for the full circuit.This requires T additional loss measurements, a modest overhead when the value T is kept small relative to the required number of iterations.The set of pre-trained parameters producing the smallest loss is selected as the initial starting point for the full circuit optimization.
It is worth mentioning that for large problem Hamiltonians treated using quantum resources, the batched optimization of fragmented circuits can be done classically in parallel prior to using the quantum hardware.Only after pre-training would it be necessary to use the quantum hardware in order to estimate the loss of the pre-trained parameters.The role of batch size T in optimization success is explored numerically in Appendix D.1.

Figure 1 :
Figure 1: 1a: Diagram illustrating how a 6-qubit system can be split into two fragments.Interactions J αβ ij are represented by lines between qubits; one label is included for clarity.Interactions that span the two fragments form the interface I. 1b: The case where the two fragments are linked via a classical channel.Mean-field measurements ⟨ Ŝ(j)β ⟩ are exchanged classically.One auxiliary qubit a 1,2 is included in each fragment's simulation, interacting with the fragment qubits according to a target qubit in the opposite fragment (identified in the figure by a blue / green circle).1c: The case where the two fragments are linked via a quantum channel.While mean-field measurements ⟨ Ŝ(j) β ⟩ are still exchanged classically, the auxiliary qubits are physically shared between fragments using some form of quantum communication.

Figure 4 :
Figure 4: Comparison between scheme involving only classical information transfer (dark teal) to that involving limited quantum and classical information transfer (light green).For reference, the independent case (no information transfer) is included in red.The results are averaged over 100 Ising-like Hamiltonians with constant h = 1.0 and randomly generated graphs J ij (see Section 3.2).The N qubits are split into groups such that N f = 3, with an additional N a = 2 auxiliary qubits employed in the simulation.

Figure 5 :
Figure5: The effect of auxiliary selection on simulation performance.The curves are the averaged results for 100 different transverse field Ising-like Hamiltonians with h = 1.0 and the randomly generated graphs J ij described in Section 3.2).Here, N = 12 with N f = 6 and N a = 2.The auxiliary target choices are ranked according to the size of v(a), such that the two environment qubits with the largest v(a) are used in simulation v 0 , the two with the smallest v(a) are used in simulation v 2 , and the remaining two auxiliary choices are used in simulation v 1 .Additionally, in red, we consider the case of randomly selecting two auxiliary target qubits with no variance calculation.In the left panel (classical communication) and center panel (quantum communication), the selection is made after one time step, and the choice remains fixed throughout evolution.In the right panel (quantum communication), the selection is re-evaluated at each time step.

Figure 8 :
Figure 8: Comparison between fragment pre-trained VQE and vanilla VQE for 500 different J ij matrices (graphs).The full PQC is split into fragments with N a = 2 and at most N f = 3 during pre-training.A total of T = 10 different partitionings are considered, and the best pre-trained solution is used to initialize the final optimization.The final percent error ϵ is provided in the top panel, while the required number of iterations N iter to reach convergence is provided in the bottom panel.For the fragment initialized case, these metrics refer to the full circuit training that occurs after pre-training.Fragment pre-training reduces the geometric mean of ϵ by orders of magnitude, even as the system size increases.Likewise, the mean number of required iterations is reduced by nearly an order of magnitude.

Figure 9 :
Figure 9: Comparison between fragment pre-trained VQE (with and without mean-field corrections) and vanilla VQE performance as a function of the transverse field strength h.Each point is the mean (geometric or arithmetic) performance of 500 different J ij matrices (graphs).The full PQC is split into fragments with N a = 2 and at most N f = 3 during pretraining.A total of T = 10 different partitionings are considered, and the best pre-trained solution is used to initialize the final optimization.The mean final percent error ϵ is provided in the top panel, while the mean required number of iterations N iter to reach convergence is provided in the bottom panel.

1 Figure
Figure A1: A schematic illustrating how a system of N = 6 qubits can be split into two fragments.Lines connecting the qubits (illustrated as atoms) indicate general spin interactions Ŝ(i) α Ŝ(j)β acting on qubits i and j, weighted by J α,β ij .The interactions highlighted in red are omitted due to fragmentation; these form the "interface" of the two fragments.An auxiliary qubit a interacts with the qubits within a fragment, according to the prescribed interactions of the circled target qubit in the opposite fragment.While, for visual clarity, this diagram only displays the mean-field corrections applied to auxiliary a 1 , in reality, such corrections are applied to each qubit that participates in one or more interactions beyond the fragment boundary.

Figure B2 :
Figure B2: Variance difference V −V M F (left axis) and concurrence C(|ψ⟩) (right axis) plotted versus a state parameter for specific parameterized two-qubit states, specified by the subplot title.The parameterized states are defined in Eqs.(B.4) -(B.6).

|k ( 1
that ⟨x|W |x⟩ = ⟨x|W |x⟩ = ⟨x|W |x⟩ = ⟨x|W |x⟩ using the Z 2 symmetry of the unperturbed Hamiltonian.Thus, the correct zeroth order eigenstates are the symmetric and antisymmetric superpositions of |x⟩ and |x⟩, given by|± x ⟩ = (1/ √ 2)(|x⟩ ± |x⟩).Appendix C.2. Perturbing the Mean-Field Corrected HamiltonianThroughout this derivation, we will take advantage of the particular form of H I,M F (|ψ⟩) to make simplifications.Consider a modified Schrödinger equation that takes into account the state-dependence of H M F (|ψ⟩):(H 0 + H I,M F (|k M F ⟩) + λV )|k M F ⟩ = E k,M F |k M F ⟩ (C.11)In conventional perturbation theory, the eigenstates and eigenvalues (|k M F ⟩ and E k,M F , respectively) are expanded about their unperturbed counterparts, |k