Coupling of nitrogen vacancy centers in nanodiamonds by means of phonons

Realising controlled quantum dynamics via the magnetic interactions between colour centers in diamond remains a challenge despite recent demonstrations for nanometer separated pairs. Here we propose to use the intrinsic acoustical phonons in diamond as a data bus for accomplishing this task. We show that for nanodiamonds the electron-phonon coupling can take significant values that together with mode frequencies in the THz range, can serve as a resource for conditional gate operations. Based on these results we analyze how to use this phonon-induced interaction for constructing quantum gates among the electron-spin triplet ground states, introducing the phonon dependence via Raman transitions. Combined with decoupling pulses this offers the possibility for creating entangled states within nanodiamonds on the scale of several tens of nanometers, a promising prerequisite for quantum sensing applications.


Introduction
Tremendous progress in understanding and manipulating the nitrogen vacancy (NV) center in diamond throughout the last decade revealed its promising capabilities for quantum information and sensing applications. The basic foundations of coherent manipulation, high fidelity polarization and optical readout [1] paved the way for using the NV center as a fully controllable quantum bit capable of operating at room temperature with extraordinary long coherence times that may reach the millisecond range [2]. Whereas the coherent coupling and entanglement to nuclear spins of nitrogen [3,4] and carbon-13 [5,6] has been demonstrated in numerous experiments, bringing different nitrogen vacancy centers to interaction remains challenging and has been demonstrated only recently [7,8]. One approach for coupling distinct NV-centers makes use of their dipolar interactions [7,9,10], which is limited by the strong distance dependence of the coupling and therefore has been demonstrated only for very closely separated pairs. Another method consists of interconnecting the NV center solid state spin qubits with photons [8], that has lead to extensive research in the design of cavities and photon couplings [11,12,13,14]. In contrast to that, the coupling to phonons is much less studied. Whereas this mechanism serves as the prominent data bus for conditional quantum operations in the trapped ion approach to quantum computing [15] and has been proposed to allow even for a strong coupling regime in phonon cavity structures in silicon [16], intrinsic phonon coupling is assumed to be inaccessible in macroscopic diamonds at room temperature. However, the coupling to magnetized nanomechanical oscillators as AFM cantilevers was successfully performed, allowing for the sensing of the vibrational mode [17,18] and even for the coherent manipulation of the NV center electron spin state [19], that might provide the basic ingredient for future phonon mediated quantum networks [20]. Here we will show that a significant intrinsic phonon coupling can be expected in nanodiamonds at low temperatures. Those can be fabricated down to 4 nm in size, additionally being capable of hosting NV centers [21]. We study the coupling strength to long wavelength (low frequency) acoustical modes and analyze the possibility to exploit these global modes for entanglement operations by creating a Raman-induced phonon coupling within the ground state electron triplet states of the NV center.

Coupling of the nitrogen vacancy center to phonons
The localization of the NV-center electronic states to the vacancy defect itself permits their description as superpositions of molecular orbitals, each associated with a dangling bond orbital of the atoms involved in the defect center [22,23]. That is, according to the C 3v symmetry of the defect center, four electronic states M el = {|a 1 , |a 2 , |e x , |e y } can be constructed out of the dangling bond orbitals M db = {|σ 1 C , |σ 2 C , |σ 3 C , |σ N } by linear combinations, with the index C referring to carbon and N to the nitrogen related bonds (see figure 1 (a)). , |e x , |e y , occupied with the six electrons (or equivalently two holes) of the NV center. (iii) Combining the electronic and spin wavefunction leads to the NV center energy level structure with spin singlets and triplets. Ground states correspond to two holes in the |e x/y state as illustrated in (ii) and excited states to one hole in |a 2 and |e x/y . Strain and therefore phonons cause displacements of the dangling orbitals (i), and consequently modify the energy levels in (ii) and (iii). The influence of the strain δ x = −ζ xx on the excited state fine structure is illustrated in the inset. (b) Phonon coupling coefficient η (6) and frequency ν of the lowest acoustical mode vs size (diameter) of a spherical nanodiamond.
In this framework the well known ground and excited state energy level structure of the NV − -center follows by associating the six electrons involved with the available electronic states M el , taking the additional spin properties into account. Equivalently, and more simply, this can be described by two holes relative to a complete filling [22]. Calculating the coupling to phonons can be decomposed into two steps by noting that the long wavelength acoustical phonons considered here will introduce a periodic strain to the NV center. Thus in a first step we will discuss the effect of strain to the energy level structure and in a second step link the vibrational phonon mode to the strain property. For the former case we will follow the discussion presented in [22] (see also a similar discussion in [23]). The electron-nuclei Coulomb coupling can be described by an interaction of the form with i and j describing a summation over all possible M db configurations and r ij = r i − r j with r i/j the corresponding atom position vectors. Noting that the coupling coefficient depends on the relative distance of the atoms involved, it is obvious that stress related displacements u i will influence the energy level structure and for small displacements, as expected by phonon effects, the strain perturbation on the level of dangling bonds can be described by with δh ij = ∇ r h ij δ u = ∇ r h ij e r the first order correction following from an expansion of h ij . Herein the tensor e µν = ∂u µ /(∂r ν ) is related to the strain tensor by µν = 1/2(e µν + e νµ ). Considering a specific ground state to excited state triplet transition this results in the following Hamiltonian for the NV-center energy levels: wherein δ 1 = −ζ( xx + yy ) and δ 4 = ζ8 β 2 e zz with [22] ζ 610 T Hz and β accounting for the difference of the nitrogen coupling compared to the carbon related ones. |g can be any of the triplet ground states, that exhibit an equal shift under the influence of strain, whereas for the excited states we find |e ξ = ±ζ 4 2 xy + [ xx − yy ] 2 for the states |E x and |E y , respectively, and ξ = 0 otherwise. For more details on the derivation of (3) we refer to Appendix A.
Vibrational modes will introduce a time-dependent periodic strain to the system, that exhibits a simple description for long wavelength acoustical modes with wavevectors near the Brillouin-zone. In that case, and assuming periodic boundary conditions, the discrete Bloch-type wavefunction, describing the displacement of the crystal lattice positions, can be described by a continuous displacement field in space u( r) [24] replacing the discrete lattice positions n by r n → r and the displacement for a specific mode α is thus obtained by [25] with k the wavevector, e (α) the mode eigenvector, ν(k) the angular frequency and a α , a † α the phonon creators and annihilators of the mode, respectively, and M the mass of the nanodiamond. For the linear acoustical modes near the zone center ν(k) = c · k with c = 1.2 · 10 4 m/s the speed of sound in diamond and k = 2 π/l out of applying periodic boundary conditions, with l the diamond length in the corresponding mode direction. Combining equations (3) and (4) allows for the calculation of the energy shift associated with the phonon coupling and exhibits the typical form of a deformation potential coupling describing local lattice compression and dilation as expected for the long wavelength acoustical phonon case. As an illustration let us explicitly give the electron-phonon coupling energy shift Hamiltonian for the specific case of a mode α with e (α) k pointing in x-direction and choosing |e = |A 2 as the excited state, favourable in the sense that it is not coupled to the singlet state via spin-orbit coupling [22,26]. In that case the electron-phonon coupling takes the simple form with the coupling coefficient that takes the role of the well-known Lamb-Dicke parameter. Note that the periodic boundary condition treatment is strictly valid only for the case of an infinite crystal; for a finite nanodiamond crystal, shape-dependent confinement effects alter the boundary conditions, therefore leading to a modification of the specific phonon mode spectrum. We will discuss the NV-center phonon coupling in that case for the example of a free elastic sphere in Appendix B. Although a change in the exact microscopic behaviour is observed in that case, the coupling properties of the long-wavelength radial breathing mode agree fairly well with the periodic boundary treatment. Moreover the scaling properties remain unchanged for both descriptions.
Adding a laser coupling with Rabi frequency Ω and frequency ω L to couple one of the ground states to the excited state |e and applying the canonical Schrieffer-Wolff transformation U = exp(iη (a † + a)|e e|) beside restricting the discussion to a single mode, leads to the following Hamiltonian in the rotating wave approximation [27,28] (a discussion on how to include the mode relaxation and the finite excited state lifetime can be found in Appendix F) Herein σ z refers to the Pauli operator of the corresponding transition andω 0 = ω 0 + η 2 ν with ω 0 the transition frequency and the second part following from the Schrieffer-Wolff transformation. Note also that the phonon coupling mechanism is assumed to be completely originated from the electron-phonon coupling Hamiltonian of the corresponding transition, whereas the photon recoil based coupling mechanism has been neglected as it is orders of magnitude lower in macroscopic systems [27]. The coupling parameter η (6) for different nanodiamond sizes is shown in figure 1 (b) taking typical values of the order of 10 −2 − 10 −3 and decreasing with the nanodiamond radius ∝ R −1 for a spherical diamond of radius R. For general diamond shapes η ∝ l/V with l the length in the mode direction and V the diamond volume, leading e.g. to a η ∝ 1/ √ R scaling for a two-dimensional structure of area ∝ R 2 . The mode frequencies ν for the lowest energy mode, for a spherical diamond also representing the frequency difference among neighbouring modes, is shown in figure 1 (b) taking typical values in the THzrange. These high mode frequencies are advantageous in the sense that neighbouring modes are well separated and additionally thermal occupation probabilities are low. The decreasing magnitude of both the coupling strength and the mode frequency limits the diamond size to several tens of nanometers. Hamiltonian (7) can be expanded in orders of the small coupling parameter η with the zeroth phonon independent order providing the carrier and the first order the blue (∝ a † |e g|) and red sideband (∝ a|e g|) transitions. These phonon sidebands should be experimentally observable in low temperature emission spectra [29]; however no such experimental investigations with nanodiamonds < 5 nm have been performed to date.  Herein Ω 2 = κ 1 ν, Ω 1 = η Ω 2 and κ 2 defined as in the text. Continuous, dashed and dashed dotted lines correspond to κ 2 = 0.1, 0.05, 0.35, whereas blue, red and green corresponds to κ 1 = 0.01, 0.05, 0.1, respectively. Horizontal lines (with triangle endpoints) denote the corresponding Γ eff value that imposes a limit on the maximal nanodiamond size shown for κ 2 = 0.05 and κ 2 = 0.1 by the vertical yellow lines. The area for which the ratio Ω gate > Γ eff is coloured with decreasing saturation for increasing κ 2 . Note that this ratio is independent of the choice of κ 1 for a given nanodiamond size.

Phonon mediated gate interaction
The phonon dependent sideband transitions offer the possibility of correlating the NV center state with the global vibrational one. This can be used as a source for gate operations [30,31,32] between different NV centers within the same nanodiamond. In one of its simplest versions a Mølmer-Sørensen type gate [30,33] can be directly implemented on the ground-excited state transition, that has the advantage of being insensitive to the actual phonon state, therefore allowing gate operations even for thermal initial phonon states and circumventing the need for vibrational mode cooling. However such a concept suffers from the relatively short lifetime of the excited state limited by both optical decay to the ground state as well as non-radiative spin-orbit relaxations to singlet states. Moreover, in order to avoid off-resonant excitations to several excited states, the driving field strength has to be limited to values smaller than the typical energy gaps ∼ 4 GHz, therefore limiting the maximal gate speed.
Here we provide an alternative concept that allows to perform the gate within the triplet ground state manifold. A Raman transition via the excited state provides the required phonon coupling, that otherwise would not exist according to the absence of a difference in the electron-phonon coupling among the ground state triplet states in (3).
For this setup to work, a Λ-transition between ground and excited states is required, that for the NV center exists between the |g +1 , |g −1 electron-spin triplet ground states and either of the excited states |A 1 , |A 2 , |E 1 , |E 2 , the latter forming equal superpositions of the m s = ±1 electron spin projections [22]. These Λ-transitions have already been successfully implemented and analyzed in experiments [34,26], being accessible in the low temperature (< 10 K) and strain limit. Using circular polarized light allows to drive spin-selective transitions between those states that might be advantageous for tuning the detunings and couplings. Additionally the Raman-transition within the Λscheme has to be carefully tuned to a single sideband, that is either the red or blue one, as otherwise the phonon dependence of the effective ground state transition is cancelled by interference of those two paths. For this task the high phonon frequencies of nanodiamonds are advantageous as they allow the single sideband addressing without stringent conditions on the Rabi-frequency of the coupling field. The gate interaction itself follows from a two-step process: First, the Raman transition provides a phonon dependent coupling within the ground state manifold. Second, this phonon dependent coupling can be used to implement a gate between two NV-centers by off-resonantly exciting the phonon state similar to the direct Mølmer-Sørensen approach. That way the excited state relaxation is suppressed by the off-resonance of the Raman transition, leading to an improved ratio between gate and relaxation time compared to a direct gate implementation on the ground-excited state transition as will be discussed below. Moreover, contributions of different excited states simply add up, therefore allowing for a straightforward integration of this effect into the formalism and not leading to an excitation of several states as might happen in the direct implementation.

Setup and first effective form
The setup for this gate is illustrated in figure 2 (a) where the dashed transitions are present only in the double-path (dp) setup, that leads to a complete σ x ⊗ σ x -type coupling in the ground state manifold, whereas they are absent in the single-path (sp) one resulting in a gate in the reduced manifold The corresponding Hamiltonian in a frame rotating with the laser frequency for NV center k can be written as with Ω 1 η k Ω 2 . Herein the first and last contribution form the phonon-dependent Raman transition consisting of a carrier and blue sideband excitation, respectively, provided that 1 Ω 1 and 2 Ω 2 (the same can be achieved with a red sideband interaction as well). The second contribution describes the unavoidable carrier excitation associated with the sideband term and is not required for the gate interaction itself, however cannot be neglected either. Higher order terms in η k have been omitted. For the double-path setup H dp k = H sp k + H sp k | |g +1 ↔|g −1 with the second contribution corresponding to the first one by replacing |g +1 by |g −1 and vice versa, i.e. both of the couplings Ω 1 and Ω 2 are present simultaneously on both transitions. Describing the offresonance ratio between driving fields and detuning by κ 1 1 (κ 1 = Ω 1 / 1 ηΩ 2 / 2 ) and noting that k ν, the optimal choice of parameters is given by Ω 2 ∼ κ 1 ν and Eliminating the off-resonant excited state results in the effective Hamiltonians H I,dp eff,k = δ dp and δ dp = δ µ=2 and δ sp = 1/2 δ µ=1 with k Ω 2 2 / 2 where the last term accounts for the coupling induced shift of the mode frequency with χ = 1/4 for the double and χ = 1/8 for the single-path configuration. The Pauli operators σ x , σ z and σ + are defined in the ground state manifold {|g +1 , |g −1 } andn denotes the phonon number operator. The terms involving phonon excitations correspond to paths involving both the sideband and carrier transition whereas noncombined paths lead to single transitions or AC Stark shifts not associated with an effective phonon (de-) excitation, respectively. Note that the last contribution in δ µ originates from the Ω 2 carrier interaction and can be significantly larger by a factor of 1/η than the preceding terms. Interestingly, taking into account the coupling to several excited states, contributions arising from |A 1 and |A 2 (as well as |E 1 and |E 2 ), both equal m s = +1 and m s = −1 superpositions, have opposite signs and therefore subtract. This limits the maximal amplitude of the uncorrelated flip contributions arising from carrier transitions to κ 2 1 ∆ with ∆ ∼ 4 GHz the excited states energy splitting, that can be considerably lower than expected from the coupling to a single excited state.

Second effective form and gate Hamiltonian
In a second stage we consider Hamiltonian (9a, 9b) for two NV centers (k=1,2) and choose ∆ = 1/κ 2Ω with κ 2 1 denoting the off-resonance of the corresponding transition. That way the phonon transition is only virtually excited which allows to obtain a second effective form corresponding to the final gate Hamiltonian. Herein the effective gate frequency is defined as Ω gate =Ω 1Ω2 /∆ ,δ dp k = δ dp For the double-path scheme this corresponds to a σ x ⊗ σ x -type gate rotating the states within the two-qubit manifolds . Additionally there exist uncorrelated single qubit flips between the ground state levels withδ dp k ∼ κ 1 ηΩ 2 that are by an order 1/κ 2 larger than the gate term itself (see figure 2 (b)). However both terms commute what allows to remove the single qubit flips by a simultaneous echo-π-pulse in σ z or σ y on both NV centers, leaving the gate interaction unchanged but adding a negative sign to the uncorrelated single-flip contributions. Interestingly, choosing σ y for that task offers the possibility to decouple the ground state system from decoherence processes as well, therefore extending the coherence time significantly. To conclude, a pure gate interaction can be achieved by adding any periodic pulsed decoupling sequence in σ y acting on an inter-pulse timescale larger than the one required for the effective Hamiltonian form to be valid, i.e. ∆t (∆ ) −1 ∼ (ηΩ 2 ) −1 . For a two-qubit π/2rotation this gate interaction including the echo-refocusing is illustrated in figure 3 (a). The single-path scheme behaves in a similar way, with the gate interaction restricted to a rotation in the M 1 manifold, more challenging with respect to the initial state initialization on the nanometer scale of adjacent NV centers that requires individual addressing. For the case of equal σ z -contributions, i.e. identical configurations on both of the NV centers, the gate and single qubit contributions commute again and the latter can be removed by an echo pulse and combined with decoupling sequences in σ x and σ y , analogue to the two-path situation. Uncorrelated transitions do not occur as they do not form sideband independent paths. Note that dependent on the parameters, the AC Stark shift contributions might be an obstacle in adjusting the off-resonance for the second effective gate Hamiltonian form. Therefore, choosing smaller Rabi frequencies (Ω 2 ∼ η κν) might be advantageous to suppress the predominant influence of the carrier term.

Comparison to a direct gate implementation
At this point it is interesting to compare the relevant timescales of this Raman-induced scheme to a direct gate implementation on the ground-excited state transition. For the direct gate implementation the conditional gate operation is directly implemented between one of the ground and the dissipative excited state, i.e. the Raman transition step is omitted. In the Raman-induced implementation the gate frequency ∝ κ 1 κ 2 (ηΩ 2 ) and has to be compared to the effective decay rate Γ eff = κ 2 1 Γ with Γ 15 M Hz the excited state decay rate [34] suppressed by the probability κ 2 1 of actually populating the excited level. Ω 2 is limited by the off-resonance to the carrier transition as described above. The corresponding ratio follows as Ω gate /Γ eff ∼ (κ 2 /κ 1 ) (ηΩ 2 /Γ). In contrast to that a direct gate implementation leads to a gate frequency Ω e↔g gate ∝ κ 1 η Ω with the same Ω limitations, that in this setup has to be related to the bare decay rate Γ, therefore Ω e↔g gate /Γ = κ 1 ηΩ/Γ. Note that this is by a factor κ 1 worse than for the Raman-induced gate scheme.

Time-conditioned gate
To improve this ratio larger κ 2 values are advantageous, corresponding to smaller offresonances with respect to the intermediate sideband states. Interestingly the condition κ 2 1 can be significantly relaxed for the double-path setup, noting that in this case the time evolution following from Hamiltonians of the form (9a) can be exactly integrated [33] as will be shown in Appendix C. An important prerequisite in that regime consists of compensating the phonon number dependent terms appearing in (9a) (the 'η 2 -terms') to ensure the commutativity of the gate relevant term (theΩ k -term) with the contributions that do not involve an effective phonon (de-) excitation (the δ k -contribution). Such a compensation can be achieved by the green compensation couplings illustrated in figure 2 (a) and also leads to significant improvements for the gate in the perturbative regime (10a), (10b) in cases when it is implemented not deep within the κ 2 1 limit. Replacing the condition ∆ 1 by the gate time condition t gate = m 2π/∆ (m ∈ N), the resulting evolution can still be described by Hamiltonian (10a), even this does not hold for intermediate time-steps. That way κ 2 = θ/(2π m) with θ the gate rotation angle taking the value κ 2 = 1/(2 √ 2) for creating a maximally entangled state (θ = π/2) with m = 2, providing that the phonon population refocuses before the intermediate echo pulse is applied (see figure 3 (b)). A similar gate that allows one to perform the gate in the non-perturbative regime despite maintaining its independence on the phonon state can be constructed out of the singlepath configuration by adding a continuous microwave driving within the ground state triplet manifold [31] and we will discuss that idea in Appendix D. Here it should be noted that the lifetime in nanocrystals is increased compared to the bulk counterpart due to the strong change in the refractive index and dielectric screening effects [21,35] with typical values [36,37] Γ ND 1/2 Γ = 7.5 M Hz. Therefore longer excited state lifetimes have to be expected what was not taken into account in figure 2 (c) as the exact magnitude is very sensitive to the substrate environment [35]. Provided that the Rabi frequencies can take the same values as in the bulk counterpart, a decay rate halved in value would double the ratio Ω gate /Γ ef f , therefore making the conditional gate operator less prone to spontaneous decay for a given diamond size. For a spherical diamond this would increase the maximal radius by a factor of √ 2.

Size limitation
The upper figures show the population of the states |g +1 , g +1 (blue), |g −1 , g −1 (red) and |g −1 , g +1 , |g +1 , g −1 (black). Dashed lines illustrate the interaction including the uncorrelated flips that are refocused by an echo pulse, whereas continuous lines focus on the pure gate interaction. The lower plots show the phonon population during the gate interaction that can be significant for the time conditioned (close-resonant) gate.

Influence of dipolar couplings
Up to now we neglected the influence of dipolar couplings, that is the optical dipolar coupling [38] on the ground-excited state transition as well as the magnetic equivalent [9,10] within the ground state manifold. Whereas this is a good approximation for specific configurations, e.g. an angle of 54.7 • of the axis connecting two NV centers to an equally oriented symmetry axis for which the dipolar couplings are exactly zero, for other configurations they can be of the same magnitude as the gate interaction itself. These couplings take values of j opt 2π · 52.4 M Hz (optical ground excited state coupling) contributing asj opt ∼ κ 2 1 j opt due to the excited state off-resonance, and j mag 2π · 104 kHz (magnetic ground state coupling) for an NV center distance of r = 10 nm. That is the effect of the dipolar interactions cannot be neglected in the general case. A detailed discussion about how to include the effect of dipolar couplings in the formalism is presented in Appendix E. As a result this requires to replace the detunings by k → k − j opt /2 and the gate frequency for the both-path setup by Ω bp gate → Ω bp gate − 2j opt − j mag /2 and Ω bp gate → Ω bp gate + j mag /2 for the M 1and M 2 -interaction, respectively, as well as Ω gate → Ω gate − 4j opt for the singlepath setup. Hereinj opt ∼ κ 2 1 j opt . Therefore, as long as j opt ν together with identical coupling configurations on both NV centers, both coupling mechanisms can be combined by taking into account the modified detuning configurations and adjusting k correspondingly.

Experimental implementation
In here we give a brief outline on how we believe the phonon coupling schemes could be realized in experiments. As mentioned earlier the scheme relies on the possibility of individually resolving single excited states, which requires to work in the low temperature regime (<10 K). A first step naturally consists in the characterization of the mode frequencies, that is, the measurement of an absorption or emission spectrum. As the typical THz phonon frequency range for nanodiamonds is well above the natural linewidth broadening (∼ 15 MHz), we expect a clear significance of phonon sidebands in the spectrum, that allows to determine the relevant frequencies along with a first estimation of the phonon coupling by fitting the data to an appropriate model (see e.g. [29]). Beside nanodiamonds on a substrate, recent experiments have demonstrated the levitation of nanodiamonds in optical dipole traps [39], which is promising in that the phonon mode spectrum can be expected to resemble more closely the one of a free particle as outlined in Appendix B. Initialization of the electron-spin state in the low temperature regime can be performed by optical pumping using resonant excitation techniques [40]. Alternatively, the Lambda scheme, which is also used for the Raman transition in the gate proposals of the previous sections, allows for the initial state preparation in a coherent population trapping configuration [26]. Readout can be carried out by resonant excitation (projective measurements) on a specific groundexcited state transition [40] after mapping the state from the {|+1 , |−1 } gate-manifold by means of a microwave π-pulse into {| ± 1 , |0 }, such as to obtain selective state specific transitions for the read-out process. Here we would like to point out that the individual read-out is challenging on the nanometer scale; however global fluorescence correlation measurements allow for a clear distinction between entangled and mixed states and while full quantum state tomography could be performed analyzing the different fluorescence levels obtained in global measurements [7], the measurement of a small number of observables sufficies to obtain very tight quantitative bounds on entanglement [41] and fidelities [42]. Nevertheless, by choosing a configuration of two NV-centers with a distinct symmetry axis orientation, provides, combined with a weak magnetic field, the possibility for individual microwave addressing within the spin-triplet ground state manifold. This also allows, beside the spectrum analysis, for a detailed analysis of the dipolar coupling using double electron-electron resonance techniques (DEER) [7]. Here it seems favorable to choose an axes configuration with a dipolar coupling as small as possible in order to make the phonon induced mechanism the dominant one. The gate scheme itself as described in the previous sections relies on purely global laser interactions that do not require individual addressability with the echo π-pulse implemented by either global microwave or carrier Raman laser couplings, respectively.

Summary
In summary we analyzed the coupling of nitrogen vacancy centers to long wavelength acoustical phonons, a mechanism capable of mediating gate interactions between NV centers for nanodiamond sizes of several tens of nanometers. Exploiting the existence of a Λ-scheme in the low temperature and strain limit, fully noise decoupled two qubit gates can be constructed within the ground state manifold, even in the presence of dipolar couplings. This might be interesting for the creation of entangled states but also for manipulating the phonon mode itself, that is the control of the motional degrees of freedom, e.g. the cooling of vibrational modes. Moreover, the realization of entangled states in nanodiamonds could have crucial application for future sensing protocols.
Author's note: While finalizing this manuscript we became aware of a similar investigation [43] studying phonon induced spin-spin interactions in diamond nanobeams.

Acknowledgements:
We thank M. Aspelmeyer for discussions that contributed to the conception of this project and A. Imamoglu for discussions at early stages of this project. This work is supported by the Alexander von Humboldt Foundation, the BMBF Verbundprojekt QuOReP (FK 01BQ1012), a GIF project and the EU Integrating Project SIQS and the EU STREP project EQuaM.
Appendix A. Calculation of the electron-phonon coupling [22,23] The effect of strain induced by phonons, i.e. the electron-phonon coupling, can be calculated by analyzing the effect of a change in the intra-atom distance on the Coulomb coupling interaction equation (1). Assuming that this displacement is small, a realistic assumption by restricting the analysis to the long wavelength acoustical modes or more precise to the one defined by k l = n 2 π with n = 1, for which the wavelength is given by the length scale of the diamond crystal, it suffices to expand the coefficients h ij to first order in the atom displacements u i out of the equilibrium positions r 0 i (i.e. r 0 i → r i = r 0 i + u i ). With the additional spherical symmetry assumption that the energy will just depend on the absolute value of the relative two atom displacement, that is where in the second line we introduced the displacement tensor e µν = ∂u µ /∂r ν and the derivation is evaluated at the equilibrium position of the atoms. Inserting the explicit expressions for the positions r 0 i as defined in figure 1 (a) and using the expressions of the electronic states M el in terms of the dangling bonds orbitals M db , the strain Hamiltonian (2) can be rewritten as δV M el = −2 ζ e xx |e x e x | − 2 ζ e yy |e y e y | − 8 β 2 ζ e zz |a 2 a 2 | − ζ (e xy + e yx ) (|e x e y | + h.c.) .
Herein we defined with the index C referring to the coupling for two carbon atoms and the difference for the carbon-nitrogen case is accounted for by the factor β. The quantity q denotes the next neighbour distance in the diamond lattice and is equal to q = 3/8 ( r 0 i − r 0 j ). Moreover couplings between |a 2 and |e x,y levels have been neglected, justified by the large energy separation. Those would correspond to strain induced transitions between the ground and excited states of the NV center and consequently do not play a significant role. Note also that the energy level |a 1 has been neglected as it is delocalized in the valence band and does not contribute to the properties of the NV center energy level structure.
Out of equation (A.3) the impact of strain on the NV center energy levels can be calculated by using the expression of the energy levels in terms of the electronic states M el as provided e.g. in [22]. For the 'two hole' description the strain perturbation Hamiltonian takes the form with 1 el the identity on M el and 1 spin the one on the spin degrees of freedom. Projecting Hamiltonian (A.5) on the NV center energy level states finally leads to H gs el−phon = 2 δ 1 (|g 0 g 0 | + |g +1 g +1 | + |g −1 g −1 |) (A. 6) for the ground state electron spin triplet states and for the excited state triplet states in the basis In the case of phonons the displacement can be expressed in terms of Bloch type wavefunctions, such that for a specific mode α, and noting that in the framework of classical elasticity theory applicable in the long wavelength limit the microscopic structure can be replaced by a continuous displacement field ( r i → r), the displacement takes the form with M the total mass of the system, e (α) the eigenvector of mode α, k the wavevector and ν(k) the angular frequency and a α , a † α the mode annihilator and creator operators, respectively. With the definition of e µν = ∂u µ /∂r ν and using that k r 1 this results in equation (4) in the main text, that together with the electron-phonon coupling Hamiltonians (A.6) and (A.7) complete the analysis of the phonon influence on the NV energy level structure.

Appendix B. Phonon-Coupling in the Elastic Sphere Model
In here we will reconsider the NV-phonon coupling (deformation potential coupling) for the special case of a sphere subject to stress-free boundary conditions. Confinement effects in such finite systems lead to a modification of the phonon modes compared to the periodic boundary condition analysis provided earlier in section 2. This modification depends on the shape and size, or more precise on the boundary conditions of the particle under consideration.
The acoustical vibrations u of a homogeneous, free spherical elastic body in the framework of continuous elasticity theory can be described by and has been first studied by Lamb [44]. Herein λ and µ are the Lame's constants that describe the material-dependent elastic properties and are related to the transverse and longitudinal speed of sound in diamond by v t = µ/ρ = 1.283 · 10 4 (m/s) and v l = λ + 2µ/ρ = 1.831 · 10 4 (m/s), respectively, with ρ = 3.512 g/cm 3 the mass density [45]. This continuous elastic body model has been successfully applied to describe the phonon properties of nanoparticles as validated by numerous experiments, and forms a good description as long a the particle size is not too small, that is, as long as the phonon wavelength is much larger than the interatomic distance to allow for the homogeneous continuum description.
The equation of motion (B.1) can be solved by introducing a scalar φ ∼ ψ lm (hr, Ω) and vector potential A ∼ r ψ lm (kr, Ω) =ê r r ψ lm (kr, Ω) with ψ lm (kr, Ω) = j l (kr) Y lm (Ω), wherein j l (kr) and Y lm (Ω) are the l-th order spherical Bessel function and the (l,m)spherical harmonics, respectively [46,47]. The corresponding displacement modes follow as derivatives of those potentials, with the scalar potential describing compressive (longitudinal) and the vector potential shear (transverse) waves, and can be classified into torsional u l,m,n tor and spheroidal modes u l,m,n sph (see figure B1 (b)). Herein the orbital quantum number l and its z-component m with |m| ≤ l characterize the mode's spherical symmetry whereas n refers to the specific overtone. Imposing stress-free boundary conditions results in an eigenvalue equation that allows to determine the k− and hvalues along with the mode frequencies.
The torsional modes are characterized by a purely transversal character without radial displacement, that is the sphere volume remains unchanged under these vibrations with N tor l,m,n describing a normalization constant. The eigenvalue equation in that case takes the form with the definition χ = k R and R being the sphere radius. Note that this equation is independent of the particle's elastic constants and is therefore of universal character. In contrast to that, spheroidal modes exhibit a mixed longitudinal and transversal character and are described by with the coefficients p l and q l following out of (χ = k R, ξ = h R)
The mode eigenvalues χ calculated that way for diamond are depicted in figure B1 (a) together with the corresponding frequency ν for a diamond of 10 nm in diameter, the latter following from the relation Note that these modes are degenerate in m for a perfect spherical symmetry. Out of (B.6) it follows that the frequency scales as the inverse of the radius, that is it exhibits the same scaling as the one obtained from the periodic boundary condition calculation. Moreover we show in figure B1 (c) that the magnitude for the lowest breathing mode (l=0) is in good agreement with the lowest mode obtained in the periodic approach, that will turn out to form a promising mode for the NV-center coupling. We will now proceed by calculating the coupling (deformation potential coupling) to these elastic sphere modes. In a second quantized form the displacement takes the form [48] u(r, Ω) = and H tor el−phon 0 for the torsional modes, with h l,n = (v t /v l ) (χ l,n /R). This is in accordance with the dominant contribution expected for the general case of a deformation potential coupling [48]. From (B.8) it follows that only the longitudinal displacement contributes significantly to the NV-center phonon-coupling mechanism and therefore the coupling to torsional modes can be neglected. It should be noted that H el−phon ∝ N sph l,m,n h l,n / √ ω lmn ∼ R −2 scales quadratically with the inverse particle radius and therefore the coupling factor η as defined via (5) scales as η ∼ R −1 , thus exhibiting the same scaling as already obtained in section 2. We verified that scaling behavior in figure B1 (c) for the breathing mode (l=m=0) and the lowest frequency mode (l=2,m=0,±1). The breathing mode coupling constant η matches fairly well the expectation calculated by using periodic boundary conditions. Here one should take into account as well the radial position dependence that is illustrated in more detail in figure B1 (d): Only the l = 0 modes have a non-vanishing coupling around the particle center (r=0) whereas modes with l = 0 exhibit an increasing region of vanishing coupling for increasing l around the particle center (for a fixed overtone number n). Combined with the fact that NV-center near the surface are less stable and additionally are more prone to decoherence, the l = 0 breathing modes can be considered as the promising modes to obtain phonon-coupling, in particular the low frequency n = 0 mode that shows the most uniform coupling achievable throughout the possible NV-center positions within the crystal. However one should keep in mind that the coupling to other modes is, dependent on the specific position of the NV center within the diamond, not necessarily weaker than the coupling to l = 0 and in figure B1 (c) this behaviour just arises from the fact that for the l = 0 mode the NV-center is assumed to be at the center (the position of maximal coupling) whereas for the l = 2 modes the reasonable assumption r = R/2 has been chosen, that does not correspond to the maximal coupling position which in fact would be given by r = R. As a general behaviour higher overtones, also known as inner modes, are accompanied with a decreasing phonon coupling layer around the surface (as can e.g. be seen in figure B1 (d)left and a similar behaviour would be observed for l = 0 modes); these modes depend only weakly on the specific boundary conditions.
In summary, the general scaling and magnitude of the phonon coupling obtained previously in section 2 by assuming periodic boundary conditions ('infinite crystal approximation') is in good agreement with the results obtained by assuming a confined finite spherical particle. This has indeed to be expected as the scaling properties arise from universal dimensionality arguments as e.g. the mode normalization factor, thus allowing a rather simple estimation of the coupling properties and strength even for different 'shapes' (dimensionality) in the periodic boundary model. However the exact microscopic spatial coupling, e.g. the spatial distribution of the coupling parameter, depends on the explicit mode properties for which the particle confinement and shape become crucial and in fact the low frequency modes (n=0 modes) are most sensitive to a change in these surface properties.

Appendix C. Exact integration of the gate Hamiltonian
The time evolution following out of Hamiltonian (9a) can be integrated exactly for commuting state operators (e.g. for the double-path gate setup), what allows to overcome the κ 2 1 limit by identifying appropriate time conditions. We will consider the generic form Using the properties of the displacement operator D(α) = exp(α a † − α * a), the time evolution of (C.1) follows as [33] Noting that the phonon dependence only appears in the displacement operator, it can be eliminated by choosing the gate time t gate such that ∆ t gate = m · 2π with m ∈ Z in which case α(t gate ) = 0. Implying that condition is fulfilled, the total evolution corresponds exactly to the one out of (10a) (up to local contributions of the first order effective Hamiltonian and global phases), i.e.
Appendix D. Microwave assisted gate A gate interaction with similar properties as the double-path gate discussed in the main text can be constructed out of the single-path setup combined with a continuous microwave driving of the states |g 0 ↔ |g +1 and |g 0 ↔ |g −1 (see figure D1 (a)). Mainly this allows to perform the gate in the non-perturbative regime implying a time condition to assure the independence of the phonon state. As a side effect such a driving also decouples the gate from ground state decoherence.  Considering the single path setup alone, the gate term in (9b) includes the operators In that case a closed phase space trajectory, that is a refocusing to the initial phonon state at a specific time, is prevented by the rotation around two orthogonal axes (σ x and σ y ) [31]. In a more formal way the non-commutativity of σ x and σ y prevents the exact integration of the gate Hamiltonian as described in Appendix C. Now adding a continuous microwave driving such that the σ y contribution is suppressed, removes those difficulties, and additionally leads to a (σ x ⊗ σ x )-type gate similar to the doublepath gate proposal that rotates states both within M 1 and M 2 . One possibility to achieve that task would be to continuously drive the states |g +1 ↔ |g −1 [31,32], what however is not really practicable for our setup. Therefore we will incorporate the full ground state triplet and show that a driving of the form |g 0 ↔ |g ±1 will be suitable for this task as well. In the following we will refer to the qubit operators within {|g +1 , |g −1 } by the indices 'pm' and denote the Pauli spin-1/2 operators in that manifold as {σ pm x , σ pm y , σ pm z , 1 pm } whereas we will denote the spin-1 operators in the ground-state triplet manifold as {S x , S y , S z , 1}.
The Hamiltonian of the total system including the microwave driving H M W is given by where we assumed that the η 2 -terms have been successfully compensated to avoid the existence of phonon-number dependent terms (see green laser coupling in figure D1 (a)). This describes the microwave driving, the global ac-Stark shift of the |g ±1 states with respect to the |g 0 state, the relative shift of the |g ±1 states and the gate relevant term, respectively. Note that the global H 1pm term can be neglected if the microwave frequency is already tuned to the ac-Stark shifted states.
In an interaction picture with respect to the continuous microwave driving and assuming that Ω M W {Ω, δ sp , δ 1 }, the following substitutions can be made by neglecting fast rotating terms ('rotating-wave approximation') Therefore, in that frame, the Hamiltonian takes the form

(D.5)
That way the σ y -contribution is suppressed, all contributions commute, ideally the state |g 0 is never populated in the interaction frame and importantly the operators appearing in the gate Hamiltonian part (the second term in (D.5)) commute what allows to integrate the time evolution exactly as described in Appendix C (corresponding tô O = 1/2 (3/4 σ pm x − 1/4 1 pm ) in equation (C.3)). Note that, as already discussed in the double-path setup, the uncorrelated single flip interactions can be removed by a single echo π-pulse in σ pm y or σ pm z . Therefore, including an echo-pulse the effective time evolution is exactly given by if the time condition ∆ t gate = n (2π) (n ∈ N) is fulfilled. That is, for the optimal choice with respect to the gate time n = 2 (n = 1 cannot be realized due to the required echo pulse) this leads to κ 2 =Ω/∆ = (4/3) 2 θ/π for performing a two qubit rotation Ω gate · t gate = θ. Here it is interesting to note that the (maximal) ratio between the gate speed and the effective excited state decay rate is independent of the absolute values of the laser Rabi frequency (independent of κ 1 = Ω 2 /ν = Ω 1 / 1 as defined in the main text) and therefore the condition Ω M W Ω does not alter the maximal nanodiamond size limitations as depicted in figure 2. However the absolute magnitude of the gate speed (∼ κ 2 1 ) decreases with a decreasing microwave field, such that other limitations as the T 2 -time of the ground state triplet states can replace the effective excited state decay rate as the limiting quantity.
Note also that the continuous microwave driving decouples the gate interaction from ground state decoherence, an effect that can be modelled as a fluctuating energy shift H decoh = δ(t)/2 S z and is suppressed according to equation (D.4), such that the limiting quantity will be given merely by T 1 for a strong driving (for a more complete discussion of the decoupling method we refer to [9]). As a final remark, the interaction frame with respect to j=1,2 H (j) M W can be implemented using the pulse sequence (see figure D1 (c)) with H given by (D.2) and U int = exp(−i π S e ) and S e ∈ {σ pm x , S y , S z } (noting that interpretation as a coupling to the dressed states of the optical dipolar interaction (see figure E1). Additionally direct off-resonantly suppressed dipolar interaction terms of the orderj opt ∼ j opt κ 2 1 appear in the Hamiltonian together with the magnetic equivalents. Interestingly, for the case of equal couplings on both NV centers, only couplings to the |+ dressed state occur (except for the state independent phonon terms described by Ω a ), explaining why the detuning k + (j opt /2) is not a relevant quantity in the effective form (E.4). This is originated in the fact that |− couplings cancel by interference due to the different sign of both paths as illustrated in figure E1. However note that this is only valid for identical coupling configurations on both NV centers, otherwise replacements of the form (here for k=1, upper indices refer to k) 1 Ω 1 Ω 1 Ω are required in δ dp k ,Ω k and Ω a , taking into account couplings to both dressed states with the first part describing the dipolar independent and the second one the dipolar induced contributions. The terms inj opt always involve terms of both k = 1 and k = 2 such that the replacements are obvious in that case.
In the absence of the magnetic dipolar term in (E.4) all terms commute and therefore the optical dipolar contribution just adds to the phonon induced gate interaction that can be derived from the first two terms as in (10a). A small modification of this behaviour arises from the magnetic dipolar contributions, that do not commute with the σ x -terms in (E.4). However due to j mag (δ dp ,Ω) the magnetic dipolar term contributes to (E.4) in the secular approximation as that commutes with all other contributions. Thus, the second effective Hamiltonian form analogue to (10a) is given by wherein the single flip-contributions (the first term) can again be removed by an echo pulse and all contributions commute. The σ 1 z σ 2 z -contributions just correspond to a global phase in the M 1 and M 2 -manifolds such that in total the gate frequency is given by Ω gate = −Ω 2 /∆ + 2j opt + j mag /2 for a M 1 rotation, and Ω gate = −Ω 2 /∆ − j mag /2 for a M 2 -rotation, respectively.
A similar analysis can be carried out for the single-path setup leading to the same replacements of the detunings by → − (j opt /2) in the uncoupled form (9b) and direct dipolar contributions as in (E.4) with In total the gate interaction arises from replacing Ω gate by Ω gate − 4j opt in (10b) beside the detuning replacements.
|− (+1) = 1 √ 2 (|g +1 , e − |e, g +1 ) ) and < 0 (red detuned). Blue lines denote couplings that are independent of the dipolar interaction, whereas red dashed ones describe dipolarly induced processes. Upper indices denote the corresponding NV center and lower ones refer to a specific coupling type. Paths related to the |− (+1) state cancel by interference due to the negative sign if Ω

Appendix F. Mode decay and excited state decay
In here we will briefly comment on how the spontaneous decay and the coupling among vibrational modes can be incorporated into the gate formalism. Both effects can be described using a master equation approach. For the spontaneous excited state decay this takes the form ∂ρ ∂t sp. dec = Γ 2 (2σ − ρσ + − {ρ, σ + σ − }) (F.1) with σ + = |e ( g +1 | + g −1 |) and σ − = (σ + ) † , Γ the excited state decay rate, {a, b} = ab + ba denoting the anti-commutator andσ ± = σ ± e ±i η(a † +a) . The phonon mode relaxation can be modelled as [27] ∂ρ ∂t moderelax. = ν Q (n th +1) a ρ a † − 1 2 a † a, ρ + ν Q n th a † ρ a − 1 2 a a † , ρ (F. 2) with ν the mode frequency as defined in the main text, Q the corresponding mode quality factor and n th the thermal phonon population. Both of those equations hold on the level of the original Hamiltonian after the Schrieffer-Wolff transformation, i.e. in the same frame as Hamiltonian (8).
The elimination of the excited state manifold, i.e. calculating an effective Hamiltonian as in (9a, 9b), corresponds to a unitary transformation T [54] such that H I eff = P g T H T † P g with P g the projector on the ground state manifold and H(t) the original Hamiltonian as e.g. given by (8). Up to second order the transformation is given by [54,55] (for the case of a Hamiltonian that is purely off-diagonal, e.g. only includes terms that couple the ground to the excited state) with H(t ) the ground-excited state coupling Hamiltonian (H sp k (8) for the single path setup or H dp k for the double-path equivalent), and the effective Hamiltonian follows as and a † eff,I = (a eff,I ) † . Therefore the evolution in the first effective frame is correctly described by replacing the operators appearing in the master equations (F.1) and (F.2) with the effective ones as defined in (F.5,F.6). Additionally one has to account for the ground state dephasing by either a stochastic approach or a master equation approach in the Markovian limit [9], unchanged by the unitary transformation for the first effective form, as T is diagonal by construction within the ground state manifold.