Pure down-conversion photons through sub-coherence length domain engineering

Photonic quantum technology relies on ef ﬁ cient sources of coherent single photons, the ideal carriers of quantum information. Heralded single photons from parametric down-conversion can approximate on-demand single photons to a desired degree, with high spectral purities achieved through group-velocity matching and tailored crystal nonlinearities. Here we propose crystal-nonlinearity-engineering techniques with sub-coherence-length domains. We ﬁ rst introduce a combination of two existing methods: a deterministic approach with coherence-length domains and probabilistic domain-width annealing. We then show how the same deterministic domain-ﬂ ip approach can be implemented with sub-coherence-length domains. Both of these complementary techniques create highly pure photons, outperforming previous methods, in particular for short nonlinear crystals matched to femtosecond lasers.


INTRODUCTION
The generation of pure single photons on demand is a key requirement for photonic quantum technologies. Single-photon sources can be approximated using either nonlinear parametric processes or single quantum emitters such as quantum dots [1][2][3]. The most widely used technique for generating single photons is parametric down-conversion (PDC) in nonlinear optical crystals [4][5][6][7]. In a PDC process, a pump laser is sent through a crystal with large χ (2) nonlinearity, where χ (2) is the second order nonlinear electric susceptibility. A pump photon has a small probability of being annihilated while creating pairs of "down-converted" single photons under conservation of energy and momentum. Due to the spontaneous nature of photon-pair creation, the complete generated state is described as a superposition of a large vacuum component, the desired photon pairs, as well as typically undesired higher-order photon-pair events [8,9].
PDC can be used as a heralded source of single photons, in which one photon of a pair is sacrificed in detection to herald the presence of a photon in the other mode. The heralded photon generally emerges in a spectrally mixed state due to strong (anti)correlation in the joint spectrum of PDC photon pairs resulting from the conservation of energy and momentum. To create pure photons, we therefore need to eliminate correlations without introducing mixture in other degrees of freedom.
The easiest way to reduce spectral correlations is to employ narrowband spectral filters. This approach, however, severely compromises the heralding efficiency (i.e. the probability of detecting a photon knowing that the other photon of the pair has been detected) of the source even with ideal filters [10] and decreases the absolute flux of the heralded photons. At high pump powers, narrowband filtering can also introduce mixing in the photon-number degree of freedom [11].
A lossless method to remove frequency correlations is to use group-velocity matching (GVM) [7,[12][13][14]. Starting with a set of desired pump and down-conversion wavelengths, one can find phasematching conditions in certain crystals that allow the inverse group velocity of the pump laser in the nonlinear crystal to either match one of the PDC photon's inverse group velocities or to match the average of the two PDC photons' inverse group velocities. This erases the timing information that otherwise arises between the pump and PDC photons which in consequence leads to more separable joint spectra and high purity photons. GVM can be achieved both in bulk crystals and in periodically poled crystals, where the crystal's axes are flipped every coherence length (i.e. the distance over which the phase between the pump and the PDC photons changes by π) to fulfil the quasi-phase matching condition [15], resulting in a periodical configuration of domains with alternating orientation.
where "p" labels the pump field and "s" and "i" label the two down-converted "signal" and "idler" photons respectively, and T is the time-ordering operator [25][26][27][28]. We call φ(∆k(ω s , ω i )) the phase matching function and define it below. The constant O is related to the transverse modes of the photons [24], and B = χ (2) 0 (hω 0p ω 0s ω 0i / 0 π 3 c 3 n p n s n i ) 1/2 /2, where ω 0,j (with j = p, i, s) are the central frequencies of the fields, n j are the corresponding refractive indices and χ (2) 0 is the crystal's second order nonlinearity. There exist a number of related, but slightly different, definitions of the phase matching function in the literature.
We define it as: where ∆k(ω s , is the phase mismatch, which depends on the material dispersion, and g(z) = χ (2) is the normalised nonlinearity along the crystal which will be effectively "tailored" via domain engineering. We note that g(z) is a real function. When defined as above, the PMF for two consecutive domains is the sum of the PMFs for the individual domains 1 (this definition has units of length, while in other popular definitions, the PMF is dimensionless).
The joint spectrum of the down-converted light is given by the product of the PMF and the pump envelope function. We therefore define the joint spectral amplitude as f (ω s , ω i ) ≡ φ(∆k(ω s , ω i ))α(ω s + ω i ). For the purposes of heralding single photons, we can restrict our analysis to the two-photon state given by the first-order term in the Dyson series expansion of Eq. (1). The spectral purity of the heralded single-photon state is defined as P The purity is therefore limited by correlations in the joint spectral amplitude. We thus seek a separable joint spectral amplitude, that is, a function of the form f (ω s , ω i ) = q(ω s )r(ω i ). For a symmetric joint spectral amplitude, the standard approach is to first choose group velocities that make φ(∆k(ω s , ω i )) intersect α(ω s + ω i ) at right angles (when plotted as a function of ω i and ω s ), then match the widths of the pump envelope and phase matching functions [29]. This removes most of the correlations, but those that are due to the shape of the PMF remain. Spectral filtering can be used to remove the final correlations, but very high spectral purities can only be achieved at the expense of the heralding efficiency. To achieve high purities and high heralding efficiencies, the phase matching function must be suitably engineered.

ENGINEERING THE PHASE MATCHING FUNCTION
In a periodically-poled crystal, g(z) alternates between +1 and −1 every coherence length: for a large number of periods g(z) can be approximated as constant and the PMF is consequently proportional to the sinc function.
Aperiodic poling is also possible. Although such techniques still constrain g(z) to values of +1 and −1, the non-trivial structure makes it possible to shape the effective nonlinearity of the crystal and consequently to customise the PMF. In particular, non-trivial poling can make the PMF approximate a Gaussian function, which is necessary to make f (ω s , ω i ) separable.
Domain engineering methods have long been studied in nonlinear optics, for example to compress or shape pulses in second harmonic generation [22,23,[30][31][32][33]. These techniques have only recently been adapted for single-photon generation. In this context, existing methods for non-trivial poling fall into two categories. Those that vary the domain widths of a predefined poling configuration [17], and those that keep the domain widths equal, but vary their relative orientations [16,18,19]. All methods in the latter category have thus far 1 Take for example a block of two domains. If the normalized nonlinearity function for the ith domain (i = 1, 2) is g i (z), then the normalized nonlinearity function for both domains is g 1,2 (z) = g 1 (z) + g 2 (z). According to our definition, the PMF for both domains is then φ 12 (∆k) = +∞ −∞ g 12 (z) e i∆kz dz = +∞ −∞ g 1 (z) e i∆kz dz + +∞ −∞ g 2 (z) e i∆kz dz = φ 1 (∆k) + φ 2 (∆k), where φ i (∆k) is the PMF for the ith domain. Take on the other hand another popular definition for the PMF, φ other (∆k) = L −1 +∞ −∞ g(z) e i∆kz dz. The function φ other (∆k) is dimensionless, but in this case φ other used domains equal in width to the crystal's coherence length, where the coherence length c is defined in terms of the phase mismatch ∆k(ω s,0 , ω i,0 ) at the central PDC frequencies as follows: c = π/∆k(ω s,0 , ω i,0 ). In the following, we describe two methods for engineering g(z). Both use the method recently introduced by Tambasco [19] as a starting point, but deviate from the method by allowing domain widths smaller than the crystal's coherence length. This move toward subcoherence length structures allows much greater accuracy in tailoring the phase matching function.

Deterministic domain engineering with fixed domain widths
We begin by summarising the method introduced by Tambasco et al. [19]. In this method the width of each domain in the grating is fixed, and the nonlinearity profile is shaped by choosing the orientation of successive "poling blocks" along the crystal, starting with the crystal's input face (the term "poling block" refers to two consecutive domains of fixed length w = c , where c is the coherence length and w is the domain width). The decision to flip (or not flip) a given block is determined by which option gives a closer approximation to the target signal and idler joint field at that point in the crystal. The normalised field inside the crystal at position z is defined as where we have omitted the amplitude prefactor [34]. The goal is to design a crystal with a g(z) that gives the desired PMF according to Eq. (2). We therefore define a g target (z), which in turn defines a target field amplitude It is sufficient to match the the amplitude at a single value of ∆k. At ∆k = π/ c , the propagation of the pump for a distance equal to one domain width changes the real part of the generated field amplitude by a factor ∆A = ±2w/π, where the sign depends on the overall previous domain configuration in the crystal. The imaginary part of the field is always equal to zero at domain boundaries. The poling pattern can be chosen so that the generated field amplitude A(z, π/ c ) along the crystal fits as well as possible the target field amplitude A target (z, π/ c ). For simplicity of notation we will refer to a domain pointing up by "up" and to a domain down by "down".
The poling algorithm proposed by Tambasco et al. involves three different building blocks which are chosen to track the target field amplitude: an up-up block leaves the average field amplitude unchanged, an up-down block increases the average field amplitude by a factor 2∆A, while down-up decrease the average field amplitude by 2∆A. These three possible configurations are graphically represented in Fig. 1 (a).
Tambasco et al. framed their algorithm in terms of blocks to ensure that the inverted regions were of equal width: this is not a necessary requirement-one might just as well choose each individual domain's orientation. We therefore consider the "domain by domain" error function which quantifies the difference between the generated field amplitude at a certain position z and the target field amplitude at z + w (i.e. after one domain). Whether to use a domain up or a domain down is determined by the following conditions: • if e(z + w) ≥ 0 and A(z, π/ c ) ≥ A(z − w, π/ c ) (i.e. in the previous domain the field amplitude was increasing), flip the domain orientation with respect to the previous domain: in this way the amplitude field will continue to increase; • if e(z + w) ≥ 0 and A(z, π/ c ) ≤ A(z − w, π/ c ) (i.e. in the previous domain the field amplitude was decreasing), keep the same orientation of the previous domain: in this way the amplitude field will start increasing; • if e(z + w) < 0 and A(z, π/ c ) ≥ A(z − w, π/ c ), keep the same orientation of the previous domain: in this way the amplitude field will start decreasing; • if e(z + w) < 0 and A(z, π/ c ) ≤ A(z − w, π/ c ), flip the domain orientation with respect to the previous domain: in this way the amplitude field will continue to decrease.
With this technique, it is possible to generate a wide range of PMF shapes, as long as the corresponding field amplitude does not vary too quickly [19]. The computational runtime of this algorithm is minimal (generally of the order of a few seconds).
(a) (c) As discussed above, the case of a Gaussian modulation proportional to g(z) = exp(−(z − L/2) 2 /2σ 2 ) cos((π/ c )z), where L is the length of the crystal, is the most interesting for engineering a pure heralded photon source. Ignoring the quickly oscillating terms, the resulting target field amplitude is where L is the length of the crystal, σ is the width of the Gaussian nonlinearity profile and c is a scale factor. As suggested in [19], a value of σ ≈ L/4 allows to reach high single-photon purity without significant degradation in the source brightness. Choosing the prefactor c = 2/πσ guarantees the optimal PMF: a higher value would compromise the purity because the target amplitude's gradient is too high to be efficiently traced by the actual field while a lower value would reduce the effective nonlinearity of the crystal. Notice that the target field amplitude in Eq. (7) is purely imaginary, while the method described above allows only the real part of the field amplitude to be customised. In principle, it is therefore impossible to use this technique to approximate Eq. (7), but in practice, we can approximate Eq. (7) up to a phase factor of i. This results in a π/2 phase shift in the exponent of Eq. (1), but when considering just the two-photon term, it can be regarded as an irrelevant global phase. Fig. 1 shows a comparison between the algorithm in [19] and our modified algorithm described above. The modified algorithm yields a closer approximation to A target . Consequently, the PMF will be closer to the target and, in the case of the Gaussian shaping, it will lead to higher purities.

Domain width annealing
The previous methods consider the case of a poled crystal with constant domain width, which allows to easily approach the problem. However, there is no evidence to suggest that a fixed-domain structure leads to an optimal result, and it is reasonable to ask if it is possible to improve the PMF shape by slightly varying the width of each domain. To this aim we use an adapted version of the simulated annealing algorithm introduced by Reid et al. in [22,23]. Annealing algorithms are commonly exploited for finding a global minimum of a given function dependent on multiple parameters [35,36] by slightly perturbing the system from a suitable starting point, calculating the relative cost function (commonly called energy in analogy with the internal energy in a physical annealing process) and accepting the change with some probability-the higher the energy the lower is the probability of accepting the new configuration. These kinds of algorithms are probabilistic and may require several runs to get an optimal result. Given a target PMF φ target (∆k), we first find the initial configuration of domain orientations by means of the method described above. Adjacent domains with the same orientation are grouped together into bigger blocks. Secondly, we define a starting temperature T and a temperature step ∆T for the annealing algorithm: we found that a value of T {0.1; 10} and ∆T = T /100000 lead to slightly better results. The width of each block is then perturbed by up to 1% and the relative PMF is computed. The perturbation value 1% is empirically determined: too small values would lead to slow convergence of the algorithm while if the perturbation is too big the algorithm is unstable and doesn't converge to a correct result. It is then possible to find the system energy defined as If this energy is smaller than the minimum energy recorded so far the new domain widths are recorded as the best configuration, and they are accepted with a probability of exp (−E/T ) (even if they are not the best configuration). Finally, the temperature is decreased by ∆T and the algorithm is repeated until the energy becomes smaller than a chosen threshold or the temperature reaches 0. A more detailed description of the algorithm is provided in Appendix A. This algorithm is computationally demanding and may require a few hours for converging to a solution for a crystal having a large number of domains.
One may think to directly use the annealing procedure without using a pre-processed initial configuration as a seed: however, our annealing algorithm doesn't allow the flipping of domain orientations, and thus a near-optimal result cannot be obtained without a suitable starting configuration. Moreover, it's reasonable to think that the method discussed above provides a good initial domain-orientation because we know that the generated amplitude has to fit the ideal case of Eq. (7).

Domain engineering with arbitrary small sub-coherence length domains
Until now we have considered domain-widths equal to the coherence length with small variations of this configuration. In this section we will see that pushing the algorithm beyond this constraint by allowing a finer discretisation of the domain-structure leads to an even better approximation of the target function.
According to Eq. (5), we consider the field amplitude at the end of m-th domain: where w is the domain width and s n = ±1 is the orientation of the n-th domain. Note that A m ({s n } m n=1 , π/ c ) depends on the orientations of all domains that come before it (but not those that come after).
For domain-widths equal to the coherence length, the imaginary part of the field is always zero at the domain boundaries, and therefore it always oscillates about zero (see Fig. 1). If the domain-widths differ from the coherence length, however, the phase might get flipped at a place where the imaginary part is non-zero, providing control over both the real and imaginary parts of the amplitude. With this modification, it is now possible to approximate complex field amplitudes. In particular, the field in Eq. (7) can be approximated up to the correct phase. But to maintain consistency, we will continue to work with the real target function used in the previous section.
To account for the complex nature of the field amplitude, we define a cost function that we want to minimize at each domain: Since the target function has a zero imaginary part, the algorithm with the new cost function will force the imaginary component close to zero.
The algorithm can be summarised as follows. First define a domain width w, the coherence length c , and the number of domains N . Then, starting from a crystal having just one domain, compute the cost function e up for the case where a domain up is added and the cost function e down when a domain down is added. Next, compare the two cost functions: if e up < e down keep the configuration where the up domain was added, otherwise keep the configuration where the down domain was added. Repeat for each subsequent domain. A more detailed description of the algorithm is provided in Appendix A. The running time of this algorithm can vary from a several seconds to several minutes depending on the length of the crystal and on the domain-widths.

Comparison of the algorithms
In this section we benchmark our domain engineering techniques in a realistic scenario. We consider a short potassium titanyl phosphate (KTP) crystal because it can be matched with femtosecond lasers [29], and its optimal GVM condition (pump wavelength of 791 nm and signal/idler wavelengths of 1582 nm) produces photons compatible with telecom technologies.
We tailor a short KTP crystal (∼ 2 mm) with a Gaussian spectral response using three different domain engineering algorithms: the method by Tambasco et al. [19] and our two new methods. The comparison results are presented in Fig. 2. In this regime, the single photon purity increases from a value of 85.4% for a standard periodically poled crystal to 97.3% for the Tambasco algorithm, while a 99.0% photon purity is observed for the width annealing and a value of 99.4% is achieved for arbitrary small sub-coherence domains. A practical limitation for sub-coherence length structures is that poling domains cannot be smaller than a couple of microns. The precision to which such domains can be grown is however very high. These practical aspects and potential tradeoffs coming from uncertainties in domain wall placement and diffusion will be addressed in future work. Here we limited our simulations to domain widths of one tenth of the coherence length. Phase matching functions (first column) in KTP crystals and corresponding joint spectrum and field amplitude along the crystal (second and third columns) in type-II PDC for a short crystal pumped with a Gaussian pulse centred at 791 nm. The joint spectrum and the amplitude along are normalised with respect to φ0 and A0, which are the maximum of the PMF and the maximum amplitude of the periodically poled case, respectively. (a) Standard ppKTP crystal having a sinc-shaped PMF. (b) Crystal poling generated using Tambasco fixed domain width algorithm. (c) Crystal poling generated by domain width annealing (d) Crystal poling generated using arbitrary small sub-coherence domains. The domain width equal to a tenth of the coherence length. This method achieves single photon purities for short crystals (99.4%) that is very close to the purity of long crystals (99.5%). The Sellmeier equations and the temperature-dependent dispersion relation used in the simulations are given in [37][38][39].
The degree of separability of the JSA has been estimated through the Schmidt method by using the singular value decomposition [29,40]. It's important to notice that this method strictly depends on the spectral range under examination: too small ranges lead to inaccurate results, too wide ranges are computationally intractable. It also depends on the discretisation of the JSA: too course a discretisation make the purity artificially high. An empirically-found reasonable choice for these parameters is a range of about eight times the PMF spectral width, and a JSA discretised as a 100 by 100 matrix.
In Fig. 3 we compare the single-photon purity for different methods-our two new methods, and those by Dixon et al. [17] and Tambasco et al. [19]-as a function of crystal length (for σ = L/4). Note that we do not compare with the Dosseva et al. method [18] as this method uses simulated annealing to solve a problem to which the Tambasco et al. method [19] finds the optimal solution. In other words, the purity given by [18] will be bounded from above by the purity given by [19] for any set of parameters. For all algorithms, the purity is highest for long crystals. The method proposed by Dixon et al. [17] has an almost constant value of 97.9% over the range we considered. The purity for all other methods saturates at the same value of 99.5%. Note that what limits the purity here is the choice of σ, not the algorithms themselves. This value can be increased up to 99.9% by taking σ = L/5 at the expense of a lower overall nonlinearity.
The difference between the algorithms is more drastic for short crystals. While the arbitrary-small domains method always achieves an almost optimal result, the annealing algorithm provides less separable JSAs than the algorithm proposed by Dixon et al. [17] for crystals having less than ∼ 60 domains: this might be due to a sub-optimal initial condition for the annealing process which makes the algorithm converge to a local minimum.
The unmodified Tambasco et al. algorithm [19] is the only one that does not yield purities that monotonically increase with crystal length. For crystals of length 80 c and 100 c , the purity drops when compared with slightly shorter crystals. This can be explained by considering that the two-domain-block structure of the algorithm doesn't allow enough flexibility to track the target amplitude closely due to the rough discretisation when the total number of domains is small. Numerical artefacts in the Fourier reconstruction prevented a reliable value for the purity for a 50 c long crystal to be calculated.  It is finally worth mentioning that the symmetrisation of our algorithms is a straightforward procedure whenever the target phase matching function is symmetric (if it is not the case, then it is not possible to have the same spectral response in both directions): either the annealing and the domain engineering with arbitrary small domains can be applied to the first half of the crystal and reproduced specularly on the second half obtaining a symmetric crystal. This could be necessary in experiments which involve a double pump configuration, where a symmetric crystal may be required for having spectrally indistinguishable photons in both directions.

CONCLUSION
Scalable, fault-tolerant quantum photonics will require photons of the highest purities created with near-unity heralding efficiencies. Spectral filtering severely compromises heralding efficiencies [10], increasing the number of multiplexed PDC sources required for approximating nearly-deterministic single-photon sources [9,41,42]. At high pump powers narrowband filtering can also introduce undesirable mixing in other degrees of freedom [11].
Domain engineering is therefore crucial for photonic architectures relying on parametric downconversion. The two sub-coherence length methods we introduced are fast, can readily be implemented commercially, and create higher-purity photons than previous algorithms in particular for short nonlinear crystals matched to femtosecond pump lasers [43].
It will be a challenge to apply our methods to integrated photonics, where photons are typically created via four-wave mixing. Group-velocity matched four-wave mixing photon sources have been demonstrated but the observed photon purity was again limited by spurious frequency correlations caused by the sinc-shaped biphoton amplitudes [44,45]. Techniques for Gaussian apodization in four-wave-mixing have been suggested [46] but they require manipulation of the effective material dispersion which complicates simultaneous group-velocity-matching. An ongoing effort aims to introduce techniques which offer a free parameter similar to periodic poling in this platform to which we hope to map our domain-engineering algorithm.
We expect that domain engineering with arbitrarily small domains will have a range of interesting applications in 'classical' nonlinear optics for which domain engineering was originally designed [22,23,[30][31][32][33], in particular because we can now keep track of both real and imaginary target functions which might offer advantages for phase-sensitive applications.