Sub-shot-noise interferometry from measurements of the one-body density

We derive the asymptotic maximum-likelihood phase estimation uncertainty for any interferometric protocol where the positions of the probe particles are measured to infer the phase, but where correlations between the particles are not accessible. First, we apply our formula to the estimation of the phase acquired in the Mach–Zehnder interferometer and recover the well-known momentum formula for the phase sensitivity. Then, we apply our results to interferometers with two spatially separated modes, which could be implemented with a Bose–Einstein condensate trapped in a double-well potential. We show that in a simple protocol which estimates the phase from an interference pattern, a sub-shot-noise phase uncertainty of up to Δθ∝N−2/3 can be achieved. One important property of this estimation protocol is that its sensitivity does not depend on the value of the phase θ, contrary to the sensitivity given by the momentum formula for the Mach–Zehnder transformation. Finally, we study the experimental implementation of the above protocol in detail, by numerically simulating the full statistics as well as by considering the main sources of detection noise, and argue that the shot-noise limit could be surpassed with current technology.


interactions.
A paradigmatic example and benchmark for every interferometer is the Mach-Zehnder interferometer (MZI), where two 'beams' propagating along separated paths accumulate a relative phase θ (to be estimated) and are subsequently recombined through a beam splitter. A similar protocol that uses internal states instead of separate paths is known as Ramsey spectroscopy [21]. It has recently been realized with a Bose-Einstein condensate (BEC) by employing two hyperfine states of the atoms [9,10]. The beam splitter was mimicked by coupling the two modes with a micro-wave pulse for a precisely chosen amount of time. Atomic interferometers with spatially separated modes, which could, for instance, be used to measure forces decaying with the distance, are more challenging in implementation owing to the difficulty in performing the beam-splitter transformation [22,23].
In this paper, we consider a simple 'double-slit' interferometer in which two beams are recombined through a free expansion of two initially localized clouds with the output signal obtained by measuring the positions of the particles. After the preparation of a suitably 3 entangled input state, sub-shot-noise (SSN) interferometry requires thus refined particle detection at the output. Some new techniques of atom-position measurements, such as the microchannel plate [24], the tapered fiber [25], the light-sheet method [26] or techniques involving atomic fluorescence from the lattice [27], give hope of almost 100% efficient single-atom detection in the nearby future. Such a tool, in principle, could even give access to atom-atom correlations at all orders.
It has been shown [28,29] that the measurement of the N th-order correlation function is the best possible estimation strategy for inferring the phase between two interfering BEC wave packets, and allows one to reach the Heisenberg limit of phase uncertainty. However, even with small BECs, the measurement of the N th-order correlation function would require substantial experimental effort, since a huge configurational space must be probed with sufficient signalto-noise ratio.
The difficulty of measuring high-order correlation functions is the motivation for this work. We show that by measuring the positions of particles the phase can be estimated with an SSN phase uncertainty using the simple one-body density. We consider two possible detection scenarios: the output signal consists of (i) the positions of single atoms or (ii) the number of atoms per pixel, which corresponds to the commonly employed least-squares estimation from the fit to the average density [30]. For both cases, we compute the general asymptotic phase uncertainty of a maximum-likelihood phase estimation scheme using the one-body density only.
We verify that in an MZI situation, estimating the phase from the one-body density is equivalent to estimation from the average population imbalance between the arms of the interferometer, and recover the known result that the sensitivity can saturate the Heisenberg limit.
We then apply this estimation protocol to our case of interest, namely two interfering BEC wave packets. We provide an analytical expression for the phase uncertainty which shows SSN scaling with phase squeezed states in the input. Contrary to the MZI, the phase uncertainty does not depend on the phase θ , which could be an important advantage. We then analyze the full statistics of the phase estimation by numerically simulating experiments with a realistic number of particles. We find that already small statistical samples are sufficient to saturate the analytical asymptotic prediction for the phase uncertainty. Finally, we discuss the main sources of noise affecting the interferometric precision, which we expect to concern the atom detection stage. We argue that including the effect of imperfect detection the SNL can still be surpassed and that the amount of squeezing necessary could be achieved in a realistic double-well setup.
The paper is organized as follows. In section 2 we derive the general expression for the uncertainty of the phase estimated from the one-body density. We consider two possible detection scenarios. We assume either having access to positions of single particles or a coarsegrained measurement due to a limited detection capability. In section 3, we apply the above results to the case of the MZI. In section 4 we turn to our case study where the interferometer consists of a simple phase imprint between the modes followed by the ballistic expansion of the mode functions. We also show how the phase uncertainty for the phase estimation from the density of the interference pattern is influenced by characteristic sources of noise. The details of analytical calculations are presented in the appendices. Schematic representation of the atom-position measurement, where the detectors (boxes) turn yellow (numbers 1, 3 and 7) when they are hit by atoms (spheres). Some detectors, however, remain gray (such as detector no. 8), so the detection efficiency drops, and sometimes a neighboring box turns yellow (no. 5 instead of no. 4), which limits the spatial resolution.

Sensitivity of the one-body density estimator
An interferometric estimation protocol can be divided into: (i) the interferometer transformation, (ii) the measurement at the output and (iii) the phase inference through an estimator. In (i), the interferometer imprints a relative phase θ on the N -particle input state |ψ in . Such a transformation can be represented by a unitary evolution operatorÛ (θ) = e iθĥ , whereĥ is linear-i.e. can be written as a sum of operators acting on each particle separately-and does not depend on θ. In the Heisenberg picture, the field operator evolves asˆ (x|θ) ≡Û † (θ)ˆ (x)Û (θ ) and |ψ in is unchanged. In the following, we will specify our arguments for the case of particle position measurements at the output, although the results of section 2.1 are valid in general. We will thus consider next that in stage (ii), upon leaving the interferometer, the positions of the particles are detected. As observed above, due to the difficulty of obtaining high-order correlation functions, often only the lowest, namely the density, can be precisely measured. This density, when normalized, gives the probability density of measuring a single particle at position x given θ . The average value is calculated for the input state |ψ in . Next, we assume having no access to the correlations, so the phase is inferred only using p 1 (x|θ ) and the measurement outcomes obtained in stage (ii).

Single-atom detection
When in (ii) single-atom detection is performed, see figure 1, each experiment gives the positions of the N atoms, The prescription of the maximum likelihood estimator (MLE) is to infer the phase θ (m) ML in stage (iii) as the value of ϕ that maximizes equation (2). As demonstrated in appendix A, this MLE is consistent, i.e. θ (m) ML → θ for m → ∞. There, we also obtain the uncertainty of the estimator equal to 1 m where C depends on the two-body probability density of detecting one particle at x 1 and the other at x 2 , and reads 7 Furthermore, F 1 is the Fisher information calculated with the probability p 1 , Equation (3) is the first important result of this paper. The phase, which is estimated from the single-body probability, depends on both p 1 and p 2 . If C = 0 or neglected as in [28], equation (3) provides a shot-noise-limited phase uncertainty, since F 1 1 [28]. However, C can assume negative values, allowing for SSN phase uncertainty, as will be demonstrated below. We also underline the generality of the above result, which is valid for any quantum state, where the parameter θ is estimated from the one-body density. In analogy, if twobody correlations can be measured in an experiment, four-body correlations would enter the corresponding expression for the asymptotic phase uncertainty.

Multiple atom detection
It is also important to consider the possibility that the detectors cannot resolve positions of individual particles. In this case, we need to use the coarse-grained density and assume that in stage (ii) in the ith experiment the number of atoms n (i) k in each of the k = 1, . . . , n bin bins is measured (the bin size x must be small to precisely sample the density variations). The measurement is repeated m times and the phase is estimated from a least-square fit of the onebody density to the accumulated data. As discussed in detail in [30], the phase uncertainty of such a fit is equivalent to the phase uncertainty of the MLE with the likelihood function L fit (ϕ) = Gaussian with a mean n k = x N p 1 (x k |ϕ) and Poissonian fluctuations, 2 n k = n k [28]. For the phase uncertainty of this MLE, we obtain similarly as in section 2.1 that where In the continuous limit x → 0, formulae (3) and (7) coincide.

Estimation from the one-body density with the Mach-Zehnder interferometer
As a benchmark for the phase estimation protocol introduced in section 2, we first consider the case when in stage (i) the system acquires the phase θ in an MZI. The interferometric sequence of the MZI consists of three steps. First, the initial two-mode state |ψ in passes a beam splitter. Then, a relative phase θ is imprinted between the modes. In the final stage of the interferometer, another beam splitter acts on the state.
It can be easily shown that the evolution operator of the whole MZI sequence iŝ U (θ) = e −iθĴ y . 8 It is convenient to switch to the Heisenberg picture. The initial field operator readsˆ (x) = ψ a (x)â + ψ b (x)b, whereâ/b annihilates a particle from mode a/b and the corresponding spatial mode function is ψ a/b (x). When passing the MZI, this operator is transformed as follows: We use this result to calculate the one-body (cf equation (1)) and two-body (cf equation (4)) probability densities that enter equation (3), assuming that both the mode functions are pointlike and trapped in separate arms of the interferometer, i.e. ψ a (x)ψ b (x) = 0 for all x. Using an initial N -particle two-mode state |ψ in = j c j | j, N − j , where |c j | 2 is the probability of having j atoms in mode a and (N − j) in b, we obtain The details of the derivation are presented in appendix B. The above expression coincides with the phase uncertainty for the estimation of the phase θ from the average population imbalance between the two arms of the MZI. This coincidence can be explained as follows. When atoms are trapped in two separate arms of the interferometer, their positions can be directly translated to the number of particles in each arm, without any loss of information. It is then not surprising that the estimators based on the average population imbalance and the average density are equivalent. Note that for a particular case of θ = 0, we obtain 7 where ξ n = N 2Ĵ z Ĵ x 2 is the spin-squeezing parameter [21,31] related to number squeezing of the initial state. Expression (10) can provide up to the Heisenberg scaling of the phase uncertainty, once the interferometer is fed with a strongly number-squeezed state [32].

Estimation from the one-body density with the interference pattern
We now turn our attention to the case of our interest, to which we devote the rest of this paper. We consider that the whole phase acquisition sequence consists of two steps. First is the phase imprint performed in the absence of two-body interactions, which in the two-mode picture is represented by the unitary operator e −iθĴ z and giveŝ Then, the trap is opened. Since the two-body collisions are assumed to be not present, the two wave packets freely spread and interfere. Atom interactions can strongly influence the expansion at an early stage, when the density of the clouds is high [33,34]. We assume that initially ψ a/b (x) have identical shape, but are centered around ±x 0 . After a long expansion time (in the x σ 2 , whereσ = ht µ (µ is the atomic mass and t is the expansion time) and κ = 2 x 0 σ 2 ,ψ is the Fourier transform of the wave packets at t = 0, the same for ψ a/b [28,29]. Note that we have dropped the common factor e i x 2 2σ 2 . Some aspects of this simple interferometric sequence, which does not require the implementation of a beam splitter, have been discussed in [29].
The field operatorˆ (x|θ ) gives p 1 (x 1 |θ) and p 2 (x 1 , x 2 |θ) presented in appendix C, which are put into equation (3). The integrals are performed analytically assuming that the interference pattern consists of many fringes, giving where ξ φ = N 2Ĵ y Ĵ x 2 is the spin-squeezing parameter [35] related to phase squeezing of the initial state. Also, we have introduced ν = 2 N Ĵ x -i.e. the visibility of the interference fringes; see equation (C.1) and below for details. One important property of equation (12) is its independence of the actual value of the phase θ . This can be understood as follows.
Probabilities (1) and (4) depend on θ in the same manner, i.e. via a sine or cosine function, see equations (C.1) and (C.2). Since the integration in (5) and (6) runs over the whole space, the shift of the trigonometric functions by a common factor θ is irrelevant, since within the envelopeψ x σ 2 there are many interference fringes. On the other hand, the sensitivity of the MZI (9) depends on θ and has some optimal 'working points' because the wave packets ψ a/b (x) do not add up to a wide envelope and therefore the argument for θ independence valid for the interference pattern does not apply here. Therefore, the θ independence of equation (12) might be an advantage of this estimation protocol with respect to the momentum formula of the MZI, valid when the phase is estimated from the one-body density.
Note that the phase uncertainty of the MZI (equation (10)) would closely resemble the above result if the second term of equation (12) were absent. In the former case, 2 θ (m) ML benefits from the number squeezing because the estimator is equivalent to the average population imbalance between the arms of the interferometer; therefore a state with reduced population imbalance fluctuations decreases the uncertainty 2 θ (m) ML . Analogously, in the latter case, the decreasing fluctuations of the relative phase between the two modes would improve the estimation precision.
However, the situation gets complicated due to the presence of the second term in equation (12). Namely, when ξ φ drops, so does the fringe visibility ν, and hence the amount of information about the phase θ contained in the one-body density degrades. As a consequence, the phase uncertainty of the estimation from p 1 (x|θ) declines.
It is now important to check whether, due to this interplay between the improvement from the phase squeezing and the deterioration from the loss of visibility, equation (12) can give SSN phase uncertainty at all. To this end, we calculate 2 θ (m) ML with phase-squeezed states for N = 100, which we generate by computing the ground state of the two-mode Hamiltonian for negative values of the interaction to tunnelling ratio U N J . For a detailed study on the preparation of phase-squeezed states, including experimental sources of noise, see [35]. For every value U N J , we find |ψ in and calculate ξ φ and ν. These values are substituted into equation (12) and the resulting 2 θ (m) ML is plotted in figure 2 as a function of ξ φ . Clearly, the uncertainty drops below the SNL. We note the presence of an optimal point, where the gain from the spin squeezing is balanced by the loss of visibility. The figure also shows that the phase uncertainty (12) does not saturate the bound set by the quantum Fisher information (QFI), 2 θ QFI ≡ 1 To support these analytical results for the asymptotics, we also study the full statistics of the protocol by simulating a phase-estimation experiment with N = 100 particles. We generate the input state with a desired amount of phase squeezing and evaluate the full N -body probability p N ( x N |θ), with which we draw a single realization yielding the N positions. We repeat the experiment m times and obtain one value of the phase θ (m) ML using MLE with equation (2). This cycle is performed n rep times and the variance of the estimator is calculated on the resulting ensemble. The results, plotted (empty circles) in figure 2 for four values of ξ φ , m = 10, and n rep = 4000, agree with the theoretical value calculated with equation (12). An important piece of information in view of an experimental implementation is that although formally the MLE saturates (12) when m → ∞, in practice m = 10 is sufficient for reaching the bound, as shown in the upper panel of figure 3. The lower panel of figure 3 shows instead the average value of the estimated phase plus uncertainties as a function of m. Here we have chosen the true value of θ to be 0; thus the figure shows that the estimator is unbiased. Our next step is to find the best scaling of equation (12) with N . In order to obtain an analytical estimate, we model |ψ in with a Gaussian, see appendix D, and find that at the optimal point m 2 θ (m),opt ML = 2N −4/3 . This prediction is compared with numerical results, where for every N we evaluate the phase uncertainty (12) at the optimal state. As shown in the inset of figure 4, the agreement between the numerics and the Gaussian approximation is very good. Also, we numerically obtain the scaling of the QFI at the optimal state m 2 θ opt QFI = N −4/3 , which differs from 2 θ (m),opt ML just by a factor of 2. Next, we discuss some characteristic experimental imperfections that can spoil the phase uncertainty (12).

Impact of detection imperfections
Single-particle detection, which is the basis of the estimation scheme discussed in section 2.1, is affected by two dominant sources of noise: limited efficiency and finite spatial resolution, see figure 1. The former is incorporated by letting the index k run from 1 to n < N in equation (2). In effect, n replaces N in equation (3). The finite spatial resolution modifies instead both C and F 1 . We implement it by convoluting the probabilities p 1 (x 1 |θ) and p 2 (x 1 , x 2 |θ) with p(x 1 |x 1 ) = 1 √ 2πσ 2 e −(x 1 −x 1 ) 2 /(2σ 2 ) (and analogously for x 2 )-a Gaussian probability of detecting an atom at x 1 given its true position x 1 . The convolutions are calculated analytically and the phase uncertainty becomes 2θ (m) where κ = 2 x 0 σ 2 andσ = ht µ were defined below equation (11). Above, we assumed that N , n 1, η = n N and the tilde denotes the phase uncertainty in the presence of errors. Note that for η = 100% and σ = 0 we recover equation (12). In figure 4 we plot equation (13)  shown in the inset. Also note that even in the presence of the noise, the phase uncertainty (13) does not depend on θ.
The two main sources of detection noise affecting the least-squares fit estimation protocol defined in section 2.2, not shown in figure 1, are imperfect atom counting and the finite bin size. To model the former, we assume that the number of atoms in each bin is measured with some uncertainty, and convolute the probabilities entering the fit likelihood function with a Gaussian error distribution, p err (n k |n k ) = √ m √ 2πσ 2 err e −(n k −n k ) 2 /(2σ 2 err /m) . As a result, the variance is increased by σ 2 err /m. Very promising detection techniques based on the detection of fluorescence photons rely upon detection of on average α fluorescence photons per atom, and this number fluctuates at the shot-noise level. Error propagation gives σ 2 err = 1 α n k and the phase uncertainty (7) becomes 2θ (m) Using N = 100 and the optimal state denoted by the vertical line in figure 2, for the bin size x = 0.2 κ an SSN phase uncertainty is preserved for α 2.2-a condition well satisfied by the light-sheet technique, which can give α 10 photons per atom [26].

Conclusions
We have derived an expression for the phase estimation uncertainty for a generic situation where the phase is inferred from the positions of probe particles, when only the one-body density is known. In a Mach-Zehnder-type interferometer, the sensitivity of this protocol coincides with the well-known error propagation formula and we recover the known Heisenberg limited phase uncertainty, θ ∝ N −1 . Then we consider the simplest 'double-slit' interferometer based on spatially interfering wave packets suggested in [29], which still scales at best as θ ∝ N −2/3 , limited by the loss of fringe visibility. Nevertheless, the phase uncertainty for the interference pattern (12) has a major advantage over the MZI (9). Namely, it performs equally well for any value of θ, while (9) can reach very high values around θ = π 2 and 3 2 π . The interferometric protocol employing the interference pattern could be implemented with a BEC trapped in a double-well potential. After imprinting the phase and switching the trap off, the two clouds would expand and interfere. The atoms could then be detected using, for instance, the light-sheet method [26], based on the fluorescence measurement of photons scattered by the atoms crossing a laser beam. Another possible scheme relies upon letting the atoms fall onto an optical lattice. If the interference pattern is dilute so that there is not more than one atom per site, their positions could be detected by a fluorescence measurement [27] with ultra-high efficiency and resolution. Recent quantum interferometry experiments [9,10,12] indicate that very important effects limiting the phase uncertainty are the detection imperfections, which we believe to have been realistically taken into account in our proposal. Another relevant constraint on the precision of a double-well interferometer comes from the noise present in the interferometric sequence (i). A recent theoretical work [35] shows that the amount of squeezing at the optimal point (ξ φ = 0.44 for N = 100 particles) could be reached with a double-well BEC using a refocusing method even in the presence of the latter source of noise.
where F is the Fisher information to be defined below. The original proof was given in [37]. A recent formulation can be found in [38], and a more accessible albeit less rigorous version is given in [39]. The proof consists of two steps. Firstly, we show that the estimator is consistent, i.e. for m → ∞, the probability that θ (m) ML = θ goes to 0, which means that the estimator approaches the true value of the phase shift asymptotically. This demonstration follows [38]. Secondly, we adapt the simplified proof of [39] (see also [37]) that a consistent ML estimator is also efficient, which means that it saturates the Cramér-Rao bound, equation (A.1), for m 1. In particular, the proof shows that for large m, θ (m) ML is distributed with a Gaussian distribution with the variance of equation (2) and mean θ ; hence it is also unbiased.
Before we start, a remark is in order. It might seem surprising or even wrong to use the likelihood function of equation (2). The reason is that we use only the single-particle probabilities to estimate the phase, even though in a single shot of the experiment, N particles will be correlated in general. Hence, the probability that in a single shot the second particle arrives at a position x 2 generally depends on the position x 1 where the first particle was detected. In traditional ML estimation, one would therefore define the likelihood function as with the N -particle conditional probability density p N ( x (i) N |ϕ) instead and define the estimator as the maximum of this function. As mentioned above, it can be shown [37][38][39] that this estimator is consistent, unbiased and that it saturates the Cramer-Rao bound with the Fisher information This bound cannot be overcome by any other estimator using the results of the measurement governed by the probability density p N ( x N |θ) or by any of its reductions. However, there are no restrictions on how an estimator can be defined; there will only be differences in the performance of different estimators. We define the estimator based on the single-particle probability density only. As the proof below shows, this estimator is consistent, unbiased and has the variance of equation (2). The variance is ultimately limited by (but is generally larger than) the ultimate limit from equation (A.1) with the Fisher information from equation (A.3). However, it has the advantage that it is accessible experimentally and also allows for SSN phase estimation, as shown in the main paper.
Consistency. We recall the definition of θ (m) ML , which is the value of the parameter ϕ which maximizes L(ϕ) from equation (1) from the main paper. Equivalently, it maximizes where N is the number of particles and m is the number of independent repetitions of the experiment.
The events x N are distributed with the conditional probability density p N ( x N |θ ), where θ is the true value of the phase shift, and p 1 (x k |θ) is obtained from p N ( x N |θ) by integrating over all x j =k . We assume identifiability, i.e. that p 1 (x|θ) = p 1 (x|θ ) for all x is equivalent to θ = θ . Consistency is then proved by showing that f (ϕ) = lim m→∞ f (m) (ϕ) has a maximum at ϕ = θ as follows: where we have used ln(y) y − 1. The equality is obtained iff y = 1. Hence the inequality equation (A.5) is saturated iff p 1 (x|ϕ) = p 1 (x|θ) for all x. It follows that ϕ = θ by the identifiability assumption. Hence θ (m) ML → θ for m → ∞.
Efficiency. We expand the first derivative of L from equation (1) of the main paper around θ , We now set ϕ = θ (m) ML . Since this phase maximizes L, the left-hand side vanishes and we obtain The consistency of the estimator ensures that we can neglect terms of higher order in θ (m) ML − θ provided that m is large enough.

In order to investigate how θ (m)
ML − θ is distributed, we start by computing the average of the denominator, The coefficient N results from the indistinguishability of the particles and F 1 is the Fisher information calculated with the single-particle probability density, Coming back to equation (A.7), we obtain Hence the difference θ (m) ML − θ is the average of m random variables, which in the central limit are distributed with a Gaussian probability. With a calculation similar to that of equation (A.8), one obtains that the average value vanishes, which means that the MLE is unbiased. The variance of the distribution in the central limit is Therefore, the Fisher information from equation (A.1) is Hence the correlations enter via the two-particle correlation function p 2 even though in the definition of the estimator only the single-body density p 1 is used.

A.2. Multiple-atom detection
The multiple-atom detection relies upon dividing the space into n bin bins, each of size x. In every bin, the number of atoms is measured and this result is averaged over m realizations, givingn (A.14) According to the central limit theorem, for large m the probability of detectingn k is Gaussian, Here, n k = lim m→∞nk and 2 n k are the associated fluctuations and both depend on the value of ϕ. We construct the likelihood function as follows: For the following calculations, we introduce the more compact notation used also in the main text. We start with calculation of the denominator of equation (A.17), which in the large m limit reads as According to (A.18), both n k and 2 n k in the above equation are a function of θ . In analogy to equation (A.8) for m → ∞ the above denominator is replaced with its average value. Upon averaging, the second term vanishes and the third term is proportional to 2 n k /m, which is negligible in the limit of large m. Therefore, we obtain The average of the square of the nominator of equation (A.17) reads as ∂ θ n k ∂ θ n l 2 n k 2 n l (n k − n k )(n l − n l ) −2m 2 n bin k,l=1 ∂ θ n k 2 n k ∂ θ 1 2 2 n l (n k − n k )(n l − n l ) 2 +m 2 n bin k,l=1 ∂ θ 1 2 2 n k ∂ θ 1 2 2 n l (n k − n k ) 2 (n l − n l ) 2 . (A.21) The second term vanishes and the last, after the average is calculated in the central limit, becomes m independent and thus can be dropped when compared to the first term.
Let us now separate the sum over k, l into k = l and k = l parts. The first one simply gives The non-diagonal part k = l depends on the two-site correlations σ 2 kl = m (n k − n k )(n l − n l ) . To justify equation (5) of the main text, we note that the atom-number fluctuations read as and thus in a small bin-size limit are Poissonian, i.e. 2 n k = p 1 (x k |θ) x = n k . Therefore, The last step is to calculate the cross-correlation term where in the last line we used which is true for k = l.

Appendix B. Derivation of the phase uncertainty for the Mach-Zehnder interferometer
When the two wave packets are fully separated, so that ψ a (x)ψ b (x) = 0 for all x, the one-body probability reads where ν = 2 N Ĵ x is the fringe visibility. This probability gives Now we calculate the second-order probability and obtain (cf equation (4)) We insert this function together with p 1 (x|θ) into the definition of C and obtain If we now combine expressions for F 1 and C as in equation (3), we obtain equation (9).

Appendix D. Gaussian scaling
To find the best possible scaling of the phase uncertainty (12) with N , we model |ψ in with a Gaussian state as follows: The operator e −i π 2Ĵ x represents a beam splitter which transforms the number-to the phasesqueezed state. For ξ φ = 1 the resulting state is spin coherent, and by decreasing ξ φ we increase the amount of phase squeezing.
This state is used to calculate the expectation values Ĵ x and Ĵ 2 y from equation (12). For N 1 the summation over j can be approximated with an integral. This way we obtain analytical expressions which are then substituted into equation (12). Taking ξ φ = N −β , where 0 β 1, we obtain The phase uncertainty is optimal when these two terms are equal; otherwise one of them would dominate at large N . This condition gives β opt = 1 3 and m 2 θ (m),opt ML = 2N −4/3 .