Scanning quantum correlation microscopy with few emitters

Optical superresolution microscopy is an important field, where nonlinear optical processes or prior information is used to defeat the classical diffraction limit of light. Quantum correlation microscopy uses photon arrival statistics from single photon emitters to aid in the determination of properties including the number of emitters and their relative brightness. Here we model quantum correlation microscopy in the few emitter regime, i.e. around four single photon emitters below the diffraction limit. We use the Akaike Information Criterion to determine the number of emitters and we vary the relative contributions of intensity to quantum correlation information to determine contribution that provides optimal imaging. Our results show diffraction unlimited performance and a change in localisation scaling behaviour dependent on emitter closeness.


I. INTRODUCTION
The quest to gain a greater understanding of biological systems and systems at the atomic scale has motivated the discovery of new microscopy techniques to overcome the diffraction limit, such as STED and STORM [1][2][3][4][5][6][7].
Although optical superresolution techniques like STED are in principle diffraction unlimited, in practice, the achievable resolution in biological systems is limited by damage induced by the optical beams: phototoxicity [8].This limits the utility of superresolution, especially in live cell imaging, with limitations on either resolution, imaging time, or both [6,7].As discussed by Hemmer and Zapata [4], the optical power limitations apply across all superresolution techniques, although tradeoffs to achieve particular goals are usually possible.This motivates the exploration of techniques that use information that is often ignored, and in that context we have chosen to explore the quantum statistics of light emitted in microscopy.
The use of quantum correlations in superresolution microscopy was proposed in one of the first superresolution schemes [9].Such quantum techniques typically employ the Hanbury Brown and Twiss experiment (HBT) [10] in place of more conventional classical photon collection.HBT uses single-photon detectors and cross-correlates the received signals to provide information about the number of single-photon emitters in the field of view.The information obtained via HBT measurements can then be combined with classical microscopy results to improve their resolution [11,12].
A general treatment of the role that antibunching plays in microscopy can be found in [13], which also clearly shows the achievable improvement in resolution.These results were demonstrated experimentally in [14].Inspired by this work, Gatto Monticone et al. [15] demon-strated how superresolution could be achieved when using confocal measurements and HBT on a cluster of classically unresolvable fluorescence color centers in diamond.These treatments used HBT (and sometimes higher order) measurements to effectively reduce the width of the point spread function (PSF), where the confocal PSF is raised to the power of k where k is the order of the correlation used in imaging.This gives rise to an effective standard deviation for the higher order PSF of σ/ √ k where σ is the standard deviation for the standard (classical) PSF.Several other recent advances in utilising quantum correlation microscopy include works that use various different methodologies to achieve superresolution for thermal optical sources [16][17][18].Other recent works include superresolution imaging using third and fourth order correlations [19], and quantum imaging of remote bodies [20].A review on several quantum imaging techniques can be found in Ref. [21].
One surprising result from quantum correlation microscopy is that the information obtained from the HBT experiment is qualitatively different from that obtained using conventional intensity measurements.Worboys et al. demonstrated that the HBT signal for two emitters reveals the relative brightness of those emitters [22].This insight enabled the development of a new protocol, quantum trilateration, where two single photon-emitters with unknown relative brightness can be localised to arbitrary precision on the basis of data from three measurement locations.Such a protocol would be impossible using conventional intensity measurements alone as the number of free parameters (five) exceeds the number of measurements (three).By contrast, HBT and intensity measurements combined provides intrinsic brightness and relative brightness.This leads to six different measurement results, enabling the localisation of the two emitters.These results generalise to three dimensions [23].
Here, we show that the scaling of our approach with respect to time converges to 1/ √ t after a certain turning point.Our observations show that this turning point is dependant on the minimum spacing of the emitters.This behaviour is shown for several cases of three to four emitter with increasing complexity such as increased background levels and unequal emitter brightness.We present a simple heuristic for obtaining the turning point location by interpolating to find the intersect of the scaling plots before and after the turning point.Our results show that the optimal ratio of intensity and correlation information is of relatively little significance in cases with no background, provided that both information sources are used.In cases with background, there is a slight preference for higher contributions of correlation information.Finally, we demonstrate how the Akaike Information Criteria (AIC) [24] can be used to constrain the number of emitters, with the accuracy of the calculation of the ground truth number of emitters improving as measurement time increases.Additionally, we observe that the measurement time required for more accurate emitter number determination decreases at higher background levels.
This manuscript is organised as follows.We first describe the HBT setup, and demonstrate how to generate the expected outputs using the second order correlation function.We show the time scaling laws of our approach for several random configurations of three or four emitters, as well as how different weightings of intensity and correlation information affects these results.We then show the evolution of AIC results with increasing measurement time.We repeat this analysis for emitter configurations of: equal brightness, equal brightness with uniform background, and unequal brightness with and without background.

A. Simulating second-order quantum correlations
The second-order correlation function, or the Hanbury Brown and Twiss experiment, provides information about the probability of multi-photon emission from a particular field of view.Figure 1(a) shows a schematic of the HBT setup that we are analysing.This two-detector setup measures the coincidence rate of photons for a system comprised of several single photon emitters.The position of each emitter, E i , is − → x i = (x i , y i ).Photons are collected from the sample via the microscope objective, which is modelled via the microscope point spread function (PSF), and the signal is split to two detectors via a beamsplitter.Detector signals can be analysed by summing, which retrieves the conventional intensity measurement, or in coincidence, which provides the HBT signal (second order correlation).Note that because the signals are electronic, both sets of data can be collected simultaneously.
To perform imaging of a fluorescent sample with unknown number of emitters, we envisage scanning the detector across the field of view, as in [15], although widefield approaches are possible [13], especially via the developments in single photon avalanche diode arrays [25,26].
For N single photon emitters, the second order correlation function measured by the HBT setup is [22] where P i and P j are the detection probabilities of photons from the respective emitter, i and j.The detection probability is found from the product of the excitation probability, emission probability, PSF, all optical losses in the microscope and any differences between the emitters (for example different orientations of the emitters [27]).Note that although the second order correlation function is often written as a function of the time delay between detections, τ , for our purposes we are only concerned with the correlations at zero time delay, ie when both detectors fire simultaneously.This corresponds to the the HBT signal at delay time zero, as shown in Figure 1.
As can be seen from Eqn. 1, the second order correlation provides information about the number of emitters, and provides maximum number discrimination when there are two emitters in the field of view.However, as shown by Li et al. [28], higher order correlations provide more information about the number of emitters as the number of emitters increases.Nevertheless we stay with the second order correlation function as it is the most convenient and accessible experimental setup.
The general form for the second order correlation function in Eqn. 1 provides one mechanism for treating background either arising from background fluorescence, detector dark counts, or both.Here, we model the background as arising due to a large number of single photon emitters, where the probability that each will emit in any given time is low, however the product of this probability with the number of emitters in a diffraction limited spot is not negligible.We further assume that the probability of emission/detection from each of these emitters is equal and that the density of emitters across the entire sample is constant.We set the detection probability from each background emitter as P bg ≪ P i , where i here refers only to the emitters we wish to characterise (i.e. the bright, non-background emitters).To determine the total number of emitters, we should integrate the background density over the point spread function, however in the limit that the number of emitters is large, we can approximate the background as arising from a large number emitters N → ∞ where N P bg ≲ min(P i ) In this limit, the second order correlation becomes [22,27] For simplicity we assume a Gaussian PSF, which is a good approximation to the more correct Airy function  (2) ) output obtained by correlating photon counts.The probability of all possible two photon detection events at detectors D1 and D2 for a given delay time are normalised by uncorrelated detection events (τ = ±∞).The example here is for the case of two equal brightness emitters which result in g (2) (τ = 0) = 0.5.[29].Hence the detection probability for the emitters is modelled via where (x i , y i ) is the location of emitter i and (x, y) is the center of the microscope PSF.We have also introduced P i,0 , which we term the intrinsic brightness of emitter i, which is defined as the probability of detecting a photon from the emitter when it is in the maximum of the PSF.The standard deviation for the Gaussian PSF is related to the numerical aperture of the microscope via for wavelength λ and numerical aperture NA.To avoid details of wavelength and numerical aperture, we normalise all of our distances and resolutions with σ.
Figure 2 (a) shows a simulation of a particular four emitter configuration in the long time limit, where each emitter has the same intrinsic brightness.In the long time limit, the total number of received counts at location (x, y) is proportional to i P i (x, y).We imagine the detector system being scanned across our sample, in this case in a 9 × 9 grid.The color axis shows the intensity (total detected photon counts) normalised by the maximum received number of counts.In Figure 2(b)-(e), we compare the ground truth case in Figure 2 (a) to several different potential fits with different numbers of emitters.Based on the the percentage difference of the fitting Residual Sum of Squares (RSS), which vary within 1 percent in the area containing the emitters, and 5 percent near the bounds of the field of view, we show that on the basis of intensity-only that there are multiple acceptable fits.This is a consequence of the well-known diffraction limit.
We are mostly concerned with the quantum imaging as a function of time, so as to best determine how the resolution varies in practical scenarios.We therefore simulate the counts obtained using intensity and coincidence (HBT) via the MATLAB function poissrnd [30].c i = poissrnd(P i t), c bg = poissrnd(N P bg t), c i,j = poissrnd(P i P j t), c i,bg = poissrnd(P i N P bg t), where t is the total detection time (not the HBT correlation time shown Fig. 1), c i is the number of counts from emitter i, c bg is the number of counts from the background emitters, c i,j is the number of coincidences from emitters i and j for i ̸ = j, c i,bg is the number of coincidences between emitter i and the background and we assume c i,i = 0.Because of the large number of background emitters, we cannot assume that the probability of coincidences solely from the background, c bg,bg , is negligible.
For a system of N emitters, with background and time dependence [27], Eq. 2 becomes The inclusion of background coincidence with i, j emitters will lead to an increase the value of g (2) (0) in locations where typically, few coincidence events would be detected due to distance between emitters or low measurement times.Background counts correlating with other background counts will also need to be considered and will lead to g (2) (0) approaching 1 in areas where the c i are small.

B. Calculation of effective PSF and relative Weighting of intensity and correlation information
Using the time dependant second order correlation function, we are now able to generate the expected g (2) (0) for a given collection time.To obtain robust statistics for the expected uncertainty in localisation, we perform for 200 independent Monte-Carlo trials for each collection time.
Estimated emitter locations are obtained via minimising the least-square estimation of our intensity and correlation data.From Eqns. 5, individual photon counts, including c i and c bg , provide the intensity data, I, while our coincident counts: c i,j , c i,bg , and c bg,bg provide the correlation data, C.
where RSS refers to the residual sum of squares measure of the difference between the estimates and the Monte Carlo data.
Following Worboys et al. [22], we determine an estimate for the uncertainty by constructing an effective point spread function.This is achieved by finding the centroid of the data and constructing a polygon that contains the 39.5% of the points, i.e., that includes one standard deviation of the points.The area of this polygon is then used to determine the diameter of an equivalent, circular region.This diameter is then the effective width, w eff of the localisation precision, which is analogous to the more usual resolution width for microscopy.This approach also limits the influence that outliers caused by local minima in the RSS minimisation process has on the polygon size.
To determine the superresolution factor (improvement over the diffraction limit), γ, we take the average of the w eff for each emitter and divide by the confocal width, 2σ, which is obtained from the experimental configuration.
An alternative to estimating the PSF achieved with quantum correlations in imaging applications is given by Ref. [15].There, expressions for higher order correlation functions are provided including coefficients that vary with order.g (2) (0) has a coefficient of 1, meaning that the approach used in Ref. [15] uses intensity and g (2) (0) information equally.To investigate if the relative weighting between intensity and g (2) (0) affects w eff , we introduced a parameter, α, which is added to Eqn. 7 to adjust the relative weighting.
We explore w eff as a function of α to determine the extent to which g (2) (0) influences the localisation precision, where α = 0.5 corresponds to the result in Ref. [15].As will be seen in the Monte Carlo results below, providing α ̸ = 0 or 1 (i.e.we are fitting using either intensity or g (2) (0) data only), w eff has only weak dependence on α, as is expected for a minimisation process.

C. Akaike Information Criteria for estimating emitter number
The Akaike-Information-Criteria (AIC) is a mathematical tool used for determining goodness of fit [31].The likelihood function of several models are compared to obtain a score, with a lower score signifying a greater likelihood of fit given the data.The scores are penalised based on the number of fitted parameters which mitigates overfitting.As we are using least-square estimation in our fitting algorithm, we use a form of the AIC equation that employs the residuals of the fitting in the likelihood term [32]: where ν is the AIC score, n is the sample size, k is the number of fitted parameters, and σ is the reduced chisquare statistic which is defined as: RSS/n.
Here, the only relevant metric of the AIC score is the difference in AIC scores between models.Therefore, to compare models for their goodness of fit, we use the score the best model, ν min , and other candidate models, ν i , to calculate their relative goodness of fit [32].
where L AIC is the relative goodness of fit.This results in the best model having a L AIC of 1 (when ν i = ν min ).Models with L AIC values close to zero have been determined by the algorithm to not accurately fit the data.

III. STUDY OF EQUAL BRIGHTNESS EMITTER CONFIGURATIONS WITHOUT BACKGROUND
In practical situations, the number of emitters will usually not be known a priori.For this reason we have chosen to study several cases of localisation of few emitters, including three and four emitters.In this section we concentrate on equal brightness emitters with no background.The first case we study is a contrived case of four emitters in a line, which will allow us to examine the two scaling regimes, short time and long time.We next consider three-and four-emitter configurations with the emitters spaced closely with respect to the diffraction limit.
Equal emitter brightness in this context means equal intrinsic brightness.This does not mean that each emitter will give rise to the same number of detected photons, because each emitter is located in a different position relative to the collection point spread functions.Instead this means that the maximum number of collected photons from each emitter would be the same if the emitters were located at the same positions relative to the collection point spread functions.

A. Four emitters in a line
One of the primary challenges of a localisation technique is how it handles points that are close with respect to the diffraction limit.We expect that as emitter spacing decreases, the ability to resolve emitters will worsen.
To study the effects of emitter spacing in our approach, we start by studying a specific case of four emitters in a line with an order of magnitude difference in spacing between two emitters.We will be calculating the improvement in w eff over time.
The configuration is shown in Figure 3 (a), with w eff as a function of measurement time shown in Figure 3 (b).From this configuration, we can clearly see two distinct turning points for each w eff line, which we term 'knees', where the gradient of w eff changes with time.This relationship between w eff and time can be expressed as [22]: where t is time, c is the y−intercept on the logarithmic scale, and m is the gradient of w eff .c and m are determined by fitting w eff linearly between knees, with k signifying prior to which knee the data is fitted (in order from left to right).As shown in [22], c is dependant on the number of measurement locations and the emitter configuration.As we are only interested in scaling laws, we will be focusing on the gradient, m.
By observing where we see knees in our w eff slopes, we find that that knees occur when w eff becomes less than the spacing between the two emitter sets, 0.318σ, and the two closest emitters 0.0212σ.We expect that if the spacing between the emitter sets and the two far emitters was greater we would see an additional knee there.From these results, we see a clear link between emitter spacing and scaling behaviour.We can characterise the scaling of w eff based on the time where our turning point occurs,  .Emitters are categorised by the distance to the closest other emitter in order to create two sets of emitters with a spacing of one order of magnitude difference.The gradients of the fits obtained from three different linear fits for each emitter's w eff is given.Gradients m1, m2, and m3 correspond to fits obtained from approximately times: 10 2 to 10 5 , 10 5 to 10 which we can denote as t knee .A simple means to obtain t knee would be to interpolate between the w eff before and after t knee , i.e., w eff (t) = min(10 c1 t m1 , 10 c2 t m2 ).( 13) Experimentally, we would expect that the emitters would be more evenly distributed among the field of view.This would result in only one knee being easily discernible, which would result in only two w eff : w eff,1 and w eff,2 .
For simplicity, we will only consider such cases in this paper moving forward.Thus, we will be obtaining two gradients, m 1 & m 2 , and single t knee which we expect will be where w eff becomes less than the minimum spacing of the emitters, d min .As stated we will obtain t knee by using Eqn.13.

B. Three emitters of equal brightness with no background
As we are interested in the localisation of more than two emitters, the simplest case we consider is a configuration of three equal brightness, unresolved emitters.
Figure 4 (b) shows w eff as a function of measurement time.We see two distinct gradients in the w eff line which we can identify as m 1 and m 2 , as described in Section III A. The w eff gradient begins with m 1 = −0.05± 0.03 until t knee = 6.7 × 10 4 ± 7 × 10 3 P i,0 t where it transitions to m 2 = −0.500± 0.007.This transition occurs where w eff approximately becomes less than d min , which is 0.250σ for this configuration, in keeping with the intuition built up in Section III A.
To explore how intensity and g (2) (0) combine to provide higher resolution, in Figure 4 (c), we show w eff as the weighting parameter α from the residual sum of squares (eq.7) is varied.In the limit α = 0 the fit uses only intensity information for emitter localisation, and α = 1 corresponds to only g (2) (0) information for localisation.We observe that any value of α that is not 0 or 1 results in better localisation than using only one information source.The variation in w eff is relatively minor resulting in a relatively flat plateau for 0 < α < 1, where random fluctuations in the algorithm's localisation and the exact emitter configuration plays a role in determining the flatness.The flatness of the w eff suggests that the exact contribution weighting matters little as long as both intensity and correlation information is used.Simply using a value of α = 0.5 so that equal weightings are used would be appropriate.This is analogous to the approach used in Ref. [15], where equal contributions are used.In the following cases, we will provide the value of α that provides the minimized w eff value (i.e, the lowest w eff in the α plateau).
Figure 4 (d) shows the AIC goodness of fit given the data as a function of time.We see from the results that the model resulting in L AIC = 1 (which indicates the model with the best estimation) changes depending on time.At low time, when the data is particularly noisy, L AIC = 1 for the model containing two emitters.For this case, at approximately 10 3 P i,0 t, the g (2) (0) counts have increased such that the data is less noisy and we consistently see the three emitter model as the optimal model, consistent with the ground truth.However, due to fluctuations caused by the Poisson statistics of the emissions, the optimal model changes between three and four emitter models.This fluctuation in optimal models stops as we reach high measurement times such as 10 10 P i,0 t.

C. Four emitters of equal brightness with no background
We now introduce an extra emitter into the problem to understand how the problem grows with increasing complexity.Figure 5 (a) shows the intensity map of 4 equal brightness emitters on a field with no background that would be obtained after an infinitely long measurement time.Overlayed onto the intensity map is a series of contours corresponding to g (2)   In Figure 5 (c), we show the calculated w eff with varied weighting parameter, α.With optimal weighting to minimize w eff , we obtain a superresolution factor of γ = 8 ± 4 at α = 0.05, γ = 54 ± 4 at α = 0.05, and γ = 2.2 ×10 4 ± 400 at α = 0.05, for times 100P i,0 t, 5.456 × 10 6 P i,0 t, and 10 12 P i,0 t, respectively.
Figure 5 (d) shows the AIC goodness of fit given the data as a function of time.As in the previous case, we see that the model resulting in L AIC = 1 changes depending on time, achieving better fits with models at fewer emitters at low time.For this configuration, once we reach a measurement times of approximately 10 3 P i,0 t we begin to consistently see the four emitter model as the optimal model, but we still see some variation in optimal fit until 5.456 × 10 6 P i,0 t.Average of all emitter w eff achieved with different intensity to g (2) (0) weighting ratio, α.As in Figure 4, we see that w eff performs best when using values of α between 0 and 1.(d) AIC goodness of fit given the data for models with differing numbers of emitters.We see that at time approximately corresponding to t knee , LAIC = 1 consistently for the four emitter model.

D. Effective PSF from intensity only, and relationship between scaling and minimum emitter spacing
Figure 6 shows a comparison of the resolution scaling with respect to measurement time for g (2) (0)-andintensity fitting (a) compared with intensity-only fitting (b), for six cases.The cases studied are summarised in Table II.The cases have been identified based on their minimum emitter spacing relative to other configurations with the same number of emitters.We show m 1 and m 2 for all cases, and compare between the g (2) (0)-withintensity and intensity only cases, which may not necessarily be approximately 1/ √ t as we do not expect intensity only to be diffraction unlimited.
We see in each of our cases that there is an improvement in emitter localisation when using g (2) (0) and intensity compared to using intensity alone.After a sufficiently long time (t knee ), we are able to achieve 1/ √ t w eff scaling for all cases when using g (2) (0).When using intensity-only, this is not achieved in the long time limit for the four emitter 'mid' and 'close' cases.While the t knee for the intensity-only w eff appears comparable to g (2) (0) and intensity at times, there is a larger relative uncertainty in t knee for intensity only cases due to uncertainty in fits.Generally, t knee for the intensity-only w eff is higher than for the g (2) (0)-with-intensity t knee .
We see in all cases, with the exception of the already resolved three emitter 'far' case, that the data forms a knee where the w eff gradient converges to 1/ √ t.As we have seen in the previous equal brightness cases, this knee corresponds to the point in the data where w eff becomes less than d min .At this point, w eff is sufficiently small so that each emitter w eff is distinguishable, meaning the emitters are localised and we observe the optimal w eff scaling of 1/ √ t.In the low time region prior to the knee, w eff improvement in time is slow as noisy intensity data and limited g (2) (0) counts lead to overlapping w eff , creating ambiguity in both emitter locations and numbers.
In summary, we observe that t knee is lower for cases with a larger d min , which is in keeping with the heuristic that the 1/ √ t localisation scaling takes over when the emitters are independently resolved.The relationship is seen in Table II and Figure 6, where t knee increases as d min decreases.
When comparing cases in Figure 6 (a), we see that w eff at a given time can be larger for four emitter cases than in three emitter cases of similar and sometimes smaller d min .We expect that for some configurations, localisation of four emitters will not be as good as comparable three emitter cases, as there is loss in g (2) (0) sensitivity as the number of emitters increases, as studied in [28].
TABLE II.Summary of configurations shown in Figure 6.Configurations are categorised as 'far', 'mid', or 'close' based on their minimum emitter spacing relative to other configurations with the same number of emitters, N .Results obtained using g (2) (0)-with-intensity (I & C) and intensity-only (I) fitting are compared.Results are in order of the emitter's minimum spacing, dmin.For cases where the interpolated t knee is < 0, the first data point where w weff < dmin is used and marked with *.If w weff is < dmin from the first data point, or m1 can not be obtained accurately due to too few data points, N/A is used for the corresponding data.Background signals are present in all practical systems to some level.These can be due to effects such as surface impurities or even dark count rates in detectors.As mentioned above, we treat all sources of background signal equivalently.For simplicity we will always assume that the background signal is constant across the field of view.This approximation may not always be correct, for example in Ref. [33], where the fabrication process led to an increase in background in the vicinity of the emitters.Nevertheless, this approximation will assist in building insight into the role background plays in imaging configurations.
One important result from Eqns. 2 and 6 is that background-emitter correlations lead to HBT signals that would not otherwise be present in the no-background case.How this changes the characteristics of the g (2) (0) contour from what is observed without background can be seen in Fig. 7, where we compare the same configuration with a background of 0 and P bg = 0.2P i,0 .These additional background-emitter correlations lead to an improvement in the α = 1 results for w eff , as will be shown below.

A. Three emitters of equal brightness with constant background
We begin our background analysis with three, equal brightness emitters in a field with a background of 0.2P i,0 .
The g (2) (0) value at any location is determined by the combined g (2) (0) counts from: emitter-emitter, emitter-background, and background-background coincident events as in Eqn 2. Hence, in the zero background cases for three emitters, we observe the expected value of 0.67 for g (2) (0) when measuring equidistant from three equal brightness emitters.Background increases the value of g (2) (0) due to the added photon counts, and hence coincidences, from the background emitters.Additionally, the inclusion of background leads to a signal increase towards 1 further away from the emitters, as background counts become the primary source of g (2) (0) counts.With optimal weighting to minimize w eff , we obtain a superresolution factor of γ = 8 ± 1 at α = 0.1, γ = 360 ± 10 at α = 0.1, and γ = 1.60 × 10 5 ± 2 ×10 3 at α = 0.1, for times 100P i,0 t, 5.456 × 10 6 P i,0 t, and 10 12 P i,0 t, respectively.Figure 8 (d) shows the AIC goodness of fit given the data as a function of time.For this configuration, L AIC = 1 for the two emitter model at time 100P i,0 t, with a 0.63 goodness of fit score for the three emitter model.For all subsequent times, L AIC = 1 for the three emitter model, with relatively low goodness of fit scores for all other models as we approach the long time limit.Background increases g (2) (0) at all locations, and increases asymptotically to g (2) (0) = 1 away from the emitters.

B. Four emitters of equal brightness with constant background
For this configuration, we have added an extra emitter on a field of constant background and have also increased the average spacing of the emitters.As we have noticed that the inclusion of background increases the g (2) (0) values obtained further away from the emitters, we will study if this leads to changes in AIC and w eff scaling results.
Figure 9 (a) shows the intensity map of four, equal brightness emitters in a field with a background of 0.2P i,0 that would be obtained after a measurement time of 10 12 P i,0 t.Overlapping the intensity map is the corresponding g (2) (0).The emitters are located at positions: (x 1 , y1) = (−1, 0.2), (x 2 , y 2 ) = (−1.4,−0.3), (x 3 , y 3 ) = (0.6, −0.5), and (x 4 , y 4 ) = (0.1, 0.6).Unlike in the previous cases, the average spacing of the emitters is greater than σ, with a spacing of 1.420σ.However, the minimum spacing of the emitters in this case is still less than σ, else this configuration would already be fully resolvable.
As in the previous case, we observe areas where g (2) (0) exceeds the expected maximum, 0.75, for a configuration of four equal brightness emitters.This can be attributed to the inclusion of background related coincidence events.We observe closed loop regions of g (2) (0) where the signal reduces further away from the emitters, before increasing Red dashed lines show 95% confidence interval of fits.t knee is 9×10 3 ±2×10 3 Pi,0t.(c) Mean w eff achieved with different intensity to g (2) (0) weighting ratios, α.Compared to the three emitter, no background case (Fig .4 (c)), higher weightings of g (2) (0) and g (2) (0) alone have an improved localisation performance.(d) AIC likelihood of fit for models with differing numbers of emitters.Compared to the three emitter, no background case (Fig .4 (d)), there is a reduction in time required for the LAIC = 1 to match the ground truth.Additionally, there is less variation in optimal models as time increases.
towards 1 as we move to the edges of the field of view.This region is a transition area where the primary source of g (2) (0) signal transitions from being mostly emitter coincidences to background related coincidences.
Figure 9 (b) shows the w eff scaling with respect to measurement time.The resolution scaling begins with m 1 = −0.42± 0.08 until approximately t knee = 200 ± 300P i,0 t where we transition to m 2 = −0.502± 0.002.This transition occurs at approximately where w eff becomes less than d min = 0.6403σ.
Figure 9 (d) shows the AIC goodness of fit given the data as a function of time.For this configuration, L AIC = 1 for the three emitter model at times 100P i,0 t, and 336P i,0 t.However, at time 336P i,0 t the four emitter model has a goodness of fit score of 0.94, making it almost equally viable as a fit to the three emitter model.For all subsequent times, L AIC = 1 for the three emitter model, with relatively low goodness of fit scores for all other models as we approach the long time limit.

C. Discussion on constant background
The scaling of our configurations in Figures 8 & 9 behave in the same manner as cases studied with zero background and keeps with the heuristic developed in Section III A. t knee is observed approximately where w eff becomes less than d min .
We observe that w eff is minimised at α values higher than in the zero background cases (Figs .4 & Figs .5)for times 100P i,0 t, 5.456 × 10 6 P i,0 t, and 10 12 P i,0 t, with optimal α being at 0.1 compared to 0.05.This together with the observation that, for cases with background, AIC converges to the ground truth model faster than in the zero background cases, suggests that background is assisting in localisation.From the w eff results in Fig- ures 8 (c) & 9 (c), we see that high values of α, as well as α = 1 perform noticeably better than the zero background cases, particularly at time 10 12 P i,0 t where α = 1 was at the order of, or worse than intensity localisation alone.From this, we can infer that the improvement in localisation that comes with background is caused by increased coincidence counts occurring from background emitters.

D. Effects of increasing background
To determine the effect of increasing background on w eff , we consider two four-emitter configurations cases where we vary the background levels.These cases are summarised in Table III.10 and 11. m2 and t knee are calculated in each case for background levels ranging from 0.002Pi,0 to 10Pi,0.For cases where t knee can not be interpolated, from m1 and m2, t knee is given as the first data point where w eff < dmin and is marked by *.
We observe from Table III that, for the configuration in Figure 10, t knee reduces as we increase background from 0.002P i,0 to 2P i,0 .As we have observed in the cases from Figures 8 & 9, background coincident counts are aiding in localisation.Looking at Figure 10 (b), which shows the α weighing at a background level of P i,0 , we see that a shift favouring higher values of α has occurred, even compared to the the results seen in Figures 8 & 9, which were at a background of 0.2P i,0 .Here, w eff is minimised at α = 0.33, α = 0.90, and α = 0.66 for times 100P i,0 t, 5.456 × 10 6 P i,0 t and 10 12 P i,0 t, respectively.As suggested in Section IV C, this is further evidence that backgroundemitter and background-background correlations can aid in localisation.
While increasing the amount of g (2) (0) counts by increasing background levels does aid in localisation, we can also see from Figure 10 and Table III that there appears to be a limit, as seen for background level 10P i,0 .While we do eventually reach a point where w eff < d min , which occurs some time around the data point at 1.28 × 10 4 P i,0 t, there is negligible change in w eff from this point as the algorithm is unable to localise any more precisely.At this extreme background, we expect that at higher times, the background-background correlations will mostly overpower any emitter related correlations preventing further localisation.
Unlike in the previous case, there is little change in t knee between background levels until we reach 10P i,0 .This can be seen from Table III and Figure 11 (a), where the w eff lines occupy mostly the same region.Background level 10P i,0 is a noticeable exception, and though we achieve an m 2 of close to 1/ √ t, there is a higher uncertainty relative to lower background levels, and t knee has a considerably high uncertainty due to fluctuations in w eff in the long time region.Our ability to resolve the particles despite the high background can be attributed to the spacing of the emitters around the field of view, which allows for more regions where g (2) (0) does not immediately go to 1 due to background-background coincidences.
As in Figure 10 (b), in Figure 11 (b) we see w eff minimisation take place at higher α when at background level P bg = P i,0 compared to zero background and lower background cases such as Figures 5 & 9. Here, w eff is minimised at α = 0.60, α = 0.60, and α = 0.80 for times 100P i,0 t, 5.456 × 10 6 P i,0 t and 10 12 P i,0 t, respectively.
We study the effects of background on AIC for the cases shown in Figures 10 and 11 by determining AIC at background levels: P bg = 0.002P i,0 , P bg = 0.2P i,0 , and P bg = P i,0 .(Figures 12 and 13).
For the close emitter case in Figure 12, we see that the model resulting in L AIC = 1 is consistent through to the long time limit, with a slight decrease in the time required for model number 4 for be the optimal model as we increase background levels.For the far emitter case in Figure 13, we see a increase in consistency for the optimal model at higher background levels, and once again observe a slight decrease in time required for model number 4 to be the optimal model.This suggests that background coincidence counts can contribute information that the algorithm can use to constrain emitter number, which can also play a part in improving localisation precision.  (2(0) weighting ratios, α.There is an improved localisation at higher α compared to the four emitter, no background case (Fig. 5   √ t w eff scaling, which is the best expected scaling result.(b) w eff for varying g (2) (0) weightings, , α, at a background of P bg = Pi,0.Note that w eff is minimized at higher α compared to cases studied at zero background.

V. STUDY OF UNEQUAL BRIGHTNESS EMITTER CONFIGURATIONS
So far, we have studied the simple case of equal brightness emitters, i.e., P i,0 = P j,0 .To show that this approach can be applied to the general case, and to determine if this results in a change in scaling behavior, we will now study fields of emitters of unequal brightness'.We will consider the cases of no background, and with background.
By comparing Figure 14 (a) to the previous equal- with a background ranging from P bg = 0.002Pi,0 to P bg = 10Pi,0.Due to larger emitter spacing, the effects of high background is not as significant as in Figure 10.Red-dashed line indicates 1/ √ t w eff scaling, which is the best expected scaling result.(b) w eff results at various g (2) (0) weightings, α, with a background of P bg = Pi,0.w eff is slightly more minimized at higher α weighting compared to cases studied at zero background.
brightness no-background cases, we can see that while the general g (2) (0) contour geometry resembles the equalbrightness no-background case, a shift has occurred in the location of the highest g (2) (0) value.The highest g (2) (0) value region corresponds to the area where the received brightness' of all emitters are equal, which for the equal brightness case, is in the middle and equidistant from all emitters.Here, a shift has occurred in the direction of the weaker emitters because the change in brightness further away from the emitters is no longer the same for every emitter.
Figure 14 (b) shows the w eff scaling with respect to measurement time.The w eff scaling begins at m 1 = −0.04 ± 0.06, until approximately t knee = 7 × 10 4 ± 1 × 10 4 P i,0 t, where we transition to m 2 = −0.504± 0.008.This transition occurs where w eff approximately becomes less than d min = 0.3536σ, following the same trend as in the equal brightness cases.The relationship between t knee , w eff and d min has remained unchanged from the equal brightness cases.
In Figure 14 (c), we show the calculated w eff with varying α.With optimal weighting to minimize w eff , we ob-FIG.12. AIC results at background levels: P bg = 0.002Pi,0, P bg = 0.2Pi,0, and P bg = Pi,0 for configuration studied in Figure 10.The time required for AIC to converge to the ground truth model (Model Number 4 in this case), reduces slightly as background increases.
Figure 14 (d) shows the AIC goodness of fit given the data as a function of time.For this configuration, the model resulting in L AIC = 1 varies between three and five until approximately P 1,0 t = 1.438 × 10 5 P i,0 t, where all subsequent optimal models are the are four emitter models.
Compared to the previous equal-brightness nobackground cases (Figs. 4 & 5), both weighting and AIC behaviour appear unchanged, with α plateauing between 0 and 1, and AIC results converging on the model matching the ground truth as time increases.

B. Four emitters of unequal brightness with constant background
Figure 15 (a) shows the intensity map of four, unequal brightness emitters on a field with a uniform back-FIG.13.AIC results at background levels: P bg = 0.002Pi,0, P bg = 0.2Pi,0, and P bg = Pi,0 for configuration studied in Figure 11.As in Figure 11, an increase in background leads to a minor reduction in the time required for the optimal model to converge to Model Number 4, which is the number of ground truth emitters.
In this configuration we once again see a shift in the g (2) (0) contour compared to the previous equalbrightness with-background cases.However, note that the shift has occurred in the direction of the brightest emitters rather than the weakest as was seen in Figure 14.This shift has occurred as a consequence of introducing background, as the contributions of backgroundbackground correlations, which increase g (2) (0) towards 1, are more significant in areas further away from the brightest emitters.In Figure 15 (c), we show the calculated w eff with varying α.With optimal weighting to minimize w eff , we obtain a superresolution factor of γ = 6 ± 5 at α = 0.1, γ = 210 ± 10 at α = 0.25, and γ = 8.2 ×10 4 ± 2 ×10 3 at α = 0.5, for times 100P i,0 t, 5.456 × 10 6 P i,0 t, and 10 12 P i,0 t, respectively.As in the previous cases with background, we see that the inclusion of background can potentially increase the value of α that minimises w eff .
Figure 14 (d) shows the AIC goodness of fit given the data as a function of time.For this configuration, the model resulting in L AIC = 1 increases from two to five emitter models until time approximatly 1.274 × 10 4 P i,0 t, where all subsequent optimal models are the are four emitter models.For this case, converging on the ground truth model does not occur as rapidly as in the previous equal-brightness, constant background cases with background levels at 0.

C. Relationship between scaling and emitter spacing for unequal brightness cases
The scaling behaviour of the unequal brightness cases behave in the same manner as the equal brightness cases with, and without background.We once again see that the time required to achieve 1/ √ t scaling, t knee , is tied to d min , and we observe two separate scaling behaviours before and after the knee.To corroborate this for various emitter spacing's, and compare correlation-with-intensity to intensity-only localisation for unequal brightness cases, we consider the scaling of additional cases as done in Section III D.
In Figure 16, we show a comparison of the w eff scaling results with respect to time for several unequal brightness emitter cases, sorted by the cases relative minimum emitter closeness and emitter number.These cases are summarised in Table IV.
As seen in the equal brightness cases (Figure 6), we observe a increase in t knee as d min decreases.t knee is higher when using intensity-only compared to intensitywith-correlation localisation, and tends to have high relative uncertainty when t knee is able to be interpolated.Additionally, we see that m 2 is unable to achieve 1/ √ t when using intensity-only for our 4 mid, 4 close, and 3 close cases.

VI. CONCLUSION
Our results show that combining intensity and g (2) (0) provides improved localisation of few single photon emitters relative to that obtained through intensity information alone.Except in the limit of very large background, where the background intensity is much greater than the emitter brightness, we observe diffraction unlimited localisation that asymptotically scales as 1/ √ t where t is the total measurement time.Localisation scaling of 1/ √ t can be achieved in the limit of high background when   16. Configurations are categorised as 'far', 'mid', or 'close' based on their minimum emitter spacing relative to other configurations with the same number of emitters, N .Results obtained using g (2) (0)-with-intensity (I & C) and intensity-only (I) fitting are compared.Results are in order of the emitter's minimum spacing, dmin.For cases where the interpolated t knee is < 0, the first data point where w weff < dmin is used and marked with *.If w weff is < dmin from the first data point, or m1 can not be obtained accurately due to too few data points, N/A is used for the corresponding data.As was seen in Table II, the results here follow the same trend of t knee increasing as dmin decreases, with higher t knee when using intensity only, particularly at small dmin.the spacing of the emitters is close to the optical point spread function's standard deviation.Prior to achieving 1/ √ t scaling, all of the configurations we tested showed a weaker localisation scaling with time, which held until the protocol had isolated the emitters and then localisation was more rapid.We term the point where the scaling laws shift as the 'knee' and observe that t knee depends on the exact geometry of the system under consideration.
While our results provide a simple heuristic to predict scaling behaviour based on t knee and the geometry of the configuration, further work is required to be able to understand the scaling behaviour prior to the knee, as this could be used to predict the measurement time required to achieve 1/ √ t, and therefore localise individual emitters, when the scale and geometry of the sample is somewhat known.It is probable that in our model, the scaling prior to the knee is a consequence of fitting multiple Gaussians to the data when the emitters are unresolved and/or the number of emitters are unknown.Our results demonstrate that the Akaike Information Criteria can be used to constrain the number of emitters.However, high measurement times are required to converge on the ground truth model in cases with very low or no background.
With the current model, we predict that this quan-tum correlation technique could be applied in a widefield approach when using an array of detectors, or combined with other superresolution techniques in order to improve resolution by an order of magnitude which may aid in the imaging of samples where the amount of light used must be considered, such as in bioimaging.

FIG. 1 .
FIG. 1.(a) Schematic of HBT experiment setup.Here, we are considering a confocal scan of some sample area (NSP S ) containing single photon emitters.The beam splitter and two detectors (D1, D2) monitored in coincidence allows for the study of photons from multiple emitters.Photon counts are correlated at C to obtain the second order correlation function as a function of delay time, τ .(b) Example cross section of normalised intensity profile obtained by summing photon counts collected at each measurement location.As the two emitters are close with respect to the diffraction limit, we expect to see a single peak.(c) Example second order correlation function (g(2) ) output obtained by correlating photon counts.The probability of all possible two photon detection events at detectors D1 and D2 for a given delay time are normalised by uncorrelated detection events (τ = ±∞).The example here is for the case of two equal brightness emitters which result in g(2) (τ = 0) = 0.5.

FIG. 2 .
FIG. 2. (a) Intensity image of N = 4 equal brightness emitters at locations shown by the black dots.The color axis is normalised to the maximum obtained brightness.(b)-(e) Four different fits to the data shown in (a) for 2 to 5 emitters (fitting results shown as crosses).The color shows the percentage difference between the fitted brightness and the original signal brightness at each measurement location.While the difference in RSS values shows changes in the algorithm's preference for fitting, without prior knowledge of emitter number and locations any of these fits could be a potential candidate.

Figure 5 (
Figure 5 (b) shows the w eff scaling with respect to measurement time.The resolution scaling begins with m 1 = −0.10±0.04 until approximately t knee = 4.7×10 5 ± 6 × 10 4 P i,0 t, where we transition to m 2 = −0.50 ± 0.01.As in the previous case, this transitions corresponds to when the mean w eff approximately becomes less than d min which is 0.1803σ for this configuration.

FIG. 4 .
FIG. 4. (a) Intensity plot and g (2) (0) contour of three emitters on a field with no background at positions: (x1, y1) = (−0.25,−0.3)σ, (x2, y2) = (−0.05,−0.45)σ, and (x3, y3) = (0.1, 0)σ.Black dots indicate ground truth emitter position.(b) w eff scaling with measurement time.Blue line shows fitted data belonging to w eff,1 used to obtain m1.Black line shows fitted data belonging to w eff,2 used to obtain m2.Red dashed lines show 95% confidence interval of fits.The lines are extrapolated past the fitted data, showing the projected trajectory of w eff if the scaling behaviour did not change.t knee is interpolated to be 6.7 × 10 4 ± 7 × 10 3 Pi,0t.(c) Average of all emitter w eff achieved with different intensity to g (2) (0) weighting ratio, α.An improvement in w eff with a weighting between 0 and 1 means that the algorithm performs best using both data sources (d) AIC goodness of fit given the data for models with differing numbers of emitters.Best model has a score of 1, with others having a score showing the goodness of fit relative to the optimal model.

FIG. 6 .
FIG. 6.(a) Localisation scaling with time of six different emitter configurations categorised by their minimum emitter spacing relative to other configurations with the same emitter number, with 'close' having the smallest spacing, and 'far' being the furthest.(b) Resolution scaling for the same configurations in (a), using only intensity information.From top to bottom according to the legend, the minimum spacing for each configuration is: 1.75σ, 0.81σ, 0.52σ, 0.40σ, 0.25σ, and 0.18σ.Red-dashed line indicates 1/ √ t w eff scaling, which is the best expected scaling result.

FIG. 9 .
FIG. 9. (a) Intensity plot and g (2) (0) contour of 4 emitters on a field with background 0.2Pi,0 at positions: (x1, y1) = (−1, 0.2)σ, (x2, y2) = (−1.4,−0.3)σ, (x3, y3) = (0.6, −0.5)σ, and (x4, y4) = (0.1, 0.6)σ.Black dots indicate ground truth emitter position.(b) w eff scaling with measurement time.Blue line shows fitted data belonging to w eff,1 used to obtain m1.Black line shows fitted data belonging to w eff,2 used to obtain m2.Red dashed lines show 95% confidence interval of fits.t knee is 200 ± 300 Pi,0t.(c) Average of all emitter w eff achieved with different intensity to g(2) (0) weighting ratios, α.There is an improved localisation at higher α compared to the four emitter, no background case (Fig.5 (c)).(d) AIC likelihood of fit for models with differing numbers of emitters.Best model has a score of 1, with others having score that shows the likelihood of model being an alternative fit to data.Similar to Figure8 (d), we see more consistency in the optimal model at lower time.The optimal model is also determined at a lower time than in the zero background case (Fig.5 (d)).
FIG. 9. (a) Intensity plot and g (2) (0) contour of 4 emitters on a field with background 0.2Pi,0 at positions: (x1, y1) = (−1, 0.2)σ, (x2, y2) = (−1.4,−0.3)σ, (x3, y3) = (0.6, −0.5)σ, and (x4, y4) = (0.1, 0.6)σ.Black dots indicate ground truth emitter position.(b) w eff scaling with measurement time.Blue line shows fitted data belonging to w eff,1 used to obtain m1.Black line shows fitted data belonging to w eff,2 used to obtain m2.Red dashed lines show 95% confidence interval of fits.t knee is 200 ± 300 Pi,0t.(c) Average of all emitter w eff achieved with different intensity to g(2) (0) weighting ratios, α.There is an improved localisation at higher α compared to the four emitter, no background case (Fig.5 (c)).(d) AIC likelihood of fit for models with differing numbers of emitters.Best model has a score of 1, with others having score that shows the likelihood of model being an alternative fit to data.Similar to Figure8 (d), we see more consistency in the optimal model at lower time.The optimal model is also determined at a lower time than in the zero background case (Fig.5 (d)).

TABLE I .
Summary of configuration shown in Fig3 11, and 10 11 to 10 15 , respectively.

TABLE III .
Summary for cases in Figs.

TABLE IV .
Summary of configurations shown in Figure