Coherent fiber links operated for years: effect of missing data

Aiming at delivering a highly available service, the French national optical fiber link network is run mostly unmanned and automatically, with the help of a global supervision. However, at a year scale, missing data are seemingly unavoidable. Here, we present a first study of the uncertainty of coherent fiber links with missing data. We present the tools to assess statistical properties for processes which are not strictly stationary, and a simulation of optical fiber links depending only on a handful of parameters. We show how missing data affects the phase-coherent optical fiber links, and how to mitigate the issue with a fill-in procedure that preserves the statistical properties. We apply the method for a 5 years-long data set of a 1410 km long fiber link. Second, we apply the method to the case of optical clock comparisons, where the downtimes of the optical clocks degrade the coherence of the links. We show that our methodology of processing the missing data is robust and converges to consistent mean values, even with very low uptimes. We present an offset and uncertainty contribution from the French fiber network of 2.4 (9.0) × 10−20, that is an improvement by a factor 5 as compared to a processing without taking the effect of missing data into account.


Introduction
Optical clocks are a new generation of atomic clocks, with superior frequency stability and accuracy as compared to clocks based on microwave transitions. However, the comparison of these optical clocks over large distances is very * Author to whom any correspondence should be addressed. 3 Now affiliated at Leibniz-Institut für Astrophysik Potsdam (AIP) An der Sternwarte 16, D-14482 Potsdam, Germany. 4 Now affiliated at 2.
Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. challenging, since traditional means of comparison of the clocks through the global navigation satellite system (GNSS) are no longer viable. Reliable and long-term comparison of optical clocks at 10 −18 fractional frequency uncertainty is among the mandatory criteria set on the roadmap for a new definition of the SI-second [1,2]. The operation of optical clocks in a network is a very exciting prospect, as it enables stringent tests of special and general relativity [3][4][5][6], search for dark matter [7], and direct measurement of the differential redshift between frequency standards at different geopotentials [8][9][10][11]. Coherent optical fiber links have shown great promise, allowing the comparison of optical clocks on a continental scale [10]. This has led to the development of coherent optical networks around the world [12]. In France, the Réseau Fibré Figure 1. Illustration of the active noise compensation scheme (ANC) used for long distance fiber links. The end-to-end measurement is recorded as the beat node between the local ultra-stable cavity and the light disseminated first in the uplink, and afterward in the downlink interferometer.
Métrologiqueà Vocation Européenne (REFIMEVE) [13] network disseminates an ultra-stable optical frequency over large distances to multiple users, aiming at more than 30 laboratories all over the country [14]. Applications range from atomic and molecular spectroscopy [15][16][17][18][19][20], radio-astronomy [21], laser frequency control [22,23], tests of fundamental physics [5,7,[24][25][26], as well as particle and high energy physics [27,28]. Aiming at delivering a highly available service, the links are run mostly unmanned and automatically, with the help of a global supervision. A large amount of data is gathered continuously over the years from all the links across the network. The data sets of two of the links of this network are used in this study.
Beyond the need for the redefinition of the SI unit 'second', ultra-precise frequency comparisons with an uncertainty below 10 −18 relies either on continuous optical frequency dissemination or on a rigorous evaluation of the effect of the missing data. Interruptions do however occur in the form of unlocks of the link, cycle slips, polarization adjustments, and general maintenance on the network. A rigorous evaluation of the performance and statistical behavior of the fiber links is needed to handle the issues from missing data. Similar evaluations have been performed in the case of frequency transfer through the GNSS and two-way satellite time and frequency transfer [29][30][31][32][33][34][35][36][37], but were never performed in the frame of coherent fiber links, which raises novel challenges.
Indeed, the strength of optical fiber links is that they operate in the phase coherent regime, which ensures that their instability decreases with time with a larger slope. This phase coherence, however, is the origin of aliasing effects [38], and adds complexity to the post-processing in the case of missing data, which can cause deterioration of the evaluation of the frequency transfer due to aliasing effects.
The second difficulty with fiber links arises from the fact that the noise is not strictly stationary [39,40]. The performance of a link depends on external conditions that are imperfectly under control. The comparison of performances varying only one parameter is not always obvious.
A first approach for optical fiber links was introduced and discussed in [14,41]. Here we provide a concise and consistent evaluation of the fiber links reduced to a handful of parameters, both on short/medium term (on the scale of seconds to hours) and very long term (on the scale of years). As the former case operates in the phase-coherent regime, which is not the case of the latter, we will split up their analyses: we first evaluate the short/medium term behavior of the fiber links, followed by an analysis of the aliasing effects arising from the sampling of missing data. Afterward, we analyze the long-term behavior of the links, using a data set of 1890 days from the REFIMEVE network. We apply the study to the case of optical clock comparisons and evaluate the effect of the downtime of the clocks on the uncertainty budget of the contribution of the link.

The experiment
In this paper, we will show studies of two links in the REFIMEVE network, which we will call Link A and Link B. Link A stretches from Systèmes de Référence Temps-Espace (SYRTE) in Paris to Laboratoire de Physique des Lasers (LPL) in the city of Villetaneuse north of Paris, where it is looped back to SYRTE. It has a total length of 86 km and is a two-span cascaded link.
Link B is 1410 km long and stretches from SYRTE to the University of Strasbourg, where it is looped back to SYRTE. This link is used in the clock comparisons between SYRTE and Physikalisch-Technische Bundesanstalt (PTB, Braunschweig, Germany). This link is a four-span cascaded link, which enables us to compensate more noise in each segment and reach a higher correction bandwidth and noise rejection [42]. It uses repeater laser stations (RLSs) and since 2018 multibranch laser stations (MLSs) [14,43]. This link is optically amplified in 16 bidirectional erbium-doped fiber amplifiers (EDFAs).
Each segment of the fiber links consists of a strongly imbalanced Michelson interferometer, where an acousto-optic modulator (AOM) is used as an actuator on the outgoing light, compensating the noise of the reflected signal in the fiber [44]. In this paper, we will focus on the properties of the end-to-end (E2E) signal of the fiber link, which is an out-of-loop assessment of the frequency transfer measured by comparing the outgoing light with the looped back signal. This scheme is illustrated in figure 1. The E2E signal of the fiber link is counted by dead-time free frequency counters. Our experiments are using K + K counters [45] with a fundamental sampling interval of 1 ms, operated in Λ mode with a gate time of τ G = 1 s, corresponding to a bandwidth of 0.5 Hz [46]. The reference clock is provided by an active H-Maser disseminated by fiber links across the campus and has negligible contribution to the uncertainty [47]. Further information on the links and technical details can be found in [14,43].

Modeling fiber links
The phase of a continuous E2E signal without any missing data can be expressed as with y(t) being the relative stochastic frequency fluctuations of the link. The terms φ n (t) = A n sin(2π f n t) sum up any periodic perturbations to the phase affecting the fiber link, where A n is the modulation amplitude and f n the modulation frequency. Pseudo-periodic perturbations are expected, since the optical length of the short arm of the interferometers is sensitive to temperature variations that cannot be compensated from the interferometric measurement [44,48,49]. On top of day-night fluctuations, several groups report observations of the effects of air conditioning systems that impose much shorter time periods, like 400 s-2000 s. More recently, humidity variations were also found to play a similar role [47]. The global mathematical frame of the study is given in appendix A.

Case study with mid-haul and long-haul fiber links
We show in figure 2 four representations of the end-to-end measurement for a duration of half a day, showing a red trace for the mid-haul (Link A) and a blue trace for the long-haul (Link B) fiber link. Figure 2(a) shows the fractional frequency fluctuations over the acquisition period, that we chose conveniently without any missing data. Figure 2(b) shows the distribution of the data. We fit this distribution with a pseudo-Voigt profile, and extract from the fit the full width at half maximum  data, with a window of 5 s. Using this approach, the short-term periodic perturbations of Link B become clearly visible. As displayed in table 1, the FWHM of Link A is as low as 16 mHz, and is 122 mHz for Link B, showing that the degradation of the transmitted laser is very low and can be neglected. Although an extensive study of the line width of the distributed signal is beyond the scope of this paper, we can notice that these line-widths are much lower than that of lasers usually used for precision spectroscopy, and thus the transmitted signals are worth to be used at the user end to reduce the line-width of local lasers [12,18,22]. We also notice that the weight of the Lorentzian part is lower for Link B than for Link A, showing that the contribution of white frequency noise is higher than that of white phase noise, as expected for a longer link. Similarly, the MDEV of 4 × 10 −17 at 1 s averaging time for Link A is typically ten times lower than for Link B.
To identify the periods and amplitudes of the pseudo oscillations, there are elaborated methods reported in the literature applied to GNSS and geodetic surveys, as principal component analysis (PCA), independent component analysis (PCA) and wavelets [50][51][52]. Here, we use a simple peak identification procedure of the auto-correlation plot to extract their amplitude and their frequency. It appears to be sufficient for the case of fiber links where the number of components generally are only a few, the signal-to-noise ratio is greater than 1, and there is no phase ambiguity. Periodic perturbations are clearly visible for Link B and are discussed below.
Power spectral densities. Figure 3 shows the power spectral density (PSD) of the phase noise for the two links with the same color code. As expected, we observe that the longest link has a higher amount of noise across the whole spectrum. First, we identified periodic perturbations of Links A and B, highlighted with arrows in figure 3. In total, 2 components are identified for Link A and 6 components for Link B. Then, we fit the rolling median of the PSDs with a window size of 35 data points (in order to remove the contribution of the peaks) and determine the noise coefficients b n for the two links (see equation (A.3) in appendix A). The obtained coefficients are indicated by dashed lines in figure 3. It is noticeable that for Link A, a non-zero flicker phase component with a power of law as f −1 at low Fourier frequencies is found, whereas Link B exhibits only white phase noise and white frequency noise components. For Link A, at frequencies higher than a few 10 mHz, the phase PSD is limited by the bandwidth of the detection system. For Link B, by contrast, the white phase noise behavior below 1 Hz is due to the active phase stabilization of the link delay. The white frequency noise at lower Fourier frequencies can arise from various phenomena breaking the coherence, as for instance the spontaneous emission of the optical amplifiers.
Coherence time. The coherence time of the fiber link is defined here as the time at which the PSD of the phase noise changes its slope from white phase noise to white frequency noise. When flicker phase noise is negligible and when b −2 is not zero, this is simply given by the relation where b 0 and b −2 are respectively the white phase noise and white frequency noise coefficients. This ratio should determine the slope change in the overlapping Allan deviation, that is the same for flicker phase noise and white phase noise. When flicker phase noise is not zero, a generalization of equation (2), as well as the relation between these noise coefficients and the two statistical estimators, phase PSD and MDEV, can be derived and is given in appendix A. Three poles can be found as detailed in appendix A and in the following we will consider the coherence time of Link A to be defined by the crossing of the terms b 0 and The coherence time of Link B is 55 s, whereas that of Link A is as long as 214 s, due to its shorter length and thus lower noise. The coherence time of Link A, being longer than that of Link B, is less clear from the plot in figure 3. This is due to the reduced resolution at lower frequencies, which increases the uncertainty of the b −2 noise parameter.
Reduction to a few parameters. With the outcome of the analysis of the auto-correlations, phase noise PSDs, and frequency noise distributions, one can draw a complete picture of the physical characteristics of a link, reduced to just a few parameters. We present in table 1 the parameters describing the two links introduced above, enabling a straightforward comparison between them. One observes that the shorter Link A has fewer and smaller periodic perturbations. We believe that it is due to its simpler architecture and shorter length, which expose the fiber to fewer noise sources. Furthermore, this should lead to a lower amplitude of the noise, since the level of fiber noise is typically proportional to the fiber length.

Fiber link simulation
With the link parameters identified in table 1, which are stable over typically months, we can appropriately simulate the behavior of fiber links. We can generate data sets at will, with a similar statistical behavior as that of the link, with no missing data, and thus study the effects of missing data within these data sets. We ran Monte-Carlo simulations, repeating calculations on any number of identically simulated data sets.
In figure 4 we show a comparison between a case of continuous experimental data from Link B (blue trace), and a simulation of the link (green trace). Both traces show a data set without any missing data. We show the end-to-end phase variations and phase PSD of Link B and the simulation in figures 4(a) and (b) respectively. We observe a very good correspondence between the two, which shows that the simulated link can be used to study the behavior of the real link. Taking advantage of the flexibility offered by the control of the various parameters, we are able to look for the impact of missing data on the link performance, as shown in the next part.

Links with missing data
Missing data in frequency comparisons have been studied for several applications, mainly using GNSS frequency transfer, for the comparison of microwave fountain clocks [30], or tests of fundamental physics [53]. A common conclusion is that the handling of, and the uncertainty introduced by, the missing data is dependent on the noise model of the data. Here we study missing data in coherent optical fiber links, both in relation to the fundamental behavior of the links themselves, and in their applications to the comparison of optical clocks.
In figure 5 we have illustrated an extreme case of missing data. We show the (a) systematic and (b) statistical uncertainty of the same continuous frequency measurement of Link B. Removing every second data point in post-processing is introducing an artificial frequency shift to either lower (green) or higher (orange) values, changing the relative frequency offset from −5(6) × 10 −20 to ∼ ±1(2) × 10 −18 . We expect that it is related to a technical limitation of the dead-time free counters, even with their very good resolution. The decreased stability performance in figure 5(b) is a consequence of the aliasing arising from the Dick effect [54], which is detailed below.
At each run of our simulation of the fiber links, we can likewise generate missing data. With the same tools, and following the same general formalism, we can control the density of missing data and their statistical properties. This is indeed a necessity in the study of external noise sources to the fiber links like missing data, as the noise of the links is not strictly stationary. This is also intended to avoid effects that would arise from a peculiar arrangement of data and missing data, that would not be reproducible. First, we describe here our noise model for the missing data. We then present a study on the effects of missing data on the statistical and systematic evaluation of the fiber link on reproducible data sets, i.e. with a steady and controlled statistical behavior. As for the data represented by equation (1), it can be split into a stochastic component and a systematic component expressed as a sum of pseudo-periodic components, which we study individually.

Noise model of missing data
To represent missing data, we use the annihilation operator g(t). At any point in time t = t k , this is defined to be either representing valid data (g(t k ) = 1) or missing data (g(t k ) = 0).  The density of missing data is given by with T being the total duration of the data sample and τ G the time gap between consecutive measurements. By this definition the two extreme cases are given by h = 0, corresponding to a complete data set without missing data, and h = 1 describing the case of an empty data set. This allows us to represent the effective relative frequency of a sample data set asỹ (t k ) = g(t k ) · y(t k ).

Gaussian distribution of missing data
In this study, we first assume that missing data are incoherent, follows a Gaussian distribution, and exhibits a white frequency noise spectrum. The noise density h is the only parameter and has an equivalent role of b −2 . This is illustrated first in figure 6(a), where we have calculated the distribution G(d, h) of the mean distance d between missing data for 3 densities of missing data h. The mean of the three Gaussian distributions is given by the inverse of the density, and their width is given by around h −3/2 , as expected (see appendix B). In figure 6(b) we show the Fourier spectrum of the annihilation operator g(t) for the same values of h. It shows a white frequency noise behavior, which can be described analytically, and is given in appendix B.

Periodic effects
We will now consider the simple case of periodic sampling, where 1 measurement value is removed every τ G /h seconds. This is a well known case, found for instruments with 'dead time', radars and lidars, and particularly well known in metrology for clocks with non-continuous operation. The latter case was introduced by John G Dick in the late 80s to describe the periodic sampling of low-noise oscillators with periodic sampling from atomic response [54]. The increase in frequency noise due to the Dick effect as a function of the density of missing data h can be written, following [55], as with S φ ( f ) being the nominal phase noise of the signal without missing data. Applying an upper limit of the summation of n 1 2h , with · denoting the 'floor' function, and evaluating the function in the case of a nominal signal of pure white phase noise S φ ( f ) = b 0 , we find the following expression for the effective phase noise The second term corresponds to the Dick noise, and is a white frequency noise term due to the aliasing of the original white phase noise. We introduce the notation D(h) to express the sensitivity of the phase noise PSD to the density of missing data.

Sensitivity function approach
The sensitivity approach was developed in the frame of cold atom clocks (atomic fountains) to simplify the modeling of the sensitivity of the phase noise power spectrum density of the servo-looped oscillator (the clock output), as a function of the ratio of the interrogation time of the atomic clock transition over the clock cycle time [56]. The sensitivity function formalism is very convenient to describe periodic perturbations, as shown in the previous part. Here, we adapt this formalism to show the convergence between the white frequency noise model and the periodic model for the missing data. Illustrated in figure 7, the missing data for a given density can be distributed with (a) no preferred order, or with a specific order: (b) the missing data is periodically spaced, or (c) all the missing data can be at the beginning or the end. For a periodic spacing between the missing data, equation (5) can be applied. We thus simulate missing data with various density of missing data, using a Gaussian distribution or a periodic distribution, and will compare the resulting phase noise with expression (6) of the Dick effect.

Results
Aliasing effects. We first consider a simulated link signal with a white phase noise component and a strong periodic modulation φ p (t) = A sin(2π f m t) around 50 mHz. The four different curves shown in figure 8 correspond to the same signal with a varying amount of periodically missing data. The phase noise using equation (6) is shown as a dotted line and predicts very well the simulated noise, except for at very high  densities of missing data (red curve). In that last case indeed the assumptions of quite low density of missing data leading to equation (6) are no longer fulfilled. We can see that the frequency noise is increasing with the amount of missing data, as expected from equation (5). Furthermore, the sampling of the strong carrier signal can be seen, resulting in an additional aliasing effect. Coherence time. The increase in white frequency noise according to equation (6) directly corresponds to a decrease in coherence time, as described by equation (2). For a signal consisting of white phase noise and white frequency noise, we can find the following expression for the effective coherence time as a function of the density of missing data: with τ coh (0) being the nominal coherence time without missing data. Figure 9 illustrates how missing data leads to increased frequency noise, and thereby a decrease in effective coherence time. The blue trace shows the short term phase noise of Link B without any missing data, which serves as a reference. In orange, we have removed 5% (h = 0.05) of the data points, using a Gaussian distribution of the missing data (with a white frequency noise spectra). This leads to an increase in frequency noise, and a decrease in effective coherence time. Repeating the simulation with a varying density of missing data (h), the aliasing effects of the missing data is shown in figure 10(a). Here we have calculated the effective coherence time as a function of the density of missing data for five nominal coherence times, ranging from 10 s-300 s. The dotted line corresponds to equation (7) and perfectly predicts the simulated data, although there is no free parameter for the simulation. This shows that the sensitivity function of equation (5), even though derived for periodically spaced missing data, can be applied for a Gaussian distribution of missing data as well. This is possible for any distribution of missing data as long as the mean of the data is unaltered, which is the most important assumption for applying equation (6) [37,57].

Mitigating the effects of missing data
Up to here, we discuss the effect of missing data, and kept the phase constant across the missing data, which conserves the length of the data set. However, such an approach biases the short-term noise level. Here, we test a second approach of dealing with the missing data, and simulated how it affects the effective coherence time. This second approach is to replace the missing data with simulated noise (according to equation (A.3)), with the same statistical parameters as the 'original' data. Our method relies on inserting the new data point as a stochastically generated point into the phase. For this part, for simplicity, we consider a link without periodic perturbations to the phase, as described in the derivation of equation (7). The mean of the distribution is determined by the mean of phase data points just before the missing data, where the number of points to be considered depends on the coherence time of the data. An initial comparison between the two approaches, as well as the option of concatenating the effective data set, can be found in [41].
We observe that there is no significant difference in the two approaches for small densities of missing data, h < τ G /τ coh (0). However, for densities h > τ G /τ coh (0) we start to see divergence between the two models: when the phase is kept constant, the effective coherence time is limited by equation (7). This limit does not apply when the missing data are replaced with simulated data. In this case, we observe that the link coherence is increasing toward its nominal value without missing data. Thus, by treating missing data and replacing them with simulated data, we can have a better estimation of the link behavior without missing data.

Long term behavior of fiber links
In section 3.1 we analyzed the behavior of the short-term noise of the links, with up to half a day of data. At these timescales it is possible to acquire continuous data, and the assessment of the links' statistical behavior becomes straightforward. Assessing the long term behavior, on the scale of years, of a fiber network presents many challenges, however, from both an operational and a computational point of view.
Here we present the analysis of the long term behavior of Link B, starting in April 2015, and ending in June 2020. The total period lasts 5 years and 64 days, with a total uptime of 41.6%. As the frequency data is recorded with a gate time of 1 s, 1890 days corresponds to more than 163 million data points. This is more than twice the duration in [14].
First, we processed data over subsets of 3 months using similar filtering procedures as in [58]. For data sets longer than 3 months, our data filtering procedures starts to become critically slow, so that we have to reduce the number of data points. We used an additional filtering stage to check that the mean fractional frequency over subsets of 1000 points is below 9 × 10 −18 , so that each reduced data point is not critically depending on the uptime of the subset. Then, we reduced the data to 40 data points per day, i.e. averaging over 2160 s. For our statistical analysis, we remove all the subsets with an uptime less than 100%, resulting in an uptime of 17% for further analysis. Then, missing data are replaced with data containing the statistical properties as analyzed in section 3, in a similar way as illustrated in figure 4.
The phase coherence has no effect due to averaging of the data over 2160 s intervals, which is much longer than the coherence time of the link. In figure 11 we plot (a) the frequency noise PSD, given by S δν ( f ) = f 2 S φ ( f ), and (b) the auto-correlation of the rolling mean of the frequency fluctuations, for the reduced data. We observe a white frequency noise behavior, consistent with the white frequency noise behavior observed with the data of shorter duration (see table 1), which is highlighted with the dashed green line. Furthermore, we observe a peak with a period of one day, highlighted by the orange shaded area. This one-day period is confirmed by the auto-correlation plot in the region of a few days. This peak is not represented by the simulated data computed using the noise model, and is revealing information contained solely in the experimental data. This result is revealed when repeatably replacing the missing data, and averaging the resulting phase PSD. This approach averages out all sporadic peaks caused by chance by single iterations, and leaves solely the peaks hidden in the data. We believe that the one-day periodic perturbation is due to daily temperature variations, which affects the reference arms of the interferometer set in the RLSs at data centers in Paris and in Strasbourg.
The analysis of the pseudo-periodic perturbation beyond one day is beyond the scope of this article. Indeed, such an analysis would rely on a good knowledge of the long term behavior without missing data, or a more elaborated statistical analysis.
We show the resulting MDEV and associated means and uncertainty in figure 12. The stability plot spans from 1 s integration time to 4 × 10 7 s. The short/medium term stability of Link B is shown in light blue, for the same period without any missing data as analyzed in section 3.1. The stability of the 5+ years of data, where the missing data have been replaced with simulated noise, is shown in dark blue, and where the missing data have been concatenated in orange.
We see an excellent agreement between the three stabilities in the region τ = 2000 s to τ = 10 000 s. The dashed lines show the noise terms of the link: 0.038 ν 2 0 τ 3 b 0 corresponding to white phase noise, and 1 4ν 2 0 τ b −2 corresponding to white frequency noise. They cross at the expected value of 0.39τ coh , indicated by the vertical line on the plot [59]. These noise terms and the coherence time translated to integration time is discussed in appendix A.
The coherence time as revealed by the spectrum of Link B (in figure 3) is shorter than what could be deduced from the inflection point of the MDEV. This is due to the two periodic perturbations f 1 and f 2 , which have periods shorter than the coherence time of the link, and induce a small bump in the stability around 10 s.
Finally, the accuracy of the frequency transfer was evaluated by calculating the mean value of the reduced end-to-end beat-note frequency offset. Although the data were acquired with a dead-time free counter in Λ-mode, one expects an excellent convergence at long term with an evaluation for data  acquired in Π-mode [60]. Following [61,62], we estimate its statistical fractional uncertainty as the long-term overlapping Allan deviation at 864 000 s or 4320 000 s of the data set without and with a treatment of the missing data respectively. The results are plotted in figure 12(a). The mean is −1(2) × 10 −21 when the missing data were treated using the noise model, and −4(5) × 10 −21 when the missing data are not treated. For both cases we obtain a mean which is consistent with zero within the statistical 1 σ uncertainty. However the uncertainty obtained with the missing data treatment is reduced by a factor 2 compared to the case of concatenated data, highlighting the role of data treatment.

Uncertainty contribution to clock comparisons
When comparing distant optical clocks, both the clocks and the links between them must be running at the same time. If we consider the simple case of the comparison of two clocks, called C 1 and C 2 , they will at any given time either be considered to be valid or not (up or down), indicated by their respective annihilation operators g C 1 (t) and g C 2 (t). We denote the annihilation operator of the comparison chain between them g ρ (t), which consists of optical frequency combs, ultra stable cavities, as well as the optical fiber link connecting them. The total uptime of the comparison can be written as the product g tot (t) = g C 1 (t)g ρ (t)g C 2 (t).
In the following we will evaluate the links' contribution to the uncertainty of the comparison in the case of missing data, evaluating the effective relative frequency fluctuations of the link y eff (t) = g tot (t)y(t), where y(t) represents the link without any missing data 5 .
In figure 13 we show the uptime of the comparison of two optical clocks during a comparison campaign in December 2018. The two clocks are located at SYRTE in Paris [63] and at PTB in Braunschweig [64]. Each laboratory compares their respective clocks to an ultra-stable cavity, from which an ultra-stable frequency is disseminated to the city of Strasbourg, conveniently placed in the mid-point between the two laboratories at the French-German border [10]. There a local two-way comparison of the disseminated signals of the cavities is performed, from which we have an indirect comparison of the optical clocks.
During the 9 days of comparison, we observed several periods of simultaneous operation of the two clocks. This is shown in figure 13, where we show the individual uptime of each of the two clocks, the uptime of the link from Paris to Strasbourg, as well as the total uptime of all three systems g tot (t). Here we did not take the link between Strasbourg and PTB nor the combs into account. We have chosen to break up the periods of simultaneous operation in 6 different subsets with varying duration and uptime, which we have analyzed individually.
For each subset of data, we have calculated the systematic and statistical uncertainty of the effective link signal y eff (t) contributing to the comparison, and compared it to the direct link signalỹ(t) = g ρ (t)y(t). As seen in figure 13, the main contribution of the downtime of y eff (t) comes from the two atomic clocks.
For each of the 6 subsets, calculating both y eff (t) andỹ(t), we have replaced the missing data due to the clocks by simulated noise. We have shuffled the distribution of missing data in each subset, and repeated this simulation 1000 times, keeping the total uptime of the subset the same. We have thus found a mean frequency shift, with a mean error, which would be expected from each period. The error of the mean offset is calculated by the long term overlapping Allan deviation.
In the two top plots in figure 14 we show the mean relative frequency offset of the effective link signal, when sampled by the uptime of the clocks, in two different cases: (a) where the effective data have been concatenated, and (b) where the missing data have been replaced by simulated noise. A weighted average of all 6 subsets yields a total systematic offset of 1.2(3.5) × 10 −19 for the concatenated data set, and  This method ensures reproducible results, and it implies that the performance of the fiber link does not depend on the uptime of the other subsystems in the comparison. These results are compatible with the scientific goal set by the BIPM of comparing remote optical clocks at least to the order of a few 10 −18 [1,2], as a part of the road-map of redefining the SI second to one or more optical transitions [1,2,65] in the coming years.

Conclusion
In this paper, we have discussed the noise behavior of coherent optical fiber links. Using a study of two French links of very different lengths, we have shown how the links can be described with just a few parameters, making the intercomparison of the links more straightforward. We have used this knowledge to develop a simulation of one of the links, enabling us to generate data with realistic simulated noise and oscillatory contributions on-demand. This method has been used to study the effects that missing data have on the statistical evaluation of the links, in particular the decrease in the coherence time of the links. We have shown that this decrease in coherence time can be reduced by replacing the missing data with simulated noise.
We then investigated two different applications where missing data is unavoidable: first, we have studied the long-term noise behavior of a fiber link, analyzing more than 5 years of data. Using the method of replacing the missing data with simulated noise, we have presented the resulting MDEV, frequency noise spectrum and auto-correlation, where we see periodic perturbations with a period of 1 day. We furthermore see a good agreement between the short-term and long-term evaluation of the links, further enhancing our trust in the results.
Lastly, we have investigated the uncertainty contributions of the links when using them to connect and compare two optical atomic clocks. By replacing missing link data with simulated noise, we have evaluated the uncertainty contribution of the links to be 2(9) × 10 −20 , which is compatible with the goal set by the BIPM of comparing optical atomic clocks to the level of at least 10 −18 . Here the profiles have been assumed to be centered around zero, which will be the case for an actively compensated fiber link.