A summary of the Planck constant determinations using the NRC Kibble balance

We present a summary of the Planck constant determinations using the NRC watt balance, now referred to as the NRC Kibble balance. The summary includes a reanalysis of the four determinations performed in late 2013, as well as three new determinations performed in 2016. We also present a number of improvements and modifications to the experiment resulting in lower noise and an improved uncertainty analysis. As well, we present a systematic error that had been previously unrecognized and we have quantified its correction. The seven determinations, using three different nominal masses and two different materials, are reanalysed in a manner consistent with that used by the CODATA Task Group on Fundamental Constants (TGFC) and includes a comprehensive assessment of correlations. The result is a Planck constant of 6.626 070 133(60)  ×10−34 Js and an inferred value of the Avogadro constant of 6.022 140 772(55)  ×1023 mol−1. These fractional uncertainties of less than 10−8 are the smallest published to date.


Introduction
It is expected that the world's measurement system, the SI, will soon be revised with seven defining constants at its foundation [1]. In preparation for this event the CODATA Task Group on Fundamental Constants has requested that laboratories performing high accuracy determinations of these constants, in particular the Planck, Avogadro and Boltzmann constants, submit their latest results by July 1, 2017 [2,3]. These and previous results will be used to establish the fixed values for the constants to be ultimately recommended to International Committee on Weights and Measures (CIPM) and then used in the revised SI [4].
In 2014 we published four watt balance results [5,6], each associated with a different test mass, which represented the most precise Planck constant determination at that time. These were the result of a few years of effort resolving major issues of the previous NPL watt balance including mass exchange errors [7], coil suspension and alignment [8]. Since that time the balance has been disassembled, primarily to modify the tare mass lift and to make modifications to the interferometer. The resulting improvements have been used in three new determinations: the first performed in February 2016 with a silicon 500 g mass, the second in November 2016 with a goldplated copper (AuCu) mass of 1 kg and the third in December 2016 with a AuCu 500 g mass.
The original four determinations were reanalysed including new corrections, and complete re-evaluations of their uncertainty budgets were made using the results of new auxiliary tests. The three new determinations were similarly analysed. There are significant correlations between these seven determinations and they must be taken into account to not underestimate the overall uncertainty. In general this is a rather difficult task for the CODATA TGFC to perform because the critical information is usually not available or not presented in a consistent manner. In fact for determinations made within the same laboratory the authors themselves are the source of such information and it is optimal for them to make this assessment in a transparent fashion.
The use of the term 'Kibble balance' has been endorsed by the CCU/CIPM [9] in recognition of the contributions of the late Bryan Kibble to metrology and in particular for conceiving the concepts of the watt balance. The NRC watt balance project especially recognizes this contribution because we have benefited from the design and construction of both Bryan Kibble and Ian Robinson. The Kibble or watt balance has been well described before [5,10] and the following is only a superficial outline.
The NRC watt balance consists of a classic equal arm beam balance with a test mass, coil and radial magnetic field on one side and a counter balancing tare mass on the other side; see figure 1. With the tare and test masses raised and zero current through the coil the beam is approximately balanced. An interferometer measures the position and velocity of the coil.
In the weighing phase the interferometer controls a feedback current which passes through the coil and generates a force to balance the beam. The general force equation is where F is the force, m is the test mass, g is the acceleration due to gravity, BL is the integrated cross product of the magnetic flux density and the coil length and I is the coil current.
In the moving phase both the tare mass and test mass are raised and the balanced beam is moved by a voice coil on the tare side of the balance. The interferometer controls a feedback current which now passes through the voice coil and produces a constant coil velocity. The resulting coil voltage is described by: (2) where V is the coil voltage and v is the coil velocity.
Assuming that BL is the same in the weighing and moving phases then In 2012 the NRC watt balance mass exchange errors [7] were detailed. The largest of these errors was caused by a tilting of the balance platform during the mass exchange and it was eliminated by moving the mass lift off the platform and onto the base. The tare lift was also mounted on the balance platform and it also caused a platform tilt but since the tare mass remains on during the entire weighing phase it did not introduce an error that is synchronous with the main mass status. However, lowering the tare mass does cause a small displacement and tilting of the coil, with respect to the coil's position in the moving phase, and this had been assessed as a 3 × 10 −9 type B relative uncertainty. The balance itself was disassembled so that the tare mass lift could be moved off the balance platform and onto the base. It was hoped that this modification would completely eliminate the coil displacement but a small residual coil shift remains, probably due to loading distortions of the flexure and knife edge pivots.

Interferometer noise reduction
Before 2014 the measurements of BL in the moving phase showed a typical relative scatter of about 60 × 10 −9 , a level of noise which is significantly higher than that in the weighing phase and also higher than can be expected from the instrumentation. Fourier analysis of the separate voltage and velocity signals, as well as their ratio, provided clear evidence that the excess noise originated from sharp resonances in the velocity measurement occurring below 200 Hz. The resonant frequencies were relatively stable and their amplitudes drifted slowly over days but usually increased. Many possible causes such as electrical interference, coil vibration modes, acoustically coupled vibration, laser output characteristics (spectral purity, amplitude, polarization and pointing stability) were considered and tested but failed to indicate the cause. Eventually the problem was traced to the beam splitter of the Michaelson interferometer which was mounted in such a way that the beam splitter was distorted and strained.
A mild distortion of the beam splitter caused the interference pattern to vary over a fraction of a fringe. Remounting the beam splitter with minimal distortion improved the uniformity of the interference pattern and increased the optical contrast but also caused the sharp resonances to disappear.
We believe that the resonances were caused by strain induced vibrational modes of the beam splitter in vacuum. The reduction in noise of the moving phase is substantial, almost a factor of ten as seen in figure 2. Measurements over the last two years confirm that the improvement is permanent.

Local field compensation
With this reduced noise in the moving phase we were now able to clearly observe and correct for the local environmental magnetic flux density. The voltage induced across the coil responds to the total flux density in the gap which includes the large, slowly varying flux produced by the magnet, as well as a contribution from the much smaller local flux which can vary quickly and randomly on the time scale of these measurements. The local flux density is measured independently with a conventional magnetometer. The strength of the coupling into the gap was determined by fitting the data during a large amplitude variation of the local field. A local field to gap field sensitivity of 1.2 T/T has been measured. Removing the local flux contribution from the moving data improves the fitting of the slowly varying portion of BL. The weighing data then utilizes this low frequency value of BL combined with the local flux during the weighing phase to calculate an equivalent mass. Reductions in the uncertainty of the measured mass of up to 40% have been observed when the local field varies considerably.

Magnetization
Whenever current passes through the coil, magnetization of the material closest to the coil, usually part of the yoke, is inevitable. This process has been considered before [5,10,11] and is commonly modelled as a current dependence of the flux density in the gap, where B0 is the flux density in the gap with zero current in the coil and α and β are the fractional coefficients related to the first and second order current dependence.
In the weighing phase two states are involved, both with the tare mass lowered. The first state has the mass off the pan and  an associated coil current I off . The second state has the mass on the pan and an associated current, I on ≈ −I off . Using the Kibble equation it can be shown that the error due to the first order magnetization term is given by α(I on + I off ). Previously, we have determined the first order fractional dependence, α = 0.56 × 10 −6 mA −1 . This is done by weighing with various beam or tare mass imbalances. The mean current of the Mass_On and Mass_Off states is typically kept very small with respect to the nominal current and the fractional error is ~1 × 10 −8 before correction.
Determination of the second order term, β, has been more difficult. To date we have not managed to get consistent determinations of β within their estimated uncertainties using asymmetric weighing. However, an upper bound of the second order coefficient is set by the differences in Planck constant determination values using different nominal masses (i.e. 1 kg and 0.5 kg). This estimate was previously used as an uncertainty component of the second order effect [5] and has been reconsidered with new data that shows (h 1 kg − h 0.5 kg )/ h 1 kg = −0.01 × 10 −9 . This would suggest that the upper limit of this error is <0.013 × 10 −9 for the 1 kg results. We remain uneasy about this estimate. In the end we have accepted an upper bound estimate of the β uncertainty based on the uncertainty of the difference of the 1 kg and 500 g results which is 3.94 × 10 −9 . This suggests | β | < 0.0205 × 10 −9 mA −2 and corresponds to 5.3 × 10 −9 , 1.3 × 10 −9 and 0.3 × 10 −9 uncertainty components for the1 kg, 500 g and 250 g results.
It is important to note that for this system, current induced magnetization is predominately a linear process because the ratio of the effects of the linear to second order magnetization is about 0.001 .
Another effect, magnetic hysteresis, is well known to exist but quite complicated to model, especially in this situation and at these levels [10][11][12][13]. To understand the nature of this error it is necessary to consider the coil current states of our watt balance experiment and some generic properties of the hysteretic processes. Hysteresis describes any process which 'lags behind' and is typically identified as a parameter controlled system which starts from an initial state, changes as the driving parameter changes but does not return to the initial state even though the driving parameter has returned to its initial value. Let us consider an extremely simplistic hysteresis model in which the magnetic state, B i , is a function of the coil current and the difference between the previous state and present state, In our watt balance experiment the coil current states can be thought of as a number of zero-current states, representing the moving phase of the experiment, followed by successive I minus and I plus coil current states representing the weighing phase, see figure 3. If we arbitrarily make γ = 0.3 then the corresponding hysteresis effect can be observed as shown in figure 3.
In figure 3 the dashed red line indicates the coil current, through the moving phase with I = 0 and through the weighing phase with the alternating I minus and I plus states. The state index is above, with state 1 as the first weighing state. The solid blue line is the magnetic state, equal to B0 through the moving phase and switching by several parts per million in the weighing phase due to the αI term. Although not shown in figure 3, at the end of the weighing phase and before the next moving phase, a demagnetizing process is performed. In this process the coil current oscillates in a decaying pattern by pushing against a mass held by the mass lift. This process effectively demagnetizes the effects caused by the weighing process and returns the magnet to the B0 state.
Note that the magnetic state 1 is smaller than magnetic states 3, 5 and 7. In this simplistic model, magnetic states 2, 4, 6, 8 are all equal. Secondly, note that the mean of states 7 and 8 equals B0. It turns out that these are general properties of all hysteretic processes driven by a linear function and for our watt balance they can be restated as: • The weighing steady state is symmetric with respect to the moving steady state. • The initial weighing states are the most affected by hysteresis but the effect decays with successive weighing states.
With this simplistic model only the first weighing state displays any hysteresis effect and it causes a positive mass error because mass is related to I plus − I minus in this substitution measurement. Other models incorporating multiple states, time dependence etc tend to spread the effect to subsequent states. The initially positive mass error remains common.
At NRC to account for drift in the balance of the beam we analyse the weighing results with a linear regression, fitting not only to the Mass_On and Mass_Off states but also to a first and second order time dependence, Removing the first and second order time dependence from data which exhibits hysteresis complicates the interpretation of the hysteresis effect but it does not affect the interpretation of the steady state solution.
Fortunately, a simple but computationally intensive technique can reveal the hysteresis effect with good precision and without additional measurements. The watt balance data is reanalysed, successively removing the initial weighing state data until only three weighing states remain, sufficient to establish the beam drift components. This is done for the several hundred weighing sequences of each Planck determination. The result is a histogram of mass changes caused by the removal of different numbers of initial weighing states, see figure 4. The noise or scatter of this process is very small because of the large size of the data.
Despite our very non-specific model the plots confirm several aspects.
• The initial mass deviation is always positive.
• The mass deviation decays within a few weighing states, and then becomes stable and repeatable.
• The noise is small; the scatter of the steady state relative mass deviations is ~0.8 × 10 −9 .
The few negative points in figure 4 are believed to be caused by the influence of the first and second order time dependence of the beam drift.
The effect of hysteresis on each Planck constant determination can be assessed as the difference of the analysis with no weighing states removed and the analysis of the steady state solution, see table 1. This is very consistent, (3.8 ± 0.8) × 10 −9 , when expressed as a fractional part of the measured mass and thus is dependent on mass and not mass squared. However, we think that the hysteresis effect should also be dependent on the coil position within the magnetic gap and this can vary with the realignment of each Planck determination. For this reason each Planck determination was corrected by its own hysteresis evaluation.
It must be noted that hysteresis does not have to be magnetic in origin. Similar effects can occur with mechanical hysteresis of the knife edges and flexures. Fortunately, the very general hysteresis principles are still applicable and the steady state solution remains valid.
The power heating of the resistor was considered during the process of removing and reanalyzing weighing data. The drift of the resistance due to self-heating has been separately evaluated by calibrating the resistor directly against the quantum Hall standard at the highest currents used, those associated with the 1 kg test mass. These indicate a correction of about 1 × 10 −9 versus the simple average resistance of the calibration.

Synchronization
The moving phase determines BL by measuring the voltage to velocity ratio. Optimum signal to noise can be achieved through three techniques • Having the voltage and velocity as constant as possible • Reducing the high frequency content of both signals as much as possible • Simultaneous data acquisition of the voltage and velocity signals We have measured the effect of desynchronization by making moving phase measurements and alternating the amount of desynchronization by synchronizing the beginning of the voltage and velocity data acquisition but intentionally retarding the completion of the voltage acquisition. The successive differences of BL are not sensitive to the drifting value of BL and the increasing uncertainty of BL is used as an indicator of the effect of desynchronization. With this technique, our typical 10 µs of synchronization error in a 0.4 s sample causes 0.6 × 10 −9 of fitting uncertainty in the determination of BL. This sets an upper limit of the desynchronization error in our measurements.

Velocity dependence
All of our seven Planck constant determinations have been performed with a moving phase velocity of 2 mm s −1 . It is important to establish if the BL determinations are independent of velocity. This independence of the moving phase has been re-evaluated at 2 mm s −1 , 1 mm s −1 and 0.5 mm s −1 . A weighted fit of the relative differences of BL versus velocity has been made which when extrapolated to 0 mm s −1 gives a correction of (1.0 ± 2.6) × 10 −9 for the 2 mm s −1 data. Figure 5. The re-pooled data, 24 h averages, of each of these three determinations is shown. The uncertainties are the type A uncertainties of the re-pooled data sets. Vertical axis is plotted as (h/h 90 − 1) × 10 9 which is mass independent and more clearly shows the deviations on a common scale. The orange data point is the dataset mean. The horizontal scales for the three plots are in days but are not identically spaced. The red square data points are made with 1000 g AuCu, the blue circle data points with 500 g Si, the red triangle data points with 500 AuCu and the small blue circle data point with 250 g. The uncertainty bars represent the combination of re-pooled standard deviation of the mean, root-sum-squared with the determination's uncertainty budget but with no account for correlations. The single gold point represents the inverse covariance weighted mean and its uncertainty bars do account for correlations.

Miscellaneous
The intensities of the optical signals of the homodyne interferometer have been balanced so that the interference amplitude has improved long term stability. The shielding of the interferometer electronics was improved to further reduce interference from other high frequency signals. A small vacuum leak which became evident in the middle of the 2013 campaign has been found and repaired.

The 2014 data set re-analysis
The four 2014 data sets were completely reanalysed. The 35 × 10 −9 correction for the 2014 mass traceability to the IPK due to a revision of the BIPM mass scale was applied with a revised uncertainty [6] and the previously described hysteresis corrections and their uncertainties were applied. A very minor correction for pressure was applied to determinations #3 and #4 to correct for the recalibration of our vacuum gauge.
A data set for each Planck value determination consists of 112 to 190 alternating weighing and moving sequences each taking approximately 45 min. The fitting of neighbouring data from sets of three moving sequences establishes the BL values used to analyse the data of each weighing sequence. Thus a different Planck value is obtained for each weighing sequence every 90 min. For the 2014 data sets, as well as the 2016 data sets (see table 2), the mean is taken as the simple mean of all the points in the data set (112-190 points). The data was then re-pooled into 24 h averages using a consistent method which  was used for the analyses of all of the determinations. The standard deviation of the mean of the re-pooled data set was then used as the Type A uncertainty of the mean value. This had the effect of reducing the degrees of freedom by a factor of about 11 (see table 3). The uncertainties were reviewed and the covariances with other determinations were established.

New Planck constant determinations
In February 2016 we measured a 500 g silicon mass and this was the first determination to benefit from the interferometer noise reduction and local magnetic field compensation. The type A noise and stability are noticeably improved. Because of the noise improvements, the moving and weighing sets are made approximately equal in time, each about 45 min and a mass result is obtained every 90 min. Again the mean is taken as the simple mean of all of the data while its Type A uncertainty is taken as the standard deviation of the mean of the re-pooled 24 h data set (see figure 5). This data set has independent resistance, laser and gravity determinations and has the least correlation with the other Planck determinations.
In November 2016 we measured a 1 kg AuCu mass and in December we measured a 500 g AuCu mass. Again the Type A noise and stability are improved. These data sets appear to be slightly degraded by vibration due to nearby construction. These two data sets have independent resistance calibrations but share, laser frequency and gravity determinations. Table 2 shows the seven Planck constant determinations and their simple uncertainties comprised of the standard deviation of the mean of their 24 h averaged data combined with the 57 other components of each uncertainty budget. The data spans a period over three years and the relative standard deviation of the seven results is only 8.1 × 10 −9 . However there are correlations between the seven determinations and a proper covariant weighted mean must be calculated to not overestimate the degrees of freedom and not underestimate the total combined uncertainty.

Covariant analysis
At the heart of any such analysis are the uncertainty budgets of the seven determinations. Appendix 1 shows the uncertainty budget of the 6th determination (1 kg AuCu, November 2016) and its correlations with all of the determinations. The derivation of the individual uncertainty components is usually self-evident from the description but the reader is directed to earlier reports [5,7,8,14] which have described them in much more detail.
Consistent with the general analysis procedures of the CODATA TGFC we have calculated an inverse variance weighted mean and account for correlations by constructing a full covariance matrix. With seven determinations and 58 uncertainty components for each uncertainty budget, the total number of individual covariances become unmanageably large (7 * 58 * 7 * 58 = 164 836). However, the common structure of the uncertainty budgets greatly reduces this number because we can assume that covariances between different types of uncertainty components are zero. Thus only covariances between different determinations, but of the same type of uncertainty component, need be considered. Because covariances are additive we can add the 58, 7 × 7 covariances associated with each uncertainty component to assemble the total covariance matrix. See appendix and table 4.
The inverse covariance weighted mean is h/h 90 = 1.000 000 1930 and its uncertainty is 0.000 000 0091, see figure 6. The chi-squared is 4.4 and the Birge ratio is 0.85 . Table 2 shows that the total uncertainties of the seven determinations are very similar despite the improvements in the Type A component of determinations 5, 6 and 7. In part this is a consequence of the uncertainty floors of the different categories of uncertainty, see table 5. The following is a brief summary of each category and possible improvements.

Discussion of the results
The mass uncertainty category has improved since 2014 and that data has been re-evaluated with better knowledge of the vacuum comparator and artefacts, and improved analysis. The mass uncertainty is now often dominated by the closure of the vacuum mass determinations before and after being measured in the watt balance. We assume the value of the watt balance reference mass at the time of the h determinations is the average of the opening and closure calibrations with an assigned uncertainty that is half of the observed mass change. We note that the closure measurements always show a mass gain suggesting the reference artefact is absorbing mass during measurement in the watt balance. Some experiments have been performed to determine the mechanism of absorption, in particular studies on gold plated copper artefacts show the mass uptake correlates strongly with the number of airvacuum cycles in the watt balance rather than with time under vacuum. This type of mass uptake due to cycling has been observed elsewhere [15]. The time independence of the mass uptake is further supported by the lack of any drift observed in the watt balance weighings over the course of the full series that could explain the closure difference. These differences have been up to 16 × 10 −9 and would be resolvable by the watt balance. We presently hypothesise that the mass gain occurs along a typical self-limiting absorption trend during pump down, with absorption sites liberated by the desorption of water subsequently occupied by hydrocarbon from the residual gas background of the watt balance. For water mass change due to vacuum exposure begins to plateau after 1-2 d [16]. This is within the stabilization time of the watt balance upon pump down and before weighings usually commence. The mass uptake due to cycling in the watt balance is not typically observed during similar cycling experiments in the NRC vacuum balance. We attribute this to a difference in the residual gas composition of the vacuum balance   compared to the watt balance. We expect the watt balance to have a significantly larger hydrocarbon background composition due to previous contamination with pump oil. However as we cannot conclusively eliminate the possibility that the mass gain occurs in full or in part during venting of the watt balance to air we therefore continue to treat the uncertainty as half the closure difference. Experiments are planned to determine the timing of the absorption and this should allow us to reduce the closure uncertainties significantly. The gravity uncertainty category is dominated by the horizontal and vertical gravity transfer corrections from the absolute gravity measurements to the test mass position. These measurements can be repeated but it will require complete disassembly of the watt balance including removal of the magnet. We do not anticipate starting such a disassembly until after redefinition of the SI.
Alignment uncertainties in practice are limited by the operator's patience and to a lesser degree the stability of the adjustments. Remote adjustment of the magnet/coil tilt and Abbe adjustment are being considered.
Resistance uncertainties are affected by the resistor stability over a couple of weeks. The uncertainty can be decreased by improving the resistors or by measuring them daily against the QHR.
Among the various weighing uncertainties the most significant is the uncertainty associated with the 2nd order magnetization effect. This is only roughly estimated and the different mass results suggest that it could be much less. This issue needs better evaluation.
Similarly while the velocity dependence is only (1.0 ± 2.6) × 10 −9 the uncertainty might be improved with more extensive tests.
The covariant analysis resulted in a final uncertainty about 2 to 3 times more than simpler mean estimates and clearly illustrates the effect of the correlations. The fact that the simple standard deviation is so comparable to the final uncertainty gives us confidence that we have not seriously underestimated the uncertainties.
Finally, we turn to the values of the seven determinations and their mean. The seven results span a three year period and include balance rebuilding in the middle of this period. There is excellent stability over this period and good agreement of results measured using different nominal masses. The covariant weighted mean is nearly identical to the simple mean value indicating that the covariant analysis is really only critical for the uncertainty analysis. While our seven results are in close agreement they are all shifted by about 4 × 10 −9 from our previous result due to the hysteresis correction described in section 2.3.

Conclusions
We have reported various improvements to the NRC Kibble balance and characterized a previously unreported systematic error. Our previous results and three new Planck determinations are presented and evaluated accounting for correlations. The results show excellent agreement and stability over time and this, along with their comprehensive uncertainty analysis, establishes a new level of confidence in the Planck constant value.

Acknowledgments
We wish to acknowledgements the assistance and support of D Inglis, A Madej and P Dube, all of NRC and I Robinson of NPL. R Behr and L Palafox of PTB have generously provided the programmable Josephson chip. Table A1 shows the uncertainty budget of the sixth determination using the 1 kg AuCu mass measured in November 2016. The first column is the component index, the second is a description of the component category, the third is a description of the specific uncertainty component, and the fourth column is the estimated relative uncertainty   times of the pair of determinations. If the product of those variations is constant (or nearly constant) then the correlation is '1'. If the product of those variations vary randomly, or even systematically, over most of the uncertainty estimate then the correlation is '0'. It is the varying or randomization of the effect over the times of the two determinations that sets the value of the correlation. Table A2 shows the 6th uncertainty budget sorted into common categories. It is interesting to note that the components of alignment, resistance, mass and gravity are all comparable in magnitude. Any future improvements will require advances in all four areas. Table A3 shows the correlation of each of the seven Planck determinations with each other.

Appendix B. Covariant analysis
We have seven determinations of a variable y, each with an uncertainty budget consisting of 58 uncertainty components, as well as correlation assessments between all possible uncertainty components.
For the kth uncertainty component the vector, S k , is made of the seven uncertainties, u k i , , associated with kth component And the kth covariance sub-matrix, C k is given by, The total covariance matrix, C is then given by the sum of the 58 covariance sub-matrices.
To solve for the best estimate of an inverse covariance weighted mean, y, we create the row vector y y y y y y y y y y y y y y , , , [ ] (B.4) and note that the chi-squared is given by Solving for y is obtained by minimizing chi squared with respect to y. The solution is a consequence of the Gauss Markov theorem which states that in a linear regression model in which the errors have expectation zero and are uncorrelated and have equal variances, the best linear unbiased estimator (BLUE) of the coefficients is given by the ordinary least squares estimator. The technique of minimizing the chisquared is very general and also used by the CODATA TGFC primarily because it is more robust for non-linear solutions.