Focus on measurement uncertainty

Guest Editors


Walter Bich, Istituto Nazionale di Ricerca Metrologica, Italy
Clemens Elster, Physikalisch-Technische Bundesanstalt, Germany


BIPM Workshop on Measurement Uncertainty

Foreword

In June 2015 a BIPM workshop on measurement uncertainty was held, organized by scientists from INRIM, NIST, NPL, PTB and the BIPM itself. Over 100 participants from more than 40 countries attended the workshop, and a number of other registered participants were able to see and hear the presentations live over the internet.

One key issue was a discussion about the current proposal for a revision of the twenty-year-old Guide to the Expression of Uncertainty in Measurement (GUM) . The GUM presents recommendations in order to unify the understanding and evaluation of measurement uncertainty particularly in metrology. Differing views were presented and the workshop allowed for an open debate.

Merits and deficiencies of the present GUM and its Supplements were pointed out and illustrated by examples, with emphasis given to the GUM revision and its impact on various metrological aspects, such as calibration procedures and Calibration and Measurement Capabilities.

Challenges of uncertainty characterization beyond the GUM were also discussed. A spectrum of applications was presented where uncertainty needs to be evaluated and different tools, not necessarily GUM-related, are required, according to the application.

This focus issue contains papers that reflect a selection of the contributions presented at the workshop.

Focus issue papers

A multi-thermogram-based Bayesian model for the determination of the thermal diffusivity of a material

Alexandre Allard et al 2016 Metrologia 53 S1

The determination of thermal diffusivity is at the heart of modern materials characterisation. The evaluation of the associated uncertainty is difficult because the determination is performed in an indirect way, in the sense that the thermal diffusivity cannot be measured directly. The well-known GUM uncertainty framework does not provide a reliable evaluation of measurement uncertainty for such inverse problems, because in that framework the underlying measurement model is supposed to be a direct relationship between the measurand (the quantity intended to be measured) and the input quantities on which the measurand depends. This paper is concerned with the development of a Bayesian approach to evaluate the measurement uncertainty associated with thermal diffusivity. A Bayesian model is first developed for a single thermogram and is then extended to the case of several thermograms obtained under repeatability and reproducibility conditions. This multi-thermogram based model is able to take into consideration a large set of influencing quantities that occur during the measurements and yields a more reliable uncertainty evaluation than the one obtained from a single thermogram. Different aspects of the Bayesian model are discussed, including the sensitivity to the choice of the prior distribution, the Metropolis–Hastings algorithm used for the inference and the convergence of the Markov chains.

Bayesian regression versus application of least squares—an example

Clemens Elster and Gerd Wübbeler 2016 Metrologia 53 S10

Regression is an important task in metrology and least-squares methods are often applied in this context. Bayesian inference provides an alternative that can take into account available prior knowledge. We illustrate similarities and differences of the two approaches in terms of a particular nonlinear regression problem. The impact of prior knowledge utilized in the Bayesian regression depends on the amount of information contained in the data, and by considering data sets with different signal-to-noise ratios the relevance of the employed prior knowledge for the results is investigated. In addition, properties of the two approaches are explored in the context of the particular example.

Introducing a Simple Guide for the evaluation and expression of the uncertainty of NIST measurement results

Antonio Possolo 2016 Metrologia 53 S17

The current guidelines for the evaluation and expression of the uncertainty of NIST measurement results were originally published in 1993 as NIST Technical Note 1297, which was last revised in 1994. NIST is now updating its principles and procedures for uncertainty evaluation to address current and emerging needs in measurement science that Technical Note 1297 could not have anticipated or contemplated when it was first conceived. Although progressive and forward-looking, this update is also conservative because it does not require that current practices for uncertainty evaluation be abandoned or modified where they are fit for purpose and when there is no compelling reason to do otherwise. The updated guidelines are offered as a Simple Guide intended to be deployed under the NIST policy on Measurement Quality, and are accompanied by a rich collection of examples of application drawn from many different fields of measurement science.

Evaluating the measurement uncertainty of complex quantities: a selective review

B D Hall 2016 Metrologia 53 S25

This paper reviews a selection of topics related to the evaluation and expression of measurement uncertainty in complex quantities, which are prevalent in electromagnetic measurements at radio and microwave frequencies. Methods appropriate for complex quantities are presented that extend those described, for real-valued quantities, in the Guide to the Expression of Uncertainty in Measurement.

Markov chain Monte Carlo methods: an introductory example

Katy Klauenberg and Clemens Elster 2016 Metrologia 53 S32

When the Guide to the Expression of Uncertainty in Measurement (GUM) and methods from its supplements are not applicable, the Bayesian approach may be a valid and welcome alternative. Evaluating the posterior distribution, estimates or uncertainties involved in Bayesian inferences often requires numerical methods to avoid high-dimensional integrations. Markov chain Monte Carlo (MCMC) sampling is such a method—powerful, flexible and widely applied. Here, a concise introduction is given, illustrated by a simple, typical example from metrology.

The Metropolis–Hastings algorithm is the most basic and yet flexible MCMC method. Its underlying concepts are explained and the algorithm is given step by step. The few lines of software code required for its implementation invite interested readers to get started. Diagnostics to evaluate the performance and common algorithmic choices are illustrated to calibrate the Metropolis–Hastings algorithm for efficiency. Routine application of MCMC algorithms may be hindered currently by the difficulty to assess the convergence of MCMC output and thus to assure the validity of results. An example points to the importance of convergence and initiates discussion about advantages as well as areas of research. Available software tools are mentioned throughout.

Estimation and uncertainty analysis of dose response in an inter-laboratory experiment

Blaza Toman et al 2016 Metrologia 53 S40

An inter-laboratory experiment for the evaluation of toxic effects of NH2-polystyrene nanoparticles on living human cancer cells was performed with five participating laboratories. Previously published results from nanocytoxicity assays are often contradictory, mostly due to challenges related to producing a reliable cytotoxicity assay protocol for use with nanomaterials. Specific challenges include reproducibility preparing nanoparticle dispersions, biological variability from testing living cell lines, and the potential for nano-related interference effects. In this experiment, such challenges were addressed by developing a detailed experimental protocol and using a specially designed 96-well plate layout which incorporated a range of control measurements to assess multiple factors such as nanomaterial interference, pipetting accuracy, cell seeding density, and instrument performance. Detailed data analysis of these control measurements showed that good control of the experiments was attained by all participants in most cases. The main measurement objective of the study was the estimation of a dose response relationship between concentration of the nanoparticles and metabolic activity of the living cells, under several experimental conditions. The dose curve estimation was achieved by imbedding a three parameter logistic curve in a three level Bayesian hierarchical model, accounting for uncertainty due to all known experimental conditions as well as between laboratory variability in a top-down manner. Computation was performed using Markov Chain Monte Carlo methods. The fit of the model was evaluated using Bayesian posterior predictive probabilities and found to be satisfactory.

Evaluation of uncertainty in the adjustment of fundamental constants

Olha Bodnar et al 2016 Metrologia 53 S46

Combining multiple measurement results for the same quantity is an important task in metrology and in many other areas. Examples include the determination of fundamental constants, the calculation of reference values in interlaboratory comparisons, or the meta-analysis of clinical studies. However, neither the GUM nor its supplements give any guidance for this task. Various approaches are applied such as weighted least-squares in conjunction with the Birge ratio or random effects models. While the former approach, which is based on a location-scale model, is particularly popular in metrology, the latter represents a standard tool used in statistics for meta-analysis. We investigate the reliability and robustness of the location-scale model and the random effects model with particular focus on resulting coverage or credible intervals. The interval estimates are obtained by adopting a Bayesian point of view in conjunction with a non-informative prior that is determined by a currently favored principle for selecting non-informative priors. Both approaches are compared by applying them to simulated data as well as to data for the Planck constant and the Newtonian constant of gravitation. Our results suggest that the proposed Bayesian inference based on the random effects model is more reliable and less sensitive to model misspecifications than the approach based on the location-scale model.

Open access
When the model doesn’t cover reality: examples from radionuclide metrology

S Pommé 2016 Metrologia 53 S55

It could be argued that activity measurements of radioactive substances should be under statistical control, considering that the measurand is unambiguously defined, the radioactive decay processes are theoretically well understood and the measurement function can be derived from physical principles. However, comparisons invariably show a level of discrepancy among activity standardisation results that exceeds expectation from uncertainty evaluations. Also decay characteristics of radionuclides determined from different experiments show unexpected inconsistencies. Arguably, the problem lies mainly in incomplete uncertainty assessment. Of the various reasons leading to incomplete uncertainty assessment, from human failure to limitations to the state-of-the-art knowledge, a selection of cases is discussed in which imperfections in the modelling of the measurement process can lead to unexpectedly large underestimations of uncertainty.

Examples of S1 coverage intervals with very good and very bad long-run success rate

Nicola Giaquinto and Laura Fabbiano 2016 Metrologia 53 S65

The paper illustrates, by means of selected examples, the merits and the limits of the method for computing coverage intervals described in the Supplement 1 to the GUM. The assessment of coverage intervals is done by evaluating their long-run success rate. Three pairs of examples are presented, relative to three different ways of generating incomplete knowledge about quantities: toss of dice, presence of additive noise, quantization. In all the pairs of examples, the first one results in a coverage interval with a long-run success rate equal to the coverage probability (set to 95%); the second one, instead, yields an interval with a success rate near to zero. The paper shows that the propagation mechanism of the Supplement 1, while working well in certain special cases, yields unacceptable results in others, and that the problematic issues cannot be neglected. The conclusion is that, if a Bayesian approach to uncertainty evaluation is adopted, the propagation is a particularly delicate issue.

Bayesian conformity assessment in presence of systematic measurement errors

Carlo Carobbi and Francesca Pennecchi 2016 Metrologia 53 S74

Conformity assessment of the distribution of the values of a quantity is investigated by using a Bayesian approach. The effect of systematic, non-negligible measurement errors is taken into account. The analysis is general, in the sense that the probability distribution of the quantity can be of any kind, that is even different from the ubiquitous normal distribution, and the measurement model function, linking the measurand with the observable and non-observable influence quantities, can be non-linear. Further, any joint probability density function can be used to model the available knowledge about the systematic errors. It is demonstrated that the result of the Bayesian analysis here developed reduces to the standard result (obtained through a frequentistic approach) when the systematic measurement errors are negligible. A consolidated frequentistic extension of such standard result, aimed at including the effect of a systematic measurement error, is directly compared with the Bayesian result, whose superiority is demonstrated. Application of the results here obtained to the derivation of the operating characteristic curves used for sampling plans for inspection by variables is also introduced.

In pursuit of a fit-for-purpose uncertainty guide

D R White 2016 Metrologia 53 S107

Measurement uncertainty is a measure of the quality of a measurement; it enables users of measurements to manage the risks and costs associated with decisions influenced by measurements, and it supports metrological traceability by quantifying the proximity of measurement results to true SI values. The Guide to the Expression of Uncertainty in Measurement (GUM) ensures uncertainty statements meet these purposes and encourages the world-wide harmony of measurement uncertainty practice. Although the GUM is an extraordinarily successful document, it has flaws, and a revision has been proposed. Like the already-published supplements to the GUM, the proposed revision employs objective Bayesian statistics instead of frequentist statistics. This paper argues that the move away from a frequentist treatment of measurement error to a Bayesian treatment of states of knowledge is misguided. The move entails changes in measurement philosophy, a change in the meaning of probability, and a change in the object of uncertainty analysis, all leading to different numerical results, increased costs, increased confusion, a loss of trust, and, most significantly, a loss of harmony with current practice. Recommendations are given for a revision in harmony with the current GUM and allowing all forms of statistical inference.

On challenges in the uncertainty evaluation for time-dependent measurements

S Eichstädt et al 2016 Metrologia 53 S125

The measurement of quantities with time-dependent values is a common task in many areas of metrology. Although well established techniques are available for the analysis of such measurements, serious scientific challenges remain to be solved to enable their routine use in metrology. In this paper we focus on the challenge of estimating a time-dependent measurand when the relationship between the value of the measurand and the indication is modeled by a convolution. Mathematically, deconvolution is an ill-posed inverse problem, requiring regularization to stabilize the inversion in the presence of noise. We present and discuss deconvolution in three practical applications: thrust-balance, ultra-fast sampling oscilloscopes and hydrophones. Each case study takes a different approach to modeling the convolution process and regularizing its inversion. Critically, all three examples lack the assignment of an uncertainty to the influence of the regularization on the estimation accuracy. This is a grand challenge for dynamic metrology, for which to date no generic solution exists. The case studies presented here cover a wide range of time scales and prior knowledge about the measurand, and they can thus serve as starting points for future developments in metrology. The aim of this work is to present the case studies and demonstrate the challenges they pose for metrology.

Towards a new GUM—an update

Walter Bich et al 2016 Metrologia 53 S149

The contents of the current edition, JCGM 100:2008, of the Guide to the Expression of Uncertainty in Measurement (GUM) and its Supplements are reviewed and remarks made concerning a proposed revision of the GUM. A committee draft of the revision was circulated to member organizations of the Joint Committee for Guides in Metrology (JCGM) and all national metrology institutes in December 2014. The motivation for the proposed changes is given and reactions to the committee draft are summarized. Some of the contents of this paper are solely an expression by the authors and do not constitute an official statement by the JCGM.

Selected articles previously published in Metrologia related to the subject of this Focus Issue:

Bayesian analysis of a flow meter calibration problem

G J P Kok et al 2015 Metrologia 52 392

A turbine flow meter indicates the volume of fluid flowing through the device per unit of time. Such a flow meter is commonly calibrated at a few known flow rates over its measurement range. A calibration curve relating the pulse factor of the meter to the flow rate is then fitted to calibration data using an ordinary least squares approach. This approach does not consider prior knowledge that may exist about the flow meter or the calibration procedure. A Bayesian analysis enables prior knowledge to be taken into account. A Bayesian inference results in a posterior distribution for the unknown parameters of the calibration curve that may be seen as the most comprehensive uncertainty information about these unknowns. This paper investigates for a flow meter calibration problem the effects of appreciating prior knowledge on values of the calibration curve and their associated uncertainties. It presents the results of a Bayesian analysis and compares them to those obtained by an ordinary least squares approach.

Explanatory power of degrees of equivalence in the presence of a random instability of the common measurand

Gerd Wübbeler et al 2015 Metrologia 52 400

The degrees of equivalence are the main outcome in the analysis of key comparison data, and they are used for the approval of the calibration and measurement capabilities of the participating laboratories. Typically, the calibration and measurement capability of a participating laboratory is seen as being approved when the corresponding unilateral degree of equivalence does not differ significantly from zero.

The relevance of degrees of equivalence may deteriorate in the presence of an instability of the common measurand. In order to quantitatively assess this deterioration we propose to consider the loss of power of a hypothesis test that can be associated with checking whether a degree of equivalence differs significantly from zero. Based on the resulting loss of power, one can decide whether the size of the instability of the common measurand may be tolerated. We illustrate the concept in terms of results obtained for the recent key comparison CCM.FF-K6.2011.