Median Statistics Estimate of the Distance to M87

de Grijs & Bono compiled 211 independent measurements of the distance to galaxy M87 in the Virgo cluster from 15 different tracers and reported 31.03 ± 0.14 mag as the arithmetic mean of a subset of this compilation as the best estimate of the distance. We compute three different central estimates—the arithmetic mean, weighted mean, and the median—and corresponding statistical uncertainty for the full data set as well as three sub-compilations. We find that for all three central estimates the error distributions show that the data sets are significantly non-Gaussian. As a result, we conclude that the median is the most reliable of the three central estimates, as median statistics do not assume Gaussianity. We use median statistics to determine the systematic error on the distance by analyzing the scatter in the 15 tracer subgroup distances. From the 211 distance measurements, we recommend a summary M87 distance modulus of 31.08−0.04+0.05 (statistical) −0.06+0.04 (systematic) mag, or combining the two errors in quadrature 31.08−0.07+0.06 mag, rounded to 16.4 ± 0.5 Mpc, all at 68.27% significance.


INTRODUCTION
The extragalactic distance ladder is essential to astrophysics and cosmology and must constantly be refined.As the galaxy M87 lies near the center of the closest galaxy cluster to us, Virgo, it is an important rung on the distance ladder and allows us to extend the ladder to more distant clusters such as Coma and Fornax.de Grijs & Bono (2019), hereafter deGB, compiled a list of 211 distance measurements to M87 obtained by 15 different distance tracers.They report a mean value of 31.03 ± 0.14 mag after reducing their data set to 24 measurements from 3 tracers which they believe to be well-calibrated and independent.
Following Crandall & Ratra (2014), Penton et al. (2018), andYu et al. (2020), we analyze the deGB data sets using median statistics (Gott et al. 2001), which is free from assumptions about the distribution underlying the data set and its errors.Because median statistics does not take into account error bars on individual measurements, it is generally less constraining.Nevertheless, we believe the median to provide a more accurate estimate of the distance to M87 than methods which rely on the unsatisfied assumption of Gaussianity.In addition to using median statistics to estimate a more reliable statistical uncertainty in the M87 distance measurement, we also utilize it to estimate the systematic uncertainty in this measurement, based on the scatter in the M87 distance estimated using each of the 15 different tracer sub-groups in the deGB compilation. 2he results of this median statistics analysis on the entire data set of 211 distance measurements provided by deGB yields an M87 distance of 31.08 +0.04  −0.04 (statistical) +0.04 −0.06 (systematic) mag at 68.27% significance.Combining the two errors in quadrature we get 31.08+0.06  −0.07 mag or 16.4 ± 0.5 Mpc.In Section 2 we introduce the different data compilations studied.In Section 3 we summarize median statistics and outline the Gaussianity test we use.In Section 4 we study the Gaussianity of the deGB compilations and argue that our median statistics result is a better representation of the true distance to M87 than a more conventional mean analysis.We conclude in Section 5.
2. DATA deGB compiled 2113 M87 distance measurements from the NASA/Astrophysics Data System (ADS) that they found to be statistically independent.They included measurements both to M87 and to the geometric center of the Virgo cluster since these values were found to be statistically indistinguishable.These were grouped into 15 tracers.4deGB selected 5 of these tracers: Cepheids, planetary nebulae luminosity function (PNLF), surface brightness fluctuations (SBF), tip of the red giant branch (TRGB) magnitude, and novae, which they found to be internally consistent and provide tight averages as opposed to the other 10 tracers.These 5 tracers correspond to a total of 44 measurements.They adjusted the measurements from these 5 tracers to agree with their best estimate of the distance modulus to the LMC of 18.49 ± 0.09 mag (De Grijs et al. 2014). 5This value is in good agreement with the median statistics estimate of 18.49 ± 0.13 mag found by Crandall & Ratra (2015).deGB's final recommended value is the arithmetic mean of a set of 246 points from three tracers: Cepheids, SBF, and TRGB.The PNLF and novae tracers were removed due to foreground-and background-biased outliers, respectively.The result of deGB's mean analysis of the 24 points from these 3 tracers is a distance modulus of 31.03 ± 0.14 mag at 1σ sample standard deviation; we find an identical result.
In this paper we analyze the following four subsets of the deGB compilation and provide the best estimate for each subset with 1σ uncertainty as explained in Section 3.1.All 15 refers to deGB's full list of 211 unadjusted data points from 15 tracers.All 15 without averages is the same but now excluding the averages tracer.7 Best 5 refers to the adjusted 44 measurements from 5 tracers from Table 1 of deGB that they determined to be internally consistent.Best 3 refers to the adjusted 24 data points from 3 tracers (excluding the Tammann et al. 2000, point) from Table 1 of deGB that they used to compute their favored summary measurement value.

ANALYSIS
Conventional methods such as mean and χ 2 analyses assume (Gott et al. 2001): 1. Individual data points are statistically independent.
2. There are no systematic effects.

One knows the standard deviation of the errors.
Median statistics was developed by Gott et al. (2001) as a powerful alternative to mean and χ 2 analyses.The essential idea is that the true value of the quantity being measured is the median of a set of repeated error-affected measurements as the number of measurements tends toward infinity.This follows from the assumptions that the data set contains only independent measurements and that it does not have any overall systematic error. 8The individual measurement errors have no effect on the computation of the median of a data set.Thus assumptions 3 and 4 are not necessary for the application of median statistics.This is advantageous for analyzing non-Gaussian data sets, as for such cases any mean analysis is suspect due to the failure to satisfy assumption 3. Since the individual measurement errors are not taken into account (assumption 4 is dropped), a median statistics analysis will generally provide a less constraining central estimate than mean statistics.We argue that despite this, median statistics provides the most reliable central estimate in the case of a non-Gaussian data compilation.
We study the Gaussianity of the compilations described in Section 2. We do this by creating error distributions of the data from various central estimates and comparing them to the Gaussian probability distribution.Based on the results of this analysis, we argue that the median is the most accurate and reliable central estimate of the deGB compilations.

Computing the Central Estimate
To study the Gaussianity of a data set, we construct an error distribution using a central estimate.We compute the median, weighted mean, and arithmetic mean central estimates and create an error distribution for each.
The true median of a data set is defined as the median as the number of measurements, N , goes to infinity.Gott et al. (2001) showed that the probability that the true median lies between any two adjacent measurements M i and M i+1 is given by the appropriately normalized binomial distribution: where M 0 ≡ −∞ and M N +1 ≡ +∞.To compute the uncertainty in our estimate of the median: (i) for every index, we find the probability that the true median is between it and the subsequent index; (ii) we split the data into two halves above and below the median of our distribution; (iii) we compute the area of each half; (iv) we iterate from the median until 68.27% of the area of a half is exceeded; and, (v) the data at the index which yields an area closest to the 68.27% is then recorded as M i+ or M i− .These indices are then used in the ordered data to construct the uncertainty as described: where M med is the median.
The weighted-mean central estimate, while it has the added assumption of Gaussianity, has the benefit of using the reported uncertainties and so is a more constraining estimate than the median.Given a data set M i ± σ i we compute the weighted-mean central estimate as (Podariu et al. 2001) with standard deviation We also employ the arithmetic mean central estimate with standard error of the mean (5) These central estimates are used to construct error distributions which we will compare with a Gaussian in order to evaluate the Gaussianity of the data set.

Error Distributions
We create error distributions of our data in order to study their Gaussianity.The error distribution N σi is a measure of how many standard deviations any individual measurement deviates from the central estimate.
Given measurements M i ± σ i and a corresponding central estimate M CE ± σ CE , the error distribution is This expression assumes that the central estimate is not correlated with the data set.This assumption is not satisfied here, as our central estimates are computed directly from the data compilations.Finding such a formula for an error distribution of a correlated median is beyond the scope of this work.
The weighted-mean case has been solved in the case of correlation between the weighted-mean and the data set.9The error distribution using a weighted-mean central estimate that is correlated with the data set is We often have asymmetric error bars on the median, in which case we slightly alter Equation ( 6).We use the upper error of the median when M i > M med and the lower error when After symmetrizing the error distribution about 0,10 we use the Kolmogorov-Smirnov (KS) test to study the Gaussianity of the data compilations.

The Kolmogorov-Smirnov Test and Testing Gaussianity
The KS test is a statistical test that compares empirical error distributions with continuous probability density functions (PDFs).11The first step is calculating the D-statistic, which is the largest difference between the empirical cumulative distribution function and that of the relevant PDF.
The D-statistic is then used to compute z as given in Press et al. ( 2007) which is used to compute the p-value Explicitly, the p-value is the probability that the D-statistic could be smaller than measured if a similar data set is used.The p-value represents the probability that we can reject the null hypothesis which states that these data do not come from the PDF of interest.Conventionally, if p ≥ 0.95, we can reject the null hypothesis that our data do not come from the relevant PDF.Therefore, if p ≤ 0.95, we conclude that these data are not consistent with having been drawn from a Gaussian distribution.We also introduce a scale factor S such that S > 1 corresponds to decreasing the errors, or narrowing the distribution, and S < 1 corresponds to increasing the errors, or widening the distribution.We then run the KS test, varying S from 0 to 10 to find S * , the value of S which optimizes the p-value.This allows us to compare the error distribution to a Gaussian.If S * > 1, that is, if the optimal scale factor is such that the distribution must be narrowed to fit a Gaussian, the distribution is broader than a Gaussian and we conclude that the errors may have been overestimated.Similarly, if S * < 1, the distribution is more narrow than a Gaussian and we conclude that the errors may have been underestimated.

Estimating Systematic Error
All uncertainties computed thus far have corresponded to statistical errors.We perform an analysis of the systematic error present in the All 15 data set, and in the All 15 without Averages data subset, using the procedure outlined in Chen & Ratra (2011).Within each tracer subgroup there is statistical resulting in a spread of measurements, and between the tracers there could be systematic error resulting from the different techniques and calibrations.
We construct a new data set consisting of the median of each tracer subgroup.We perform a median statistics analysis on this new data set to find the median of medians and its associated uncertainty.If we assume that these medians differ only systematically from each other, this uncertainty corresponds to the systematic uncertainty of the entire group of tracers.

RESULTS
We perform a median statistics analysis on deGB's compilation of 211 distance measurements.We calculate three central estimates and study the Gaussianity of the All 15 data set (comprised of 211 measurements from 15 tracers), the Best 5 data set (comprised of 44 adjusted measurements from 5 tracers), and the Best 3 data set (comprised of 24 adjusted measurements from 3 tracers).These data sets are outlined in Section 2. The central estimates and results of the KS test for these data are shown in Table 1.The p-values in this column are for the unscaled error distribution, or S = 1. b The 1.0 values in the p * column all lie within the range of (0.995, 0.999).
c There were some data without errors.These were set to the mean of the uncertainties for that tracer in order to perform the weighted mean analysis.We find all four data sets to be inconsistent with Gaussianity.Therefore, we argue that the median provides the best central estimate of each of these compilations.As the optimal scale factor S * > 1 for all of these data sets, the error distributions are all wider than a Gaussian.This could be due to an overestimation of some of the errors.
We report a median of 31.08In the case of the All 15 and All 15 without averages data sets, there are enough tracers to estimate the systematic error in the median of the entire data as outlined in Section 3. The results of this analysis are shown in   The All 15 data set has the most measurements, enough to allow an estimate of systematic uncertainty, and also to make this data-set the best defended against the effects of small number statistics.Therefore, we choose to use the results of the analysis of this data-set as our final reported value.

CONCLUSION
After analyzing the data sets compiled by deGB, we recommend an M87 median statistics distance modulus from the All 15 data set without averages of value 31.08 +0.05 −0.04 (statistical) +0.04 −0.06 (systematic) mag at 68.27% significance.Combining the two errors in quadrature we have 31.08+0.06 −0.07 mag or 16.4 ± 0.5 Mpc.This estimate is consistent with deGB's result of 31.03 ± 0.14 mag based on the Best 3 data set.We argue that our reported value is more reliable than deGB's mean statistics analysis since these data are not Gaussianly distributed.Using the larger data set allowed us to estimate systematic uncertainty and include that in our reported value.

a
All Data are the 211 data points including theTammann et al. (2000) measurement.The next 15 lines are the individual tracer types.Only for tracer types with more than 10 measurements do we show their uncertainty in the last column.The Subgroup Medians row shows the results from a median statistics analysis of the previous 15 medians and its uncertainty is the reported systematic uncertainty.bTheshift is the difference between the tracer median and the All Data median in the first row.c Error on the median of the previous column.

Table 1 .
Central estimates and KS test results for the four data compilations.
+0.04 −0.04 mag for the All 15 data set, 31.08 +0.05 −0.04 for the All 15 without averages data set, 31.02+0.04 −0.05 mag for the Best 5 data set, and 31.03+0.04 −0.01 mag for the Best 3 data set, where all errors are at a 68.27% confidence level.

Table 2 .
The median of the All 15 data set is 31.08+0.04−0.04(statistical)+0.04−0.06 (systematic) mag and the median of the All 15 without averages is 31.08 +0.05 −0.04 (statistical) +0.04 −0.06 (systematic) mag, at 68.27% significance.The systematic error due to the tracer differences is comparable to the statistical uncertainty on the median.Combining the two errors in quadrature we get 31.08+0.06−0.07mag or 16.4 ± 0.5 Mpc for both the All 15 and the All 15 without averages data sets.

Table 2 .
All 15 and All 15 without Averages Systematic Uncertainty Analyses.a