Brought to you by:

BAYESIAN TECHNIQUES FOR COMPARING TIME-DEPENDENT GRMHD SIMULATIONS TO VARIABLE EVENT HORIZON TELESCOPE OBSERVATIONS

, , , , , and

Published 2016 November 29 © 2016. The American Astronomical Society. All rights reserved.
, , Citation Junhan Kim et al 2016 ApJ 832 156 DOI 10.3847/0004-637X/832/2/156

0004-637X/832/2/156

ABSTRACT

The Event Horizon Telescope (EHT) is a millimeter-wavelength, very-long-baseline interferometry (VLBI) experiment that is capable of observing black holes with horizon-scale resolution. Early observations have revealed variable horizon-scale emission in the Galactic Center black hole, Sagittarius A* (Sgr A*). Comparing such observations to time-dependent general relativistic magnetohydrodynamic (GRMHD) simulations requires statistical tools that explicitly consider the variability in both the data and the models. We develop here a Bayesian method to compare time-resolved simulation images to variable VLBI data, in order to infer model parameters and perform model comparisons. We use mock EHT data based on GRMHD simulations to explore the robustness of this Bayesian method and contrast it to approaches that do not consider the effects of variability. We find that time-independent models lead to offset values of the inferred parameters with artificially reduced uncertainties. Moreover, neglecting the variability in the data and the models often leads to erroneous model selections. We finally apply our method to the early EHT data on Sgr A*.

Export citation and abstract BibTeX RIS

1. INTRODUCTION

The Event Horizon Telescope3 (EHT) is a global array of telescopes that performs very-long-baseline interferometry (VLBI) at 1.3 and 0.8 mm wavelengths with unprecedented angular resolution. When the full array of stations from Arizona to the South Pole and from Hawaii to France are incorporated, the longest baselines (14 Gλ at 0.8 mm) will provide angular resolution better than 20 μas. The main targets of the EHT are the Galactic Center black hole Sagittarius A* (hereafter Sgr A*) and the nuclear black hole in the galaxy M87, at the center of the Virgo cluster of galaxies. For both of these sources, the apparent size of the event horizon is larger than the EHT resolution, millimeter emission is expected to originate very close to the horizon, and synchrotron opacity is not expected to obscure the innermost regions, thus allowing a clear and high-resolution view of their event horizons.

The images of the accretion flows around these black holes are expected to exhibit a shadow surrounded by a ring of emission (Bardeen 1973; Luminet 1979; Falcke et al. 2000), with a size and shape that depends primarily on the black hole mass and only very weakly on its spin (Johannsen & Psaltis 2010). The shadows of the nuclear black holes of the Milky Way and M87 have apparent diameters of 50 μas and 36 μas, respectively, assuming Kerr black holes (Johannsen et al. 2012). Observations of Sgr A* with three stations in Hawaii, California, and Arizona measured a source size that is comparable to the event horizon scale (Doeleman et al. 2008) and provided evidence for variability of its emission (Fish et al. 2011). The Sgr A* observations of Fish et al. (2011) reported not only the increase of the total flux but also the brightening of the correlated flux density in long baselines. Three-station VLBI observations of M87 detected structure that was identified with the base of the relativistic jet and had a size comparable to that of the black hole shadow (Doeleman et al. 2012). For the rest of this paper, we will focus on Sgr A*, although our results are very general and can be applied to interferometric observations of any target.

Sgr A* is the closest supermassive black hole to the Earth. It has long been observed at wavelengths ranging from radio to γ-rays to study accretion physics around the black hole (see Falcke & Markoff 2013 for a recent review). In the near-infrared, adaptive optics observations of orbiting stars provide the mass of and distance to the black hole (Ghez et al. 2008; Gillessen et al. 2009a, 2009b; Chatzopoulos et al. 2015). The angular size of Sgr A*, set by the ratio of mass to distance, is more accurately measured and is the largest of any known black hole.

Observations at millimeter wavelengths provide the opportunity to understand the accretion process near the event horizon of Sgr A*. At these wavelengths, the strong interstellar scattering that greatly blurs the image at longer wavelengths (e.g., Doeleman et al. 2001; Bower et al. 2006) is reduced and the spectrum of Sgr A* implies a transition from optically thick to thin emission, providing a clear view of the event horizon (e.g., Narayan et al. 1998; Özel et al. 2000; Marrone et al. 2006; Bower et al. 2015). Sgr A* is observed to vary on intraday timescales at nearly all wavelengths, including millimeter wavelengths (e.g., Marrone et al. 2008; Yusef-Zadeh et al. 2009; Dexter et al. 2014), which suggests a highly dynamic environment around the black hole.

There have been many theoretical efforts to simulate accretion onto Sgr A* (see Yuan & Narayan 2014 for a recent review). These calculations have generally focused on reproducing its spectrum and measured emission size through both semi-analytic stationary models and time-dependent magnetohydrodynamic (MHD) and general relativistic MHD (GRMHD) simulations. Time-independent models, including Özel et al. (2000), Yuan et al. (2003), and Huang et al. (2007), have used variations in the magnetic field, electron density, and temperature profiles computed from hydrodynamic models to match observational constraints. Incorporating GR ray-tracing, these models can also be used to constrain properties such as the black hole spin and orientation (e.g., Broderick et al. 2009, 2011). Time-resolved simulations of the accretion flow can similarly be adjusted through a variety of parameters, prescriptions, and initial conditions to better match existing observations. Such simulations have the potential to capture additional observational details that are not encoded in the stationary simulations, including polarization properties (e.g., Goldston et al. 2005; Shcherbakov et al. 2012; Gold et al. 2016), and especially the variability of the total emission and source structure. Several (GR)MHD simulations that can match numerous properties of Sgr A* (e.g., Chan et al. 2009; Mościbrodzka et al. 2009; Dexter et al. 2010, 2012; Shcherbakov et al. 2012; Mościbrodzka & Falcke 2013; Chan et al. 2015a, 2015b) now exist in the literature.

The prospect of millimeter-wavelength EHT observations that can resolve Sgr A* in both space and time, alongside simulations and existing observations that show temporal changes, indicates a need to compare data and models in a time-dependent manner. This should help us constrain the GRMHD model and physical parameters to describe the accretion flow around Sgr A*. However, there are so far no tools with which time-dependent data can be analyzed in the context of time-variable models. For example, given the likely changes in the structure of the emission region and the stochastic nature of MHD turbulence, a snapshot by snapshot comparison is neither feasible nor meaningful. In this paper, we develop a statistical method to compare interferometric measurements with simulated mock data from time-dependent GRMHD models. In particular, we employ Bayesian tools to compute the posterior likelihood of model parameters given the observational data. We apply this to several simulations and verify the method using mock data. Then, we use EHT observational data from 2007 and 2009 to demonstrate the application of the method on Sgr A*.

2. BAYESIAN DATA ANALYSIS

In this section, we develop a general Bayesian framework that allows us to compare an ensemble of data points, such as VLBI visibilities, to an ensemble of simulated data predicted by a theoretical model, when both the data and the model exhibit intrinsic variability.

2.1. Formalism

We follow the standard Bayesian approach to calculate the posterior likelihood $P({\boldsymbol{w}}| \mathrm{data}$) that a theoretical model described by a vector of parameters $w$ is in agreement with a set of data, i.e.,

Equation (1)

Here, Ppri($w$) is the prior likelihood over the model parameters and we calculate the proportionality constant C such that the integral of the posterior likelihood over the parameter space is equal to unity.

The vector of parameters for the GRMHD models that will be compared to EHT observations includes the properties of the black hole spacetime (its mass, spin, and any deviations from the Kerr solution), geometric properties of the orientation of the observer (the orientation and inclination of the spin axis on the observer's sky), and parameters related to the microphysical properties of the plasma (e.g., the ratio of electron-to-ion temperatures in phenomenological models). Moreover, the results of the GRMHD simulations will need to be convolved with a model that describes the image blurring caused by interstellar scattering, which itself will have a number of parameters. Even though we explore below the characteristics of this Bayesian framework using a simplified two-parameter comparison between simulations and data, our methods are general and the full parameter space will need to be considered and explored when fitting actual EHT data.

Hereafter, we will consider a small set for the black hole spin, which has little effect on the overall properties of the black hole image (see Johannsen & Psaltis 2010; Broderick et al. 2011), and fix its inclination with respect to the observer, which can be constrained significantly by the overall size of the 1.3 mm image of the black hole (Psaltis et al. 2015). We will also consider discrete models for the plasma thermodynamics (see Chan et al. 2015b), with parameters chosen such that the simulations reproduce the broadband spectral properties of Sgr A*. This set of choices leaves us with two parameters that need to be obtained by comparing models to interferometric data, i.e., an overall multiplication factor for the flux density, F0, and the angle ξ between the east–west axis and the projection of the black hole spin angular momentum onto the sky plane, measured in degrees east of north (e.g., Broderick et al. 2011).4 Under these assumptions, Bayes' theorem becomes

Equation (2)

We assume flat prior likelihoods over the two model parameters Ppri(ξ) and Ppri(F0), with −180° ≤ ξ < 180° and Fmin ≤ F0 ≤ Fmax, with the range of the overall flux normalization to be specified later.

The last term in Equation (2), $P(\mathrm{data}| \xi ,{F}_{0})$, measures the likelihood that a particular set of data is consistent with a set of model parameters. For the case of EHT, data and simulations need to be compared directly as interferometric observables. A fundamental interferometric observable is the visibility, which is a sample of the Fourier transform of the sky image, I(α, β), at discrete spatial frequencies. The visibility, ${ \mathcal V }(u,v)$, is defined as

Equation (3)

for a projected antenna separation of (u, v) wavelengths within the plane perpendicular to the line of sight.

At any given observing epoch, the EHT will generate a set of simultaneous visibility amplitudes at many baselines as well as sets of closure phases on baseline triangles and closure amplitudes. For the purposes of this initial investigation, and to simplify our notation, we will assume that we only have observations of visibility amplitudes and that each measurement has independent uncertainties from the others. In other words, we will assume that

Equation (4)

where i = 1, 2,..., M and j = 1, 2,..., N denote baselines and epochs, respectively.

Each observation is characterized by a likelihood ${P}^{\mathrm{obs}}({ \mathcal V };{{ \mathcal V }}_{{ij}},{\sigma }_{{ij}})$, centered at ${{ \mathcal V }}_{{ij}}$ and with a dispersion of σij. If, for any pair of model parameters (ξ, F0), the model made a single prediction for the visibility amplitude for baseline i and epoch j, say ${{ \mathcal V }}_{{ij}}^{\mathrm{sim}}(\xi ,{F}_{0})$, then the single-baseline, single-epoch likelihood in Equation (4) would simply be equal to

Equation (5)

However, GRMHD models are highly variable (e.g., Chan et al. 2015a) and, for each baseline, they predict a range of visibility amplitudes, with a likelihood that we denote by ${P}_{{ij}}^{\mathrm{sim}}({ \mathcal V };\xi ,{F}_{0})$. In this case, we write the posterior likelihood that a particular observation is consistent with the model predictions as

Equation (6)

Our goal is to find the set of model parameters (ξ and F0 in the example here) that maximizes the posterior likelihood in Equation (4), given a set of observations. This Bayesian analysis process, presented in Figure 1 as a flow chart, selects models for which the distribution of predicted visibilities in each baseline matches the distribution of observed visibilites.

Figure 1.

Figure 1. Flow chart of the algorithm to test the performance of the Bayesian data analysis.

Standard image High-resolution image

The first factor in Equation (6), ${P}^{\mathrm{obs}}({ \mathcal V };{{ \mathcal V }}_{{ij}},{\sigma }_{{ij}})$, is well understood. The real and imaginary components of the complex visibility individually follow Gaussian distributions. The visibility amplitude is the magnitude of the complex vector described by those variables, and its posterior likelihood is not Gaussian. The appropriate posterior likelihood for the visibility amplitude, known as a Rice distribution, is

Equation (7)

where I0 is the zeroth order modified Bessel function of the first kind. This likelihood approaches a Gaussian for large signal-to-noise ratios (see Chapter 6 of Thompson et al. 2001).

In order to simplify the calculation of integral (6), we make use of the fact that the distribution of simulated visibility amplitudes ${P}_{{ij}}^{\mathrm{sim}}({ \mathcal V };\xi ,{F}_{0})$ has discrete values rather than a continuous distribution, as it is obtained from snapshot images. For T snapshots,

Equation (8)

where ${{ \mathcal V }}_{{ij}}^{\mathrm{sim}}(t;\xi ,{F}_{0})$ is the visibility amplitude sampled at (uij, vij) coordinates from the tth snapshot with the given set of parameters (ξ, F0). Therefore, Equation (6) simplifies to

Equation (9)

for the likelihood ${P}_{{ij}}^{\mathrm{sim}}({ \mathcal V };\xi ,{F}_{0})$, with given parameters.

In the following set of examples, we calculate the posterior likelihoods over the model parameters using Equations (2), (4), and (9). In real applications, however, observations on different baselines are performed simultaneously, so we must consider a joint probability distribution in visibility space. For example, when an observation is carried out with three stations (or equivalently, three baselines), the posterior likelihood for the observations, ${P}^{\mathrm{obs}}({ \mathcal V })$, becomes a multivariate Rician. Moreover, we need to consider the posterior likelihood, ${P}^{\mathrm{sim}}({ \mathcal V })$, that the model predicts a particular combination of simultaneous visibilities on the same baselines.

Finally, we can trivially incorporate to our Bayesian inference other interferometric observables, such as closure phases and closure amplitudes. The closure phase is the sum of visibility phases around the triangle of baselines formed by any set of three stations. The closure phase is independent of instrumental and atmospheric phase fluctuations and depends only on the visibility phase of the source. If we have ${ \mathcal N }$ stations where ${ \mathcal N }\geqslant 3$, ${}_{{ \mathcal N }-1}{C}_{2}=({ \mathcal N }-1)({ \mathcal N }-2)/2$ independent closure phases can be measured. The posterior likelihood for closure phase data becomes

Equation (10)

for $l=1,2,\ldots {,}_{{ \mathcal N }-1}{C}_{2}$ sets of telescopes (closure phase "triangles"), and j = 1, 2,..., N observations. Here, Pobs(Φ; Φlj, σlj) and Psimlj (Φ; ξ, F0) are the likelihoods for closure phases from the observation and simulations, similar to ${P}^{\mathrm{obs}}({ \mathcal V };{{ \mathcal V }}_{{ij}},{\sigma }_{{ij}})$ and ${P}_{{ij}}^{\mathrm{sim}}({ \mathcal V };\xi ,{F}_{0})$ in Equation (6).

2.2. Model Comparison

In addition to inferring the most likely parameters of a given model from observations, we are also interested in comparing the ability of different models to describe the observed distributions of visibility amplitudes and closure quantities.

Information criteria approaches such as the Akaike Information Criterion and the Bayesian Information Criterion (BIC) have been adopted elsewhere to evaluate the quality of astrophysical models (e.g., Liddle 2007). Broderick et al. (2011) applied these criteria in evaluating the relative success of time-independent models of Sgr A* in fitting EHT data. The BIC is defined as

Equation (11)

where ${{ \mathcal L }}_{\max }$ is the maximum likelihood using the parameters within the model, k is the number of free parameters, and N is the number of data points. The definition implies that the smaller BIC value suggests a better fit to the data. However, criteria such as the BIC consider only the maximum posterior likelihoods and do not account for the relative extent of the volumes in parameter space within which each model is consistent with the data.

In the context of our work, we will perform model comparisons in terms of an appropriately defined Bayesian evidence. If we consider each model as a single point in a discrete parameter space, then, given the data, we can use Bayes' theorem to calculate the posterior likelihood for each model as

Equation (12)

Here m is the discrete index specifying each model {${{ \mathcal M }}_{m}$}, which is governed by a set of parameters ${{\boldsymbol{w}}}_{m}$. If we denote by Ppri(m) the prior likelihood for each model and by Ppri,m(${{\boldsymbol{w}}}_{m}$) the prior likelihoods for each set of parameters within model m, then we can write

Equation (13)

The last term in Equation (13) is the posterior Bayesian likelihood of the data, given a model m and a set of model parameters, e.g., Equation (4).

In order to calculate the Bayesian evidence for model m, we marginalize Equation (13) over the space of model parameters, i.e.,

Equation (14)

Assuming a flat prior of models and the same prior likelihood over all model parameters for each model, this intergral simplifies to

Equation (15)

where C' is an appropriate normalization constant.

The integrated likelihood ${ \mathcal L }(m| \mathrm{data})$ provides a measure of model selection statistics. As is the standard practice in Bayesian inference, we use the Bayesian evidence only for comparison among models and not to assign a posterior likelihood to any model individually.

3. LIKELIHOOD WITH MOCK DATA

We now turn our attention to the application of the statistical method described in the previous section to mock and real EHT data. To this end, we make use of a set of previously studied GRMHD simulations labeled A $\to $ E, for which time-dependent spectra and images were computed and fit to the time-averaged properties of Sgr A* (Chan et al. 2015b).

3.1. GRMHD Simulations

In a recent study, Chan et al. (2015a) investigated the variability and millimeter/infrared flare characteristics of GRMHD simulations. The simulation of the accretion flow was performed with the three-dimensional GRMHD code HARM (Gammie et al. 2003; McKinney 2006; McKinney & Blandford 2009; Narayan et al. 2012; Saḑowski et al. 2013) and the ray-tracing of photon trajectories was performed with the fast GPU-based algorithm GRay (Chan et al. 2013). The simulated Sgr A* images with horizon-scale resolution and their spectra were fit to both the observed broadband spectra and the overall 1.3 mm source size determined by EHT observations (Chan et al. 2015b).

Chan et al. (2015a, 2015b) simulated high cadence images of the accretion flow around Sgr A* for each model. The simulation results were recorded with a timestep of 10 GMc−3, which corresponds to 212 s for the mass of Sgr A*. Each snapshot contains four levels of field of view: 16M × 16M, 64M × 64M, 256M × 256M, and 1024M × 1024M, containing 512 × 512 pixels. Here 1 GMc−2 corresponds to 5.1 μas for the distance to Sgr A*. We use a composite of the multi-level resolutions for our analysis in order to encompass all the emission near and away from the black hole and to generate interferometric visibilities. It is reasonable to limit our images to 1024M × 1024M, because the corresponding 10.2 μas pixel scale is considerably finer than the highest spatial frequency (58.9 μas fringe spacing) sampled by the EHT data we use as a comparison.

The GRMHD models of Chan et al. (2015a, 2015b, see also Narayan et al. 2012; Saḑowski et al. 2013) were categorized into two classes based on initial magnetic field configurations: Standard And Normal Evolution (SANE, disk-dominated) and Magnetically Arrested Disk (MAD, jet-dominated) models, which use multi-loop and single-loop initial magnetic fields, respectively. The simulation images at 1.3 mm for these two classes of models show dissimilar features from each other: in SANE models, the emission originates from a crescent-like shape in the disk region, whereas it is concentrated at the footprints of the outflowing material in the MAD models. Chan et al. (2015a, 2015b) also explored the dependence of the predictions of the GRMHD models on the black hole spin parameter a and on the way in which the electron temperature was prescribed.

In some of their models, the electron temperature in the funnel region of the accretion flow was fixed to be constant (constant Te,funnel). In other models, the ratio of the electron and ion temperatures in the funnel was fixed (constant θfunnel). In both cases, the electron and ion temperatures have a constant ratio in the disk. Among all the configurations they explored, they narrowed the possibilities down to five models that account for both the broadband spectrum of Sgr A* and its overall image size at 1.3 mm:

  • (1)  
    Model A: a = 0.7, SANE, constant Te,funnel.
  • (2)  
    Model B: a = 0.9, SANE, constant Te,funnel.
  • (3)  
    Model C: a = 0.0, MAD, constant θfunnel.
  • (4)  
    Model D: a = 0.9, MAD, constant Te,funnel.
  • (5)  
    Model E: a = 0.9, MAD, constant θfunnel.

All models show variability over the entire length of the simulation, i.e., over ∼10,000 GMc−3 (∼60 hrs). Figure 2 shows the 1.3 mm light curves for the simulations. Overall, the SANE models are observed to produce more variability than the MAD models.

Figure 2.

Figure 2. Light curves of Models A to E at 1.3 mm. The total fluxes of each snapshot are plotted with a 10 GMc−3 timestep, over a 10,000 GMc−3 duration. The combination GMc−3 corresponds to 21.2 s for the mass of Sgr A*. SANE models not only show larger variability than MAD models, but also often display sudden increases of the total intensity, i.e., flares.

Standard image High-resolution image

We generated visibilities from the simulations using our own VLBI simulation code. The algorithm calculates (u, v) coordinates from the observing time and celestial coordinates of the source, performs Fourier transform of the input image, and samples complex visibilities according to the telescope array setup. We tested the results of this code against the MIT Array Performance Simulator5 and found excellent agreement between the simulated visibilities.

Figure 3 shows the range of predicted visibility amplitudes for the five GRMHD models. The SANE models show more variability in the visibility space than the MAD models, just as they do in total intensity (Figure 2). This variability is explored in more detail in the companion paper, Medeiros et al. (2016).

Figure 3.

Figure 3. Sample VLBI visibility amplitudes of Models A to E. We generated the simulated visibility amplitudes using three stations in Hawaii, California, and Arizona. We produced the visibility amplitudes from GRMHD images, using (u, v) coordinates of the EHT observation data in 2007 and 2009, without rotation (ξ = 0°) and flux scaling (F0 = 1). For all (u, v) coordinate pairs, we plot the mean visibility amplitudes of 1024 snapshots in solid line as well as their ±1σ range in the shaded region.

Standard image High-resolution image

In the following sections, we describe tests of our Bayesian analysis using mock data from these simulations. These tests demonstrate that our statistical method successfully recovers model parameters (Section 3.2) and can faithfully distinguish between models (Section 3.5).

3.2. Parameter Estimation

In our first test we investigated the performance of our statistical method in inferring model parameters. We constructed the test as follows:

  • (1)  
    We assumed a fiducial set of model parameters (ξ, F0) = (−30°, 1).
  • (2)  
    We generated data using three baselines with three stations: Hawaii—Arizona, Hawaii—California, and Arizona—California.
  • (3)  
    We sampled (u, v) coordinates during the period when three stations can perform simultaneous observations of Sgr A*.
  • (4)  
    We generated visibilities using model D, which is an MAD model.
  • (5)  
    We randomly selected visibility amplitudes from the distribution of simulated visibilities, ${P}_{{ij}}^{\mathrm{sim}}({ \mathcal V };-30^\circ ,1)$, for each baseline. Visibilities were taken from the same simulation frame for all three baselines. We then added random Gaussian noise to the complex visibilities to produce mock observational data from the visibility amplitude samples.
  • (6)  
    We computed the posterior likelihood $P(\mathrm{data}| \xi ,{F}_{0})$ using simultaneous observations on three baselines, as described in Equation (4). We considered orientation ξ in the range from −90° to 90°, as visibility amplitudes without phase information are degenerate under rotations of 180°. We repeated the calculation for 5000 different realizations of the mock data.

We averaged the posterior likelihoods to form 500 sets of data, where each set contains N = 10 observations. For every set of data, we found the orientation and flux scale where the maximum likelihood occurs. The distributions of these maximum likelihood parameters for our input simulation are shown in Figures 4(a) and (b). The red vertical lines mark the model parameters (ξ, F0) that were assumed when generating the mock visibility data. The histograms indicate that the orientation and flux scale of the highest likelihoods of all data sets are distributed around the assumed parameters. We do not see any bias in the inferred parameters.

Figure 4.

Figure 4. Parameter estimation test with mock data generated using model D and assuming a black hole spin orientation of ξ = −30° and an overall flux normalization of F0 = 1. The first two panels show the one-dimensional likelihoods for 500 realizations of observations, each containing 10 epochs over (a) the orientation (ξ) and (b) flux scale (F0), evaluated at cross sections of the two-dimensional parameter space that encompasses the point of maximum likelihood. Panels (c) and (d) show the averaged likelihoods in parameter space for (c) 500 realizations each containing 10 epochs and (d) 200 realizations each containing 25 epochs. Maximum likelihoods occur at ξ = −30° and F0 = 1 for both cases. Contours indicate the 68.3%, 95.5%, and 99.7% credible regions. Given a realistic observational setup, our Bayesian method recovers the assumed parameters with no biases. Panel (e) is similar to (d), but with more complete coverage of a six-station EHT (see Section 3.3).

Standard image High-resolution image

Figure 4(c) shows the averaged posterior likelihoods of 500 sets of data. The red dotted lines in the parameter space again mark the combination of the assumed model parameters (ξ, F0) = (−30°, 1). The lines cross very near the peak of the posterior likelihood, and their intersection is located inside the 68% credible region of the likelihood. Figure 4(d) shows the averaged likelihoods when the 5000 mock visibilities are grouped into 200 sets, each including N = 25 observations. Compiling a larger number of observations causes the allowed region of the parameter space to contract, as expected.

3.3. Effect of Future EHT Stations

In the near future, more stations will be incorporated into the EHT, such as the Atacama Large Millimeter/submillimeter Array (ALMA) and the South Pole Telescope (SPT). We performed a mock observation test to explore the effect of future EHT stations. The test is similar to that of the previous section, but assuming we have 15 baselines with 6 stations. The stations are the Submillimeter Array (SMA) in Hawaii, the Combined Array for Research in Millimeter-wave Astronomy (CARMA) in California, the Submillimeter Telescope (SMT) in Arizona, the Large Millimeter Telescope (LMT) in Mexico, ALMA in Chile, and the SPT at the South Pole. The sampled (u, v) coordinate sets include non-simultaneous observations because considering only simultaneous detections for six stations greatly limits the observing period. Visibility amplitude errors are estimated from the baseline sensitivity, which is proportional to the geometric mean of the System Equivalent Flux Densities (SEFDs) of antennas.6 We assumed 10 s coherent integration time and 2 GHz bandwidth for the sensitivity.

There are three factors that affect the allowed parameter ranges in the Bayesian analysis: the overall sensitivity of the interferometric array, (u, v) coverage in the visibility space, and sampling of the intrinsic variability of the source. Figure 4(e) shows the combined effect of these: the increased sensitivity and (u, v) coverage of a six-station EHT will narrow the allowed parameter space compared to Figure 4(d), which has the same number of observations but fewer stations and lower sensitivity. To assess the importance of source variability in parameter estimation, we constructed three array setups as follows:

  • (a)  
    The early EHT array of three stations at Hawaii, Arizona, and California, with 920 observing epochs.
  • (b)  
    The full EHT array with six stations when the sensitivity of ALMA is replaced with that of the co-located Atacama Pathfinder Experiment (APEX) telescope,7 with 61 observing epochs.
  • (c)  
    The full EHT array with six stations, including ALMA, with four observing epochs.

In these three scenarios, we have varied the number of observing epochs such that the accumulated point source sensitivity of the array is constant. That is, the total thermal noise is the same in all the arrays, after accounting for the varying number of observing epochs. The comparison of the six-station array with APEX is in interesting counterpoint to the case with ALMA because it holds the (u, v) coverage fixed (and the sensitivity), but varies the number of observing epochs. Figure 5 shows that the parameter constraints are much tighter when we average over more observing epochs, emphasizing the importance of sampling the intrinsic source variability when attempting to constrain the model parameters with EHT observations. Figure 5(a) has the poorest (u, v) coverage by far, but achieves the tightest constraint, while fixing both (u, v) coverage and sensitivity in Figures 5(b) and (c) again show that simply sampling more epochs will tighten the parameter constraints significantly.

Figure 5.

Figure 5. Parameter estimation test with the same underlying parameters as Figure 4. The three panels show the averaged likelihoods for 20 realizations each in three separate setups. The number of observations in each setup is chosen such that the raw overall sensitivity is held fixed across the three setups. The three panels correspond to (a) 920 epochs on the SMA–CARMA–SMT baseline, where previous EHT observations were done; (b) 61 epochs on baselines including SMA, CARMA, SMT, LMT, APEX, and SPT; (c) 4 epochs including the same baselines as panel (b) but with ALMA replacing APEX. These figures show that sampling the source variability is more important for parameter estimation than the raw sensitivity of the array or the (u, v) coverage.

Standard image High-resolution image

3.4. Effect of Time Averaging

In the parameter estimation test, we calculated likelihoods over parameter space with both time-dependent and averaged GRMHD images to compare the effect of time averaging. Figure 6 contrasts the likelihoods of time-resolved (black contours) and time-independent (red contours) inputs to our Bayesian method when the same number of mock observations are analyzed. Figures 6(a) and (b) are the likelihoods assuming the early three-station baselines and the six-station EHT with ALMA, respectively. The figure shows that a much smaller region of parameter space is allowed when comparing the mock data to a time-averaged image. However, the best-fit parameters in the time-averaged case are offset from the true parameters, as is most clearly evident in Figure 6(b). This demonstrates the false certainty implied by the small allowed parameter region. The time-resolved analysis, by contrast, delivers allowed parameter ranges that are consistent with the input simulation.

Figure 6.

Figure 6. Effect of time averaging in the parameter estimation test for (a) the early EHT baselines and (b) the full EHT baselines including ALMA. We use the identical parameters for the mock data as Figure 4. The likelihoods of the averaged GRMHD image are overplotted in red contours. The likelihood using the averaged images underestimates the uncertainties of the inferred parameters. The best-fit parameters derived from the time-dependent and averaged images are offset from each other as well.

Standard image High-resolution image

3.5. Model Selection

We now use the Bayesian method to identify the relative posterior likelihood among several models, as discussed in Section 2.2. For this test we generated mock visibilities following the same procedure as the test of the previous section. We used (u, v) coordinates identical to those in the EHT observations of 2007 and 2009. As discussed earlier, SANE models present greater variability than MAD models (Figure 3). Therefore, we generated two mock data sets, one from model A (SANE) and one from model E (MAD), and performed our model selection test twice (labeled test A and test E, respectively).

We present in Table 1 the marginal likelihoods in Equation (15) and the $\bigtriangleup \mathrm{BIC}$ values in Equation (11). We obtained these values by fitting the mock model A data and mock model E data with all five models. The ratio of the marginal likelihoods for two models is the Bayes factor, and it measures the relative strength of the evidence against another one. Jeffreys's scale provides guidance for the interpretation of the Bayes factor (Kass & Raftery 1995). A ratio between 10 and 100 is translated as strong, and a ratio greater than 100 is regarded as decisive evidence. A similar approach using the difference of information criteria exists and $\bigtriangleup \mathrm{IC}$ greater than 5 and 10 are understood as strong and decisive, respectively.

Table 1.  Marginal Likelihoods of Model Estimation Tests

  Test A Test E
  ${ \mathcal L }(m| \mathrm{data})$ △BIC ${ \mathcal L }(m| \mathrm{data})$ △BIC
Model A 1.00 0.0 5.36 × 10−7 28.5
Model B 0.799 1.5 5.31 × 10−10 42.8
Model C 7.20 × 10−9 37.5 9.90 × 10−4 12.7
Model D 1.81 × 10−4 17.3 3.47 × 10−5 20.8
Model E 2.98 × 10−10 41.4 1.00 0.0

Note. Test A uses mock data using model A, and test E uses mock data using model E. The marginalized likelihoods, ${ \mathcal L }(m| \mathrm{data})$, are normalized to the best model for both tests, as are the BIC values.

Download table as:  ASCIITypeset image

In each test, our analysis prefers the model from which the data were generated. In Test A, model B is only slightly disfavored compared to model A. This results from the similar shapes of the visibility amplitude distributions in the SANE simulations. The fact that model B has a different spin than model A is also an indication that black hole spin has a very small effect on the overall appearance of SANE models for Sgr A*. In this test, the MAD models are decisively rejected, as expected, because the smaller variability of the model visibilities in the MAD simulations is inconsistent with the mock data. Conversly, Test E decisively favors model E but rejects all four other models.

4. APPLICATIONS TO EARLY EHT DATA

In this section, we apply the Bayesian method to early, limited EHT visibility amplitude data. These EHT observations of Sgr A* were carried out in 2007 and 2009 with only three stations (Doeleman et al. 2008; Fish et al. 2011): the James Clerk Maxwell Telescope in Hawaii (HI), the CARMA in California (CA), and the SMT of the Arizona Radio Observatory (ARO) in Arizona (AZ). The observations measured 18 visibility amplitudes in 2007 (HI–AZ: 4, AZ–CA: 14) and 51 (HI–AZ: 19, HI–CA: 12, AZ–CA: 20) in 2009.

In our comparison of the GRMHD models to the data, we only consider values for the orientation ξ from −90° to 90°, as before, because visibility amplitudes have a 180° degeneracy. Closure phase information is required to resolve the ambiguity, and the closure phases were not measured well in these early data.

In order to compare the simulations with real data, which are affected by interstellar scattering, we incorporated the effects of scattering using a simple prescription. Multiwavelength observations of Sgr A* show that its intrinsic size scales as λ2 due to scattering by electrons along the line of sight to the Galactic Center. The scattering is described as an elliptical kernel (Bower et al. 2006) having major and minor axes

Equation (16)

and

Equation (17)

with a position angle of the major axis PA = 78° east of north. In the visibility space, this corresponds to an elliptical taper of the visibility amplitude (Fish et al. 2014). Psaltis et al. (2015) revisited the inference of the scattering kernel parameters from observations and evaluated their uncertainties. Johnson & Gwinn (2015) described the refractive scintillation in more detail, where the real effect of scattering is likely more complicated than this simple convolution prescription. For the purposes of our initial study, we will not consider further these two improvements on the modeling of the scattering kernel.

Figure 7 shows the normalized posterior likelihoods when comparing the EHT data to the five GRMHD models. Notably, the observational data identify different physical parameters for the five models, even though all models are physically plausible descriptions of the structure of Sgr A*. For all five models, the 2009 EHT data dominate the outcome because of the larger number of detections in that campaign. Parameters with maximum likelihoods are given in Table 2 with 68% errors. While there is some agreement about the orientation parameter among most of the models with limited data, it is clear that inferences about the spin orientation are model specific and not general. Of course, the analysis presented here does not explore the effect of varying other parameters, such as the inclination. Expanding the dimensionality of the parameter space for each model will most likely affect the inferred model parameters and their uncertainties.

Figure 7.

Figure 7. Normalized posterior likelihoods over the black hole spin orientation ξ and the overall flux normalization F0, when the five GRMHD models A to E of Chan et al. (2015a) are compared to two epochs of EHT data. The likelihood when 2007 data, 2009 data, and the combined data set are used is shown in the three sets of panels. The parameter constraints derived from comparing the data to time-averaged images are overplotted in red contours. The parameter constraints are both offset from the constraints derived when considering the intrinsic variability of the model, and the allowed parameter range is incorrectly reduced compared to the true uncertainty reflected by the time-dependent analysis.

Standard image High-resolution image

Table 2.  Best-fit Parameters and Model Evidences with EHT Observations in 2007 and 2009

  Time-dependent analysis Time-averaged analysis
  ξ F0 ${ \mathcal L }(m| \mathrm{data})$ △BIC ξ F0 ${ \mathcal L }(m| \mathrm{data})$ △BIC
Model A ${-46^\circ }_{-{6}^{^\circ }}^{+{9}^{^\circ }}$ ${1.41}_{-0.19}^{+0.23}$ 1.84 × 10−8 37.1 ${-{42}^{^\circ }}_{-{3}^{^\circ }}^{+{3}^{^\circ }}$ ${1.27}_{-0.03}^{+0.03}$ 5.98 × 10−1 0.0
Model B ${-36^\circ }_{-{8}^{^\circ }}^{+{43}^{^\circ }}$ ${1.23}_{-0.22}^{+0.19}$ 5.99 × 10−15 68.2 ${-46^\circ }_{-{2}^{^\circ }}^{+{2}^{^\circ }}$ ${1.37}_{-0.03}^{+0.03}$ 4.93 × 10−2 4.5
Model C ${50^\circ }_{-{5}^{^\circ }}^{+{6}^{^\circ }}$ ${1.11}_{-0.06}^{+0.06}$ 2.71 × 10−2 6.9 ${53^\circ }_{-{5}^{^\circ }}^{+{4}^{^\circ }}$ ${1.05}_{-0.02}^{+0.03}$ 3.62 × 10−2 6.0
Model D ${-37^\circ }_{-{6}^{^\circ }}^{+{14}^{^\circ }}$ ${2.44}_{-0.15}^{+0.15}$ 1.18 × 10−3 15.7 ${-32^\circ }_{-{4}^{^\circ }}^{+{11}^{^\circ }}$ ${2.24}_{-0.05}^{+0.05}$ 6.56 × 10−4 16.0
Model E ${-39^\circ }_{-{3}^{^\circ }}^{+{4}^{^\circ }}$ ${2.50}_{-0.11}^{+0.12}$ 1.00 0.0 ${-37^\circ }_{-{3}^{^\circ }}^{+{4}^{^\circ }}$ ${2.41}_{-0.05}^{+0.05}$ 1.00 0.8

Download table as:  ASCIITypeset image

Figure 7 and Table 2 also compare the likelihoods obtained when fitting time-dependent and time-averaged GRMHD images to the data. The time-averaged analysis is similar to the time-resolved one, except that ${P}^{\mathrm{sim}}({ \mathcal V })$ is composed of visibilities generated from a single image obtained by averaging all the snapshots. As with the mock data, the lack of variability in the averaged model image results in a substantial underestimation of the uncertainties in the model parameters.

Remarkably, comparing model evidences shows that SANE Models A and B are decisively rejected in the time-dependent analysis, while they are among the most favored in the time-averaged analysis. This happens because, even though the images of Models A and B have an overall structure that matches the observations (compare, e.g., the data and the predictions of the five models at baselines larger than ∼3 Gλ in Figure 8), they are nevertheless disfavored in the time-dependent analysis because they show much larger variability than the current limited data (compare the range of visibilities predicted at ≲1 Gλ by all five models in Figure 8). Clearly, retaining knowledge of the variability predicted by a particular GRMHD (or other) simulation is important for evaluating how well that simulation matches the data.

Figure 8.

Figure 8. VLBI visibility amplitudes of Models A to E with the best-fit parameters listed in Table 2 and EHT data. The visibility amplitudes are plotted with their ±1σ range in the shaded region. The EHT data include two days of observations in 2007, and three days observations in 2009. Although the simulated visibility amplitudes and observations appear to agree well with each other for all five models, marginal likelihoods and BIC show model E is the favored one with the limited EHT data.

Standard image High-resolution image

The Bayesian evidence for the different models shown in Table 2 seems to suggest that the existing EHT data favor MAD models over the SANE ones. However, it is important to emphasize here a requirement that needs to be satisfied before the Bayesian method can be applied to real data: that both the simulations and the observations have sampled adequately the variability at each baseline so that they cover the entire range of possibilities. This is clearly not the case for the early EHT data that we are using here, since they comprise only ∼10 hrs of total integration time. As a result, while we present this analysis as a proof of principle, inferring model parameters from real EHT data based on our Bayesian method will be possible only after a substantial amount of data has been collected and their variability has been characterized.

There exist in the literature several analyses of the current EHT data based on time-averaged models. Broderick et al. (2009, 2011) modeled the accretion flow with time-independent, semi-analytic RIAF models and inferred the following as the most probable black hole parameters: viewing angle8 $i={50^\circ }_{-{10}^{^\circ }}^{+{10}^{^\circ }}$, and position angle $\xi ={-20^\circ }_{-{16}^{^\circ }}^{+{31}^{^\circ }}$ for the 2007 observations only, and $i={68^\circ }_{-{20}^{^\circ }}^{+{5}^{^\circ }}$ and $\xi ={-52^\circ }_{-{15}^{^\circ }}^{+{17}^{^\circ }}$ for the 2007 and 2009 observations. Dexter et al. (2010, 2012) performed three-dimensional, time-dependent GRMHD simulation, then averaged the simulated images to fit the VLBI data. They provided $i={50^\circ }_{-{15}^{^\circ }}^{+{35}^{^\circ }}$ and $\xi ={-23^\circ }_{-{22}^{^\circ }}^{+{97}^{^\circ }}$ from the 2007 observation data, and $i=60^\circ \pm 15^\circ $ and $\xi ={-70^\circ }_{-{15}^{^\circ }}^{+{86}^{^\circ }}$ using the 2007 and 2009 observations.9

Although these values are broadly consistent with each other and with those we inferred here for most models, their uncertainties are significantly tighter than what we found here for the SANE models (which are the only types of models considered earlier). This is not surprising because, as the above analysis suggests, fitting models that do not allow for source variability yields artificially small parameter uncertainties when the source is, in fact, variable.

5. CONCLUSION

In this work, we developed a statistical tool that will allow us to interpret the variable EHT observational data in the context of time-dependent model predictions. We tested the method for parameter estimation and model selection statistics. We then applied this Bayesian statistical analysis to existing EHT observations from 2007 and 2009, as a proof of principle, and investigated its applicability and statistical power.

We find that taking into account the time variability in the data and in the models increases the uncertainties in the inferred parameters beyond what is obtained when using time-averaged emission structures. Moreover, ignoring the variable nature of the data and of the simulations may lead to erroneous model selections.

In our analysis, we only considered two parameters—the orientation of the black hole spin vector in the sky plane and the flux normalization—in this paper. Additional parameters, such as the inclination angle of the spin vector, the black hole spin and the plasma parameters of the accretion flow, can also be investigated with this method by expanding the parameter space and injecting corresponding simulation data sets.

Finally, the method presented in this paper is quite general and can work with any time-resolved simulation. Although only visibility amplitudes were included in our testing, other observables such as closure phase and amplitude can easily be incorporated with future observations with additional baselines.

J.K. and D.P.M. acknowledge support from NSF grants AST-1207752 and AST-1440254. C.K.C., F.O., and D.P. were partially supported by NASA/NSF TCAN award NNX14AB48G and NSF grants AST 1108753 and AST 1312034. L.M. acknowledges support from NFS GRFP grant DGE 1144085. All ray-tracing calculations were performed with the El Gato GPU cluster at the University of Arizona that is funded by NSF award 1228509.

Footnotes

  • Another often used angle to measure the same orientation is denoted by ϕ and is the complementary angle to ξ, measured in degrees north of east (e.g., Bartko et al. 2009); clearly, ξ = 90° − ϕ.

  • SMA phased array: 4000 Jy, CARMA: 10,000 Jy, SMT: 11,000 Jy, LMT: 1400 Jy, ALMA phased array: 100 Jy, SPT: 9000 Jy.

  • SEFD of 3600 Jy.

  • Note that Broderick et al. (2009, 2011) uses θ instead of i for the observer's inclination angle.

  • Dexter et al. (2010, 2012) use 90% confidence for the uncertainty of estimated parameters.

Please wait… references are loading.
10.3847/0004-637X/832/2/156