UvA-DARE (Digital Academic Repository) Selective Dynamical Imaging of Interferometric Data

Recent developments in very long baseline interferometry ( VLBI ) have made it possible for the Event Horizon Telescope ( EHT ) to resolve the innermost accretion ﬂ ows of the largest supermassive black holes on the sky. The sparse nature of the EHT ’ s ( u , v ) -coverage presents a challenge when attempting to resolve highly time-variable sources. We demonstrate that the changing ( u , v ) -coverage of the EHT can contain regions of time over the course of a single observation that facilitate dynamical imaging. These optimal time regions typically have projected baseline distributions that are approximately angularly isotropic and radially homogeneous. We derive a metric of coverage quality based on baseline isotropy and density that is capable of ranking array con ﬁ gurations by their ability to produce accurate dynamical reconstructions. We compare this metric to existing metrics in the literature and investigate their utility by performing dynamical reconstructions on synthetic data from simulated EHT observations of sources with simple orbital variability. We then use these results to make recommendations for imaging the 2017 EHT Sgr A * data set. evenly distributed both radially and angularly. The result is an accurate recovery of the model behavior.


Introduction
Interferometric astronomical observations offer much larger resolving power than do single telescopes, with the interferometric resolution depending on the distance between the elements rather than the diameters of the individual apertures. Since an interferometer probes the Fourier transform of an on-sky source (and not the source image itself), the placement, selection, and availability of baselines to maximize coverage of the (u, v)-plane is an important and open optimization problem inherent to the interferometric image synthesis. For very long baseline interferometry (VLBI), which represents some of the most extreme interferometric observations, the arrays are generally very sparse, Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. 3 and optimized placement or selection of baselines becomes critical to recovering reliable information about the source.
For any ground-based interferometric observations, including VLBI, Earth rotation causes each individual baseline to trace an elliptical path in the (u, v)-plane. Such "Earth rotation aperture synthesis" can be used to improve the (u, v)-coverage of sparse arrays. Standard VLBI imaging uses this additional coverage under the assumption that the target source structure remains unchanged for the duration of the observation. For the cases of sources that vary on shorter timescales (e.g., the Galactic center, X-ray binaries, microquasars), "snapshot images" can be produced using only time intervals over which the source can be considered static (e.g., Martí-Vidal et al. 2011;Massi et al. 2012;Miller-Jones et al. 2019). In the most extreme scenarios, only instantaneous coverage may be suitable for use in static image reconstructions, severely limiting the (u, v)-coverage and resulting image quality. Such a case is the main concern of this paper. Moreover, this limitation will generally result in portions of an observation that are better than others for imaging because the snapshot coverage is time dependent.
In particular, directionally biased (u, v)-coverage results in a nonisotropic resolution that complicates interpretation of the reconstructed source geometry. The (u, v)-coverage of a given array varies with the decl. of the source, due to its dependence on the projected baseline lengths and orientations. The impact of source decl. on the (u, v)-coverage varies throughout a night of observation as the target rises and sets. A network of baselines that produces an evenly distributed (u, v)-coverage of an equatorial source will be east-west biased when targeting a northern source. Observations where the source morphology and orientation of the source are unknown cannot exploit a priori knowledge to determine whether a directionally biased (u, v) configuration will be able to reproduce source structure reliably in an image. Therefore, for cases where the source morphology is unknown, the optimal (u, v)-coverage for producing high-quality image reconstructions will be those that are approximately isotropic in angular distribution.
Periods of sustained isotropic (u, v)-coverage may allow multiple high-quality image reconstructions on short timescales to be strung together, resulting in "dynamical" reconstructions. For instance, the most basic dynamical reconstruction can be produced by joining a series of snapshot images. The quality of such a reconstruction will be time dependent owing to the evolving snapshot coverage, and it may be necessary to exclude periods with poor (u, v)-coverage. In periods of time where the (u, v)-coverage is optimal, the coverage used to produce each snapshot of a dynamical reconstruction will be sufficiently isotropic to reproduce source features on snapshot timescales, allowing for the production of reconstructions that adequately recover the evolution of highly time-variable sources. By contrast, periods of suboptimal (u, v)-coverage may be unable to provide high-quality reconstructions on snapshot timescales, making the algorithmic detection and subsequent flagging of these periods important for the production of meaningful dynamical reconstructions.
The challenges of observing a variable source with a sparse interferometer can be exacerbated by poor coverage geometry. In particular, given the shortage of available facilities supporting high-frequency radioastronomical observations, corresponding VLBI arrays exhibit substantiantial variance in coverage quality in time.
The Event Horizon Telescope (EHT) is a unique VLBI network of telescopes that exploits the full diameter of Earth and the performance at the challenging 1.3 mm (230 GHz) wavelength to achieve the required angular resolution for horizon-scale tests of general relativity for the largest black holes on the sky (see, e.g., Johannsen & Psaltis 2010;Psaltis et al. 2015;Psaltis 2019). The EHT operates in millimeter wavelengths, the optimal range that enables the resolution of the Sgr A * black hole shadow and reduces the impact of the interstellar medium scattering effects dominant for longer wavelengths, while still being observable and manageable in the radio interferometric framework. Operating at 230 GHz, the EHT achieved a resolution of ∼20-25 μas, which the EHT Collaboration (EHTC) used to produce the first images of a supermassive black hole (Event Horizon Telescope Collaboration et al. 2019aCollaboration et al. , 2019bCollaboration et al. , 2019cCollaboration et al. , 2019d. These data and the resulting images were used to estimate a mass of M ≈ 6.5 × 10 9 M e for the supermassive black hole in M87 (Event Horizon Telescope Collaboration et al. 2019f), based on an observed angular shadow diameter of ∼ 42 μas (Event Horizon Telescope Collaboration et al. 2019aCollaboration et al. , 2019d. During the 2017 campaign, the EHT also observed the radio source Sgr A * in the Galactic center (Event Horizon Telescope Collaboration et al. 2022aCollaboration et al. , 2022bCollaboration et al. , 2022cCollaboration et al. , 2022d, associated with a supermassive black hole with M ≈ 4.1 × 10 6 M e (GRAVITY Collaboration et al. 2018a;Do et al. 2019;Event Horizon Telescope Collaboration et al. 2022f). The expected mass-to-distance ratio (M/D) of Sgr A * yields a predicted angular shadow diameter of ∼50 μas (GRAVITY Collaboration et al. 2018b) and a minimum variability timescale (light-crossing time) of GM/c 3 ≈ 20 s. The corresponding timescale for M87 is ∼ 1600 times longer owing to the larger mass. Indeed, structural variability of the M87 shadow has been reported on timescales from ∼1 week (Event Horizon Telescope Collaboration et al. 2019d) to several years (Wielgus et al. 2020b). The rapid minimum variability timescale of Sgr A * , combined with the extreme sparsity of the EHT, presents an urgent and unique need to characterize the effects of time-dependent instantaneous (u, v)coverage.
In this paper, we develop a procedure for selective imaging of highly variable sources. In Sections 2 and 3, we summarize the synthetic data generation and imaging methods used herein. In Section 4, we show the limitations of imaging in sparse and uneven coverage. In Section 5, we survey several metrics capable of ranking (u, v)-coverage quality. Additionally, we derive a novel isotropy-based metric that addresses the limitations described in Section 4. In Section 6, we apply these metrics to the 2017 EHT coverage of Sgr A * , validate their ability to predict reconstruction quality from (u, v)coverage geometry, and make recommendations for selective dynamical imaging of the 2017 EHT Sgr A * data set. In Section 7, we briefly discuss the utility of coverage metrics in ranking and selecting between different available observing periods. Finally, in Section 8, we summarize our results.

Model Definition and Synthetic Data Generation
In order to test the ability of various EHT array configurations to recover source variability in different observation periods, we designed and generated synthetic data for three different models. The models are chosen owing to their structural similarity (in image and visibility domain) to expected images of Sgr A * .
4 Similarity between the model and data was characterized as either displaying time variability or producing a Bessel function Fourier representation with nulls between 2 and 4 Gλ and between 6 and 9 Gλ. The models are described in Section 2.1. The synthetic data generation is expanded on in Section 2.2.

Models
Here we describe the suite of models used to test the effects of coverage on reconstructions. Examples of each model with CLEAN beam convolution can be seen in the first row of Figure 1.

Rotating Elliptical Gaussian
The rotating elliptical Gaussian model is generated using a bivariate exponential with major-axis FWHM Γ a , minor-axis FWHM Γ b , and overall flux density A: The image is padded and rotated by polar angle j after the model is generated. By default, the overall flux density A was set to 1.0 Jy. Periods for the rotation ranged between 30 and 1000 minutes (longer than one night of observation).

Ring and Orbiting Hot Spot
The ring model is generated by the subtraction of two concentric uniform-brightness disks, equivalent to the crescent model described in Kamruddin & Dexter (2013), with the parameters a = b = 0. The positive disk has a radius of 25 μas, and the subtracted disk has a radius of 18 μas. All ring models in this paper use these parameters with the exception of the ring model in Figure 4, which has a subtracted disk radius of 20 μas. The ring model is used with two synthetic data tests: the static ring (with no hot spot) and the dynamic ring (a static ring plus an orbiting hot spot, referred to as ring+hs). The underlying ring has a diameter of 50 μas and a flux density of 1.0 Jy. A hot spot total flux density of 0.25 Jy and an FWHM of 10 μas are added to the image, centered on the ring. After construction, the total flux density of each image is normalized to 1.0 Jy. In the dynamic ring model, the orbiting hot spot is centered on the ring and circularly orbiting at a distance of 21.5 μas with periods of 30 and 270 minutes. The static ring model is a special case of the ring+hot spot model with the flux density of the hot spot set to zero.

Synthetic Data Generation
Synthetic data were generated based on April 7 of the 2017 EHT coverage using the eht-imaging library (Chael et al. 2018). A simulated observation corresponded to approximately 11.5 hr of observing on all available baselines. Snapshot images from the simulated sources and the resulting amplitudes and closure phases can be seen in Figure 1 and Appendix A.
Observation parameters (e.g., R.A. and decl. of the source, observing frequency, bandwidth) were duplicated from the 2017 EHT survey of Sgr A * . The stations included in the simulated observations were the Atacama Large Millimeter/ submillimeter Array (ALMA), the Atacama Pathfinder Experiment (APEX), the Large Millimeter Telescope (LMT), the James Clerk Maxwell Telescope (JCMT), the Submillimeter Array (SMA), the IRAM 30 m telescope on Pico Veleta (PV), and the South Pole Telescope (SPT). Weather complications were ignored, and the simulated observations assumed that all stations were observing the source for the chosen night.
All data were generated with thermal noise only. Realistic values for the thermal noise power were based on estimates from the real 2017 EHT data. No other noise, scattering, leakage, or simulated gain errors were applied to the simulated visibilities.

Imaging Approaches
Since the interferometric measurements are often incomplete in the Fourier domain, the inverse problem of reconstructing an image from the observed data set is usually underdetermined. Consequently, the image reconstruction requires prior information, assumptions, or constraints to derive a reasonable image from the infinite number of possibilities that can explain the measurements.
The two most popular categories of imaging methodologies are inverse modeling (e.g., CLEAN) and forward modeling (e.g., regularized maximum likelihood). See Event Horizon Telescope Collaboration et al. (2019d) for a general overview of the two methods. For time-variable sources, both approaches may allow for more effective reconstructions of dynamic structures than snapshot imaging by including assumptions or constraints on temporal variations of the source structure in addition to the spatial properties regularized in static imaging (e.g., Event Horizon Telescope Collaboration et al. 2019d). One such way imposes a temporal similarity constraint between Figure 1. The three synthetic models detailed in Section 2 are displayed. The first row shows each model as seen on-sky at 6 UT, just after the observation begins. The second row shows the model convolved with an 18 μas diameter CLEAN beam. The white circle in the lower right corner shows the size of the beam. The third row shows the measured visibility amplitudes as a function of baseline length for the entire observation. A static imaging routine would fit to the full set of these amplitudes; however, a dynamical imaging routine only attempts to fit to small chunks of the full data set at any one time.
images at different times and between each time snapshot and the time-averaged structure. In the following subsections, we briefly describe each dynamical reconstruction method used in this paper. See Event Horizon Telescope Collaboration et al.

Inverse-modeling Approaches
Imaging of radio interferometric data is traditionally carried out through CLEAN deconvolution algorithms (e.g., Högbom 1974;Clark 1980). These inverse-modeling approaches iteratively deconvolve the effects associated with the limited sampling of the (u, v)-plane, corresponding to the interferometer's point-source response (the so-called "dirty beam") to the inverse Fourier transform of the measured visibilities, commonly referred to as the "dirty image." The source brightness distribution is modeled as a collection of point sources, which are extracted at the location of the peaks in the dirty image through an iterative process until some specified stopping criterion is reached. In observations with limited (u, v) sampling, such as those obtained with the EHT, it is important to guide the CLEAN deconvolution process through the inclusion of the so-called "cleaning windows," restricting the sky areas within which the point components are localized.
Mitigation of the a priori calibration uncertainties is commonly carried out through multiple rounds of CLEAN deconvolution followed by self-calibration, which solves for the station gains that maximize consistency between the current model and the measured visibilities (e.g., Wilkinson et al. 1977;Readhead et al. 1980;Cornwell & Wilkinson 1981;Pearson & Readhead 1984). Amplitude self-calibrations are necessarily limited to intervals of time larger than the expected variability in order to retain information about source variability. The final image is obtained by convolving the model components with a Gaussian CLEAN beam that approximates the central lobe of the point-spread function of the interferometer, with the addition of the last residual image, which represents some unconvolved additional structure and noise. In this paper we use the Difmap software (e.g., Shepherd 1997 andM. Shepherd 2011, private communication) for CLEAN imaging.
Once the imaging procedure converges based on a specified stopping criterion into an average static image, CLEAN dynamic imaging is performed by first dividing the data set into smaller portions with a time duration similar to that of the expected source timescale variability (i.e., "snapshots"). Under the assumption of small structural changes over time, the model corresponding to the static image is used as an initial model, upon which we look for structural changes by cleaning the residual map corresponding to each data snapshot. To guide the deconvolution with such a limited (u, v)-coverage, we limit the extra cleaning to the imaging regions in which we have emission in the averaged image by placing tight cleaning windows. In addition, further self-calibrations in phase and amplitude are performed to refine antenna gain corrections.
The CLEAN algorithms do not enforce similarity between snapshots, other than the use of common initial image priors, which facilitates tracking of rapid source structural changes at arbitrarily separated spatial locations. However, these image changes are restricted to occur within the tight cleaning windows established around the emission found in the averaged static image.

Forward-modeling Approaches
Unlike the inverse-modeling methods, which solve for a sparse image on the image domain from the dirty map transformed from the measurement sets, the forward-modeling methods solve for an image by evaluating the data likelihood derived from the consistency between actual measurements and the model data set forward-transformed from the image. It offers flexibility to the imaging through robust data products (e.g., closure quantities that are not affected by station-based calibration uncertainties) and incorporates various observing effects into the observational equation used in the forward transform.
Regularized maximum likelihood (RML) methods (see Event Horizon Telescope Collaboration et al. 2019d, for an overview) optimize a cost function composed of χ 2 terms (proportional to log-likelihood terms) of visibility components and regularization terms that describe the prior assumptions for images. Each regularization term is described by a product of its relative weight (i.e., hyperparameter) and regularization functions. These regularization functions include, e.g., maximum entropy (e.g., Narayan & Nityananda 1986;Chael et al. 2016), total variation and its variants (e.g., Akiyama et al. 2017;Kuramochi et al. 2018), and sparsity priors (e.g., Honma et al. 2014). The cost function can be interpreted as a maximum a posteriori (MAP) estimation by considering the regularization terms as log prior distribution of the image, although regularization functions do not always have a probabilistic interpretation. The final reconstruction is convolved with the CLEAN beam of the interferometer to remove the effects of methodology-specific super-resolution.
The RML approach can be extended to dynamic reconstruction (henceforth RML dynamic imaging) in a conceptually simple way (Johnson et al. 2017). The likelihood term can be formulated by forward-transforming snapshots of a video, instead of a single image, to data. One can add temporal regularization terms that penalize temporal variations of the source structure by defining a metric for the "distance" between adjacent frames. A popular choice is a sum of squared pixel differences between two adjacent snapshots, assuming that snapshot-to-snapshot transition of the source brightness is piecewise smooth (e.g., the R Δt regularizer in Johnson et al. 2017). Another widely used choice is a sum of squared differences between the time-averaged image and each snapshot, based on an assumption conceptually similar to dynamic CLEAN imaging (Section 3.1) that the deviations of each snapshot from the mean image are small and sparse (e.g., R ΔI regularizer in Johnson et al. 2017). The temporal regularization term necessarily suppresses intrinsic source variability if weighted too high; however, what constitutes "too high" varies depending on the source structure and variability timescale. Popular image distance metrics include the Euclidean norm or a relative entropy such as Kullback-Leibler divergence (Kullback & Leibler 1951).
StarWarps (Bouman et al. 2018) is another forwardmodeling method for dynamical imaging adopted in this work. StarWarps, based on a probabilistic graphical model, solves for snapshots of a video by solving its posterior probability distribution defined as a product of three terms: data likelihood, multivariate Gaussian distributions for each snapshot, and transitional probabilities between adjacent snapshots effectively working as spatial and temporal regularizations, respectively. StarWarps allows for the exact inference of the 6 video by computing a mean and covariance of the image, which provides a complete description under Gaussian approximation. By contrast, the RML dynamic reconstruction derives only a MAP estimation. StarWarps requires an initial static image, which can be either data driven (e.g., a best-fitting static image of the entire data set being dynamically reconstructed) or prior driven (e.g., a synthetic image of a ring).
In this paper, we use the RML dynamic imaging algorithms implemented in eht-imaging (also referred to as ehtim) and SMILI (Sparse Modeling Imaging Library for Interferometry) and the StarWarps algorithm in eht-imaging. See Event Horizon Telescope Collaboration et al. 2022c for more details of regularization functions and other imaging parameters used in the reconstructions.

Limitations of Sparse (u, v)-coverage
In this section, we explore limitations of imaging with limited and directionally biased (u, v)-coverage. In particular, we will show that sparse (u, v)-coverage results in predictable limitations (i.e., deviations from the true source morphology), which are determined by the geometry of the (u, v)-coverage. Figure 2 shows how imaging using directionally biased (u, v)-coverage (column (a)) fails to properly recover the orientation of an intrinsically noncircularly symmetric source when the (u, v)-coverage does not sufficiently sample the source structure in the relevant direction. By contrast, the same baselines-oriented in a more isotropic way (column B)-are capable of recovering the source profile in all directions. In addition, reconstructions of circularly symmetric sources by directionally biased (u, v)-coverage (column (a) of Figure 3) can introduce a lack of circular symmetry that is not present in the underlying source or in reconstructions performed on nondirectionally biased coverage (column (b) of Figure 3). Due to the minimization of the artifacts introduced into a reconstruction via incomplete (u, v)-coverage, imaging algorithms work better when applied to isotropic (u, v)-coverages.
In addition to angular inhomogeneity, radially inhomogeneous coverage also leads to ambiguous image reconstruction. The Fourier transforms of various source types (e.g., rings, crescents, Gaussians) are approximately degenerate when observed using an interferometer that only marginally resolves the image (e.g., Thompson et al. 2017;Issaoun et al. 2019). More complex approximate degeneracies exist for radially inhomogeneous interferometers that only probe short and long but not intermediate baselines (e.g., Doeleman et al. 2008). For the EHT observing a ∼ 50 μas diameter ring, as expected for Sgr A * , short baselines correspond to those with length <2 Gλ and long baselines correspond to those with length > 6 Gλ. Figure 4 demonstrates that a Gaussian model describes the simulated EHT observation of a 50 μas static ring nearly as well as the static ring model itself if only particular subsets of the data are fit. Even an infinitesimally thin ring model, when only fit to medium and long baselines, can provide a high-quality fit while misrepresenting the total intensity of the source. Without sufficient radial homogeneity (i.e., coverage of short, medium, and long baselines, as seen in column (c) of Figure 2), fitting and interpreting a model confidently can be difficult.
Periods of (u, v)-coverage where these limitations are more likely to occur can be identified by constructing a metric that scores directional bias and radial homogeneity (i.e., coverage of short, medium, and long baselines). The prevalence and severity of reconstruction artifacts that result from coverage limitations form a continuum that can be used to rank different (u, v) configurations. A metric based on these limitations could be applied to a full observation to distinguish different observing periods (composed of many evolving (u, v) configurations) by their ability to produce high-quality reconstructions.

Coverage Metrics
Multiple (u, v)-coverage metrics with different underlying considerations exist in the literature. Here we summarize several metrics and compare the way they score a given observation. In addition, we develop a novel "isotropy metric" that has been tailored to the specific vulnerabilities detailed in Section 4.
Our "selective dynamical imaging" approach uses such a metric to identify intervals during an observation where the coverage is optimally configured for imaging. Importantly, these intervals are chosen before image reconstruction is attempted. In contrast, methods such as lucky imaging (e.g., Fried 1978) identify particularly useful images (e.g., with minimal distortion) after and on the basis of the reconstruction. A single score computed directly from (u, v)-coverage known a priori is preferable to an empirical approach (e.g., performing a simulated observation of a synthetic model and comparing reconstructions with the model), as it provides source-structureagnostic assessments with a substantially lower performance cost.
Any metric capable of scoring different periods of (u, v)coverage would have demonstrable limitations. Within a single observation, a comparison of (u, v)-coverage at two different points in time is a reliable way of determining which time region of the observation will produce superior image reconstructions. However, certain reconstruction-impacting data quantities can vary independently of the (u, v)-coverage. Sensitivity, calibration, and systematic uncertainty can also be important factors but are not probed by coverage metrics.

Normalized Cross-correlation
The normalized cross-correlation between two images is a measure of their similarity. By performing a dynamical reconstruction on a simulated observation and comparing each image of the dynamical reconstruction to the model, we can heuristically identify which portions of the observation produced the best reconstruction (i.e., the portions of the reconstruction with the greatest similarity to the model). We define the normalized cross-correlation ρ NX (X, Y) of two images X and Y in an identical fashion to Event Horizon Telescope Collaboration et al.
A normalized cross-correlation between a model and the associated reconstruction is the most straightforward way to identify trustworthy periods of an observation, assuming that the reconstruction on synthetic data will behave in a similar fashion to a reconstruction on real data. This assumption is only upheld if care is taken to ensure that the qualities of the synthetic data match those of the real data. In addition, heuristic tests such as the normalized cross-correlation can be biased depending on the structure and inherent variability of the model chosen. If the source randomly aligns with collimated (u, v)coverage at some point in the observation, it can result in a misleadingly high normalized cross-correlation that cannot be replicated for a different source model.

(u, v) Filling Fraction
Palumbo et al. (2019) propose a geometric scoring procedure for the (u, v)-coverage based on the specification of a desired array resolution q res and imaging field of view θ FOV . q res sets an outer boundary with radius q 1 res in the (u, v)-plane within which a "filling fraction" is computed, and it is typically taken to be the nominal array resolution set by the longest baseline in an observation. θ FOV determines a convolution radius of 0.71/θ FOV corresponding to the scale in the (u, v)-plane over  shown as a function of radial baseline length ρ. Short ( 2 Gλ), medium (2 Gλ < ρ < 6 Gλ), and long ( 6 Gλ) baselines are displayed in green, red, and blue, respectively. Continuous fits of different (but equally well-fitting) models are overlaid. Fitting a ring model with infinitesimal thickness (denoted by J 0 , representing a zeroth-order Bessel function of the first kind) to only medium and long baselines accurately represents the source shape and size but poorly constrains the total intensity of 1 Jy. In addition, an equally good fit can be obtained with a simple Gaussian fit to only short and long baselines. Data from all three baseline types are required to correctly constrain key properties of the source. This result highlights how model misspecification can lead to severe systematic errors, especially when working with limited baseline coverage. 8 which the Fourier response to a filled disk on the sky of diameter θ FOV would decay to half of its maximum amplitude; we use θ FOV = 100 μas for the filling fraction computation in Figure 6. Intuitively, the largest image feature considered in the optimization of coverage sets the smallest scale of interest in the (u, v)-plane; thus, convolving a proposed set of (u, v) points by 0.71/θ FOV yields a measure of what region in the (u, v)plane is sampled by measured visibilities. The fraction of the bounding circle sampled by the convolved coverage is the filling fraction.
Increasing the specified resolution (perhaps by increasing observation frequency) extends the bounding circle, decreasing the filling fraction unless θ FOV is correspondingly decreased. In this way, the filling fraction captures some features of the "spatial dynamic range" discussed in Lal & Lobanov (2007). As shown in Figure 7 of Palumbo et al. (2019), the filling fraction metric is a successful and nearly linear predictor of image fidelity until the filling fraction reaches values near 0.9, at which point imaging techniques are limited by methodologyspecific super-resolving scales, which for many imaging algorithms is at approximately half of the diffraction-limited CLEAN beamwidth in the case of the EHT (Event Horizon Telescope Collaboration et al. 2019d).

Largest Coverage Gap
An alternative metric probing the coverage isotropy is based on identifying the largest gap in the (u, v)-coverage; hence, we refer to it as the largest coverage gap (LCG) metric (Wielgus et al. 2020a). In this approach we consider the coverage as a set of sampled (u, v)-plane locations, to find the largest circle that can be drawn within the limits of the coverage that does not contain a coverage point in itself. Such a largest circular gap can be efficiently calculated with Delaunay triangulation of the coverage set (Barber et al. 1996). Then, the diameter of the gap d max can be turned into a metric coefficient with where r max is the longest projected baseline length. If we demand that the (u,v) distance corresponding to the center of the circle is less than r max , then we have 0 m LCG 1, with m LCG = 1 corresponding to the limit of a complete continuous coverage. A coverage consisting of a single detection would correspond to m LCG = 0. Unlike the filling fraction metric, the LCG metric is independent of the assumed field of view.

Isotropy and Radial Homogeneity
We propose a novel metric of (u, v)-coverage isotropy and radial homogeneity (hereafter referred to as the "isotropy metric") based on the limitations described in Section 4. Similarly to the LCG metric, the isotropy metric penalizes anisotropy of the coverage, although the two approaches differ appreciably. In this approach, we treat the distribution of baselines in the Fourier plane as a mass distribution and quantify the radial and angular homogeneity using the second moments of inertia. We define the isotropy metric coverage parameter  for a given snapshot as where 〈u 2 〉, 〈v 2 〉, and 〈uv〉 2 are the second moments of the baseline distribution,  is the Kolmogorov-Smirnov (K-S) distance of the radial distribution of baseline lengths from uniform, and max  is the maximum value of  at any point in time during the observation (or an arbitrary value for the purpose of cross-observation comparisons). The isotropy metric has the benefit of being fully analytic and automatically normalized between 0 and 1. A full derivation of the isotropy metric is presented in Appendix B.

Discussion of Metrics
Despite differences in methodology and implementation, the metrics we examined found similar fluctuations in (u, v)coverage quality and identify similar candidate time regions for high-quality imaging. A comparison of the metrics detailed in Sections 5.1-5.4 applied to the 2017 EHT (u, v)-coverage of Sgr A * is shown in Figure 6. In general, the second half of the observation has superior (u, v)-coverage as indicated by the metrics, and the period from ∼01:00 GMST to ∼03:30 GMST maximizes the various metrics.
The (u, v) filling fraction and LCG metric generally produce results similar to the isotropy metric. Disagreements between the metrics can be seen especially at the beginning of the observation (e.g., 17-19 GMST) and in the middle (e.g., 21-23 GMST). These are periods where the (u, v)-coverage is extremely sparse and not suitable for imaging, though the degree to which these periods are determined to be unsuitable varies depending on the specific considerations of the individual metrics.
The normalized cross-correlation metric, while the most direct measurement of time-varying reconstruction quality, lacks source structure agnosticism. If the source structure is known, the normalized cross-correlation metric can be a useful method of determining what periods of an observation are most advantageous for imaging that particular source structure. However, if the source structure is unknown, then a wide suite of representative source models must be tested to mitigate possible biases. Additionally, the normalized cross-correlation method will be tied to the particular imaging algorithm and hyperparameters used, making this metric less robust than the others considered.
The particular constructions of the metrics can lead directly to unintuitive or undesirable behavior. One example of undesirable behavior is a metric punishing a coverage for adding data points. Intuitively, more baselines lead to better coverage of the Fourier plane and therefore more information about the source. However, if these additional baselines are placed strategically, they can result in an unintuitive score assignment. A trivial example of this can be generated for the LCG metric. Consider a coverage with maximum baseline length less than ρ that achieves m LCG ≈ 1. By placing a single baseline of length L far outside the initial coverage (i.e., L ? ρ), r = L max and d max goes as ∼ L − ρ. This drives m LCG to zero and seems to indicate that the coverage has become demonstrably worse, when in reality the coverage quality has largely stayed the same, with the improvement of a single ultralong baseline. This type of array pathology does not occur 9 in the 2017 EHT coverage but may present an issue if the EHT goes to ultralong space baselines.
The isotropy metric exhibits similar misbehavior, as demonstrated in Figure 5. Given an isotropic coverage with a low number of baselines, adding just two baselines strategically can decrease the isotropy metric value by a substantial amount. With so few baselines, the addition of new baselines would intuitively be considered an improvement. However, the metric detects a decrease in isotropy and reports accordingly. This problem is only present for arrays with small numbers of baselines-it is difficult or impossible to significantly alter the isotropy of an array configuration for larger arrays using only a few baselines without resorting to ultra-long-baseline placement as in the LCG example.
An additional limitation that any metric based purely on coverage possesses results from unusual source structure. The metrics described above attempt to predict reconstruction quality by analyzing the coverage available, but this prediction is performed under the assumption of "reasonable" source structure (i.e., source structure with smooth, continuous Fourier representation). However, we can construct simple examples that would render these metrics unhelpful by violating the assumption of reasonable source structure. Consider a source whose Fourier transform has zero flux density everywhere an array has (u, v)-coverage, and nonzero flux density everywhere else. Regardless of how good the coverage itself is (and therefore how well a given metric may score the coverage), there is no way to produce an accurate reconstruction of the source. This limitation is not likely to be an issue, as the restriction that source emission be nonnegative induces sufficient correlation in the Fourier domain that the possibility for arbitrary pathologies to "hide" in coverage-deficient swathes of the plane is severely limited.
The isotropy metric offers substantial performance (i.e., time and overhead cost) benefits over the other metrics considered, while reporting similar results. We test the performance of each metric by computing the per-snapshot score for a full 12 hr observation five times, resulting in ∼10,000 snapshots available for scoring. We run the performance assessments on an i5-1038NG7 10th-generation Intel x86_64 processor with 16 GB of RAM. The normalized cross-correlation, which must reconstruct hundreds of images and compare them to model images, takes several hours to complete. The isotropy metric performs ∼ 10 5 times faster than the full normalized cross-correlation, ∼ 20 times faster than the (u, v) filling fraction, and ∼ 10 times faster than the LCG metric, while producing similar assessments of coverage. The substantial performance differences between the metrics will become more pronounced with larger arrays, such as the next-generation EHT (ngEHT) coverage. The low overhead generated by the isotropy metric in comparison to the other metrics examined makes the isotropy metric a more optimized method of scoring (u, v)-coverages in important contexts, such as real-time track selection and long-term ngEHT site placement (Raymond et al. 2021), which is expanded on in Section 7.

Application of Metric to 2017 EHT Array
We apply the coverage metrics discussed in Section 5 to the EHT (u, v)-coverage of Sgr A * corresponding to 2017 April 7. The instantaneous metric values for each snapshot of the observation are shown in Figure 6. This method of scoring the observation clearly identifies distinct periods of varying coverage quality. The time region from ∼01:30 GMST to ∼03:10 GMST (denoted as "Region II" in Figure 6) has the highest overall isotropy and baseline density of the observation. We select this period as a candidate time region for highquality imaging (a "good" time region, i.e., one where the typical metric score is high). In contrast, the time region from ∼19:45 GMST to ∼21:00 GMST (denoted as "Region I" in Figure 6), while relatively stable, displays substantially lower coverage quality. We select this time region to examine the behavior of reconstructions in periods of ambiguous coverage quality. Exact time stamps for these time regions are given in Table 1.
By performing reconstructions in these time periods, we can validate the capability of the (u, v)-coverage metrics to predict reconstruction quality based on coverage alone. We reconstruct four configurations of the ring+hs toy model detailed in Section 2: a 270-minute orbital period clockwise and counterclockwise, and a 30-minute orbital period clockwise and counterclockwise. The reconstructions in each time region are produced according to the RML and CLEAN imaging methods in Section 3, and we perform feature extraction on each image using the REX module of eht-imaging (Event Horizon Telescope Collaboration et al. 2019d) to recover the position angle of the hot spot at each moment in time. The extracted hot spot angles and the model orbits for Regions I and II are shown in Figures 7 and 8, respectively. Images sampled from dynamical reconstructions generated by each method are displayed in Section C. The "success" of a reconstruction is Figure 5. A pathological case for the isotropy metric that demonstrates an undesirable behavior. The isotropic coverage shown (blue) results in an isotropy metric value of ≈ 0.3 ( 1  , shown in blue). Adding two data points (red) strategically (i.e., in an anisotropic configuration) decreases the overall isotropy of the array and lowers the metric score by a factor of ≈ 1/2 ( 2  , shown in red+blue=purple). This change makes sense given the considerations of the metric-the new array is more anisotropic and therefore has a lower score. However, this behavior is undesirable since, intuitively, we expect that an array with more baselines will perform better than an array with fewer baselines. Note: in order to compute the isotropy metric as defined in Section 5.4, the example coverages shown above are assumed to be part of the 2017 April 7 EHT coverage of Sgr A * , and the corresponding value of max  is adopted (see Appendix B). determined by the successful extraction of the hot spot position angle.
We find that Region II produces reconstructions that facilitate accurate recovery of dynamical variability. The reconstructions show a ring of approximately 50 μas in diameter with a distinct hot spot. The recovered hot spot orientations are shown in Figure 8, along with a comparison to the model values. To compare the N recovered position angles j g r with the model angle j g m , we use a phase-adjusted rms , defined as Overall, the Region II reconstructions successfully recover the dynamical variability in the model, with rms of the recovered angles and model varying between 0.16 and 0.20 rad. Decreases in coverage quality as measured by the metrics (indicated pointwise in Figure 8 by increases in transparency, using the isotropy metric as an example) correlate with lower-quality recovery of the hot spot position angle. These low-quality angle recoveries are most obvious in the RML reconstructions of the counterclockwise T = 30-minute case. The CLEAN algorithm appears to be more resistant to the sudden loss of coverage and maintains reconstruction quality even through drops in metric score. Excluding these lapses in coverage quality, Region II clearly facilitates the recovery of source structure and at least one kind of dynamical variability, covering a wide range of periods and directions.
By contrast, a comparatively low scoring time region (Region I) does not produce reconstructions capable of accurately recovering dynamical variability. Dynamically reconstructed images show a ring-like feature, but the time variation of the brightness asymmetry does not match the model. Overall, the Region I reconstructions fail to recover the dynamical variability in the model, with rms of the recovered angles and model varying between 1.21 and 1.73 rad. The rms on the recovery in Region I is  Note. Regions I and II correspond to the time regions identified in red and blue, respectively, in Figure 6. The LMT dropout corresponds to the sudden loss of coverage that occurs partway through Region II. All time stamps correspond to 2017 April 7 unless otherwise noted. In UT, the observation begins and ends on April 7; however, when converted to GMST, Region I lies on 2017 April 7 while Region II lies on 2017 April 8.

11
between 7 and 10 times higher than in Region II. For all tests, the scatter around the model is substantially larger than the scatter in Region II, rendering the accurate extraction of a period difficult or impossible. Reconstructions in periods outside of Regions I and II cannot recover even basic source structure without significant a priori information. Based on these results, we expect that reconstructions on real data will produce the most accurate and robust recoveries of  12 orbital dynamical variability in Region II of April 7 of the 2017 EHT observations of Sgr A * . Reasonable imaging and feature extraction procedures failed to produce meaningful results in Region I. This ranking is consistent with the predictions of the coverage metrics described in Section 5 and Appendix B. A wide array of factors impact reconstructions, and the tests presented here do not compose a realistic synthetic data suite for assessing whether or not Region II can accurately recover dynamical variability for Sgr A * . The tests provided here solely demonstrate that Region II is the best time region for performing dynamical reconstructions based on coverage considerations. Additional testing on more complex source types with realistic data corruptions can be found in Event Horizon Telescope Collaboration et al. (2022c).

Metric-based Array Comparisons
Go/no-go decisions about whether to proceed with an observing run are often made with limited information about the readiness or weather conditions at particular sites. Simulating (and scoring) array configurations with different dropouts-and characterizing changes in the size and quality of the identified candidate temporal regions-can facilitate those go/no-go choices while incorporating uncertainties about station status. One observation configuration can be considered "better" than another if it provides a candidate time region with a higher metric score, or a similar metric score for a longer duration. Figure 9 shows an example of this kind of analysis performed with a hypothetical ngEHT array. The top panel shows the isotropy metric (see Section 5.4) score per snapshot for the full array during a night of observation in which every station is observing, which represents the ideal scenario. The middle and bottom panels reproduce the metric score per snapshot assuming that four sites are unable to observe. By identifying a candidate time region and characterizing its quality (based on, e.g., average metric score in the time region) and duration (denoted as Δτ in Figure 9), we can track these characterizations through different combinations of site dropouts and estimate how critical the sites are to the array. Based on the computations in Figure 9, ALMA, JCMT, LMT, and SMA dropping out would be catastrophic to the array performance, as the optimal time region for dynamical imaging reduces in duration by ≈80% and reduces in quality by ≈50%. By comparison, the combined dropout of PV, PDB, CARMA, and LMT does not substantially change the duration or quality of the identified candidate time region.
A metric-based comparison additionally provides a natural and quantitative ranking for identifying which day of an observation campaign produced the most optimal coverage for dynamical imaging. We can rank observations on separate days in a similar fashion to the dropout scenarios visualized in Figure 9 by comparing the duration and quality of the identified candidate time regions for each day. An example of such a comparison between the April 7 and April 10 runs of the 2017 EHT observation campaign is shown in Figure 10. Instantaneous metric scores are computed for each scan of both days, and candidate time regions for dynamical imaging (green) are identified. While both candidate time regions are of approximately the same duration, the candidate time region associated with April 7 displays substantially better coverage, making April 7 a better choice for dynamical imaging than April 10.

Conclusions
We have demonstrated that limitations of imaging associated with sparse baseline distribution can be inferred from the specific geometric properties of the (u, v)-coverage. Highly anisotropic coverage produces artifacts in reconstructions that distort the image in a direction consistent with stripes in the dirty beam. Additionally, an uneven radial distribution of the Fourier plane (u, v)-coverage tends to result in ambiguous image reconstruction. These limitations can be partially avoided by imaging in radially and angularly homogeneous coverage. The reconstruction issues associated with sparse coverage are exacerbated by rapid short-timescale variability, as seen in a wide array of astrophysical sources, including Sgr A * and the precessing jets of X-ray binaries. was used to perform the computation (see Appendix B). Generally, dropouts will impact the maximum coverage score achieved throughout the observation and the duration (shown as Δτ) of the most optimal time region. Characterizing a site's importance to the observation is a useful way of informing a go/no-go decision, which takes into account station readiness and probability of dropout. For the above observational scenario, the loss of, e.g., ALMA, JCMT, and SMA would likely motivate a "no-go" decision, whereas the loss of, e.g., CARMA and PV would not motivate canceling the night of observation.

13
Next, we surveyed and compared existing geometric measures of (u, v)-coverage, in addition to deriving a novel isotropy-based metric that addressed the specific limitations demonstrated in Section 4. The examined metrics included the normalized cross-correlation (2), the (u, v) filling fraction (see Palumbo et al. 2019), and the LCG metric (Equation (3); see Wielgus et al. 2020a). The isotropy metric treats the distribution of baselines as a mass distribution and examines the second moment to rank coverages by homogeneity in the Fourier plane. The isotropy metric gives similar results to other (u, v)-coverage-based metrics while being more computationally efficient.
These metrics were applied to the April 7 data of the 2017 EHT coverage of Sgr A * (Event Horizon Telescope Collaboration et al. 2022aCollaboration et al. , 2022bCollaboration et al. , 2022cCollaboration et al. , 2022d) and used to select candidate time regions for high-quality dynamical imaging. All metrics identify a period from ∼01:30 GMST to ∼03:10 GMST ("Region II") that minimizes the coverage limitations each metric addresses. We also select a period from ∼19:45 GMST to ∼21:00 GMST ("Region I") with reconstruction capability. Reconstructions of time-variable sources allowed successful recovery of the characteristic source variability in Region II. In contrast, reconstructions in Region I were unable to recover the characteristic motion. The ranking determined by the suite of reconstructions performed on synthetic data verifies the predictions made by the examined coverage metrics. We expect that attempts to recover variability in real EHT observations of the Galactic center will produce the most robust and accurate recoveries in Region II of the 2017 April 7 data set, and therefore we recommend performing dynamical imaging procedures in that time region.
Coverage metrics have additional utility for ranking interobservation comparisons based on their ability to recover dynamical variability, which has a variety of applications to the broader field of interferometery. These metrics provide the ability to make select observation time slots based on the capability of available antennas to recover particular dynamical evolution in the target. Such a scored assessment of coverage could well prove of use to other VLBI arrays both as they make go/no-go decisions about whether to observe on a particular night and then when identifying the periods of best coverage to perform static and dynamic imaging.
We thank the National Science Foundation (awards OISE-1743747, AST-1816420, AST-1716536, AST-1440254, AST-1935980) and the Gordon and Betty Moore Foundation (GBMF-5278) for financial support of this work. This work was supported in part by the Black Hole Initiative, which is funded by grants from the John Templeton Foundation and the Gordon and Betty Moore Foundation to Harvard University. Support for this work was also provided by the NASA Hubble Fellowship grant HST-HF2-51431.001-A awarded by the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., for NASA, under contract NAS5-26555.
The Event Horizon Telescope Collaboration thanks the following organizations and programs: the Academy of Finland (projects 274477, 284495, 312496, 315721); the Agencia Nacional de Investigacin y Desarrollo (ANID), Chile via NCN19 058 (TITANs) and Fondecyt 3190878, the Alexander von Humboldt Stiftung; an Alfred P. Sloan Research Fellowship; Allegro, the European ALMA Regional Centre node in the Netherlands, the NL astronomy research network NOVA and the astronomy institutes of the University of Amsterdam, Leiden University, and Radboud University; the Black Hole Initiative at Harvard University, through a grant (60477) from the John Templeton Foundation; the China Scholarship Council; Consejo Nacional de Ciencia y Tecnología (CON-ACYT, Mexico, projects U0004-246083, U0004-259839, F0003-272050, M0037-279006, F0003-281692, 104497, 275201, 263356) We gratefully acknowledge the support provided by the extended staff of the ALMA, both from the inception of the ALMA Phasing Project through the observational campaigns of 2017 and 2018. We would like to thank A. Deller and W. Brisken for EHT-specific support with the use of DiFX. We thank J. Delgado for helpful discussions and feedback. We acknowledge the significance that Maunakea, where the SMA and JCMT EHT stations are located, has for the indigenous Hawaiian people.

Appendix A Sample Synthetic Data Products
Here we present representative samples of the visibility amplitudes and closure phases that are generated from the synthetic data described in Section 2.1. Sample amplitudes are shown in Figure 11, and sample closure phases are shown in Figure 12. . Example amplitudes for the three main synthetic data types described in Section 2.1. The time-variable models have periods of 270 minutes. The amplitudes recorded during Regions I and II (see Table 1) are shown in red and blue, respectively. Region II has higher radial homogeneity in (u, v) distance, which contributes to its higher metric score and increased dynamical imaging capability. Error bars show 1σ thermal noise.

Derivation of Isotropy-based Coverage Metric
Section 4 demonstrated that quantifying the isotropy of a (u, v)-coverage configuration can indicate whether it is suitable for producing accurate reconstructions of a dynamic source. We adopt a coverage metric of the form is the set of 2N baselines (including their Hermitian conjugates), I is a measure of the isotropy of the coverage, and R is a measure of the radial homogeneity of the coverage.
To estimate the radial homogeneity, we compare the cumulative distribution function (CDF) of the distribution of baseline lengths against the uniform CDF via the K-S test. The uniform distribution examined for this test ranges from 0 Gλ to the maximum baseline length achieved in the observation. This test returns a "distance"  between the distributions, which increases as the distribution becomes less radially homogeneous, making use of the test in this context a measure of radial inhomogeneity. To convert the result of the K-S test into a measure of radial homogeneity, we select an upper bound max  corresponding to the maximum distance from uniform any individual baseline distribution obtains throughout the observation. We then subtract the result of the K-S test from this maximum, i.e.,  is our new metric of homogeneity. This metric is conveniently bounded between 0 and 1. To make this metric absolute, a fixed value of max  can be chosen arbitrarily and applied to multiple observations. For the absolute comparisons in this paper (see, e.g., Section 7) a value of = » 0.513338437261774 0.513 max  is adopted. This value of max  is chosen to be the maximum value of  achieved during the April 7 observation.
In order to measure the isotropy of the coverage, we examine the second moment (moment of inertia) of the distribution of baselines. As a spatial configuration of points with uniform weighting, the (u, v)-coverage can be treated as a mass distribution.
For a two-dimensional mass distribution, a disk is considered isotropic, a rod is considered anisotropic, and the spectrum between the two cases can be probed using the moment of inertia tensor. Given 2N


The principal moments of inertia can be computed from the eigenvalues λ 1 and λ 2 of . From these, we can derive the following orientation-independent measure of isotropy: This measure of isotropy is naturally normalized between 0 and 1. Substituting this expression and Equation (B2) into Equation (B1) gives the following expression for a coverage  This measure of coverage quality can be applied to partition an arbitrary VLBI observation into time regions ranked by their ability to accurately reconstruct dynamical sources.

Appendix C Sampled Results from Synthetic Data Reconstructions
In Section 6, we test the coverage metrics examined in Section 5 by performing reconstructions of synthetic observations in selected time regions of the 2017 EHT observational coverage of Sgr A * . Here we provide representative snapshot images from each of the time regions. Sampled snapshot images from Region I are shown in Figure 13, and sampled snapshot images from Region II are shown in Figure 14.
Reconstructions in both Region I and Region II demonstrate clear recovery of a ring-like feature of approximately 50 μas in diameter, as both sufficiently probe the radial distribution of the source Fourier transform to constrain the overall size of the source. However, the directionally biased coverage of Region I produces incorrect reconstructions of hot spot location. By contrast, reconstructions in Region II repeatedly recovery the correct hot spot location across all periods, directions, and imaging algorithms.
Though both time regions recover a ring-like feature of the approximately correct size, the Region II reconstructions provide a more accurate ring-to-central-depression flux density ratio. The increased accuracy is present in both the CLEAN and RML reconstructions. By contrast, reconstructions in Region I fail to consistently provide a visually distinctive depression and misrepresent the angular brightness profile of the source. Figure 13. Sampled reconstructions from Region I. The top row shows the true model image for each configuration. All panels show model images and reconstructions at ∼21:00 GMST on April 7. The white circle in the lower right corner of each panel corresponds to an 18 μas diameter CLEAN beam. Even with substantial prior assumptions that facilitate ring reconstruction, the hot spot is frequently placed incorrectly, rendering this time region unsuitable for recovery of orbital angular variability.