Spectropolarimetric measurements of hidden broad lines in nearby megamaser galaxies: a lack of clear evidence for a correlation between black hole masses and virial products

High-accuracy black hole (BH) masses require excellent spatial resolution that is only achievable for galaxies within ~100 Mpc using present-day technology. At larger distances, BH masses are often estimated with single-epoch scaling relations for active galactic nuclei. This method requires only luminosity and the velocity dispersion of the broad line region (BLR) to calculate a virial product, and an additional virial factor, $f$, to determine BH mass. The accuracy of these single-epoch masses, however, is unknown, and there are few empirical constraints on the variance of $f$ between objects. We attempt to calibrate single-epoch BH masses using spectropolarimetric measurements of nine megamaser galaxies from which we measure the velocity distribution of the BLR. We do not find strong evidence for a correlation between the virial products used for single-epoch masses and dynamical mass, neither for the megamaser sample alone or when combined with dynamical masses from reverberation mapping modeling. Furthermore, we find evidence that the virial parameter $f$ varies between objects, but we do not find strong evidence for a correlation with other observable parameters such as luminosity or broad line width. Although we cannot definitively rule out the existence of any correlation between dynamical mass and virial product, we find tension between allowed $f$ values for masers and those widely used in the literature. We conclude that the single-epoch method requires further investigation if it is to be used successfully to infer BH masses.


INTRODUCTION
Water megamasers are extremely luminous sources of 22 GHz radiation generated by the amplification of microwave signals through stimulated emission (for a review, see Lo 2005). They can be found within a few parsecs of active galactic nuclei (AGN) and may be used to probe the kinematics of this inner region (e.g. Greenhill et al. 1996;Trotter et al. 1998;Peck et al. 2003). Some special disk systems, such as NGC 4258, have masers which trace the ridge-line of an edge on disk, allowing for precise measurements of the disk dynamics (Herrnstein et al. 1999). Measurement of the acceleration of systemic features provide an independent evaluation of H 0 (e.g. Kuo et al. 2013Kuo et al. , 2015Reid et al. 2013;Gao et al. 2016;Pesce et al. 2020). The rotation axis of the maser disk is also observed to align with the jet axis when jets are detected, suggesting the maser disk can be used to understand the geometry of the accretion disk (e.g. Greene et al. 2013;Kamali et al. 2019). * based on observations made with the Southern African Large Telescope (SALT) In addition, the Keplerian rotation of the sub-parsec scale accretion disk traces the black hole (BH) mass with high precision and accuracy (e.g. Miyoshi et al. 1995). Currently, there are only ∼ 100 BHs with masses determined by dynamical tracers such as masers, stars, or gas including both active and non-active galaxies (e.g. Kormendy & Ho 2013;McConnell & Ma 2013;Saglia et al. 2016), as this method requires that the BH sphere of influence be spatially resolved. Therefore, with current adaptive optics we are limited to objects within ∼ 100 Mpc for dynamical estimates, except for the most massive BHs. No other BH has a mass measured with the same precision as that in the Galactic Center, but the masers offer the most precise measurement after that (e.g. Maoz 1998;Kuo et al. 2010). Beyond the available dynamical masses, all other BH masses have been estimated using only indirect tracers, often involving accretion (e.g. Shen 2013). We will use the high accuracy and precision of the maser masses to test the fidelity of other methods of BH mass measurements, specifically single-epoch scaling relations in AGN.
In the absence of available maser measurements, the most accurate method for determining BH masses using emission from AGN is reverberation mapping (RM). This approach uses broad line region (BLR) gas that is not spatially resolved as a dynamical tracer to measure velocity from line widths. While it is not possible to resolve the gravitational sphere of influence in most of these AGN, it is possible to determine a size scale for the broad line region by measuring the time lag between variations in the AGN continuum to those in the emission lines of the BLR (e.g. Blandford & McKee 1982;Peterson 1993). This delay provides an estimate of the light-travel time across the BLR, from which the average radius of the BLR can be determined. The BLR radius (R), in combination with the gas velocity measured from the width of the broad emission lines (W) and gravitational constant (G), can then be used to calculate a virial product (Equation 1).
Under the assumption that the BH dominates the gravity in this region, the virial product is expected to be correlated with the mass of the BH, but there is a virial pre-factor, f , required to account for the geometry and dynamics of the disk. The pre-factor is defined such that f multiplied by the virial product gives the virial BH mass (Equation 2).
In general, we do not know the shape or kinematics of the BLR, whether it is flattened or round, or whether the gas is inflowing, outflowing or neither. In a growing number of cases, this has been modeled (e.g. Pancoast et al. 2014b;Grier et al. 2017;Li et al. 2018Li et al. , 2022Williams et al. 2018Williams et al. , 2020Bentz et al. 2021;Villafaña et al. 2022, see more discussion below) but in general, an average value of f has been used.
As well as providing an estimate of the BH mass, RM has been used to calibrate a relationship between the luminosity of the AGN and the BLR size (e.g. Bentz et al. 2013). With this relationship, the BH masses of distant AGN can be estimated through the virial product with only luminosity and a measured line width (Vestergaard 2002), known as the single-epoch method. Most inferences about the cosmic evolution of BH mass density, and potential evolution in BH-galaxy scaling relations, have relied on these single-epoch virial masses (e.g. Laor 1998;Wandel et al. 1999;McLure & Jarvis 2002;Vestergaard & Peterson 2006;Kelly & Shen 2013;Volonteri & Reines 2016;Pensabene et al. 2020). Therefore, it is crucial to determine whether the virial product provides an accurate value of BH mass.
The virial estimate relies on multiple assumptions about the structure of the BLR. Uncertainties in cal-ibrations alone lead to estimates of 0.3-0.5 dex scatter (Vestergaard & Peterson 2006;Shen 2013). There are also as-yet unquantified systematic errors that are still a large cause for concern. Primarily, the virial estimate assumes that broad emission lines are virialized (e.g. Peterson & Wandel 1999, 2000Onken & Peterson 2002;Kollatschny 2003;Bentz et al. 2010). Additionally, the gas velocity of the BLR is assumed to be at the radius measured by reverberation mapping, but this is not necessarily the case (Krolik 2001). Although gravity is assumed to dominate in the BLR, there is an unknown contribution from radiation pressure (e.g. Krolik 2001;Marconi et al. 2008Marconi et al. , 2009Netzer & Marziani 2010). This could introduce additional scatter, as well as a luminosity dependence (Shen 2013), which may also be generated from BLR breathing modes (Wang et al. 2020).
The determination of an accurate black hole mass also depends on the choice of f , which in turn must be calculated through an independent method. Historically, f has been calibrated by aligning the M-σ * relation of reverberation mapped AGN with that of quiescent galaxies (e.g. Onken et al. 2004;Collin et al. 2006;Woo et al. 2010;Graham et al. 2011;Grier et al. 2013;Batiste et al. 2017). Although f is often taken to be a constant, it has been found to vary significantly between objects (Yu et al. 2019). An independent method of determining f by modeling the BLR has also found variation in the virial pre-factor (e.g. Pancoast et al. 2014b;Grier et al. 2017;Williams et al. 2018). This modeling is intensive, and requires densely sampled light curves, so it has only been done for 27 objects so far (Villafaña et al. 2022). More comparisons are needed.
Evaluating the accuracy of the virial product as a probe of BH mass requires a comparison to a known, dynamical mass. In general, this is very challenging since luminous AGN are needed for RM, but severely complicate stellar or gas dynamical measurements, as their light swamps that of the stars. Although there are a few objects with both measurements (e.g. Davies et al. 2006;Onken et al. 2007;Den Brok et al. 2015), building up this sample will be slow. We instead use the larger sample of objects with maser dynamical masses. However, the accretion disk is at very high inclination to enable masing, so the BLRs of maser galaxies are all obscured. Therefore, we must use spectropolarimetric measurements to probe the BLR and measure the hidden broad lines so that we may calculate a virial product. Similar work has been done previously for smaller samples of megamaser galaxies (Kuo et al. 2010;Du et al. 2017). By expanding this sample through additional spectropolarimetric measurements of maser galaxies, we sought to test the hypothesis that a direct correlation exists between virial product and BH mass.
In Section 2 we describe our spectropolarimetric observations of megamaser galaxies and subsequent data reduction. We present our measurements of broad line widths, along with additional values from the literature, in Section 3. BH masses determined through the virial product or through RM modeling are given in Section 4. Section 5 includes discussion of the virial pre-factor, and implications for the virial mass and BLR structure. We summarize our results in Section 6 and discuss possible future work.

Sample
We began with the sample of megamaser galaxies with known, dynamical BH masses listed in Kuo et al. (2020). The dynamical masses were determined through the work of the Megamaser Cosmology Project (MCP; Reid et al. 2009;Braatz et al. 2010). We selected any megamaser disk with a published BH mass, even if doublepeaked rather than triple-peaked with more complex kinematics. Uncertainties in the BH mass are adopted from Kuo et al. (2020) or Greene et al. (2016) based on the dynamical modeling papers referenced therein.
Of the 22 megamasers included in the parent sample, we observed nine as described in Section 2.2. Among these nine objects, we find evidence of a polarized broad line in three (Section 3.2). In addition to the nine we observed, we include six additional galaxies with measured polarized broad lines in the literature. Our complete sample of objects with broad line widths is described in Section 3.

Data
Linear spectropolarimetry of nine megamaser galaxies with known disk dynamical BH masses was obtained with the Robert Stobie Spectrograph (RSS) on the Southern African Large Telescope (SALT). See Table 1 for dates of observation and exposure times. Each of the nine objects was observed on one night, except for NGC 1194 which was observed on two. Exposure time was divided evenly between four waveplate angles, with three observations at each angle. The seeing was approximately 0.6 . Resolution was R ≈ 1065, and the pixel scale was 0.1267 per pixel. The spectra were taken in a wavelength range of 4200Å-7270Å.

Reduction
The data were reduced with the polsalt 1 extension to the pysalt 2 (Crawford et al. 2010) reduction pipeline with a few minor modifications. Basic reduction steps include overscan subtraction; corrections for gain, crosstalk, and distortion; and cosmic ray cleaning. We modified the wavelength calibration method slightly to ensure that the wavelength was fit over the full pixel domain. After wavelength calibration, individual spectra were extracted by manually selecting the center and width. The O and E spectra for each observation were interpolated to use the same wavelength solution, then combined to calculate the Stokes parameters.
At this step, the reduction pipeline was modified to account for masked pixels. In the original software, if any of the three observations at a given waveplate angle had a masked pixel at a certain wavelength, the corresponding pixel in the combination would be masked. We altered the reduction such that if only one pixel out of three were to be masked, that pixel would be replaced with the average value of the remaining two pixels.
The Stokes parameters were combined to generate the total intensity, polarization fraction (P), and polarization angle (θ) for each object. The resulting spectra are missing ∼50Å of data between ∼5220-5270Å and ∼6260-6310Å due to the location of the chip gaps. These gaps do not affect the analysis of the broad Hα region.
Before fitting the broad Hα line ( §3.2), we performed additional continuum subtraction from the Stokes parameters following the method described in Capetti et al. (2021). We estimated the continuum polarization for each object by taking regions of 30-80Å on either side of the Hα line, then performing a constant fit to the values of I, Q, and U between these regions. The continuum fit was then subtracted before the Stokes parameters were combined to find P and θ. The regions used for the background fit were chosen in each object to avoid emission and absorption lines, as well as the chip gaps. This subtraction improved the detection of the polarized broad lines and removed interstellar polarization in the region of interest.

Standard Star
To ensure the observations of the nine sample objects are properly calibrated, we observed a standard polarized star, BD-12 5133. This star has a known polarization fraction of 4.27 ± 0.02 % in the V -band, and a polarization angle of 145.88 ± 0.09 • (Cikota et al. 2016). We apply the polsalt data reduction to the standard star and measure a polarization fraction of 4.6 ± 0.2 % and a polarization angle of 144 ± 1 • averaged over the V band (5070 -5950Å). The values of polarization fraction and angle are consistent within 2σ so we are assured the reduction is well calibrated, although we discuss a possible calibration issue in Section 3.1.

Spectra
For each object, we show P and θ in 25Å bins as well as the total and polarized intensity for the full observed wavelength spectrum. We also show total and polarized intensity in the Hα region after performing continuum subtraction. One example is shown for IC 2560 in Figure  1, and the rest in Appendix A.
We find the average θ and P in the V-band continuum (5800 -6300Å) and across Hα (6500 -6625Å) in the rest frame wavelength for each object following Ramos Almeida et al. (2016). We also find the signalto-noise ratio of P in the same regions. These values are given in Table 2. We find that θ is not well determined due to large scatter over short wavelength ranges. Our values have fractional errors of ∼ 30%. A precise value of θ, however, is not important for our analysis of the virial product.
We find the results of P and θ for IC 2560 and NGC 3393 to be consistent within errors when compared to the observations in Ramos Almeida et al. (2016). Our measurements of NGC 1068 agree with the results presented in Inglis et al. (1994) and Young et al. (1995) for observations of the nucleus.
The value of P increases towards the red end of the spectrum for both IC 2560 and NGC 5765b ( Figure 1). This is most likely an issue with calibration rather than S/N as we do not see the same feature in the other objects in the sample all with similar values of S/N. This might be indicative of a red galaxy continuum present in our polarized spectra. However, as we subtract the total continuum before fitting the broad feature ( §2.3), this issue should not affect our measured values.

Fitting the Spectra
For each of the nine observed objects, we fitted the Hα-[NII] lines in both total and polarized intensity using the astropy LevMarLSQFitter function. In the total intensity, we used three Gaussian components for the narrow lines and a constant background continuum. The narrow lines were fixed to have the same width in velocity space. The relative amplitudes of the [NII] lines were fixed in a 1:3 ratio, and the relative positions were set by the known wavelength difference. In the polarized intensity, we fitted for the same three lines, although they are not always present in the polarized light, with the same constraints and a constant background. We also included a broad component represented by an additional Gaussian peak. The broad peak was initialized at the same wavelength as Hα, but its position was allowed to vary. In the case of NGC 1068, there are so many velocity components within our aperture that we could not find a model including narrow lines to fit the polarized spectrum well. Therefore, we only included the broad feature and constant background.
We are primarily interested in polarized intensity, and do not need a precise polarized fraction. Therefore, we do not perform starlight subtraction. We do note, however, that because we do not subtract the continuum, our polarization fractions should be considered lower limits.
Of the nine objects, three show evidence of a broad Hα feature: IC 2560, NGC 1068, and NGC 5765b. We find evidence of a broad feature in the weaker Hβ line only for NGC 1068. The total and polarized fits for these objects along with residuals are shown in Figures  2 -4.
To confirm the presence of a broad line in these three galaxies, we fitted the spectra with and without a broad component and calculated ∆χ 2 between the two models. For the broad component to be considered significant, ∆χ 2 must be greater than 2.7 (90% confidence) times the number of additional parameters (three). We found ∆χ 2 to be greater than 8.1 in all cases.
To estimate the error in the broad line width, we took 1000 random samples from the spectrum assuming a normal distribution of polarized flux at each wavelength. The mean and standard error were assigned to be the observed values output by the reduction pipeline ( §2.4). We then refit both the narrow and broad components in the artificial observations. Some samples did not require a broad line component by our ∆χ 2 test. We did not include these cases in calculating the distribution of broadline properties. Any broad line with lower width than the narrow features was also excluded. Fewer than 15% of samples were excluded by these conditions for each object. For each accepted sample, we determined the broad line parameters. From the distribution of broadline widths, we found the difference from the mode to the 1σ level of significance (the 84 th and 16 th percentiles, respectively). This provided the upper and lower errors on the FWHM measured from the original spectrum.
In addition, for each sample we calculated the percentage of total polarized flux contained by the broad line by integrating both the broad feature and the total polarized flux. The flux contained in the broad feature was not consistent with zero in any of the three objects, providing additional evidence that the broad feature is a significant component.
Observations of the remaining six objects were insufficient to confirm the presence of a broad feature. Although we do not find a polarized broad line in NGC 3393 with our observation, a broad line was found in this object by Ramos Almeida et al. (2016).

Broad Line Widths
We combined our observed broad-line widths with additional objects from the literature. Here, broad-line width is defined using FWHM rather than line dispersion (σ line ), which produces a different standard value of f (see e.g. Wang et al. 2019). We took all spectropolarimetric measurements with broad Hα features in megamaser galaxies with known dynamical masses. The broad line widths are either provided directly from these sources, or in the case of NGC 2273 estimated by Kuo et al. (2010) from the spectrum provided by Moran et al. (2000).
We found additional values of FWHM for eight objects. For two of these objects, IC 2560 and NGC 1068, we also observed polarized broad features and took the perform a very similar fitting method to that described in Section 3.2 on data from VLT/FORS2, and find broad features in four megamaser disk galaxies. All measurements in our final sample are of polarized broad features, with the exception of NGC 4258. A broad feature in total intensity is observed by Ho et al. (1997) for this object. Barth et al. (1999) observe a polarized spectrum of NGC 4258, and fit the broad line fixing the width to the Ho et al. (1997) value. Although Barth et al. (1999) determine this is not a significant detection of a broad feature, we include this measurement because NGC 4258 is the archetypal megamaser galaxy and this is the only example of a broad line width for this object. Because Barth et al. (1999) fix the broad line width, we had to assign an uncertainty to the value. We chose a similar fractional error to the highest uncertainty measurements in our sample.
In the case of NGC 4388, we have both a direct-light spectrum from Ho et al. (1997)  Nominally these are consistent within the uncertainties for NGC 4388, but the difference is large, so we should view NGC 4258 with some skepticism.
All broad line widths including our observations and literature values are given in Table 3.

BLACK HOLE MASSES
Our main goal in this paper is to use the secure BH masses derived from the maser dynamics to test the fidelity of single-epoch BH masses. We thus review the strengths and limitations of each method briefly before turning to our comparison.

Maser Dynamical Masses
BH masses can be measured through the observation of dynamical tracers such as stars, gas, and masers (e.g. . Of these dynamical methods, the most precise extragalactic BH masses are determined with maser dynamics. There are significant limitations to this method, however. Megamaser disks must be edge-on to be detected and within ∼ 100 Mpc to be spatially resolved (Kuo et al. 2010). To accurately determine the BH mass, the galaxies must also have a Keplerian rotation curve (Kuo et al. 2010).

Reverberation Mapping
While we cannot spatially resolve the broad line region, we can use temporal variability to determine a characteristic size. All AGN show variability in their disk emission on timescales of days to months, and because the BLR is photoionized by the UV photons from the accretion disk, this leads to variability in the broad line emission. There is a lag, however, because the BLR sits light-days from the BH. This time separation can be measured, and provides an estimate of the size scale of the BLR (e.g. Blandford & McKee 1982;Peterson 1993). With the known speed of light, the average radius of the BLR can be determined. This can be used in combination with a velocity estimate and the virial parameter f to calculate a virial mass (Equation 2).
Onken et al. (2004) find f to be a constant value of 1.4 when using FWHM as a velocity estimate. The value of f is generally calibrated using the relationship between BH mass and galaxy stellar velocity dispersion (σ * ) by assuming that the scaling relations for quiescent galaxies are the same as those of AGN (e.g. Onken et al. 2004;Collin et al. 2006;Woo et al. 2010;Graham et al. 2011;Grier et al. 2013;Batiste et al. 2017). It is typically taken to be a constant of order unity, although it has been shown to vary between objects (Yu et al. 2019) and there is an uncertainty of ∼0.4 dex on the value from calibration alone (Shen 2013). Any dependence of the scale factor f on AGN properties (e.g. luminosity) constitutes a major systematic uncertainty in BH mass determination ( §5).
High cadence, high signal-to-noise RM data allow for modeling of the BLR, as it becomes possible to measure the lag between continuum and line emission as a function of velocity. These data can constrain idealized models of the BLR that include flattening, inclination, and kinematic structure as free parameters, (e.g. Pancoast et al. 2014b;Grier et al. 2017;Williams et al. 2018Williams et al. , 2020Bentz et al. 2021;Villafaña et al. 2022). In practice, simulated line profiles are generated from BLR models and then compared to the observed, reverberation mapping data to produce constraints on the model parameters. Using this technique, the mass of the BH is determined independently of the virial product, and therefore does not require a choice of f . Instead, the f -factor can be extracted as a model parameter along with the BH mass and other values of interest.
While the modeling is independent of f , it is limited by the models of the BLR that go into the fitting. If the models do not span the space of real BLRs in dynamics or structure, then it is possible to induce systematic errors in the BH masses. One interesting test is to attempt to recover BH masses from a simulated BLR. In an analysis of multiple RM modeling methods, Mangham et al. (2019) use a rotating, biconical disk wind BLR model and generate mock data with simulations of ionization and radiative transfer. These data are passed through the RM modeling programs to evaluate how well they extract the BLR kinematics and parameters. Interestingly, while the CARAMEL model (Pancoast et al. 2011(Pancoast et al. , 2014a fails to recover the proper kinematics, it does accurately recover the time delay, inclination of the BLR, and the BH mass. We

Single-Epoch Masses
It is possible to use the radius-luminosity relation calibrated from RM to estimate a BH mass from a single spectrum, the so-called "single-epoch" BH mass (Vestergaard 2002). Again, assuming that the BLR is virialized, one takes the luminosity and infers the size scale of the BLR using the radius-luminosity relation. As in RM, the velocity of the BLR gas comes from the line-width, and the virial factor f is usually derived as a constant scaling that makes the ensemble of RM masses obey the M-σ * relation. Single-epoch masses typically have an uncertainty of ∼ 0.5 dex (Shen 2013). Here, our goal is to test these single-epoch masses against the well-known maser mass using our polarized broad line measurements We estimate the virial products of our observed sample with Equation 1. We measure the velocity scale, represented by W in Equation 1, with the FWHM of the polarized broad lines, see Table 3. The size of the BLR is estimated through the radius-luminosity relation determined from RM, and is given by Equation 3 (Bentz et al. 2013).
log 10 (R BLR /1 lt-day) = 1.527 +0.031 −0.031 + 0.533 +0.035 −0.033 log 10 (λL λ /10 44 erg/s) This relation gives the radius of the broad line region as a function of its 5100Å luminosity. Although recent results point to possible variation in the slope of the R-L relation (Alvarez et al. 2020), given the limited luminosity range of our sample, any slope variation will be minimal. The optical luminosity of the BLR cannot be measured directly in obscured AGN, so instead we choose to use high energy (E > 10 keV) X-rays to provide a proxy for the bolometric luminosity. Hard X-rays are highly penetrating even in the most Compton-thick AGN, which several of the masers sample are known to be (e.g. Masini et al. 2016).
The most sensitive, and currently only, focusing hard X-ray telescope is the Nuclear Spectroscopic Telescope Array (NuSTAR), which provides high-quality X-ray spectroscopy in the 3-79 keV band. All nine of the masers considered here have previous observations with NuSTAR (Arévalo et al. 2014;Baloković et al. 2014;Bauer et al. 2015;Masini et al. 2016Masini et al. , 2019). These studies self-consistently fit each individual NuSTAR spectrum with a well-motivated transmission and reflection model to fully account for even the heaviest obscuration along the line of sight, providing the most direct measure of the intrinsic X-ray luminosity in the 2-10 keV band derivable directly from the high-energy emission. We convert the intrinsic 2-10 keV luminosities provided by these studies to the luminosity distances adopted throughout (see Table 3). To estimate the bolometric luminosity for each maser, we use these hard X-ray de- Here, L X is the 2-10 keV intrinsic X-ray luminosity. The error on K X is dominated by the intrinsic scatter of 0.37 dex. Duras et al. (2020) provides an additional bolometric correction for 4400Å luminosity, Equation 5.
Again, the error is dominated by intrinsic scatter with a value of 0.26 dex. To convert from 4400Å to 5100Å luminosity, we use the power law fit to the composite spectrum in Vanden Berk et al. (2001), Equation 6.
This gives the 5100Å luminosity to be approximately 80% of that at 4400Å. The final luminosity values can then be used in Equation 3 to calculate the radius of the broad line region.
With these values, the virial products of each object can be estimated, and are given in Table 3. The virial products can then be compared to dynamical masses from maser disks (Greene et al. 2016;Kuo et al. 2020).
In Figure 6, we compare the virial product of each maser to its known dynamical mass. These products do not include the f -factor. Rather, different values of f are represented by the dashed lines in the figure. If all objects were to fall on the center, bold line, the BH mass would be equal to the virial product with no additional geometric factor. Each additional line has an f value that is five times higher than the line above it. Therefore, within this maser sample f ranges from approximately 0.1 to 40 (or 0.1 to 1.3 if NGC 4258 is excluded).
For a measured value of the virial product, the true dynamical mass of the object can vary greatly. Sources with a virial product of approximately 10 7 M , for example, can have a dynamical mass from 10 6 -10 8 M within only this sample of nine objects. Therefore, f must be well calibrated to determine the true dynamical mass from the virial product. We will consider relationships between f and other observable parameters in Section 5.2. We will also explore whether we should expect to find a well-defined f factor given the uncertainties in our virial product.

DISCUSSION
We first discuss theoretical models of the BLR, and the importance of different measured parameters. Then we turn to our final sample, which includes 9 masers with broad polarized lines and 16 RM+dynamical modeling objects from the literature. We consider whether the single-epoch masses are correlated with the dynamical mass measurement, first with the maser sample alone, where we understand well the dynamical masses and uncertainties, then including the full sample. We conclude with a discussion of the caveats to these results.

Theoretical expectations for BLR structure
We have focused on using the BLR as a tracer of the BH mass. The BLR, however, is also one of our primary tools for understanding accretion onto supermassive BHs, as the ionization structure gives us clues about the SED of the accretion disk, and the dynamics are tied to the emission from the disk. Over many decades, several different models have been proposed for the BLR, including orbiting clouds, inflowing and outflowing gas, and rotating disk winds (see reviews in Mathews & Capriotti 1985;Sulentic et al. 2000; Czerny 2019), each of which show some success, particularly in explaining the photoionization of the BLR gas (e.g.

Kwan & Krolik 1981; Korista & Goad 2004
). RM is one of the best ways to probe this region, and velocity resolved RM is now within reach for large samples.
One of the models that has been explored most is that of the disk wind (e.g. Shields 1977;Emmering et al. 1992;Chiang & Murray 1996)  In the disk wind model, there are several concrete predictions for how f might depend secondarily on other parameters. For instance, we expect the geometry of the system, including inclination and relative positions of the BLR and polar scatterers, to affect observed line width, an effect which must be accounted for in f (e.g.  (2004) find the disk wind to be sensitive to Eddington ratio. In general, the disk structure and temperature depend on luminosity, and therefore too the BLR, so there is a strong motivation to explore this question empirically.

Correlations between observable and derived values
For the maser sample as a whole, we do not find clear evidence of a correlation between virial product and dynamical BH mass. We will quantify the lack of correlation below. However, in this section we also try to determine whether there is a secondary parameter driving the relation between f and dynamical BH mass, which might help us understand the virial BH masses.
We evaluate the possible trend between the virial product and dynamical mass, along with other correlations, using the Pearson correlation test. Because the values we will compare have individual, possibly correlated, errors we use a Monte Carlo (MC) method to explore the correlations rather than taking the Pearson correlation as measured. To perform the MC correlation test, 10,000 samples of BLR radius (or luminosity), FWHM, and dynamical mass are taken assuming a normal distribution for each with the mean and standard error set by values in Table 3 and Equations 3-6. These values are then used to recalculate the virial product, and we calculate f by dividing the dynamical mass by the virial product. We measure the r and p values from  the Pearson test on each sample and look at the distributions of r and p over all samples to evaluate the correlation between different values. The distributions of these values for the relationship between virial product and the mass of the BH are given in Table 4. Additional correlations are given in Table 6 in Appendix B. If the virial product is a good estimator for the dynamical mass of the BH, there should be a high correlation between the two values. In the full sample of masers, however, we see no correlation between the mass of the BH and the virial product. We additionally see no correlation between the mass of the BH and the individual components of the virial product: luminosity or BLR radius and FWHM. Even if NGC 4258 is removed, there is no significant correlation. This implies that the f -factor is unlikely to be a constant value, and that the BH mass depends sensitively on the per-object value of f . Given that f is not independent of dynamical mass, we do recover an expected trend between implied per-object f -value and M dyn for the maser sample.
If a relationship existed between the observed parameters, i.e. luminosity and FWHM, with the f -factor, these values could be used to calibrate f and find an accurate mass from the virial product. Additionally, from our understanding of the structure of the BLR, we may expect a dependence on luminosity. Therefore, we search for a correlation between f and luminosity or FWHM, but do not find a strong correlation with either. We similarly would expect a relationship between Eddington ratio and f -factor, and we observe a possible correlation in Figure 6. When the Pearson test is performed, however, we do not see a strong relationship between these parameters.
After exploring the correlations between parameters using only the masers, we combine the maser sample with the RM sample described in Section 4.2. Figures 7  and 8 show comparisons between f and observable and derived values, respectively, including all objects. The two panels of Figure 7 show f compared to luminos- ity and FWHM respectively. We find no evidence of a correlation between these values. The f -factor does appear to have a relationship with the Eddington ratio and BH mass when including the maser and RM samples, as seen in Figure 8. Considering only the maser sample, we would expect f to be related to Eddington ratio and dynamical mass because the mass of the BH is included in all three values; f is directly proportional to the dynamical mass and Eddington ratio is inversely related. We note that we still see this possible relationship in the combined sample, where f is not derived directly from M BH as in the masers. The Pearson test, however, does not provide evidence for a strong relationship between these parameters. Additionally, even if there was a relationship between f , Eddington ratio, and dynamical mass, these values are not directly measurable and therefore would not be useful for determining the value of f for a given object.
In the past, one confirmation that single-epoch measurements are accurate tracers of BH mass has been the measured correlation between σ * and virial product. Indeed, f has been calibrated by solving for the value bringing the virial products in line with the M-σ * relation. Therefore, we also examine the relationship be-tween BH mass, σ * , and f . Because the M-σ * relationship is used to calibrate f , we should see a correlation between σ * and BH mass or f . We compile values of σ * for the majority of the objects in our sample, see Table 7 in Appendix C. When performing the correlation test, however, we do not find evidence for a relationship between σ * and BH mass or f in either the sample of megamasers taken alone or when RM modeling objects are included.
Given the large uncertainties on individual virial products, it is important to ask whether we could measure a single f value from our sample even if virial product and dynamical mass were perfectly correlated. We must test if our sample is strong enough to rule out a relationship between virial product and dynamical mass, and do so as follows.
First, we generate artificial sources by selecting dynamical masses from a uniform distribution spanning the range of our maser sample. Assuming a perfect correlation between dynamical mass and virial product, we calculate virial products by choosing a constant value of f . We assign errors to both quantities by taking the average error associated with the virial products and dynamical masses in our maser sample. This allows us to  simulate an observation of each object by sampling a Gaussian with mean given by the generated virial product or dynamical mass, and a standard error from the average value. We repeat the process to generate a random sample with 10 4 simulated maser objects, each with an observed dynamical mass and virial product. These values are combined to generate an observed f value. The Kolmogorov-Smirnov (K-S) test is used to compare the generated sample to our true sample of nine maser objects. This is done for different values for f , and including both the masers alone and the combined sample of masers and RM modeling objects. Results are shown as the solid lines in Figure 9.
From the results of the K-S test, we can rule out a correlation between dynamical mass and virial product with a single value of f = 1.4 (Onken et al. 2004). We can also almost entirely reject the 1σ range of f = 1.12± 0.3 given by Woo et al. (2015) with a p-value below 0.05. However, we cannot rule out a perfect correlation for all values of f . For example, when including the maser sample alone, values of f between ∼0.1-0.5 are allowed. As seen in Figure 6, the majority of maser objects fall  . K-S test p-value generated by comparing the observed maser or maser and RM modeling sample to a randomly generated, perfectly correlated sample. The blue shaded region represents the values of f included in the 1σ range of Woo et al. (2015), f = 1.12 ± 0.3. The vertical blue line gives f = 1.4 as predicted by (Onken et al. 2004). The horizontal line represents a p value of 0.05. If a value of f falls below the line, we can reject a perfect correlation between dynamical mass and virial product with that single f value. The test is performed for virial products calculated with the values of FWHM given in Table  3 (solid lines), and the values of FWHM after 2000 km s −1 broadening is subtracted in quadrature (dashed lines). We find it to be unlikely that our unchanged sample could be drawn from a perfect correlation between dynamical mass and virial product with f from either Woo et al. (2015) or Onken et al. (2004). It would be more difficult, however, to reject these f values if thermal broadening were present.
in this range of f values. To show that virial product and dynamical mass are not correlated with any value of f , we would need to observe additional maser objects or reduce the uncertainties on existing objects.
The major sources of error in our virial product estimates are our understanding of the polarized broad line width and measurements of the intrinsic luminosity. These will be discussed in the following sections. Reducing these uncertainties will be difficult, so we can instead estimate how many additional maser objects would be required to rule out a single value of f . Starting with the observed sample of masers, we assume that there is no relationship between virial product and dynamical mass. We generate a random value of each from a uniform distribution over our sample range. Using these two values, we calculate f . We perform the K-S test again with this additional object included in the maser sample, and continue adding objects until p < 0.05. We repeat the test 1000 times and fit the resulting distribution with a Poisson function to statistically determine how many additional objects are required. For a fixed f of 0.3, we find that approximately six additional maser objects would be required to rule out a correlation.

Caveats from intrinsic luminosity
The requirement of converting a hard X-ray luminosity in a heavily obscured AGN to optical continuum measurement introduces several sources of error. As discussed in Section 4.3, there is significant uncertainty in the bolometric corrections. The X-ray correction has an intrinsic scatter of 0.37 dex, while the optical has a scatter of 0.26 dex (Duras et al. 2020). Future surveys that attempt to measure virial products across all types of AGN could potentially reduce the combined uncertainty on the luminosity through the consistent calibration to a wavelength region that is relatively insensitive to dust and gas obscuration, such as using high spatial resolution imaging in the mid-infrared.

Caveats in using polarized line widths
We have presented comparisons between dynamical mass and virial products using broad-line widths measured from polarized light. We must address whether the polarized line widths may either under or overestimate the true velocity distribution.
In megamaser galaxies, the disk must be nearly edgeon to observe masing. When jets are seen, they align with the angular momentum of the disk, suggesting the masers and inner disk are aligned (Kamali et al. 2019). With this geometry we would expect polar scattering, where scatterers are located above and below the disk, to dominate. If this were the case, the polarization angle is predicted to be perpendicular to the radio jet (e.g. Smith et al. 2002). For the objects in our sample which are observed to have a jet, we do find these two angles to be approximately perpendicular. We can make this comparison for eight objects from the maser sample, see Table 5. Polar scattering would produce narrower polarized lines compared to the full distribution of velocities in the BLR (e.g. Smith et al. 2002Smith et al. , 2004. Therefore, the geometry of the object may lead to narrowing of the Note-Polarization angles measured in the continuum and radio jet position angles for objects in our maser sample with an observed radio jet. Angles are measured East of North. We find the these angles to be roughly perpendicular, with an unweighted, average separation of ∼ 70 ± 20 • . References: (1)  polarized broad lines. We expect the order of this effect to be comparable to the inclination effects for the BLR sample.
In addition to line narrowing due to the scattering geometry in these objects, we also expect some thermal broadening due to the nature of the scatterers themselves. Studies of NGC 1068 have shown evidence for scattering by both dust and electrons with the dominant scatterer varying by region (e.g. Miller et al. 1991). By comparing the observed polarized line width between dust scattering regions, which are not expected to cause thermal broadening, and electron scattering areas, which do produce broadening, the effect of thermal broadening can be estimated. NGC 1068 was found to have thermal broadening of approximately 3360 km s −1 with a corresponding electron temperature of ∼ 10 5 K (Miller et al. 1991).
The thermal broadening in NGC 1068 likely represents an extreme case. First, the broadening of ∼ 3400 km s −1 is larger than the total line width of the majority of objects in our sample. Additionally, regions dominated by dust or electron scattering could be observed independently in NGC 1068, while we mostly likely see a mix of both scatterers when observing the objects in our sam-ple. We can consider NGC 4388 for which we have both direct-light and polarized broad line measurements with FWHMs of 3900 km s −1 (Ho et al. 1997) and 4500±1400 km s −1 (Ramos Almeida et al. 2016) respectively. If we take the difference between these values to be solely due to thermal broadening, we find a broadening of ∼ 2200 km s −1 with a corresponding electron temperature of 3 * 10 4 K. This is significantly lower than the broadening in NGC 1068.
Although we do not expect all of our objects to have as much thermal broadening as NGC 1068, we may consider what effect more moderate broadening would have on our results. In general, our virial masses overestimate the dynamical masses of the BHs if a typical value of f = 1.4 is assumed. It is possible that thermal broadening is responsible for an increase in the observed FWHM leading to this overestimation. Therefore, we determine the value of thermal broadening that must be subtracted in quadrature from the FWHM of each object such that the virial and dynamical masses agree for f = 1.4. For this test, we exclude NGC 4258 which would require the FWHM to be narrower to agree. We find our sample to require an average thermal broadening of ∆v ≈ 3000±1000 km s −1 to match the dynamical masses using the accepted value of the virial parameter. The temperature of the scattering electrons that would produce this broadening can be estimated with T = m e ∆v 2 /16k B ln(2) (Miller et al. 1991). We find the corresponding electron temperatures to be between ∼ 10 4 − 10 5 K.
It is possible that electron scattering could cause thermal broadening of this level in almost all objects in our sample. However, if we remove this estimated broadening from our values of FWHM we find the resulting widths to be systematically smaller when compared to the broad line widths in the RM sample, which are not affected by thermal broadening. It is unlikely for the maser sample to have inherently smaller values of FWHM compared to the RM objects since they live in similar host galaxies. Additionally, any geometrical effects leading to line narrowing should have similar magnitudes across both samples. Therefore, although there may be some thermal broadening in our sample, it is likely to be less than ∆v ≈ 3000 km s −1 .
We consider the effects more moderate thermal broadening would have on our conclusions about the likely value of f . To do so, we recreate the results described in Section 5.2 after subtracting 2000 km s −1 of thermal broadening from the FWHM of each maser object. Although this represents less thermal broadening than in NGC 1068, or the value required to match f = 1.4, it allows for objects with a value of FWHM less than 2000 km s −1 to be included in our test. Additionally, this amount of thermal broadening does not cause the maser FWHM values to be systematically smaller than the RM sample values. After subtracting this thermal broadening, we reproduce the K-S test to determine if our objects could correspond to a correlation between virial product and dynamical mass. These results are shown as the dashed lines in Figure 9. If 2000 km s −1 thermal broadening were present in each maser object we can no longer completely reject the range of f values given by Woo et al. (2015), although we still find the value of f = 1.4 given by Onken et al. (2004) to be unlikely. Therefore, it is possible that thermal broadening is the source of disagreement between the dynamical masses and our observed virial products. However, the true value of thermal broadening in each individual object is not known and would require more detailed observations to determine.

Implications for the structure of the broad line region and virial masses
There are many reasons why the single-epoch mass estimate may break down, which have been discussed thoroughly in Shen (2013). First, this method relies upon the assumption that the BLR is virialized. In a number of AGN, measurements of different line widths and time lags using RM data show the expected virial scaling (e.g. Peterson & Wandel 1999, 2000Onken & Peterson 2002;Kollatschny 2003). This is not sufficient to confirm the region is virialized, however. For example, radiation pressure would lead to a similar scaling (Krolik 2001). Even if the BLR were not virialized, the measured line widths would not be expected to deviate significantly from the expected virial value (Shen 2013).
Another issue may be the difference in measurement between the radius and width used in the virial product. Krolik (2001) considers the possibility that the weighting over radial distribution used to determine line width could be different than that for radius of the BLR. Therefore, the product of these values would not be an accurate estimate of the enclosed mass.
The value used for width is another area of concern, as either the FWHM or σ line could be used. FWHM is used more commonly because it can be measured more easily than σ line , which often requires modeling (Dalla Bontà et al. 2020 The assumption of a single f value is not necessarily valid either. The standard f value is calculated with the M-σ * relation, which is assumed to hold between AGN and quiescent galaxies in the calibration of f . However, within the dynamical sample there is some evidence for different scaling relations for different galaxy morphologies (e.g. Hu 2008;Greene et al. 2008;Graham 2008;Graham & Li 2009;Hu 2009;Gültekin et al. 2009;Greene et al. 2010;McConnell & Ma 2013). At this point, it is unclear whether the maser galaxies behave differently from spirals without nuclear activity (Greene et al. 2016). Therefore, the choice of objects can produce different values of f (e.g. Shen 2013).
We find f to vary between our observed galaxies, but must consider the limitations of our method for determining f for megamaser galaxies. Due to the orientation of these objects, we can only measure the broad lines in polarized light. As discussed in Section 5.4, this could introduce unaccounted for bias into our calculation. However, when we add the RM sample we see very similar results, suggesting that maser measurements alone are not biased.

SUMMARY AND FUTURE WORK
We have used spectropolarimetric measurements of megamaser galaxies with known dynamical BH masses to determine the accuracy of the single-epoch method.
We do not find strong evidence for a correlation between the virial product and dynamical mass. Additionally, f was found to vary significantly between objects and was not found to correlate with any observable parameters. Although we cannot rule out a correlation between virial product and dynamical mass, we show that this correlation is unlikely for specific values of f previously proposed in the literature. We supplement our sample with RM-modeled objects, and find consistent results.
Further observations would be necessary to calibrate the virial product and reach a better determination of the value of f . Additional spectropolarimetric measurements of megamaser galaxies may also provide stronger evidence for a lack of correlation between virial prod-uct and dynamical mass. Multi-object RM happening now with SDSS, and planned to continue with SDSS-V, will yield new information about BLR (Shen et al. 2014;Homayouni et al. 2020). Additionally, Las Cumbres (Brown et al. 2013), and in the future Rubin Observatory (Bianco et al. 2021;Abell et al. 2009), will provide high-cadence monitoring of AGN that will hopefully produce new insight into the BLR structure. ELTs will in principle measure dynamical BH masses to z > 1 (Gültekin et al. 2019), further expanding the possible comparison sample.
Our understanding of the cosmic evolution of BH mass density and BH-galaxy scaling relations often relies on single-epoch virial masses (e.g. Laor 1998;Wandel et al. 1999;McLure & Jarvis 2002;Vestergaard & Peterson 2006;Kelly & Shen 2013;Volonteri & Reines 2016;Pensabene et al. 2020). If geometry and other unknown factors significantly affect f or the broad line width, then the weak relation between black hole mass and virial product may introduce large unquantified uncertainties in the inference of a mass. Such uncertainties make individual measurements of BHs very challenging, and these measurements should be approached cautiously, particularly in small samples of objects.  Table 6 lists the results of all correlation tests described in Section 5.2. This includes correlations between additional variable pairs, and correlations for the maser sample excluding possible outliers. Table 7 compiles information for all megamaser galaxies included in Table 3