Testing White Dwarf Age Estimates using Wide Double White Dwarf Binaries from Gaia EDR3

White dwarf (WD) stars evolve simply and predictably, making them reliable age indicators. However, self-consistent validation of the methods for determining WD total ages has yet to be widely performed. This work uses 1565 wide (>100 au) WD+WD binaries and 24 new triples containing at least two WDs to test the accuracy and validity of WD total age determinations. For these 1589 wide double-WD binaries and triples, we derive total ages of each WD using photometric data from all-sky surveys, in conjunction with Gaia parallaxes and current hydrogen-atmosphere WD models. Ignoring initial-to-final-mass relations and considering only WD cooling ages, we find that roughly 21-36% of the more massive WDs in a system have a shorter cooling age. Since more massive WDs should be born as more massive main-sequence stars, we attribute this unphysical disagreement as evidence of prior mergers or the presence of an unresolved companion, suggesting that roughly 21-36% of wide WD+WD binaries were once triples. Among the 423 wide WD+WD pairs that pass high-fidelity cuts, we find that 25% total age uncertainties are generally appropriate for WDs with masses>0.63 Msun and temperatures<12,000 K, and provide suggested inflation factors for age uncertainties for higher-mass WDs. Overall, WDs return reliable stellar ages, but we detail cases where total ages are least reliable, especially for WDs<0.63 Msun.


INTRODUCTION
White dwarf (WD) stars are the end stages of all stars with initial masses less than roughly 8−10 M (Doherty et al. 2015) and have been used for decades as stellar chronometers (see review by Fontaine et al. 2001). Since WDs no longer undergo core fusion, their evolution is generally a cooling problem. The development of robust cooling models (e.g., Bergeron et al. 1995) enables WDs to act as precise and accurate age indicators. However, to determine the total age of a WD, the time from the zero-age main sequence (ZAMS) to the current state of the WD is required. The progenitor lifetimes from ZAMS to the WD phase are determined through the use of a semi-empirical initial-final mass relation (IFMR, Catalán et al. 2008, Kalirai et al. 2008, Salaris et al. 2009, Cummings et al. 2018 in conjunction with stellar evolution model grids, such as MESA (Dotter 2016) and PARSEC (Bressan et al. 2012).
The total ages of WDs can be used to date a variety of astronomical objects, including wide binary companions, such as M dwarfs (e.g., Fouesneau et al. 2019;Qiu et al. 2021, Kiman et al. 2021, brown dwarfs (Zhang et al. 2020), and a growing number of planet-host stars (Martin et al. 2021). WDs can also age-date larger populations, such as stars in the disk (e.g., Winget et al. 1987), or even the halo of our Galaxy (Kilic et al. 2019;Torres et al. 2021). WD ages can also be used to study the star formation history of the Milky Way (e.g., Tremblay et al. 2014). An understanding of the limitations and reliability of WD total ages is crucial to the inferences derived from these studies.
The total age derived for a WD is strongly dependent on the IFMR used, especially for the lowest-mass WDs (M < 0.7 M ). However, the low-mass region of the IFMR is poorly sampled because nearby clusters containing at least one WD are not old enough. Attempts have been made to alleviate this problem by using wide binary companions to WDs (Zhao et al. 2012;Barrientos & Chanamé 2021) and the distribution of WDs in the color-magnitude diagram , but this still results in a large range of predicted progenitor masses across different IFMRs. Further, the IFMR may show evidence of a kink where the mass-loss relationship is no longer monotonic (Marigo et al. 2020). The IFMR is likely also sensitive to the metallicity and initial rotation of the progenitor star ). These parameters are not easy to determine, and thus assumptions for their values must be made.
Widely separated double WD binaries (WD+WD binaries) are ideal systems for checking the viability of our existing methods for determining WD total ages. It is safe to assume the two WDs bound to each binary system formed from the same collapsing molecular cloud at nearly the same time with the same chemical composition (White & Ghez 2001;Kraus & Hillenbrand 2009). Wide WD+WD binaries that are sufficiently separated (> 100 au) have not experienced any significant mass transfer events or other interactions between the two stars during their lifetimes. At intermediate separations (1-100 au) where tidal forces, mass transfer, and accretion are not a significant source of interaction, it has been shown that the interaction between a companion and the circumstellar disk can affect the angular momentum evolution of the star (Morgan et al. 2012). In the absence of any interactions between the two stars, any methods for determining the total age of a WD should independently return the same total age for both components of the binary.
Since the launch of the Gaia spacecraft (Gaia Collaboration et al. 2016, Gaia Collaboration et al. 2018a, Gaia Collaboration et al. 2021, the sample of wide WD+WD binary systems has increased by an order of magnitude (Andrews et al. 2015, Tian et al. 2020. This increase in systems presents the opportunity for a large statistical study into the accuracy of the models and methods used to determine the total ages of WDs, which is the focus of this work. In Section 2, we present the sample selection of resolved WD+WD binary systems, including 23 new WD+WD+MS systems and one bound WD+WD+WD system. Section 3 discusses the techniques used to determine their ages. In Section 4, we discuss comparisons of the total ages in each binary. We end with a summary of the findings and results in Section 5.

SAMPLE SELECTION
Obtaining a large sample of wide double WD pairs is crucial for this work. The improved parallaxes and proper motions for more than 1 billion sources all-sky with the release of Gaia EDR3 (Gaia Collaboration et al. 2021) significantly increased the potential yield of searches for wide common-proper-motion pairs. With this larger sample we can delve into the systematic errors present in WD total age determinations.

Binary Sample
For this work, the base of our sample is composed of the 1565 wide WD+WD binaries from El-Badry et al. (2021). Candidate pairs in this work were selected to have projected separations less than a parsec, parallaxes that are consistent within 3σ, and proper motions consistent within 2σ of a bound orbit. This sample is restricted to objects with fractional parallax uncertainties less than 20% for both components, with most systems within 300 pc. The sample covers a wide range of magnitudes, spanning 13 < G < 21 mag, with roughly 68% of the 3130 individual WDs in the sample fainter than G > 19 mag. El-Badry et al. (2021) calculate a chance alignment factor (R chance align) for each candidate binary pair through the use of a shifted Gaia catalog. Conducting a search for binaries in the shifted catalog only produces chance alignments, and thus can be used as a way to determine the likelihood that a binary pair is a chance alignment (see their Sec 3). This parameter can be used to filter out low-fidelity wide binaries.

Triple Sample
To further increase the sample of wide double WDs, we conduct a search for wide, resolved triple star systems containing at least two WDs. In El-Badry et al. (2021), all triple systems are intentionally removed from the sample by the neighboring pairs cut (see their Sec 2.1). In this work, we take the sample of binary pairs from El-Badry et al. (2021) before the cleaning of the sample and remove the neighboring pairs cut, but retain the cut on neighboring sources of N < 30, where a neighboring source is defined, following El-Badry et al. (2021), as a source with (i) projected on-sky separation less than 5 parsecs; (ii) proper motion difference less than 5 km s −1 with a 2σ tolerance; (iii) parallaxes consistent within 2σ.  (2021) used in this work, including 23 of our newly discovered wide WD+WD+MS triples and one WD+WD+WD triple: distribution in galactic coordinates, color-magnitude diagrams for the primaries and secondaries, apparent magnitude distributions, magnitude difference, distances, and projected angular and physical separations. The primaries are defined to be the more massive WD component. A large range of the WD cooling tracks is covered by the sample, which allows us to test WD total ages across a representative sample of WD masses and effective temperatures. Cooling tracks for a 0.6 M and 0.9 M WD model from Bédard et al. (2020) are shown for reference on the color-magnitude diagrams, as well as dots representing WD cooling ages of 0.5, 1, 2, 4, and 6 Gyr for each. The separation distribution of pairs with R chance align > 0.1 is shown in cyan for reference in the bottom right panels. The upturn in the distribution at high separations is caused by these possible chance alignments.
Next, we make a list of systems of various sizes by combining all binaries with at least one common source between them. We iterate this process until each system is independent of each other.
To clean this new sample of possible wide systems, we use the classifier for spurious astrometric solutions provided in Rybizki et al. (2021). We require each component of the triple to have the fidelity v1 parameter greater than 0.5. This assures that each source has a reliable astrometric solution. We are only concerned with finding systems with three components in this extended list and throw out any systems with a size not equal to three. (Repeating this exercise for N = 4, 5, . . ., we do not find any systems with a size greater than 3 in this extended list of wide systems with at least two WDs.) With these cuts, we identify 23 new WD+WD+MS triples and one WD+WD+WD triple. This list includes the first resolved triple WD system found by Perpinyà-Vallès et al. (2019). All WD+WD+MS and WD+WD+WD triple systems are detailed in Table A2. All but three of the triples in our sample are hierarchical, where we have defined a hierarchical triple as one in which the largest separation is greater than five times the smallest separation. An excess of hierarchical triples is expected since the hierarchy improves dynamical stability in the system (Hut & Bahcall 1983).
The total ages on the non-hierarchical systems (WDWDMS EDR3 3681220906103896192, WDWDMS EDR3 6756233901661753472, WDWDMS EDR3 929001808377561216) can provide interesting constraints on the timescales of dissipation of such systems. In both WDWDMS EDR3 3681220906103896192 and WDWDMS EDR3 6756233901661753472, one of the WDs does not have a total age estimate since it has a mass of 0.2M and 0.35M , respectively, but the total ages of the other WDs in these triples are 92 +3 −5 Myr and 2.9 +0.4 −0.2 Gyr, respectively. The weighted mean total age of the WDs in WDWDMS EDR3 929001808377561216 is 3.3 +3.1 −1.3 Gyr. Further information is found in Table A2. Although two of these non-heirarchical wide triples seem too old (> 2.5 Gyr) to still be bound, it has recently been shown that the unstable phase of these triples on average lasts 10 3 − 10 4 crossing times, and can even last as long as 10 6 − 10 7 (Toonen et al. 2021). Assuming the crossing time is of similar magnitude to the outermost orbital period, the two older systems would have a crossing time of ∼ 10 4 years, indicating the older systems have experienced ∼ 10 5 crossing times. Of the 21 systems that are hierarchical, 19 systems have an inner pair of two WDs, although this is likely an observational bias caused by the high luminosity ratio of a MS star to a WD.
Our search is sensitive to wide triple WD systems, but we do not find any new wide WD+WD+WD systems.
We do find a possible candidate for a new WD+WD+WD system (Gaia EDR3 source  IDs: 261249666477351552, 261249670772776832, 261249636413169920) but one component has a fidelity v1 parameter from Rybizki et al. (2021) of 0.01. Still, all three WDs have significant parallaxes and proper motions that are in better than 2σ agreement. With all components G > 20 mag, this potential new WD+WD+WD is significantly fainter than the first published triple WD system. This limits the parallaxes of each WD to 10%-15% uncertainties. The system lies somewhere between 100 − 130 pc, with the weighted mean distance being 118±8 pc. This possible WD triple has a similar hierarchy as the first such identified system in Perpinyà-Vallès et al. (2019); the system has an inner pair separated by ≈170 au and an outer component separated by ≈6500 au. (The first triple WD system has an inner separation of ≈300 au and outer separation of ≈6400 au.) Follow-up analysis is needed to confirm the true nature of this possible WD+WD+WD system.

Full Sample
Various parameter distributions of the full sample, including a color-magnitude diagram (CMD) of all wide double WDs, is shown in Figure 1. Our full sample of wide WD+WD pairs spans a large range of the WD cooling sequence, covering many masses and effective temperatures. The sample also spans many different evolutionary stages. In fact, we expect roughly 430 WDs (13%) to be mostly crystallized, based on the 80% crystallization boundary from Tremblay et al. (2019). However, many of our targets are quite faint; the majority of sources are fainter than G > 19.5 mag (see Figure 1). The separation distribution shows an excess of high separation pairs. These pairs are likely chance alignments and can be removed with a cut on R chance align from El-Badry et al. (2021), as discussed in Section 4.
In preparation for the eventual total age determination, we gather additional photometric data on each WD in order to sample a larger range of wavelengths than just the three broad photometric bands in Gaia EDR3. We use the crossmatches provided by the Gaia team for EDR3 (Marrese et al. 2021) to get photometric data from the Sloan Digital Sky Survey (SDSS), the Panoramic Survey Telescope and Rapid Response System (PanSTARRS), the SkyMapper Southern Survey, and the Two Micron All Sky Survey (2MASS). We find that 32%, 70%, 23%, and 4% of the WDs in our sample have photometry in SDSS, PanSTARRS, Skymapper, and 2MASS, respectively.

TOTAL AGE DETERMINATION
WD total ages have two necessary components that need to be determined: the time the star has spent as a WD (its cooling age) and its progenitor main-sequence and giant branch lifetimes. To calculate the cooling age of a WD, atmospheric parameters of the WD must be determined, which can be derived using either spectroscopic or photometric data in conjunction with model atmospheres. Fundamentally, the WD surface gravity and temperature must be determined.
Spectroscopic observations are available in the literature for roughly 300 of the WDs in our sample, so we lack spectroscopy for the vast majority of our 3179 objects. Therefore, we resort to conducting spectral energy distribution (SED) fitting of the average fluxes in various photometric bands from all sky surveys. In practice, SED fitting is sensitive to the radius and temperature of a WD. This radius can be converted into a surface gravity through the use of the mass-radius relationship for WDs (Bédard et al. 2020).
Once a surface gravity and temperature are determined, an estimate of the mass of the WD can be calculated using the same mass-radius relationship. This estimate of the mass, with the use of an IFMR and stellar evolutionary grids, can be used to generate an estimate of the lifetime of the ZAMS progenitor. We discuss this process in more detail below.

WD Atmospheric Parameters
To determine masses and effective temperatures of the WDs in our sample, we use available photometry from Gaia EDR3 and the all-sky surveys discussed in Section 2.3, in conjunction with a fitting technique similar to the photometric technique described in Bergeron et al. (2019).
Reddening effects from interstellar gas and dust can be important for the WDs in our sample (the majority are at a distance beyond 150 pc). Thus, the observed magnitudes in each band are dereddened following the process outlined in Gentile Fusillo et al. 2019, which we summarize here. We use reddening maps from Schlafly & Finkbeiner (2011) to determine the extinction coefficient in the Johnson V band, A V . Then, the A V values are converted to extinction coefficients in the other bands using the appropriate R values. We assume the material is concentrated along the galactic plane and has a scale height of 200 pc. When parameterised this way, the dereddened magnitude in a given band x is determined by, where M x,obs is the observed magnitude, A x is the extinction coefficient, b is the galactic latitude, and ω is the parallax in arcseconds. A more sophisticated reddening determination can be used similar to what is done in Gentile Fusillo et al. (2021), but in comparisons with atmospheric parameters derived in their work, we find no systematic shifts and the parameters are in good agreement (as demonstrated below). The observed, dereddened magnitudes are converted to average fluxes using the appropriate zero points for each bandpass. We apply the G-band magnitude correction for all sources with a six-parameter astrometric solution provided in Gaia Collaboration et al. (2021). To account for the offset of SDSS magnitudes from the AB magnitude standard, we apply an offset of −0.04 and 0.02 mags in the u band and z band, respectively (Eisenstein et al. 2006). For each survey, we remove photometry that has been flagged for various issues. For SDSS, we remove photometry that has been flagged with EDGE, PEAKCENTER, SATUR, and NOTCHECKED. We remove photometry from PanSTARRS that uses rank detections less than one. For SkyMapper, we only use photometry that has no raised flags. We also enforce a minimum uncertainty in each bandpass of 0.03 mag to account for systematic errors in the conversion of magnitudes to av-erage fluxes. This also gives equal weight to each bandpass and one point with a small error will not dominate the behavior of the fits.
We generate synthetic averages fluxes in a given bandpass using synthetic hydrogen atmosphere WD (spectral type DA) spectra from Koester (2010), convolved with the appropriate filter transmission profiles. The observed and synthetic fluxes are adjusted to a distance of 10 pc using the weighted mean of the parallaxes of the two components of the binary, or the weighted mean of all three components for our triples. To adjust the synthetic fluxes to 10 pc, the radius of the WD must be calculated from a mass-radius relation. We use the WD evolutionary models from Bédard et al. (2020) to convert the temperature and surface gravity of a given model to a radius. The photometric uncertainties are then added in quadrature with the uncertainty on this weighted mean parallax. The use of the weighted parallax in these systems allows for more precise determinations of the atmospheric parameters due to its higher precision. For the sample of triples, the precision of the parallax is six times better on average, because the MS companions are much brighter. This leads to the WDs in these triples having smaller mass uncertainties than what is expected given their brightness (an average improvement of 40% for the brighter WD component and 15% for the fainter WD component).
Due to the lack of spectral information for the majority of our objects, we assume all of our WDs are DAs. Although fitting a non-DA WD with a DA model can introduce systematic mass errors on the order of 10 − 15% (Giammichele et al. 2012), we are forced to choose a model. A DA model is the simplest assumption, considering >65% of WDs in magnitude-limited samples within Gaia are DAs (Kleinman et al. 2013).
We use the python package emcee (Foreman-Mackey et al. 2013), which uses an affine invariant Markov Chain Monte Carlo (Goodman & Weare 2010), to get the posterior range on the temperature and surface gravity for each WD in the sample. We use the 50 th percentile of these posteriors as an estimate of the best fit. We use flat priors on both effective temperature and surface gravities with bounds defined from the limits of the models (3000 − 40,000 K and log g of 6.0 − 9.5 dex) and assume Gaussian errors. Using the WD evolutionary models from Bédard et al. (2020), which assume thick H layers, we convert the surface gravities and temperatures to masses and cooling ages. The lower and upper errors on these parameters are taken as the 16th and 84th percentiles, respectively. To account for any unknown systematics, we set a lower limit on the uncertainty of the atmospheric parameters derived from the   fits of 0.038 dex for surface gravity and 1.2% in temperature (Liebert et al. 2005). An example of a fit can be seen in Figure 2. As a check on our methods, we compare the atmospheric parameters we derive to the parameters from the 100-pc WD sample of Kilic et al. (2020), and to the full Gaia EDR3 WD catalog from Gentile Fusillo et al. (2021), which were also determined from the photomet-ric technique we employ. The comparison to the 100pc WD sample from Kilic et al. (2020) reveals no significant systematic differences between the two works. The atmospheric parameters on average agree within 0.9σ (100 K) and 0.6σ (0.03 dex) for effective temperature and surface gravity, respectively. We observe some small dispersion at higher temperatures, which we expect since our work incorporates interstellar reddening whereas Kilic et al. (2020) does not. Comparing to the Gaia EDR3 WD catalog, we find that the atmospheric parameters on average agree within 0.4σ (300 K) and 0.35σ (0.1 dex) for effective temperature and surface gravity, respectively. The quoted uncertainties we find are significantly smaller than the ones in Gentile Fusillo et al. (2021) due to the use of more photometric points in the SED and the weighted parallax. Although we have smaller uncertainties and use different methods to determine the amount of interstellar reddening, we do not find any systematic shifts in derived WD atmospheric parameters.
We also search for spectral types of our objects that have been published in the literature, finding 305 objects with previously reported spectra. The sources for these spectral types can be found in the catalog of WD pairs described in Table A1.
We supplement these previously published spectral types with 57 spectra taken on the DeVeny Spectrograph mounted on the 4.3-m Lowell Discovery Telescope (LDT, Bida et al. 2014) in Happy Jack, Arizona. Using a 300 line mm −1 grating we obtain a roughly 4.5Å resolution, and rotate the position angle of the slit to capture both WDs simultaneously in our observations. We extracted the spectra using a quick-look pipeline written using the ccdproc and specutils packages from the AstroPy package (Astropy Collaboration et al. 2018), and wavelength calibrated the spectra with a HgArNe lamp. We do not attempt to fit these spectra and only conduct a visual inspection for the presence of hydrogen absorption features, and include their spectral type in the sp type column described in Table A1. Future work will further analyze atmospheric parameters derived from this spectroscopy.
Comparing the derived atmospheric parameters of the previously identified DAs with the parameters derived in this work, we find that the effective temperatures agree well but that the determination of the surface gravities from spectroscopic observations vary from our own values derived from photometric information. We find that the atmospheric parameters of our work and the spectroscopic parameters on average agree within 0.9σ (300 K) and 1.3σ (0.1 dex) for effective temperature and surface gravity, respectively. We also find that the spectroscopic fits find higher surface gravities for the WDs on average and thus hotter temperatures as well. Although the absolute differences are similar to the other comparisons, the quoted uncertainties from the spectroscopic parameters are smaller and result in a stronger disagreement in surface gravity (1.3σ compared to 0.9σ and 0.35σ).

Initial Masses and Progenitor Lifetimes
To get the total ages of the WDs in our sample, we need an estimate of the lifetime of the ZAMS progenitor. The first step towards this is to determine an initial main-sequence mass for each of the WDs in our sample. To get these masses, an initial-final mass relation is needed. In this work, we use the theoretical IFMR from Fields et al. (2016), which was generated by running a suite of solar metallicity stellar evolution models from the Modules for Experiments in Stellar Astrophysics (MESA, Paxton et al. 2011, Paxton et al. 2013, Paxton et al. 2015 starting at the pre-MS phase through the thermally pulsing asymptotic giant branch to the final WD phase. This allows us to remain self-consistent within both our IFMR and our determination of ZAMS progenitor lifetimes using the same open-source stellar evolution code and allows us to use a more theoretically motivated shape for the IFMR. The theoretical IFMR (shown as the dashed line in Figure 3) lies below what is observed for WDs in near solar metallicity clusters, an effect that is still not fully understood, though might involve uncertainties in the amount of convective overshoot that is used in the models (Fields et al. 2016) or uncertainties in nuclear reaction rates that can affect the core mass (e.g., deBoer et al. 2017). To account for this systematic effect, we fit a positive offset to the IFMR using the same WDs in solar metallicity clusters. We find the best-fitting offset to apply to the final WD mass is +0.0636 M . The points of this shifted IFMR can be found in Table B3.
Using this modified IFMR (the solid line in Figure 3), we generate progenitor MS masses for all the WDs in our sample. We convert these MS masses into progenitor lifetimes by interpolating the solar metallicity, zero rotation MESA evolutionary tracks of Dotter (2016) and Choi et al. (2016) and summing the main-sequence and giant branch lifetimes.
The uncertainties on the progenitor main-sequence masses and progenitor lifetimes are determined by taking the 1σ uncertainties on the WD mass, determined from the model-atmosphere fitting described in Section 3.1, which determine an upper-and lower-limit on the MS mass. Then the uncertainty on the progenitor lifetime is quoted as the difference between these 1σ bounds and the determined value. Total ages  Table B3).
are then determined through the sum of the cooling age and the progenitor lifetime, and the uncertainties on total age are calculated by adding the uncertainties on cooling age and progenitor lifetime in quadrature. A table of all double WD pairs with atmospheric parameters and total ages can be accessed at https://sites.bu.edu/buwd/files/2022/05/ Table A1.csv; a description of the table columns and data can be seen in Table A1.
As a check, we compare the progenitor lifetimes from our MESA IFMR to ones derived from the three-piece, cluster-calibrated IFMR from Cummings et al. (2018). We find that the progenitor lifetimes from both IFMRs are within 25% of each other on average. The difference between the total ages is on average 10% when comparing the two IFMRs. Overall, we find that the use of the IFMR from Cummings et al. (2018) does not significantly change the conclusions discussed in subsequent sections.

Precision of Total Ages
The WD mass is the dominant factor for determining the total age of the WDs in our sample, and sub-sequently, the uncertainties in the WD mass strongly influence the degree of precision on the total ages. This is especially true for the lower-mass WDs. For a typical 0.6 M WD (Kepler et al. 2007;Tremblay et al. 2016), the initial progenitor MS mass is around 1.2 M which corresponds to a progenitor lifetime of ≈ 6 Gyr. A 5% decrease in the WD mass results in a ZAMS progenitor mass of 0.88 M and progenitor lifetime around 17 Gyr. Thus, for WDs near this mass range, it is extremely difficult to constrain the total age without very precise WD mass estimates. This problem is compounded by the fact that the lower-mass region of the IFMR is the least constrained. Depending on the IFMR used, initial mass values for a 0.6 M WD can range between 1.2 − 2 M which corresponds to progenitor lifetimes between 1.4 − 6 Gyr. This problem is also exaggerated when considering that for these low-mass WDs, the progenitor lifetimes are a major fraction of the total age.
We stress that low-mass WDs are still useful in providing a lower limit on total ages. A 0.6 M WD with a 5% error on its mass can provide a 1σ lower limit of 1.4 Gyr on the progenitor lifetime, modulo uncertainties on the IFMR used. This, in conjunction with a precise cooling age, can still provide useful information on these WDs and their companions.
Fortunately, the opposite is true for the higher-mass WDs. Higher-mass WDs (> 0.8 M ) come from massive progenitors, which have short ZAMS progenitor lifetimes. Thus, even larger mass uncertainties for highmass WDs do not result in large errors on the progenitor lifetime (see Figure 4). Put another way, high-mass WDs can have large relative uncertainties in their progenitor lifetimes but still have relatively precise total ages. Additionally, the IFMR of these high-mass WDs are better empirically constrained from cluster WDs.

COEVAL ANALYSIS
Before engaging in any analysis on the total ages, we remove systems where we know our methods described in the previous section return unreliable total ages. We apply cuts on the sample to remove all systems with on-sky separation <2 , as well as systems with R chance align < 0.1. We also remove systems with at least one component with T eff < 3200 K, as well as those with an inferred total age > 14 Gyr.
The cut to on-sky separation removes systems with significant blending of the photometry, which confuses the SED fits (154 of the WD pairs have on-sky separations < 2 ). We adopt the cut on the R chance align parameter provided in El-Badry et al. (2021); cutting at 0.1 ensures the remaining systems have a high prob-ability of being bound (> 90%). Although we cut on a < 10% chance alignment probability, we emphasize that the expected contamination fraction from chance alignments is much less than 10%, because most of the pairs in the sample have chance alignment probabilities that are much lower than 10%. From the actual chance alignment probabilities of the pairs in our sample, we estimate that there are 4±2 chance alignments in the sample, corresponding to a contamination fraction of about 0.3%. The wide WD+WD sample contains 1390 high fidelity pairs with R chance align < 0.1 (see Figure 1).
The cut on effective temperature is employed because our model grid stops at 3000 K. Since our fitting routine does not attempt to extrapolate, objects with reported temperatures near the cutoff will have underestimated uncertainties and their temperatures will be artificially hotter than their true temperatures. We also ignore all systems with a total age greater than that of the Universe. Many of these systems come from overluminous sources in the Gaia CMD (see Figure 1). These sources will have low WD masses that, if they are actually so low mass, could not have formed from single-star evolution. At lower temperatures, the observed WD sample does not follow the cooling model trends which may result in these lower unreliable masses (see Figure 1). There are a total of 670 WDs with inferred total ages > 14 Gyr, which results in 548 pairs being removed.
After these cuts, the high-fidelity sample used for analysis contains 1272 WDs in 636 wide double WD pairs. The mean magnitude of this surviving sample is G = 19.3 mag.

Age Comparison
The first test of the total ages made is to check the absolute age difference in each pair. Since we assume the stars in each wide pair in our sample are coeval, we expect them to return identical total ages. This absolute age difference is plotted in Figure 5. The total age difference between each component is typically close to zero until one of the components masses is below a value of ∼ 0.63M . These systems do not provide very precise ages, and it is evident that the systematic uncertainty in these systems is too large to provide statistically significant comparisons.
This leads us to remove any systems that have a WD with a mass less than < 0.63 M from any further analysis. These systems can still be useful to provide lower limits on the total age, as discussed in Section 3.3, but we are not able to assess their reliability due to the large systematics in conjunction with their large uncertainties. After removing these systems, the number of wide WD+WD pairs is reduced to 423, with a total of 846 WDs. The second test of the accuracy of the total ages is to compare the ages of the WDs in each binary in relation to the uncertainties on those total ages. To do this, we explore distributions of age agreement, which we define as where τ 1 and σ τ1 are the total age and uncertainty on the total age of the primary (the most massive component of each binary) and τ 2 and σ τ2 are the total age and uncertainty on the total age of the less massive component. Systems with an age agreement (σ 1−2 ) equal to 5 have individual component ages that disagree at a 5σ level. We show in Figure 6 the distribution of the σ 1−2 disagreement for the 423 high-fidelity pairs where both WDs have M > 0.63M . The distribution exhibits a long tail out to large absolute values and has a notable bias towards negative values. Fewer than half (47%) of the systems have total age agreement within 1σ. The slight tendency towards negative σ 1−2 disagreement values shows that the more massive component of each pair has a younger total age on average compared to its companion. Possible causes of this asymmetry are further explored in Section 4.2. The long tail of the distribution reveals that there is a population of at least 60 systems (14%) with values of the σ 1−2 disagreement greater than five where the systematic errors (∆τ ) are non-negligible compared to the random errors reported (σ τ ).

Ill-Behaved Systems
Substantial uncertainty in the total ages in these WDs comes from the process of converting the WD mass to a ZAMS progenitor lifetime. We remove the most modeldependent aspect in the determination of the WD's progenitor lifetime by simply considering that the IFMR is monotonic and increasing, such that a higher-mass WD comes from a higher-mass progenitor. This assumption may not hold for all WD masses, as some evidence of a kink is seen in the IFMR for a range of WD masses around 0.65 M (Marigo et al. 2020).
Still, for the vast majority of cases, the more massive WD in the binary comes from a more massive mainsequence star, and it's progenitor lifetime should be smaller than the progenitor lifetime of its companion. For a coeval binary pair, the cooling age of the more massive WD must be longer than the cooling age of its companion. Any systems where the more massive WD has a shorter cooling age than its lower mass companion are discrepant before invoking any theoretical IFMR or other process for determining a progenitor lifetime.
Our sample contains many systems where this is the case, as shown by Figure 7. The figure includes systems that pass the temperature, separation, and chance alignment cuts discussed in Section 4, but we do not apply the total age and mass cut described in that section for this comparison. Roughly 43% lie in the negative region of the plot, where the more massive WD has a shorter cooling age. This statistic requires we properly define the more massive WD (see representative uncertainty in Figure 7); 21% of our systems are more than 1σ away from having the more massive component with a shorter cooling age. This is a conservative estimate for the number of these ill-behaved systems because the masses of the two WDs in each pair have a positive correlation. This is due to the fact that the two WDs are effectively at the same distance, and the uncertainty on the distance affects the determined uncertainties on the masses. If in fact the system is closer and therefore brighter, then both WDs would be brighter and the derived masses A significant fraction of objects (43% for the full sample, 35% for the DA+DA systems) sit in the negative region, which implies the more massive component has a shorter cooling age, which is unphysical and likely implies a large fraction of wide WD+WD binaries were once triple systems.
would both be lower, leaving the determination of the more-massive primary unchanged. Thus, somewhere between 21-43% of the wide WD+WD systems have the more massive WD with a shorter cooling age. Given a monotonic and increasing IFMR, these systems will have large σ 1−2 disagreement values. The systems that have negative cooling age differences shown in Figure 7 show no bias towards any mass or temperature range. Removal of the 187 ill-behaved systems that pass the cuts described in Section 4 results in a substantial fraction of systems in the long negative tail of the σ 1−2 disagreement distribution being removed (see Figure 8).
One possible explanation for this discrepancy is that we are assuming all of our WDs have a hydrogen dominated atmosphere. Applying a DA (hydrogendominated atmosphere) model to a non-DA WD can cause systematic errors on the order of 10-15% on the WD mass (Giammichele et al. 2012) which can lead to the wrong determination of the primary. The objects in the sample that are known to have two pure DA WDs still show objects in the negative region where the higher mass WD has a shorter cooling age (see Figure 7). The amount of wide DA+DAs that are ill-behaved is around . The distribution of σ1−2 disagreement for the 423 wide binaries that pass the cuts outlined in Section 4 (black) compared to the distribution of σ1−2 disagreement for the sample of systems where the more massive WD has a shorter cooling age (red). These systems can account for the bias towards negative values including the most discrepant systems.
36% of 74 systems (with 21% being more than 1σ discrepant). Systems where one of the WDs is known from spectroscopy to be a non-DA have 63% out of 73 systems in the negative region (with 52% being more than 1σ discrepant).
Our spectroscopic follow-up using the LDT was designed to specifically target systems in the negative region of Figure 7 and thus biases our results to this region. An alternative way to determine if there is an excess of non-DA systems in the ill-behaved region of Figure 7 without having to worry about the bias introduced by our follow-up observations is to compare relative rates of spectral types within the sample of the ill-behaved objects. There are 101 systems in the negative region and 156 of the 202 WDs have a spectral type determined. 66% of these WDs are DAs. In magnitude limited samples, we expect ≈ 65% of the WDs to be DAs (Kleinman et al. 2013). For the systems that are more than 1σ discrepant, 59% of the WDs with known spectral types are DAs.
Although there is some evidence for an overabundance of non-DAs in the negative region of Figure 7, we conclude it is unlikely that a large population of non-DAs is the full explanation for the number of these ill-behaved systems. To avoid any contamination from non-DAs, we will refer to the values for the DA+DA subsample (21-36%) when discussing the percentage of ill-behaved systems.
Instead, we propose that most of the ill-behaved systems are the descendants of triples that have an inner pair that either experienced a merger that effectively reset the age of the more massive WD or is currently unresolved. Through binary population modelling, Temmink et al. (2020) find that assuming single star evolution for a merger remnant can result in underestimating the age by 3 − 5 times on average, which could manifest as an anomalous cooling age like those in Figure 7. They also find 10−30% of isolated WDs have experienced a merger in their past, usually before the WD phase is reached, which is similar to the percentage of ill-behaved systems in our sample (21 − 36%). While that value is for single WDs once in binaries, it is a valuable number to anchor expectations.
If all the systems in the negative region of Figure 7 are or were previously triples, this implies that roughly 21-36% of the wide WD+WD pairs in our sample started as a triple system and have either undergone a merger or have a close inner pair that is unresolved. This range is consistent with the expected binary and triple fraction (30−35% and 10−20%, respectively) for WD progenitor stars that are 2−4 M on the ZAMS (Moe & Di Stefano 2017). We also find that 50% of the double WD pairs in the triple systems are in the negative region of Figure 7. This would imply that these systems are descended from quadruple systems.
At least one well-studied merger remnant is included in our sample: the massive (> 1.1 M , Külebi et al. 2010), strongly magnetic (up to 800 MG, Burleigh et al. 1999), rapidly rotating (725 s, Vennes et al. 2003) object REJ0317-853 (Ferrario et al. 1997). This WD is in a system with a σ 1−2 disagreement value of −17 and its effective temperature is at least twice that of its lower mass companion. We conclude that the majority of these illbehaved systems are likely merger remnants or unresolved triples. Spectroscopic follow-up of these systems can help to disentangle the subset of these objects that are the result of a merger from the unresolved triples which can place empirical constraints on the triple fraction of WD progenitors.

Dependence on Mass and Temperature
The dependence of the age disagreement on the WD mass and temperature can provide insights into where our models are insufficient. A treatment and correction for these dependencies is crucial if WD ages are to be used on large scales to age various populations.
We explore the dependence of the age disagreement on the WD mass and temperature by splitting the systems Inflation Factor Figure 9. The absolute value of the total age disagreement and the resulting total age error inflation factors as a function of WD mass. Each point represents the median of the absolute value of the total age disagreement of a bin centered at the point. The total age error inflation factors are calculated by dividing this median by the expectation for a folded Gaussian distribution. The black points represent the full sample of 423 systems that survive the cuts described in Section 4. The grey background points represent the median age disagreement for a sub-sample of systems where both WDs have masses contained in each bin. Each bin contains 20% of the sample, which is roughly 170 individual WDs per bin for the black points (the grey points are around 3 − 5 times fewer). The red line is a fit to the black points that is described in Section 4.3 to define an inflation factor; a coarse mapping for this fit can be found in Table 1. into individual WDs, and inspecting the median of the absolute value of the age disagreement of each system in various bins of effective temperature and mass. In this analysis, we do not remove any of the ill-behaved systems discussed in Section 4.2, since it is not usually feasible to separate these possible merger remnants when looking at an isolated WD or WD+MS system in the field.
We find that the age disagreement is effectively constant as a function of temperature, but shows a strong trend as a function of mass. This can be seen in Figure 9. The highest masses (> 0.8M ) have a median |σ 1−2 | disagreement value of ≈ 1.5 while the lowest mass bin (< 0.67M ) has a median value of ≈ 0.6. The uncertainties in the medians shown are found through a bootstrapping method with replacement. To ensure that this trend is the true underlying trend, we calculate the median age disagreement for a sub-sample of systems where the mass of both WDs in the system are in the same bin. This removes cross contamination from systems that have WDs with very different masses. The grey points in Figure 9 show the trend for this sub-sample, and are consistent with the points for the whole sample. This trend in mass can be caused by larger systematic errors in the total ages of massive WDs causing larger relative age differences in binaries containing them, or by an underestimation of the total age uncertainties of high mass WDs. We find that the relative age difference is essentially constant (≈ 25%) as a function of WD mass. This leads us to conclude that the uncertainties on the total ages of the high mass WDs in the sample are underestimated.
To determine the amount of inflation of the total age errors needed, we take the median age disagreement and divide by the expected median value of a folded Gaussian distribution with a standard deviation of one. These values can be seen as a second y-axis in Figure 9. Next, we fit these points with a sigmoid function, defined as, where M is the mass of the WD and a, b, and c are all constants. We fit this function to the inflation factor points shown in Figure 9. The best fit parameters are a = 2.2, b = 0.68, and c = 22. The resultant fit can be seen in Figure 9 as the red dashed line, and select points are given in Table 1. The total age error inflation factor for the lowest mass bin is 0.85, but we do not recommend deflating any total age errors. The uncertainty makes the point consistent with a total age error inflation factor of 1.0. Thus, we recommend using the fit function only for WDs with masses greater than 0.67M . To further understand where WD ages are the most reliable, we split the parameter space into 25 bins in temperature and mass. For each bin, we check the median total age, absolute age difference, and total age error inflation factor needed. The results of this are shown in Figure 10. The most problematic areas of parameter space are the higher masses, which is what is seen in Figure 9, and especially at the hotter temperatures where the cooling age contribution to the total age is minimized. These hot, high-mass regions of parameter space are where we would expect the merger remnants to be at their highest contamination rate. This is also where the Q-branch is located, which was first seen in the HR diagram in Gaia DR2 (Gaia Collaboration et al. 2018b;Cheng et al. 2019), as well as the onset of crystallization . The cooling models we use do include crystallization, but not other exotic cooling effects, such as sedimentation of neutron-rich ions (e.g., Bildsten & Hall 2001;Bauer et al. 2020;Blouin et al. 2021).
The typical percent age difference across all of the bins is around 25%, ranging from 60% in the worst case (T eff > 12000 K, 0.8 M < M < 0.9 M ) and 15% in the best case (7550 K < T eff < 9100 K, 0.73 M < M < 0.8 M ). The WDs with the hottest temperatures show larger percent age differences on the order of 35%-60% (0.2-0.5 Gyr) compared to the rest of parameter space. Thus, 25% total age uncertainties are a representative approximation for most WD age estimates on an individual WD for WDs > 0.63 M and temperatures cooler than 12,000 K. This median WD age precision compares well to similar results from an independent analysis of many of the same objects (Qiu et al. 2021). It should be noted that with additional information through spectroscopy we can achieve better age precision (Moss et al. 2022).

CONCLUSIONS
In this work, we constructed a sample of 1565 wide WD+WD binaries as well as 24 mostly new wide triple WD+WD+X systems to test the accuracy and precision of WD total age determinations when no spectral type information is known. We measured effective temperatures, surface gravities, and masses for all WDs in our sample by using hydrogen-dominated atmosphere models and fitting photometry from a variety of all-sky surveys. Using these atmospheric parameters, we determined the total ages of each WD through the use of WD cooling models, a theoretically motivated and observationally calibrated initial-to-final-mass relation, and stellar evolution model grids.
Using a high-fidelity sample of wide WD+WD pairs with uncontaminated photometry described in Section 4, the main conclusions of this work can be summarized as follows: · We find 24 mostly new widely separated triples with at least two WDs, and possibly the second ever resolved triple-WD system. We find 21 of the 24 triples have a hierarchical structure. The three triples not in a hierarchical structure will likely dissipate due to dynamical instability, but the ages of the WDs in these systems can provide useful lower limits on the dissipation timescale. We find that two of the non-heirarchical triple systems are ≈ 3 Gyr old, while the other is quite young at ≈ 90 Myr old.
· We find that the total ages of the lowest mass WDs in the sample (< 0.63M ) are too uncertain to be used in the age comparisons conducted in this work. The absolute age difference of systems containing at least one lower-mass WD quickly diverges towards high values of a few Gyr. A lower limit on the total age can still be determined for these WDs, but we do not attempt to test the absolute reliability of their total ages. This is unfortunate, since the mean mass of field WDs is roughly 0.63 M (Tremblay et al. 2016;Kepler et al. 2016Kepler et al. , 2019. · When comparing the total ages of each component of the WD pairs, we find a significant fraction of systems (21 − 43%) where the more massive WD has a shorter cooling age. We find that this trend holds even for a subsample of DA+DA systems, with 21-36% having the more massive WD with a shorter cooling age. This is the opposite of what is expected for a monotonic and increasing initial-tofinal-mass relation. We attribute this discrepancy to the presence of many merger remnants or unresolved triples in the sample, such that roughly 21−36% of our sample of currently wide WD+WD binaries started as bound triple systems. These WDs have large age disagreement values and can cause problems when determining ages for field WDs, but unresolved companions or a merger history is a problem for many techniques for determining stellar ages. · We find that the level of age disagreement strongly depends on the mass of the WD. Wide binaries containing at least one higher-mass WD (>0.8 M ) generally have a lower level of agreement compared to those with lower-mass WDs, reflecting the fact that the formal age uncertainties are often underestimated for higher-mass WDs. We fit this dependence to determine the amount of error inflation on the total ages needed to improve the agreement. We find that errors on the total ages of the highest-mass WDs (>0.9 M ) need to be inflated, up to a factor of 2.2, when assuming that the WD is a DA. Fortunately, higher-mass WDs also have more precisely determined ages (median total age uncertainties of ≈ 0.2 Gyr), so they remain very useful chronometers. Inflation Factor Figure 10. The inflation factor and median total age and age difference for 25 bins in mass and temperature of roughly similar size. Each box is colored by the inflation factor and the white box shows the median total age (Gyr, upper left), median total age difference (Gyr, upper right), the total age error inflation factor (lower left), and the number of WDs in the bin (lower right). The upper right bin also includes the parameters represented in the white boxes. For example, the 50 WDs in the bottom left bin (4000 − 6350 K, 0.63 − 0.67 M ) have a median total age of 4.6 Gyr, their ages differ from their companions by 1.4 Gyr on average, and their total age uncertainties need to be inflated by a factor of 1.2 for their ages to come into 1σ agreement.
· Looking at 25 bins in mass and effective temperature space in Figure 9, we find that total ages are roughly accurate at the 25% level for WDs with masses >0.63 M using only photometry. The WDs with the hottest temperatures have the largest inaccuracy, with uncertainties of 35%-60%, but these are relatively small absolute errors on the order of 0.3 Gyr.
Due to the large sample size, we assumed that all the WDs in our sample are DA. When a spectral type is known, a more accurate total age can be determined and the recommendations discussed above are not immediately applicable. A large sample of wide double WDs with known spectral types needs to be investigated to determine the accuracy of WD total ages when the spectral type is known.
Future work is planned to quantify how atmospheric parameters from spectroscopy can alleviate the systematic errors mentioned above. Spectroscopic observations do not suffer from the same problems as the photometric methods employed in this work, and should be more sensitive to the WD mass. A spectrum can also provide important clues as to whether a WD has a prior merger history or is part of an unresolved binary (for example, if it is strongly magnetic, or if it is radial-velocity variable). With our follow-up efforts using the LDT, we continue to increase the sample of wide WD+WD pairs with available spectra in hopes of gathering a large sample of wide double WDs with spectroscopically determined atmospheric parameters.
In general, we find that WD total ages agree very well in wide double WDs, but the uncertainties occasionally need to be inflated to compensate for systematics on the total ages, the presence of an unresolved companion, or merger history. Given the massive increase in the number of systems with a wide WD companion thanks to Gaia, we hope this result provides a first road map to providing more reliable stellar ages using these wide binaries. This work has made use of data from the European Space Agency (ESA) mission Gaia (https://www. cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, https://www. cosmos.esa.int/web/gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement. These results made use of the Lowell Discovery Telescope (LDT) at Lowell Observatory. Lowell is a private, non-profit institution dedicated to astrophysical research and public appreciation of astronomy and operates the LDT in partnership with Boston University, the University of Maryland, the University of Toledo, Northern Arizona University and Yale University. The upgrade of the DeVeny optical spectrograph has been funded by a generous grant from John and Ginger Giovale and by a grant from the Mt. Cuba Astronomical Foundation.

Column Units Description
Note-Columns dealing with properties of the individual WDs will have " 1" or " 2" at the end to denote the primary and secondary, respectively, where the primary has been defined as the more massive WD. The final 26 rows contain the wide double WD pairs from the triples found in this paper. * Spectral types in this work are collected from Sion et al. (1991); Finley & Koester (1997)

B. SHIFTED IFMR
The table used for the initial-to-final-mass relation in Figure 3 is detailed here in Table B3, adapted from Fields et al. (2016). The WD mass is M f and the ZAMS mass is M i .