The 50 Mpc Galaxy Catalog (50MGC): Consistent and Homogeneous Masses, Distances, Colors, and Morphologies

We assemble a catalog of 15424 nearby galaxies within 50 Mpc with consistent and homogenized mass, distance, and morphological type measurements. Our catalog combines galaxies from HyperLeda, the NASA-Sloan Atlas, and the Catalog of Local Volume Galaxies. Distances for the galaxies combine best-estimates for flow-corrected redshift-based distances with redshift independent distances. We also compile magnitude and color information for 11740 galaxies. We use the galaxy colors to estimate masses by creating self-consistent color -- mass-to-light ratio relations in four bands; we also provide color transformations of all colors into Sloan (g-i) by using galaxies with overlapping color information. We compile morphology information for 13744 galaxies, and use galaxy color information to separate early and late-type galaxies. This catalog is widely applicable for studies of nearby galaxies, and placing these studies in the context of more distant galaxies. We present one application here; a preliminary analysis of the nuclear X-ray activity of galaxies. Out of 1506 galaxies within the sample that have available Chandra X-ray observations, we find 291 have detected nuclear sources. Of the 291 existing Chandra detections, 249 have log(L$_{X}$)$>$38.3 and available stellar mass estimates. We find that the X-ray active fractions in early-type galaxies are higher than in late-type galaxies, especially for galaxy stellar masses between 10$^9$ and 10$^{10.5}$ M$_\odot$. We show that these differences may be due at least in part to the increased astrometric uncertainties in late-type galaxies relative to early-types.


INTRODUCTION
The galaxies in the nearby Universe provide our best laboratory for studying a wide range of scientific questions.Most fundamentally, the properties of nearby galaxies are the outcome of the process of galaxy evolution over the lifetime of our Universe.We can use our detailed view of nearby galaxies to quantify their morphology, kinematics, and star formation histories.
Our most complete census of galaxies comes from the local measurements.Only within the Local Group can the faintest galaxies be found (e.g.Willman et al. 2005;McConnachie 2012;Simon 2019), and even with modern surveys like the Sloan Digital Sky Survey (SDSS), we can get a census of bright dwarf galaxies with masses (M ⋆ ) ∼10 9 M ⊙ only out to ∼50 Mpc (e.g.Blanton et al. 2005).The detailed structures within nearby galaxies can reveal many aspects of their history and evolution, including their faint stellar halos (e.g.Crnojević et al. 2016), star cluster populations (e.g.Krumholz et al. 2019;Hughes et al. 2021), dwarf galaxy satellites (e.g.Carlsten et al. 2022), central stellar struc-tures (e.g.Erwin et al. 2015;Neumayer et al. 2020), and their supermassive black holes (e.g.Saglia et al. 2016).Nearby galaxies also provide a unique laboratory for stellar and high-energy astrophysics, such as quantifying rare stages of stellar evolution that aren't represented in the Milky Way (e.g.Melbourne et al. 2012), finding and characterizing rare sources like ultraluminous X-ray sources (Kaaret et al. 2017), and transient events including gravitational wave events (Abbott et al. 2017) and rare types of supernovae (e.g.Tartaglia et al. 2018).Finally, nearby galaxies provide a baseline for distance measurements in the Universe and measurement of their distances are key for understanding cosmology (e.g.Freedman 2021).
A number of existing catalogs of nearby galaxies are regularly used in selecting sources for these science explorations.The Karachentsev et al. (2013) Local Volume Galaxy catalog1 is a regularly updated and very complete catalog of the nearest galaxies with distance <11 Mpc and includes distance and photometric information.Beyond this distance, the most widely used sources of galaxy information are HyperLeda2 (Makarov et al. 2014) and the NASA Extragalactic Database (NED)3 .In addition, many studies make use of catalogs of the nearest rich clusters of galaxies, Virgo (Binggeli et al. 1985;Kim et al. 2014).Distance measurements to nearby galaxies has been a subject of intense study, and much of this information for nearby galaxies has been compiled within a subsection of NED (Steer et al. 2017).All of these works combine heterogeneous photometric, distance, and morphological measurements, and make some attempt to synthesize and homogenize these.Modern studies of galaxies typically characterize galaxies based on their stellar masses and colors (e.g.Baldry et al. 2004).Unfortunately, there is no uniform and homogeneous estimates of these quantities for nearby galaxies, and the varying photometric data as well as heterogeneous distance estimates available for these galaxies complicates their comparison to more distant galaxy samples.
The goal of this paper is to provide the first homogenized source of nearby galaxy masses, distances, colors, and morphologies.We have chosen a distance limit of ∼50 Mpc for this catalog, as many of the existing studies of nearby galaxies are limited to galaxies within this distance limit.We find in §4 that these galaxies have stellar masses that range from log(M ⋆ /M ⊙ ) of 6.6 to 10.9 (at the 99%ile).We also present one application that motivated the creation of this catalog; a comparison of nuclear X-ray activity between early and late-type galaxies.These results are the first step to determining constraints on the occupation fraction of central massive black holes in nearby galaxies (Miller et al. 2015;Gallo & Sesana 2019), which will be presented in a follow-up paper (Gallo et al., in prep).
This paper is organized as follows: we describe our sample selection and photometry in §2.We discuss distance estimates to the sample galaxies in §3, determine their masses based on consistent color-massto-light ratio measurements in §4, and describe their morphologies and group membership in §5.Finally in §6, we apply the catalog by combining it with Chandra archival data to present our X-ray active fraction measurement before concluding.The full galaxy catalog is in the Appendix in Table 2.This table is available through the publisher and (potentially updated) at https://github.com/davidohlson/50MGC.Throughout the paper, when we refer to column names without other clarifying information, we refer to the full catalog; columns are indicated through their font, i.e. column font format.

SAMPLE SELECTION & PHOTOMETRY
In this section, we detail the construction of our sample from multiple sources as well as the compilation of photometry that enables luminosity and mass estimates for each galaxy.

Base Catalogs & Combination
Our catalog is formed from three base sources, the Local Volume Galaxy (LVG) catalog (Karachentsev et al. 2013), HyperLeda (Makarov et al. 2014), and the NASA-Sloan Atlas (NSA) 4 .All three contain a wide variety of information including redshifts and photometry.The strength of the NSA catalog is its uniform Sloan Digital Sky Survey photometry extending to quite faint sources, however, this photometry only extends over ∼1/3rd of the sky.The LVG and HyperLeda catalogs combine hetereogenous data sets and are continuously maintained and updated.The LVG catalog contains nearly complete measurements of galaxies brighter than M B ∼ −12 within 10 Mpc, however it lacks optical color information for the galaxies.Both NSA and HyperLeda extend to greater distances and have optical color information.Combined, these catalogs can provide the data needed to obtain distances, masses, and Hubble types for galaxies out to maximum target distance of 50 Mpc.
From HyperLeda, we initially began with a subsample of 18726 objects designated as galaxies and with radial velocity < 3500 km s −1 .We chose to remove the ∼20% of objects which were missing absolute B-band magnitude values in column mabs from our sample, because HyperLeda's calculation of absolute magnitude relied upon apparent B-band magnitude, distance modulus, and extinction.Removal of these objects from the sample ensured the inclusion of the most essential galaxy measurements.This cut the initial HyperLeda sample down to 14941 objects.
For the LVG sub-sample, we combined the "Catalog of Nearby Galaxies" and "Global Parameters of the Nearby Galaxies" tables available from the LVG catalog website5 .The combined tables totalled 1240 galaxies 6 .
Finally for the NSA data, we used the nsa v1 0 1 catalog and found 8242 objects with cz < 3500 km s −1 .We also added available SDSS spectral line flux data to these objects, matched by plate and fiber IDs.We note that the NSA catalog includes not just objects with SDSS redshift information, but also galaxies with redshifts measured from other sources (ALFALFA, NED, 6dF, 2dF, and ZCAT).

Combining the Sample
The three subsamples contain both duplicate objects and contaminants.Here, we describe how we combined the catalogs while eliminating duplicates and contaminants.We note that the presence of a galaxy in each input catalog is given by flags in our catalog: hl obj, lvg obj, and nsa obj.If a galaxy is present in that catalog the flag has a value of 1, while if it is not, it has a 0 value.
The HyperLeda sample included some duplicate galaxy entries; to remove these we crossmatched each entry to its nearest neighbor within the sample.We followed up on galaxies with nearby neighbors within a 2 ′ separation radius for galaxies within 20 Mpc, and a 1 ′ separation for galaxies between 20-50 Mpc.For the 580 potential duplicate galaxies we examined available data and flagged them as follows -0: individual galaxy with the best available data, 1: duplicated galaxy with less available data, 2: off-nuclear position in duplicate galaxy, 3: image shows a bright or star-forming region within a larger galaxy.We include only objects flagged with 0, which removed 183/580 galaxies.This cut resulted in an initial HyperLeda sample of 14758 galaxies.
We first created a combined LVG and HyperLeda sample.We used astropy.coordinates.SkyCoord to conduct a coordinate-based crossmatch with a maximum separation distance of 30 ′′ .Of the 1240 galaxies in the LVG, 744 were found in HyperLeda.The 496 unmatched LVG galaxies are almost entirely very low-luminosity.This explains their exclusion, as these galaxies are likely to lack the photometric and redshift measurements required to have been included in our initial HyperLeda sample.When combining sources between these catalogs, we prioritize the LVG galaxy properties in almost all cases, i.e. radial velocities, B band magnitudes, and distances (see §3).
We then crossmatched NSA catalog objects to our combined HyperLeda and LVG samples using a maximum separation distance of 30 ′′ .Of the 8242 NSA objects, 6966 matched the combined HyperLeda & LVG catalog.For the the 1276 unmatched objects, we found that many were contaminants.To distinguish these contaminants from real galaxies, we visually inspected each galaxies' SDSS imaging and compared redshifts to those listed in the NASA Extragalactic Database (NED).At the lowest redshifts, the contaminants included foreground stars on top of image artifacts or distant galaxies.At larger distances, bright portions of galaxies with separate spectroscopic measurements from SDSS were often included as separate catalog entries.Some contaminants also were just image artifacts, and a small number contained no obvious image source whatsoever.We flagged each of the 1276 unmatched NSA objects, based on our findings -1: image shows a bright or starforming region within a larger galaxy.2: image shows no object, an imaging artifact, a star, or a foreground star over a high redshift galaxy.3: Duplicates of otherwise good galaxies, of which we flagged the entry with the most reliable redshift source to keep, in order: SDSS, NED, ZCAT.4: Good imaging of a galaxy with redshift matching NED.These qualtity flags are given for all 1276 galaxies in Table 1.Only objects in category 4 are considered as possible unique new NSA galaxies to add to the catalog.
Finally, to minimize the potential for duplicate galaxies, we wanted to ensure none of the unique NSA objects in category 4 were observations of the edge of a galaxy already included in our sample.To do this, we first needed uniform angular diameter measurements for each sample galaxy.
HyperLeda gives galaxy diameter measurements as the major axis using a standard 25 mag arcsec −2 isophote (d25), whereas LVG uses a diameter measurement corresponding to the Holmsberg isophote (∼ 26.5 mag arcsec −2 ; which we call d26).We compared Color-Dependent Correction d25 HyperLeda diameters to d26 LVG diameters for galaxies found in both source catalogs to find the typical ratio between these, and then multiplied the LVG diameters by ∼ 0.838 to approximate d25 HyperLeda values for all the nearest neighbor galaxies.
For each NSA object, we then divided the angular separation from its nearest neighbor's angular diameter to create a "match separation ratio", δ sep /d25.Of the galaxies confirmed to include in our sample, we only included those with a ratio δ ≥ 0.75 (i.e.lying beyond 1.5 galaxy radii).Of the 208 objects in category 4, 170 galaxies passed this cut and were added to our final sample.These galaxies have a median distance of 29.9 Mpc and a median mass of 8.08 × 10 7 M ⊙ (log(M ) ∼8), and thus add significantly to our sample.
We note that many studies (e.g.Geha et al. 2012;Reines et al. 2013) have used the NSA catalog as a sample of nearby galaxies without removing contaminants.We find that ∼13% of nearby NSA galaxies (1106/8242) are contaminants.Most contaminants show a low mass estimate from NSA redshift distances and photometry, with a median mass of 2.6 × 10 7 M ⊙ .We therefore include the full list of NSA objects that don't match HyperLeda & LVG and the criteria we used to cut these down in Table 1 for future users.

Galaxy Coordinates
We want the coordinates in our catalog to be centered as accurately as possible on galaxies' nuclear regions for the purpose of AGN selection and observation.Our catalog combines multiple subsamples, each with slightly different astrometric methods and sources.HyperLeda applies a homogenized position computed by averaging published measurements weighted by an assigned quality flag.NSA provides coordinates of the measured object's photometric center in the SDSS.The source of LVG coordinates was not given.Each time we combined subsamples with different astrometric sources, we compared the coordinates for galaxies in both.We visually inspected 150 galaxies with overlap between HyperLeda and either LVG (739 galaxies) or NSA (7136 galaxies).We randomly selected these galaxies to have a range of offsets between 1 ′′ and 10 ′′ .We compared the galaxy positions to imaging with SDSS and 2MASS, both of which have good astrometry (Pier et al. 2003;Skrutskie et al. 2006).This visual inspection suggests that in general the HyperLeda positions are better aligned with the photocenter than the LVG and NSA samples.We therefore use the following priority order for the ra and dec in our catalog -HyperLeda, LVG, and NSA.
In addition to the comparison and selection between position sources from multiple catalog sources, we also compared our selected Right Ascension and Declination values to the primary position provided by NED.We retrieved NED positions using a name-based query.After removing 25 matches with separation >5 ′ due to misnamed entries in HyperLeda and NED, we reliably matched positions for 14412 catalog galaxies (∼ 93%).We used the visual inspection process described in the previous paragraph to determine whether NED coordinates are noticeably preferable to our catalog position.We found our positions were generally better than those from NED, with a larger spread between our catalog positions and NED than between HyperLeda, LVG, and NSA.We find 1680 galaxies have a separation > 3 ′′ between our catalog position and the NED position.We investigate the relation between coordinate offsets, mass, and galaxy type in more detail in §6.Because of the variation between positions from different sources, we choose to include columns for NSA and NED positions in the final catalog, in addition to our chosen compilation of coordinates described above.

Color and Luminosity
The luminosities and colors of our galaxies are two crucial parts of our catalog.In addition to being intrinsically useful, the colors are also important for calculating stellar masses ( §4) and for assessing the galaxy type (early vs. late; §5).To obtain these estimates, we use four different sources for colors and associated luminosities: (1) HyperLeda, (2) NSA, (3) the Siena Galaxy Atlas (SGA), (4) NASA/IPAC Extragalactic Database (NED).All luminosity calculations utilize the best distances described in 3.  Extinction Correction: For our foreground extinction corrections, we uniformly apply the Schlafly & Finkbeiner (2011) corrections; all our sources have Schlegel et al. (1998) extinction estimates available, and we multiply these by 0.86 to convert them to Schlafly &  Finkbeiner (2011) values 7 .For HyperLeda, we used B and V magnitudes, and scaled their A B as described above; then to determine A V values we assumed an A V /A B = 0.769 (Cardelli et al. 1989).For NSA, we used the provided Schlegel et al. (1998)  Finally, for NED, we use B and R band magnitudes from the same source for consistency -these primarily come from the southern ESO Uppsala survey (Lauberts & Valentijn 1989); we use the HyperLeda B-band extinction, and assume an A R /A B = 0.628 based on the extinction curves of Cardelli et al. (1989) andO'Donnell (1994) taken from the Padova CMD website8 .
We did not apply any internal extinction corrections, thus the luminosities and colors presented here are only foreground extinction corrected.Internal extinction values are available from both HyperLeda and LVG.However, we found no correlation between these two.
Luminosities & Colors from HyperLeda ((B − V ) 0 , L V , L B ): HyperLeda provides two different B − V colors: (1) the total asymptotic corrected color, bvtc which is corrected for both internal and foreground extinction, and (2) color measured within the effective aperture in which half the total B-flux is emitted, bve, which is not corrected for extinction.Various HyperLeda galaxies have one or both measurements available.Our first step to obtain consistent values was to undo the HyperLeda extinction corrections on the bvtc column, and apply the Schlafly & Finkbeiner (2011) foreground extinction correction to both the total and effective colors -we call these extinction corrected quantities (B − V ) t,0 and (B − V ) e,0 respectively.Because we are using the colors to calculate total masses, we also attempt to correct the small offset between the (B − V ) t,0 and (B − V ) e,0 -the difference between these colors for galaxies that have both values are shown in Fig. 1.The offset is clearly color dependent, and the line in that figure shows the color dependent correction appslied to the (B − V ) e,0 to get our final (B − V ) 0 values for galaxies that only had bve data available.We flagged these data using the bvcolor f value in our table.
For calculating luminosities and total magnitudes in B and V bands, we found that the given bt and vt columns were not consistent with the given colors (after correcting for the different extinction corrections applied) -this is likely due to the heterogeneity of data sources used.Given the ubiquity of btc values, we first calculate the B total magnitude by removing the extinction corrections and applying Schlafly & Finkbeiner (2011) foreground extinction corrections to get total B t,0 values.We then derived the total V band magnitude by using our B t,0 and subtracted the (B −V ) t,0 value to get a foreground extinction V band magnitude, V t,0 .We note the use of B t,0 values throughout this paper when plotting galaxy luminosities.These are transformed to luminosities using the Vega-based absolute magnitudes of the Sun from http://mips.as.arizona.edu/cnaw/sun.html.
Luminosities & Colors from NSA ((g − i) 0 , L i , L B ): NSA provides Petrosian flux and extinction values for each F N ugriz band.We calculate the Pogson g-, r-and i-band magnitudes using the SDSS relation 9 : m = 22.5 − 2.5log 10 f .We then extinction correct all individual magnitudes as described above.For calculating galaxy masses ( §4), we use the g 0 and i 0 magnitudes to calculate an extinction corrected (g − i) 0 9 https://www.sdss.org/dr12/algorithms/magnitudes/color.We also calculate and i-band luminosity for the galaxies (L i ) using the solar AB magnitude from http://mips.as.arizona.edu/cnaw/sun.html.
To analyze a single luminosity distribution for our entire sample, we also estimate the B-band luminosity L B from the SDSS magnitudes using the stellar transformation B 0 = g 0 + 0.39 * (g − r) 0 + 0.21 from Jester et al. (2005).
Luminosities & Colors from SGA ((g − r) 0 , L r , L B ): A complementary source of photometry for primarily southern galaxies is the Siena Galaxy Atlas (SGA10 ).We matched our catalog to the SGA catalog after removing sources with known redshifts above 3500 km s −1 using a maximum separation distance of 30 ′′ .In cases of multiple galaxies within this threshold, we matched the galaxy with the smallest angular separation.This gave us (g − r) 0 data for 7681 galaxies, 1616 of which otherwise lacked color.
The SGA catalog contains a number of different magnitude measurements.To ensure consistency, we compared overlapping sources with NSA and found [G,R,Z] MAG SB26 best matched the NSA magnitudes.We then calculated reddenings and extinction using the IRSA online portal11 to get Schlafly & Finkbeiner (2011) extinction values.We corrected for these values to calculate the (g − r) 0 color, L r , and estimate the Bband luminosity L B using the same Jester et al. (2005) equation applied to NSA photometry.
To expand our mass estimations for southern galaxies, we used astroquery to request NED photometry for each sample object.We added available B and R data to the sample, specifically objects with an observed passband of B (Cousins) (B 25) and R (Cousins) (R 25).If multiple magnitude measurements were available for a single object, we included a mean value.Queries returned a uniform B and R magnitude error of 0.09 for all objects.We used these B and R band magnitudes, corrected for foreground extinction, to calculate the (B − R) 0 color, L R , and L B where available.
Combined Color and Luminosity Estimates: In section 5 we describe color transformations between (g −i) 0 and other colors.We use these to derive an estimate of the (g −i) 0 color for all galaxies with colors.This is contained in column gi color.For galaxies with multiple available colors, we give precedence to source colors in order: NSA (g −i) 0 , and then synthesized (g −i) 0 values based on (g − r) 0 , (B − V ) 0 , and (B − R) 0 .We compile the L B values from all sources in column B lum; most of these are taken directly from HyperLeda, while a handful are synthesized from other sources as described above with the precedence order being HyperLeda, NED, NSA, SGA.A vast majority of L B estimates come directly from Hyperleda.

Cleaning the Photometric Measurements Used for Mass Estimates
Due to a number of objects with both NSA and Hyper-Leda data available, we wanted to test which source provided more reliable photometry when both were available.We compared g vs B magnitudes, due to their similar wavelength, and investigated galaxies with significantly differing magnitudes to available NED values.Generally, by comparing to available photometry compiled in NED, we find that NSA photometry is more reliable in cases where it is discrepant from HyperLeda and therefore typically prioritize NSA data in estimating galaxy masses.An exception to this is for a selection of galaxies with NSA g > 16 and B − g > 1; for these galaxies we use Hyperleda values preferentially, as the HyperLeda values more closely matched other measurements from NED.Based on this analysis, we created the mag flag column to signify which source provided the more accurate measurement at a given magnitude -0: the difference between B and g are within one magnitude, so NSA values are used for calculating masses.1: B and g are discrepant; NSA photometry is more reliable.2: B and g are discrepant; HyperLeda photometry is more reliable, and is used for estimating masses.
There are 446 additional objects where the NSA photometry is not used.These include objects with flags indicating unreliable photometry, specifically those with the 5th or 6th bitwise flag set in the NSA DFLAGS field (Blanton, M., private communication).We also don't use NSA photometry for 20 objects with g or i petrosian flux ≤ 0 nanomaggies.Of these 446 galaxies, 306 had other colors and luminosities available for calculating masses.

DERIVING BEST DISTANCES
In this section we discuss our derivation of the best available distance estimates across our full galaxy sample.These distance estimates fall into three broad categories: (i) redshift-based distances determined using the Cosmicflows-3 Distance-Velocity Calculator (Kourkchi et al. 2020), (ii) redshift-independent distances taken from the LVG and from the NASA/IPAC Extragalactic Database (NED Steer et al. 2017;Steer 2020), and (iii) Virgo Cluster distances, for which objects matching Virgo samples and catalogs were assigned either measured distances or an appropriate mean distance.
In general, we prioritize the redshift independent and Virgo distances except in cases where the available redshift based distances are more accurate.∼ 20% of redshift distances are more accurate, once distances reach > 30 Mpc and error becomes a lower fraction of total distance.

Redshift Based Distances
Due to the peculiar motions of galaxies, in the nearby Universe, redshift based distances are not always accurate.However, this accuracy can be improved by using models of the local mass density based on the measured peculiar motions of galaxies (Courtois et al. 2012).We chose to use the Cosmicflows-3 Distance-Velocity Calculator (Kourkchi et al. 2020) because it covered the full distance range of our sample using an up-to-date local mass density model (Graziani et al. 2019).We entered Heliocentric velocities into the calculator to retrieve distance estimates corrected for the local mass density.Velocity measurements are given precedence in the order NSA, LVG, and HyperLeda where available.
To estimate errors on the redshift based distances, we first estimated a typical peculiar velocity for galaxies in the Local Volume.We based this on galaxies with redshift independent measurements (see next subsection) with errors <10%.We then used the CMB-relative velocity measurements from HyperLeda to calculate the peculiar velocity of each galaxy using v pec = v rad − H 0 d assuming H 0 = 70.We then measured the width of this distribution using the 16th and 84th percentiles, taking half the width to get a typical peculiar velocity of ∼400 km s −1 .
Our redshift based distance errors were then calculated by adding and subtracting this typical peculiar velocity from the Heliocentric velocity and recalculating the distance using the Cosmicflows-3 Distance-Velocity Calculator.The final redshift-dependent distance error uses a symmetrization of the difference between upperand lower-limit corrected distances.These distance errors are typically ∼65.3% at 10 Mpc and ∼17.4% at 40 Mpc.

Redshift Independent Distances
We use redshift independent distances from three sources: (1) the updated LVG catalog distances and (2) the compilation of redshift independent distances in the NED distance database (NED-D Steer et al. 2017).We chose to give highest priority to the LVG catalog distances, as these distances have typically represent the best available measurements and are kept up to date.We manually examined distance estimates where the values from LVG conflicted with those NED-D and NED-MED, and found LVG to typically be superior; for instance the distances to the Maffei group galaxies were based on the recently updated TRGB distances presented in Anand et al. (2019).
For galaxies in NED-D, we chose our distances using the following list of primary indicators, in order of precedence based on the order given in Steer (2020): Cepheids, TRGB, RR Lyrae, Red Clump, SNIa, FGLR, Horizontal Branch, SBF, GCLF, CMD, Type II Cepheids, Miras, PNLF, AGB, Carbon Stars, SNIa SDSS.We use the single most recent preferred distance estimate of the preferred, except if two preferred measurements were from the same year, in which case we apply an average of these preferred distances and errors.If NED-D provided more than two older measurements from the best primary indicator, we ensured that the newest distance was not greater than two standard deviations from the averaged older distances.For galaxies with large spreads in their preferred distances, we use the mean of all preferred distances with the best primary indicator.We encode this information in the field 'dist nedd flag' -0: uses single best NED-D measurement, 1: best NED-D distance averaged from multiple measurements, 2: NED-D best indicator mean used, due to discrepancy, as previously described.

Galaxies in Virgo
Due to the nearby Virgo Cluster's large velocity dispersion, redshift based distances to Virgo are much more inaccurate than for galaxies in the rest of our sample.Because of this, we treated Virgo as a special case for distance purposes.We matched and compared our sample to galaxies from Mei et al. (2007), the Extended Virgo Cluster Catalog (EVCC; Kim et al. 2014) -we included "certain members" from their catalog as special cases to have their distances revised.These galaxies have velocities within the range of galaxies gravitationally bound to Virgo at a given projected radius.
For these galaxies, we first combined the separate redshift-based and redshift-independent distance lists, using whichever available option had the lowest fractional error.Galaxies with surface brightness fluctuation distances from from Mei et al. (2007) are assigned that distance and error.Other Virgo galaxies with only redshift based distances available are assigned the average distance of 16.5 ± 1.1 Mpc given by Mei et al. (2007).For galaxies with redshift-independent distances, we found more divergent values than expected, including some that didn't agree with Virgo distances.After careful evaluation of individual cases, we assigned the 16.5±1.1Mpc distance for most objects, keeping the redshift independent distance only if |D − 16.5| > 3σ, where σ is the error on the distance; this is a total of 862 objects.

Final Compilation of Distances
For galaxies with multiple distance sources, our aim was to identify the more reliable and precise distance source.Our distance sources included: Virgo distances, LVG distances, NED-D distances with lowest error, redshift-based distances from Cosmicflows-3 calculator.We prioritized Virgo distances, and otherwise chose the distances with the lowest error between NED-D and CF3 z-distances.
Of the 55 objects which had no assigned distance from any of our sources, 29 are assigned the given distances and errors from HyperLeda.The remaining 26 are NSAonly objects for which velocity measurements returned a Cosmicflows-3 corrected distance D ≤ 0; we don't supply a distance in this case.
As well as best distances (the bestdist column) and errors (bestdist error), each galaxy in the catalog has a field for specific distance indicator if available.We have also included the field bestdist source, the values of this are -0: Virgo, 1: LVG, 2: NED-D, 3: redshiftbased, 4: HyperLeda, 'nan':No Distance.The final distribution of galaxy distances, separated using these classifications, is shown in §4.

MASS ESTIMATION
Our goal in this section is to estimate consistent stellar masses across the full sample based on galaxy colors and luminosities.We use the g − i vs. M/L i color-M/L relation from Taylor et al. (2011) as the basis for our measurements and derive self-consistent relations in other colors using galaxies with overlapping measurements.We use the g−i vs. M/L i relation as our preferred relation as previous work has found it is accurate to ∼0.1 dex due to the alignment of the extinction and stellar population effects (Zibetti et al. 2009;Taylor et al. 2011).We note that while NIR photometry is widely used to estimate stellar masses assuming a constant M/L (e.g.Meidt et al. 2014), this approach is compromised in galaxies with younger stellar populations by its sensitivity to difficult to model TP-AGB stars and hot dust emission (e.g.Kriek et al. 2010;Taylor et al. 2011;Mc-Gaugh & Schombert 2014;Telford et al. 2020).
One challenge in deriving stellar masses in our sample is the different photometry available for different galaxies.This creates two issues, first, McGaugh & Schombert (2014) has shown that individual color-M/L relations are not consistent across multiple bands, i.e. that using different colors in the Bell et al. (2003) or Zibetti et al. (2009) relations gives systematically different masses for observed galaxy colors.Second, many available relations are not available across both the Sloan and Johnson photometry that we are using for our galaxy sample (e.g.Roediger & Courteau 2015).To solve both of these issues, we therefore translate the Taylor et al. (2011) g−i vs. M/L i into the other colors used ((g−r) 0 , (B − V ) 0 , and (B − R) 0 ) using overlapping galaxies to ensure self-consistency.
We first used Equation (7) from Taylor et al. (2011): to calculate galaxy stellar mass estimates (log(M ⋆,i /L i )) using (g − i) 0 and L i derived from NSA data.To translate this relation into other bands, we used overlapping samples.First, for 6677 NSA galaxies with both (g − i) 0 and (g − r) 0 color available, we compared the stellar masses calculated from i−band to the r−band luminosities, i.e. log(M ⋆,i /L r ).We then use this to fit the M ⋆,r /L r vs. (g − r) 0 relationship using the RANSAC fitting from sklearn.linear model to create a linear fit to this comparison while ignoring outliers as shown in Fig. 7.We followed the same process with 986 overlapping galaxies between NSA and HyperLeda to translate the color-M/L relation to (B − V ) 0 .Our sample lacks significant overlap between galaxies with both (g − i) 0 and (B − R) 0 , due to the latter coming primarily from southern sky observations.We therefore used the same method on 822 galaxies with (g − r) 0 and (B − R) 0 to bootstrap an appropriate and self-consistent M/L relation.The resulting equations from these transformations are: We multiplied the color-based M/L ratios by their corresponding luminosity to calculate a stellar mass estimate.We created separate columns for mass estimates from each color: g − i, g − r, B − V , and B − R. To compile our final column of "best" mass estimates for 11740 galaxies, we gave precedence when available to (1) 6626 log(M ⋆,i ) values calculated using NSA data using Equation 1, (2) 2705 log(M ⋆,r ) values from SGA data using Eq. 2, (3) 1182 log(M ⋆,V ) values from HyperLeda using Eq. 3, (4) 1227 log(M ⋆,R ) values from retrieved NED photometry using Eq. 4.
To obtain errors on the mass estimates, we use the scatter from multiple estimates for the same objects.
To do this, we take the sample of 986 galaxies with both (B − V ) 0 and (g − i) 0 colors, and compare the masses derived with these two separate colors.This comparison includes scatter in the mass estimates due to different bandpass information and independent photometric measurements.The 68 percentile of the mass differences is 0.136.We use this value as an assumed floor on the log(M ⋆ /M ⊙ ) error.We also propagated distance errors by calculating high and low masses adding and subtracting bestdist error.For our final logmass error, we use the larger of these two error estimates.

Mass Comparisons
Our mass estimates are based on those of Taylor et al. (2011) and thus should be consistent with the widely used GAMA mass functions (Baldry et al. 2012;Kelvin et al. 2014;Wright et al. 2017;Driver et al. 2022) that provide the deepest mass function estimates for galaxies in the local Universe.The Taylor et al. (2011) mass estimates are based on the Bruzual & Charlot (2003) stellar population models and assume a Chabrier (2003) mass function and the dust extinction law from Calzetti et al. (2000).There is an extensive comparison of how this color-M/L relation compares to earlier color-M/L relations in Taylor et al. (2011).In particular, their Fig. 13 shows a their relation is ∼0.2 dex below the Bell et al. (2003) relation (due to IMF differences) and a similar normalization but significantly different slopes from the Zibetti et al. (2009) relationship due to differences in the weighting of stellar population models.Here we compare our derived masses against dynamical mass estimates from ATLAS 3D Cappellari et al. (2011) and IR based mass estimates from the S4G survey (Sheth et al. 2010).

ATLAS 3D Dynamical Mass Comparison
ATLAS 3D is a volume-limited sample of 260 nearby early-type galaxies Cappellari et al. (2011) within 42 Mpc.Dynamical masses were derived using Jeans anisotropic modeling of the integral field kinematics in Cappellari et al. (2013a) and Cappellari et al. (2013b).We compare our masses to those derived from the stellar-only dynamical (M/L) stars estimates from Cappellari et al. (2013b), shown in Fig. 5.The ATLAS 3D masses are mostly larger than our masses, with a median offset of 0.34 dex.The offset is near zero in the lowest mass galaxies (log(M ⋆ /M ⊙ )<10), and is highest (nearly 0.5 dex) at the highest masses.These offsets are consistent with the findings of Cappellari et al. (2012), who found that a vast majority of ATLAS 3D galaxies had dynamical mass estimates that were heavier than stellar population model estimates based on a Chabrier (2003) IMF, with a typical offset roughly a factor of 2 (0.3 dex) higher.They found that this offset strongly depends on the observed velocity dispersion, with the highest dispersion and mass galaxies having the largest offsets, just as observed here.So overall, this comparison suggests that our mass estimates are accurate for their assumption of a Chabrier IMF, but that assumption may be a poor one especially for the most massive early-type galaxies, and our stellar mass estimates may significantly underestimate the masses of these galaxies.

S4G IR-based Mass Comparison
The Spitzer Survey of Stellar Structure in Galaxies (S4G) is a volume-, magnitude-, and size-limited survey of over 2300 nearby galaxies at 3.6 and 4.5 µm (Sheth et al. 2010).The available mass estimates are based on IR luminosities using the relation of Eskew et al. (2012).This relation uses the resolved star stellar mass estimates assuming a Salpeter IMF of the LMC from Harris & Zaritsky (2009) and combines this with IR fluxes 3.6 and 4.5 µm to get a luminosity to stellar-mass conversion.This relation was applied to all the S4G galaxies to derive stellar masses available in the v. 2 S4G catalog 12 .Using these values, we match 2346/2352 galaxies (99.7%) and plot our derived masses vs. the S4G Eskew et al. ( 2012) masses (after subtracting 0.24 dex to account for the difference expected in mass between a Chabrier and Salpeter IMF Cappellari et al. 2012) in Fig. 5.After the correction for the Salpeter IMF, we see very close agreement between our masses with just a bias of just 0.03 dex (with S4G masses being very slightly more massive).The one-to-one relation extends over a wide range of galaxy masses, and the 1σ scatter around this relation is just 0.24 dex.Overall, we find good agreement between the S4G masses and the masses we derive after accounting for different IMF assumptions.

Mass Completeness
Many science applications benefit from complete, volume-limited samples of galaxies.To gauge the com-pleteness of our sample, we compare our galaxy catalog to mock galaxy samples created from galaxy stellar mass functions.Above, we have derived masses for our galaxies based on the color-stellar mass relations used in the GAMA survey (Baldry et al. 2010;Taylor et al. 2011); we therefore use the GAMA galaxy stellar mass functions to evaluate the completeness of our catalog.
For the full sample of galaxies, we use the most recent stellar mass function published by GAMA from Driver et al. (2022), which extends down to M ⋆ ∼ 10 7 M ⊙ .We simulate a galaxy population with a volume of 4/3π(50 Mpc) 3 based on the Driver et al. ( 2022) mass function.We then compare the number of galaxies with available masses in our catalog to this mock galaxy sample in 0.2 dex mass bins to calculate the fraction of the expected number of galaxies that we detect.We then translate this fraction to estimate the "Volume Completeness Radius", which we define as the radius out to which we'd have a complete sample based on the total number of galaxies in our sample.For instance, if we detect 50% of the number of expected galaxies, this translates to a Volume Completeness Radius of ∼40 Mpc.Fig. 6 shows that the catalog is >50% complete at M ⋆ ≳ 10 8.5 M ⊙ , which corresponds to an effective volume with radius R ∼ 40 Mpc.At lower masses, the completeness drops significantly.At the highest masses, the sample also appears to become less complete, a point which we come back to below.
The Driver et al. (2022) paper doesn't have comparable morphology or color separation to what we have done here, so we use an earlier GAMA paper by Baldry et al. (2012), which is more directly comparable to our color-based morphology separation (i.e. the color type in our sample).We note that the color-separation used here and in the Baldry et al. (2012) are similar but not identical.Fig. 6 shows that for the late-type galaxies, the Volume Completeness Radius rises to the expected 50 Mpc at stellar masses of ∼10 10 M ⊙ .The early-types achieve a similar completeness at much lower masses, but then the completeness declines at the highest masses.Values greater than one in this figure could be due to: (1) a real difference in the number density of galaxies in the local Universe than in the regions used for the Baldry et al. (2012) mass function, or (2) differences in the color separation between our study and theirs.Note that a direct comparison using identical color separations isn't possible due to their use of u − r colors in separating their populations, and the lack of u − r colors for many galaxies in our sample.The bottom panel of Fig. 6 shows the fraction of early-type galaxies as a function of stellar mass in our sample compared to the Baldry et al. (2012) mass function.This shows that the fraction of red galaxies is higher than expected in the local Universe at low masses, but lower at higher masses.
The lack of high mass galaxies overall, and specifically high mass early-type galaxies is a surprising result, because we expect to be complete at the highest masses.This therefore implies that these galaxies are underdense in the local Universe relative to the volumes probed in the GAMA survey.We re-emphasize that our stellar mass estimates should be consistent with GAMA, thus making the difference more likely to represent a real deficit of massive early-type galaxies locally relative to the GAMA volume.Colored lines show comparisons between color type separated catalog galaxies against color-separated mock sample created using the Baldry et al. (2012) galaxy stellar mass functions.Bottom-Percent of galaxies classified as "Red" by our catalog and the Baldry et al. (2012) sample in each bin.

Galaxy Morphologies
Many galaxies in the nearby universe have been morphologically classified using the Hubble and de Vaucouleurs classification systems.Galaxies can be separated into two broad categories: early-types are bulgedominated, elliptical or lenticular in shape, and host little-to-no current star formation.Late-types have a spiral disk shape with smaller central bulges and undergo current star formation.Numerical Hubble T classification ranges from -6 to 10, with negative values cor-  (2011) relation to other colors to ensures consistency between masses calculated in different bands.We note that although this (g − r)0-M/Lr relation is used for galaxies with SGA data, we fit this relation with just NSA photometry to ensure a large sample of galaxies and minimize photometric outliers.
A similar process was used to also create the other color-M/L relations given in §4.responding to early-type galaxies and positive values to late-types.In addition, early-and late-type galaxies are often separated using galaxy color-magnitude or color-mass diagrams.At a given stellar mass, earlytype galaxies are typically associated with redder colors, while late-types are associated with bluer colors.In this section, we discuss the source of our morphological type information.We then also describe how we translate this into a color-based separation of early and late-type galaxies.We then evaluate the X-ray active fraction as a function stellar mass and morphology/color in §6.
We use the numerical T -type translation of the morphological classifications in this work.Both HyperLeda and LVG provide numerical T -types for 13744 of the 15611 (89%) galaxies in our catalog.Of the remaining galaxies, 603 sample galaxies have available color information.Furthermore, as can be in the left panel of Fig. 9, the early-and late-type galaxies (a.k.a. the red sequence and blue cloud) show considerable overlap, especially for low-mass (<10 9 M ⊙ ) galaxies.In order to give all sample objects an appropriate morphology, we worked to minimize the potential for morphological misclassification and calculated a color-based type for all galaxies with available color information.
To separate galaxies into early-and late-types using the color-mass diagram, we first calculated a morphological separation line by minimizing the misclassification of galaxies with both morphological T -types and NSA colors.We separated galaxies as early-and late-type based on their T -type (with T > 0 being late-type), fit a line to the color-mass diagram that minimizes the number of misclassifications, i.e. late-types redder than the line, and early-types bluer than the line.To fit this line, we first divided the data into 0.25 dex mass bins and calculated the fraction of misclassifications as a function of (g − i) 0 color.We then plotted the (g − i) 0 value with minimum misclassification in each bin and found the best-fit line to to establish our color-based morphological separation line.We note that we only fit the line between log(M ⋆ /M ⊙ ) of 8.5 to 10.5 where galaxy mass estimates are more robust.This line's equation is: This equation appears as the upper dashed line in the right panel of Fig. 9. Visual inspection of SDSS imaging of the misclassified galaxies using our morphological separation line suggested that while most red late-types were showed distinct signatures of being dusty late-type galaxies (i.e.spiral arms or clumpy regions of active star formation), many blue early-types were in fact not properly classified and also showed these signatures of ongoing star formation.This was especially true for early-type galaxies that had colors much bluer than our morphological separation line.To better separate early-and late-type galaxies we decided to reclassify very blue early-type galaxies.We created a second reclassification cutoff line below which we reclassified these galaxies in our best type column (as described below) .To create this line, we sorted the early-type contaminants by color in .Morphological classifications of our galaxies in the color-mass plane.Left -a color-mass diagram of 6677 galaxies with g − i data from NSA and morphologies from HyperLeda and LVG.Right -the same color-mass distribution (early-types in red contours, late-types in blue), with the black line showing our linear fit to most cleanly separate early-and late-type galaxies; this fit is extrapolated to larger and smaller masses (gray line).The dotted line and blue shaded regions shows where we reclassify blue early-type galaxies as late-type based on visual inspection of these galaxies (see §5).
0.25 dex mass bins.For each mass bin, we visually classified these contaminants as early-or late-type and then calculated threshold color such that legitimate misclassification was minimized within the cutoff.Then, as above, we fit reclassification cutoff line to the colors determined in each bin to get: This equation appears as the lower dotted line in the right panel of Fig. 9.
To apply our color-based reclassification across our full sample, we needed to transform our morphological separation and reclassification cutoff lines to (g − r) 0 , (B − V ) 0 , and (B − R) 0 .Note that for deriving the (B − R) 0 we added (B − R) 0 data from Cook et al. (2014) for galaxies in our sample.Using overlapping galaxies and using RANSAC as described above, we used the following color transformations.
Applying these color-transformations we get the following morphological separation lines: and the reclassification lines: For our final catalog, we provide several morphological types.In addition to our catalog columns for numerical T -types and their source, we created two type columns which broadly categorize galaxies into 'early' or 'late'.Our color type classifies a galaxy as early if its color lies above the morphological separation line, and late if it lies below the line.Our recommended best type column follows this process: (1) most galaxies are classified based on their T -type from HyperLeda or Karachentsev et al. (2013) with t ≤ 0 = 'early' and t > 0 = 'late'.
(2) The 387 early-type galaxies that have colors bluer than the reclassification cutoff are reclassified as latetype.
(3) 603 galaxies without a morphology are assigned their color-based type.We reclassify each galaxy based on the color used to estimate mass in the final catalog.
In total, we have best type values for 14347 out of 15611 galaxies, and color type values for 11740 galaxies.We show the masses of galaxies separated into earlyand late-type in the left panel of Fig. 10, and also include the masses of all galaxies with T -types in the right panel.

Galaxy Environments: Group Catalog Assignment
The galaxies in the local Universe span a wide range of environments, from dense clusters like Virgo to galaxies in voids.One common way of identifying galaxies in dense environments is through group catalogs.To enable easy identification of galaxies in groups in our survey, we use the Lambert et al. (2020) group catalog based on the 2MRS survey.This group catalog identifies 3022 groups from the 2MRS catalog (which extends well beyond our 50 Mpc limit) using a friends-of-friends algorithm.
To match our galaxy catalog, we first used Table 3 from Lambert et al. (2020), which gives individual galaxies that are part of a group or subgroup using the paper's friends-of-friends algorithm.We crossmatched our sample to galaxies with v cmb < 3867 km s −1 (based on our highest redshift galaxy), resulting in 2463/2864 unique matches within 30 ′′ .We wanted to ensure we also matched low-mass sample galaxies which might not be included in the Lambert et al. ( 2020) catalog due to the magnitude limit of 2MRS.To do this, we used Table 1 from Lambert et al. (2020), which includes group positions, comoving distances, velocity dispersion, and radius/R 200 estimates.We crossmatched our entire sample to find the nearest group for each galaxy, considering sample galaxies to be part of a group if they fulfilled two requirements: (i) the velocity for our sample galaxy must be within 2× the velocity dispersion of the group, and (ii) the angular separation between between the galaxy and its nearest group matches was smaller than the angular radius of that group, (d ≤ R 200 ).Using this three-dimensional approximation, we flagged an additional 790 sample galaxies galaxies as likely group members.As expected, the median log(M ⋆ /M ⊙ ) for the distinct group method matches is ∼8.4,compared to ∼10 for the direct galaxy matches.
We assigned Lambert et al. (2020) Group IDs to 3253 sample galaxies, giving precedence to those with direct galaxy matches.This simultaneously flags galaxies as being part of a group, and easily allows users to seek further group information (including group radius and velocity dispersion) by cross-correlating our catalog with the catalogs available from Lambert et al. (2020) on the publisher's website.
The Virgo cluster is indicated in the Lambert et al. (2020) catalog as cluster 2987 -of the 347 galaxies with that group identification, we find 276 of them (79%) are included in Virgo based on distance sources in §3.This is a small fraction of the 907 total galaxies that we identify as being part of Virgo in our catalog because the Lambert et al. (2020) catalog is based on 2MASS and thus misses many fainter Virgo galaxies.

Background
The initial motivation for creating this sample was to quantify the demographics of central black holes in nearby lower mass galaxies.These demographics can provide constraints on the formation mechanisms of cen-tral massive black holes (see recent review by Greene et al. 2020).X-ray measurements of AGN in nearby galaxies ≲ 50 Mpc can provide deep and clean enough observations to constrain the occupation fraction of black holes in galaxies below 10 10 M ⊙ .Results on large samples of early-type galaxies (Miller et al. 2015;Gallo & Sesana 2019) suggest a high occupation fraction for 10 9 to 10 10 M ⊙ galaxies, in agreement with dynamical measurements in a handful of nearby early-type galaxies (Nguyen et al. 2018(Nguyen et al. , 2019)).However, a majority of galaxies below 10 10 M ⊙ are late-type galaxies (Baldry et al. 2012), and thus far no suitable galaxy sample exists to compare the X-ray active fraction (and thus occupation fraction) of late-type galaxies to the studies of early-type galaxies at masses below M ⋆ ∼10 10 M ⊙ .Recent compilations of archival X-ray data in nearby galaxy samples cut out most low mass nearby galaxies by requiring redshift independent distance measurements (She et al. 2017a,b), or by including only brighter samples dominated by massive galaxies (Bi et al. 2020).The X-ray active fractions of more distant dwarfs (between z of 0.08 and 2.4) have also been studied by Mezcua et al. (2018) in the COSMOS field.Our new 50 Mpc Galaxy Catalog provides the best starting point for constraining the local X-ray active fraction and the occupation fraction of central black holes.We focus specifically on trying to get as large and complete a sample as possible down to lower masses (i.e. 10 8 M ⊙ ), with stellar mass estimates and Hubble type measurements based on color index.Below we quantify the X-ray active fraction using our new catalog.
Detection of a nuclear X-ray source above 10 42 erg s −1 (the "classical" AGN threshold) is invariably associated with accretion-powered emission from a massive (i.e., non-stellar) black hole.Below this threshold, and even more so below 10 40 erg s −1 , emission from bright X-ray binaries could contribute to the detected signal.The expected luminosity from low-mass X-ray binaries in particular scales with stellar mass (Lehmer et al. 2019), which makes spatial resolution key for discriminating between low-Eddington ratio massive BHs vs. contaminants.Owing to the factor ∼4 higher resolution, Chandra is thus preferable to XMM-Newton for searching and identifying massive black holes with luminosities lower than traditional AGN.
We present the Chandra-based X-ray active measurements of our galaxy catalog here, without an attempt to model the distribution or account for contamination from X-ray binaries.However, in a subsequent paper, we will use this data to constrain the L X − M ⋆ relationship and the occupation fraction as a function of galaxy stellar mass including estimates of contamination from X-ray binaries (Gallo et al. in prep).

50 Mpc Active Fraction Measurement
We determined which galaxies in our catalog have available Chandra X-ray data using the Chandra Source Catalog 2.0 (Evans et al. 2010, and Evans in prep), which includes Chandra observations made public before the end of 2014.We search for X-ray sources coincident with our galaxies' nuclei using their catalog ra and dec.Note that our choice of coordinates have been discussed in §2.1.2,and the effect of uncertainties on these central coordinates is considered in more detail in §6.2.2.Using Chandra's CSCView13 , we retrieved two tables from a crossmatch with our catalog: (1) a table of limiting sensitivities for all 1506 objects with Chandra data, and (2) a table of detections within a given angular separation (1-3 ′′ ) of the galaxies' ra and dec.The second table of detections contains 291 unique galaxies with matching detections when using a 1 ′′ search radius (note that 20 objects returned multiple sources).We use our distance estimates (best dist) to calculate X-ray luminosities from the 0.5-7 keV fluxes.These have a median log(L X /erg s −1 ) of 39.21 (hereafter we just refer to this as log(L X )), with a full range of log(L X ) from 33.19-42.26.While some of these sources are likely contaminants from X-ray binaries and not accreting massive black holes, we attempt to minimize this contamination in two ways: (1) we restrict our Xray activity analysis below to the 249 (85%) of sources that have available log(M ⋆ /M ⊙ ) and log(L X )>38.3 (as in Miller et al. 2015;Gallo & Sesana 2019), and (2) we use a 1 ′′ match as our default matching radius.We note that previous work on X-ray binary contamination in similar Chandra observations has shown that only a small fraction (∼10%) of nuclear X-ray detections with log(L X )>38.3 are likely to be contaminants (e.g.Gallo et al. 2010;Miller et al. 2012;Foord et al. 2017).We include the columns chandra observation and chandra detection to flag matched catalog galaxies using our default 1 ′′ matching, and also include a chandra detection 3arcsec for galaxies with sources within 3 ′′ (see §6.2.2); the X-ray luminosity is given in the log lx column.
Of the 249 galaxies used for our active fraction analysis, 159 have available SDSS imaging.For visualizing the X-ray sources in our sample, we show 42 of these galaxies in Fig. 11.In the first three rows of the figure we show the 18 lowest-mass SDSS galaxies, and a sampling of galaxies at higher masses in the bottom half.The 18 lowest mass galaxies have log(M ⋆ /M ⊙ ) from 7.99-9.61;there are an additional 4 galaxies in this mass range without available SDSS imaging.

Local Active Fraction as a Function of Galaxy Mass and Type
We calculate the X-ray active fraction by dividing the number of galaxies with Chandra detections by the number of galaxies with Chandra observations.We use a 1 ′′ matching radius, to limit the impact of X-ray binary contamination and take advantage of Chandra's excellent resolution (e.g.Gallo et al. 2010), however, we examine the results of larger matching radii in the next subsection.Results of this X-ray active fraction binned by mass and separated by morphological type, are shown in Fig. 12.Although some detections may be caused by nuclear X-ray binaries, this fraction serves as an upper limit of AGN occupation for our catalog.Figure 11.Images of catalog galaxies with Chandra sources within 1 ′′ of their centers that have log(LX )>38.3 and have available SDSS imaging.The white circle is centered on the catalog's galaxy coordinates and shows the 1 ′′ radius used to match to Chandra detections, while the red dot marks the Chandra source position.Each image has a width of 20 ′′ .The galaxies are arranged in order of increasing log(M⋆/M⊙).The top three rows show the 18 galaxies with the lowest log(M⋆/M⊙), and represent 18/22 lowest mass galaxies with X-ray sources.The bottom four rows show galaxies at higher masses sampled evenly in log(M⋆/M⊙).Image labels provide catalog objname, logmass, and best type (red are early-types, and blue are late-types).
We observe an apparent difference in detection fraction between early-and late-type galaxies, particularly within the mass range 9 ≲ log(M ⋆ /M ⊙ ) ≲ 10.5.We examine the significance of this difference following the methods of Hoyer et al. (2021).We calculate p-values under the null hypothesis that both early-and late-type galaxies have the same X-ray active fraction.We split data into 0.25 dex bins from 7.75 ≤ log(M ⋆ /M ⊙ ) ≤ 11.25.We use the total X-ray active fraction in each bin as an estimator of the binomial success probability.Using this fraction, we draw n samples from a binomial distribution where n is the number of galaxies per bin in both the early-and late-type subsamples.We repeat this exercise 10 6 times, and use the fraction of estimates that match or exceed our observed difference between early-and late-type galaxies to estimate a p-value for each bin.Using Fisher's method (Fisher 1992), we combine all p-values into a single parameter, under the assumption that the null hypothesis is true and that the data in each mass bin are independent.The final p-value is 0.0004 over the entire mass range, thus suggesting that the observed enhancement of early-type galaxies is significant at a level >3σ.This significance increases if, for instance, we consider just the galaxies between 9 ≤ log(M ⋆ /M ⊙ ) ≤ 10.5.Thus, it appears there is a significant enhancement of X-ray detections in early-type galaxies versus late-type galaxies.We examine whether this result is due to a higher AGN fraction in early-type galaxies below.

Local Active Fraction Uncertainties and Systematics
One significant potential uncertainty is the locations of galaxy centers, and whether our catalog coordinates accurately reflect the true center of each galaxy.To gauge the impact of this uncertainty, we conducted the same analysis using a 2 ′′ and 3 ′′ separation constraint for Chandra detection matching.As shown in the left panel of Fig. 13, larger matching radii increase the Xray active fraction for both types overall.The difference between early-and late-type active fraction depicted in Fig. 12 still exists for larger separation constraints, but the statistical significance decreases to p = 0.0019 for 2 ′′ and p = 0.0104 for 3 ′′ .
The decreasing difference in X-ray active fractions between early-and late-types with larger matching radii could be explained if the centers of late-type galaxies were less certain due to their lower surface brightness and more complicated morphology.The central surface brightness does depend on galaxy mass & luminosity (e.g.Ferrarese et al. 2006;Blanton et al. 2003), with decreasing surface brightness in lower mass galaxies below roughly the Milky Way's luminosity/mass.Latetype galaxies also generally have lower surface brightness than early-type galaxies (e.g.Blanton et al. 2003).In addition, many of our late-type galaxies are irregulars; these dominate our late-type galaxy number counts below log(M ⋆ /M ⊙ )=8.0.Finally, distance plays a role in our ability to accurately locate the angular center of our galaxies, with more distant galaxies likely having more accurate angular coordinates.
As noted in §2.1.1,we see significant discrepancies between our catalog positions (mostly from HyperLeda) and the best available positions from NED.These offsets can give us a sense of how uncertain the nuclear positions are of each galaxy.In the right panel of Fig. 13, we show the fraction of galaxies with offsets larger than 1 ′′ , 2 ′′ , and 3 ′′ as a function of galaxy type (early vs. late) and stellar mass.The fraction of galaxies with significant position differences between NED and our catalog is noticeably larger for late-type galaxies than early-type galaxies in the mass range 7.5 ≲ log(M ⋆ /M ⊙ ) ≲ 9.5.For a 1 ′′ position match, the differences are large enough (>20% at some masses between early-and late-types), to explain the lower active fractions we see in latetype galaxies.These larger positional uncertainties are likely tied to the lower surface brightness and irregular morphology amongst our late-type galaxies.However, at higher masses (9.5 ≲ log(M ⋆ /M ⊙ ) ≲ 10.5), the positional uncertainty differences become smaller than the active fraction differences.To further examine the possibility of uncertain nuclear coordinates impacting our active fraction measurements, we visually examined all 17 galaxies (16 of which are late-type) with log(M ⋆ /M ⊙ ) ≲ 9.5 that had X-ray sources within 3 ′′ of their centers, but not within 1 ′′ .In 11 of these 17 galaxies, the X-ray sources are clearly offset from the galaxy centers.In another five galaxies it was ambiguous whether the X-ray source was nuclear or not, while in only one galaxy (NGC 598/M33), it was clear that the X-ray source was nuclear and our nuclear position is offset by ∼1 ′′ from the true center of the galaxy (this galaxy has a known nuclear star clusters that is visible in SDSS imaging).This shows that accurate nuclear po-sitions are an important factor to consider when examining the presence of AGN in nearby galaxies, especially in lower-mass late-type galaxies.However, it doesn't appear that errors in nuclear positions fully account for the difference we see between early and late-type galaxy active fractions.

Local Active Fraction as a Function of Galaxy Mass and Color
We also repeated the process to calculate X-ray active fractions using only color information to separate earlyand late-type galaxies (using the color type described in §5.1).This is shown in the lower-right panel of Fig. 12.When classifying type based only on color, X-ray active fractions appear similar in all mass bins.This change from our primarily morphological-based types results from a suppression of the X-ray active fraction in red galaxies relative to early-type galaxies.This is primarily caused by galaxies classified as late-type, but with red color which lie above our color-based type separation (Fig. 9).Visual inspection shows that the majority of these red late-type galaxies are dusty spiral galaxies, many of which are viewed at high inclinations.This suggests that despite the potential issues with visual morphological classification (see §5.1), that they provide valuable information on the galaxy type over and above the color-information.This is particularly true for red late-type galaxies, which are quite numerous and have accurate morphological classifications.
The active fraction measurement we present here is the first step in a more sophisticated occupation fraction analysis (including accounting for XRB contamination) that will be presented in a follow-up paper (Gallo et al., in prep).

CONCLUSIONS
In this paper, we present a robust catalog of 15424 nearby galaxies within 50 Mpc with self-consistent distance, mass, and morphological type measurements.This catalog was constructed primarily to enable X-ray active fraction and occupation fraction measurements, but the utility of this catalog spans a much broader range of possible applications.In particular, it provides a more complete catalog of lower mass galaxies than were included in recent similar studies (She et al. 2017a;Bi et al. 2020).
Our catalog combines galaxies from HyperLeda, the NASA-Sloan Atlas, and the Local Volume Galaxy catalog.We also include additional photometry from Siena Galaxy Atlas and NED.For galaxies with data from multiple sources, we compared values to the literature and include best possible measurements in the catalog.We extensively cleaned the catalog, including individual Fraction with Large Offset > 1" > 2" > 3" Late Early Figure 13.Position offsets and their potential effects on active fraction as a function of galaxy mass and type.Left -Comparison between the X-ray active fractions calculated using different separation radii to crossmatch with Chandra observations.The solid line shows active fraction using a matching radius of 1 ′′ , dashed lines use a radius of 2 ′′ , and dotted lines with a radius of 3 ′′ .Red and blue lines show early-and late-type galaxies respectively.Right -Fraction of catalog galaxies with large position offsets between our catalog and NED as a function of mass and type.The solid line shows fraction with offset > 1 ′′ , dashed lines with an offset > 2 ′′ , and dotted lines with offset > 3 ′′ .Large offsets are seen much more commonly in late-type galaxies than in early-type galaxies, especially at lower masses.
investigation into extreme outliers.Through individual investigation of SDSS imaging for NSA non-matches, we discovered that ∼ 13% of nearby NSA galaxies are contaminants.
We compile extinction-corrected magnitude, luminosity, and color information for 11674 galaxies (76% of our full sample).We use the galaxy colors to estimate masses by creating self-consistent color -mass-to-light ratio relations in four bands.We also provide errors on our mass estimates that include both the scatter around our color-M/L relations and distance errors.We also compare our masses to photometric masses from S4G and dynamical masses from ATLAS 3D .
Galaxy morphologies are available for 13744 galaxies.We combine this information with galaxy masses and colors to fit a line that optimizes the separation of earlyand late-type galaxies.Through visual inspection, we find that early-type galaxies bluer than this line are frequently incorrectly classified, and we therefore reclassify those that fall more than ∼ 0.13 mags below our morphological separation line.Red late-type galaxies are typically truly late-type; many of these are edge-on dusty galaxies.
In addition we: • combine best estimates for flow-corrected redshiftbased distances with redshift independent distances, with a focus on minimizing distance error, and including special treatment for galaxies in the Virgo Cluster.
• use galaxies with overlapping color information to calculate and provide empirical transformations of B − V , B − R, and g − r colors into Sloan g − i.
• identify galaxies in the catalog which belong to dense groups with the help of the Lambert et al.
Lastly, we present a preliminary analysis of X-ray sources and find that 249 out of the 1506 galaxies with existing Chandra observations have X-ray detections with log(L X )>38.3 and available stellar mass estimates.We present evidence that the X-ray active fractions for early-and late-type galaxies differ.In particular, we find the active fraction is higher for early-types within the mass range 9 <log(M ⋆ /M ⊙ )< 10.5.We show that increased astrometric uncertainties in late-type galaxies than early-types may be partially responsible for this difference in active fractions by galaxy type.In an upcoming paper (Gallo et al., in prep) we use this catalog to conduct the first robust analysis of AGN occupation fraction down to log(M ⋆ /M ⊙ )=8 both overall, and separated by morphological type.
We acknowledge the usage of the Siena Galaxy Atlas, based on the Legacy Survey.The Legacy Surveys consist of three individual and complementary projects: the Dark Energy Camera Legacy Survey (DECaLS; Proposal ID #2014B-0404; PIs: David Schlegel and Arjun Dey), the Beijing-Arizona Sky Survey (BASS; NOAO Prop.ID #2015A-0801; PIs: Zhou Xu and Xiaohui Fan), and the Mayall z-band Legacy Survey (MzLS; Prop.ID #2016A-0453; PI: Arjun Dey).DECaLS, BASS and MzLS together include data obtained, respectively, at the Blanco telescope, Cerro Tololo Inter-American Observatory, NSF's NOIRLab; the Bok telescope, Steward Observatory, University of Arizona; and the Mayall telescope, Kitt Peak National Observatory, NOIRLab.Pipeline processing and analyses of the data were supported by NOIRLab and the Lawrence Berkeley National Laboratory (LBNL).The Legacy Surveys project is honored to be permitted to conduct astronomical research on Iolkam Du'ag (Kitt Peak), a mountain with particular significance to the Tohono O'odham Nation14 .
The Siena Galaxy Atlas was made possible by funding support from the U.S. Department of Energy, Office of Science, Office of High Energy Physics under Award Number DE-SC0020086 and from the National Science Foundation under grant AST-1616414.

APPENDIX
A. FULL CATALOG  Magnitude for all galaxies; gi color -Estimated g − i color for all galaxies with color measurements; mag flag -Flags galaxies with magnitude preference exceptions as described in Section 2.3; cf3 dist -Distances from CosmicFlows3 Calculator; cf3 dist error -Error on cf3 dist; zind dist -Redshift independent distances; zind dist error -Error on zind dist; zind indicator -Redshift independent distance indicator listed in NEDD; bestdist method -General method used for best distances: CF3-Z, Karachentsev, NED-D, Mei, Cantiello, EVCC, HyperLeda; bestdist source -Source/Reference for bestdist, see Section 3; dist ned flag -Flags galaxies with NED-D best distance exceptions as described in Section 3.2; logmass gi -log(M * /M ⊙) from (g − i); logmass gr -log(M * /M ⊙) from (g − r); logmass BV -log(M * /M ⊙) from (B − V ); logmass BR -log(M * /M ⊙) from (B − R); chandra observation -True if CSCView crossmatch returned a limiting sensitivity at matching position as described in Section 6; chandra detection -True if CSCView crossmatch returned flux information within a 1 ′′ matching radius as described in Section 6; log lx -X-ray luminosity log(L X /erg/sec) calculated from the CSC 0.5-7 keV flux, using a 1 ′′ radius; chandra detection 3arcsec -True if CSCView crossmatch returned flux information within a 3 ′′ matching radius as described in Section 6; log lx 3arcsec -X-ray luminosity log(L X /erg/sec) calculated from the CSC 0.5-7 keV flux, using a 3 ′′ match radius.

Figure 1 .
Figure 1.Difference between HyperLeda galaxies with overlapping total ((B − V )t,0) and effective ((B − V )e,0) colors vs their (B − V )e,0 value.The red line shows a linear correction applied to (B − V )e,0 values to correct for the slight color-dependent offset to allow combination of colors into a single field.Correction coefficients are based on a linear fit from 0.2 ≤ (B − V )e,0 ≤ 1.0.

Figure 2 .
Figure 2. Examples of SDSS imaging of NSA contaminant objects which are removed from our catalog.(a) Imaging shows a bright, star forming region of a galaxy (NSAID: 177945).(b) A foreground star centered on a distant galaxy which causes incorrect redshift values for the galaxy(NSAID: 401970).(c) An imaging artifact treated as a galaxy by NSA (NSAID: 460798).(d) An artifact from a bright star in the top left surrounds a nearby star in the center, which causes both classification as a galaxy and provides redshift of the star (NSAID: 460798).

Figure 3 .
Figure 3.The distribution of the B-band luminosities log(LB) for the entire catalog (shaded histogram), along with the subset that have available masses (dark gray line).The colored lines indicate the source of photometry used in estimating the galaxy masses (see §4.)

Figure 4 .
Figure 4. Distribution of our chosen best distances for the entire catalog (shaded histogram).Colored lines indicate the source (bestdist source) used for the final compilation of distances (see §3).
Figure 5.Comparison of our mass estimates (logmass best) against those from S4G (blue;Sheth et al. 2010;Eskew et al. 2012) and ATLAS 3D dynamical stellar mass estimates (red;Cappellari et al. 2011).The S4G masses are corrected from a Salpeter to Chabrier IMF by subtracting 0.24 dex for consistent comparison to our catalog.The solid line shows the one-to-one relation, while dashed lines indicate an 0.5 dex offset.

Figure 6 .
Figure 6.Completeness of our catalog as a function of stellar mass.Top-Mass completeness of our catalog, using 0.2 dex bins.Y-axes show the radius of effective volume left and completeness fraction right.The black line compares all catalog galaxies against a < 50 Mpc mock sample created using the Driver et al. (2022) galaxy stellar mass function.Colored lines show comparisons between color type separated catalog galaxies against color-separated mock sample created using theBaldry et al. (2012) galaxy stellar mass functions.Bottom-Percent of galaxies classified as "Red" by our catalog and theBaldry et al. (2012) sample in each bin.

Figure 7 .
Figure 7.An example of how we transform the Taylor et al.(2011) (g − i)0-M/Li relation to (g − r)0-M/Lr.The y-axis is determined by taking masses calculated from the NSA data using the (g − i)0-M/Li relation, and then dividing by the NSA Lr values.This is then plotted against NSA (g − r)0 values and fit using a robust outlier-rejection algorithm, RANSAC.Also shown is the somewhat steeper relation fromZibetti et al. (2009).Our translation of theTaylor et al. (2011) relation to other colors to ensures consistency between masses calculated in different bands.We note that although this (g − r)0-M/Lr relation is used for galaxies with SGA data, we fit this relation with just NSA photometry to ensure a large sample of galaxies and minimize photometric outliers.A similar process was used to also create the other color-M/L relations given in §4.

Figure 8 .
Figure 8. Distribution of log(M⋆/M⊙) estimates for the entire catalog (shaded histogram).Colored lines indicate which colors are used to estimate stellar mass, using colormass-to-light ratio relations and precedence as described in §4.
Figure9.Morphological classifications of our galaxies in the color-mass plane.Left -a color-mass diagram of 6677 galaxies with g − i data from NSA and morphologies from HyperLeda and LVG.Right -the same color-mass distribution (early-types in red contours, late-types in blue), with the black line showing our linear fit to most cleanly separate early-and late-type galaxies; this fit is extrapolated to larger and smaller masses (gray line).The dotted line and blue shaded regions shows where we reclassify blue early-type galaxies as late-type based on visual inspection of these galaxies (see §5).

Figure 10 .
Figure10.Mass distribution of galaxies as a function of their morphology.Left -Distribution of log(M⋆/M⊙) for the full catalog (shaded histogram), using best type as described in §5.1 to separate into early-(red) and late-types (blue).Right -Comparison of numerical T-Type vs log(M⋆/M⊙) for all catalog galaxies with an available t type.Color gradient is included for visual representation of galaxy type and coincides with T-Type value.

Figure 12 .
Figure12.X-ray active fraction of catalog galaxies as a function of galaxy mass and type, using a 1 ′′ matching radius.Top row -the log(M⋆/M⊙) distribution of catalog galaxies with Chandra data.The solid line shows galaxies with Chandra observations, and the filled histogram shows galaxies with Chandra detections with log(LX )>38.3 for (a) early-types and (b) late-types.Panel c -X-ray detection fraction as a function of log(M⋆/M⊙) and separated by best type, a combination of morphological and color-based types (see §5).Panel d -X-ray detection fraction as a function of log(M⋆/M⊙) and separated using color type.Shaded areas in bottom figures show Agresti-Coull binomial confidence intervals.
-Right Ascension provided by NSA; dec nsa -Declination provided by NSA; ra ned -Primary Right Ascension provided by NED; dec ned -Primary Declination provided by NED; d25 -Apparent Diameter from Karachentsev and HyperLeda; v cmb -Radial velocity with respect to CMB radiation; v source -Original source for compiled velocity: HyperLeda, LVG, NSA; hl obj -True for objects in HyperLeda; lvg obj -True for objects in Karachentsev's Catalog of Local Volume Galaxies; nsa obj -True for objects in NASA-Sloan Atlas; sga obj -True for objects in Siena Galaxy Atlas; color type -Color-based Type, see Section 5.1; a B leda -B-band extinction from HyperLeda multiplied by 0.86 to translate to Schlafly, A V = 0.769 * a B leda and A R = 0.629 * a B leda ; a g nsa -g-band extinction from NASA-Sloan Atlas, the i-band extinction used is = 0.550 * ag nsa ; EBV irsa -E(B − V ) value from IRSA dust website Schlafly values for all galaxies.Used only for the Siena Galaxy Atlas sources, Ag/EBVsga = 3.303, Ar/EBVsga = 2.285; Bt0 leda -Extinction corrected total B band magnitude from HyperLeda; BV color leda -(B − V ) color from HyperLeda; B lum -B-band Luminosity -derived for non-HyperLeda sources as described in Section 2.2; gi color nsa -Extinction corrected (g − i) color from NASA-Sloan Atlas; i lum nsa -i-Band Luminosity, calculated using Mi,⊙ = 4.53; gr color sga -Extinction corrected (g − r) color from Siena Galaxy Atlas; r lum sga -r-band Luminosity, calculated using Mr,⊙ = 4.65; BR color ned -Extinction corrected (B − R) color from NED; R lum ned -R-band Luminosity from NED, calculated using M R,/odot = 4.60; BMag -Estimated absolute B-Band

Table 1 .
NSA objects that don't match galaxies in the Hyperleda and LVG catalogs.

Table 2 .
Stub of catalog, shortened to include only the most important columns Note-objname -Common galaxy name; ra -Right Ascension from combined sources; dec -Declination from combined sources; v h -Heliocentric radial velocity, see Section 3.1; t type -Numerical Hubble T-Type; best type -Combined galaxy type, see Section 5.1; bestdist -Our chosen best distance estimate; bestdist error -error for best distance estimate; logmass -Compiled best log(M * /M ⊙) estimate; logmass error -Error on logmass; logmass src -Color used for best mass estimate: g − i, g − r, B − V , B − R; Columns found in full catalog, available from the publisher and at https://github.com/davidohlson/50MGC:pgc -PGC number (from HyperLeda); nsa id -NASA-Sloan Atlas NSAID number; group id -Galaxy is member of group, number based on Lambert et al. 2020 group catalog; ra nsa