Multiwavelength Catalog of 10,000 4XMM-DR13 Sources with Known Classifications

We present a collection of $\sim10,000$ X-ray sources from the 4th XMM-Newton Serendipitous Source Catalog (4XMM-DR13) with literature-verified classifications and multi-wavelength (MW) counterparts. We describe the process by which MW properties are obtained and an interactive online visualization tool we developed.


INTRODUCTION
Collections of reliably classified X-ray sources can be used for training supervised machine learning algorithms which can then quickly classify large numbers of X-ray sources (see e.g., McGlynn et al. 2004;Yang et al. 2022;Tranin et al. 2022).While bright X-ray sources can often be classified solely from X-ray properties, the classification of fainter but much more numerous sources can greatly benefit from including properties of their multiwavelength (MW) counterparts.We created such dataset 1 from 4XMM-DR13 (Webb et al. 2020).

METHODS
We searched the literature described in Yang et al. (2021Yang et al. ( , 2022)), with the addition of the following class-specific catalogs Jackim et al. (2020) Luo et al. (2015) , for reliably classified sources.These sources were sorted into 9 broad astrophysical classes: active galactic nuclei (AGN), pulsars and isolated neutron stars (NS), non-accreting X-ray binaries (NS BIN) 2 , cataclysmic variables (CV), high-mass X-ray binaries (HMXB), low-mass X-ray binaries (LMXB), high-mass stars (HM-STAR) 3 , low-mass stars (LM-STAR) and young stellar objects (YSO).We then cross-matched these sources with 4XMM-DR13 within r = 10 ′′ , avoiding some particularly crowded environments (e.g., globular clusters, galaxies, the Galactic center), regions with complex diffuse X-ray emission (e.g., bright pulsar wind nebulae, supernova remnants) or infrared (IR)-bright fields (e.g., central parts of star-forming regions).Sources of populous classes (AGN, HM-STAR, LM-STAR, CV, and YSO) were omitted if the separations of their 4XMM-DR13 counterparts were larger than the combined positional uncertainties (PUs) from the literature and 4XMM-DR13 (at 95% confidence) or > 3 ′′ .For rare-type sources, we manually checked the counterparts by reviewing the publications on individual sources and inspecting the X-ray and MW images.Furthermore, all selected sources were matched to SIMBAD (Wenger et al. 2000), and sources with classifications conflicting with the main SIMBAD class were omitted from the dataset (unless a mistake in the SIMBAD class was identified from looking at the original publications).For 8.5% of our 4XMM-DR13 matches we were able to find matches in the Chandra Source Catalog version 2.1 (Evans et al. 2020) which provides more accurate positional information that we then adopted for these sources.
As a next step, within r = 30 ′′ from each selected X-ray source, we combined sources from five all-sky lower-frequency catalogs 4 into a single merged MW catalog.In this process, the sources from (near-)IR catalogs were first matched to Gaia DR3 sources, with Gaia source coordinates adjusted to each catalog's epoch using Gaia proper motions.If the separation between Gaia and other catalog sources was smaller than the 5σ PUs of the two catalogs, similar to Marrese et al. (2019), the two sources were considered as a match.For (near-)IR sources lacking Gaia DR3 counterparts, we used CatWISE2020 as a reference catalog, instead of Gaia DR3.If no match with Gaia or CatWISE2020 is established, then the remaining sources from 2MASS and AllWISE catalogs were matched with each other without any proper motion corrections.We found that the proper motion corrections from CatWISE2020 are not as accurate as those from Gaia, and we did not use them if the ratios of the total proper motions to their uncertainties were < 5 or the reduced chi-squared of the astrometry fitting > 1.5.We did not use the proper motions from AllWISE since they did not account for parallax.When determining the positions and the positional uncertainties of sources, we prioritize them from the Gaia, followed by CatWISE2020, 2MASS, and AllWISE catalogs, respectively, in cases where a source is matched to multiple catalogs.Finally, we used the probabilistic cross-matching algorithm NWAY (Salvato et al. 2018) to match the merged MW catalogs to the positions of X-ray sources while accounting for their individual positional uncertainties.Approximately 11% of the X-ray sources have more than one MW counterpart matched within the cross-matching radius.These X-ray sources were omitted from further consideration to avoid confusion.
Additionally, extended sources with Gaia BP/RP flux excess factor phot bp rp excess factor5 > 20 or extended source flag= 5 raised by AllWISE, and AGNs with large extinction values (E(B − V ) > 0.05) were removed from the TD.We also removed sources of HM-STARs, LM-STARs, and YSOs classes lacking MW counterparts because the sources from these classes are expected to have MW counterparts.Finally, we removed MW counterparts matched with isolated NSs by chance coincidence because virtually all isolated NSs (except Crab pulsar) are too faint to be detected in the surveys used here.A screenshot from the interactive visualization tool, available at https://yichaolin-astro.github.io/4XMM-DR13-XCLASS/, shows HR1 versus FX/Fo, with an LM-STAR source selected.One can see LM-STAR and AGN are two separate clusters while other source classes show some overlapping, thus requiring using different pairs of source properties for distinction.
Individual sources can also be selected for their information and MW images.

ACKNOWLEDGEMENT
This research has made use of data obtained from the 4XMM XMM-Newton serendipitous source catalogue compiled by the XMM-Newton Survey Science Centre consortium.This work was supported by NASA award 80NSSC22K1575.