Survey for Distant Stellar Aggregates in Galactic Disk: Detecting Two Thousand Star Clusters and Candidates, along with the Dwarf Galaxy IC10

Despite having data for over 10^9 stars from Gaia, only less than 10^4 star clusters and candidates have been discovered. Particularly, distant star clusters are rarely identified, due to the challenges posed by heavy extinction and great distance. However, Gaia data has continued to improve, enabling even fainter cluster members to be distinguished from field stars. In this work, we will introduce a star cluster search method based on the DBSCAN algorithm; we have made improvements to make it better suited for identifying clusters on dimmer and more distant stars. After removing member stars of known Gaia-based clusters, we have identified 2086 objects with |b|<10 deg, of which 1488 are highly reliable open star clusters, along with 569 candidates, 28 globular cluster candidates and 1 irregular galaxy IC 10 at low Galactic latitudes. We found that the proper motion of IC10 is similar yet slightly different from the water maser observations, which is an important result for the comparison with Gaia and VLBA. Besides, when compared with the star clusters appearing in Gaia DR2/EDR3, we have found nearly three times as many new objects above a distance of 5 kpc, including hundreds of them above Av>5 mag. And it has enabled us to detect a higher number of old clusters, over a billion years old, that are difficult to detect due to observational limitations. Our findings significantly expand the remote cluster sample and enhance our understanding of the limits of Gaia DR3 data in stellar aggregates research. The full figure set for 2085 clusters can be seen in \url{https://nadc.china-vo.org/res/r101258/}


INTRODUCTION
Gaia is a billion-star surveyor that can achieve astrometric uncertainty of tens of micro-arcseconds (Gaia Collaboration et al. 2016). This makes it an invaluable tool for studying the Milky Way and stellar objects in it, especially open clusters (hereafter OCs) (Cantat-Gaudin 2022). OCs provide accurate distance and kinematics information through the astrometric data of cluster members and a valuable wide age range (from 1 Myr to < 10 Gyr), which makes them ideal tracers for studying the Galactic structure and stellar evolution (e.g. Kuhn et al. 2019;Cantat-Gaudin et al. 2020a;Castro-Ginard et al. 2021), as intended by the Gaia mission (Perryman et al. 2001).
Moreover, most of the Gaia OCs and candidates found in the Galactic disk are within 5 kpc, with only ∼370 having a parallax lower than 0.2 mas based on Gaia DR2/EDR3 (Gaia Collaboration et al. 2021, EDR3). The primary limitation for finding distant OCs in Gaia data has been Galactic extinction, with most found OCs having an extinction value of less than A v = ∼3 mag (Cantat-Gaudin et al. 2020b, hereafter CG20). OCs with an A v greater than 5 mag were considered not be easy to detected, which has led to less focus on the search for those OCs. This absence of distant findings makes it challenging to trace the Galactic structures beyond 5 kpc based on OCs (e.g. Cantat-Gaudin et al. 2018;He et al. 2021b). However, although the optical observation limitations exist in the Galactic disk, we believe that the vast astrometric dataset in Gaia DR3 is still not being fully utilized.
Based on Gaia data, the DBSCAN algorithm (Ester et al. 1996) shows it is an efficient clustering method that has con-tributed significantly to new OC searches. Combined with the k th nearest neighbor distance (hereafter k th NND) algorithm, the pioneering work of Castro-Ginard et al. (2018) used them in Gaia data to identify 1214 new OCs and candidates in Gaia DR2 and EDR3 (Castro-Ginard et al. 2020, 2022. Inspired by Castro-Ginard et al. (2018), our previous research work used a two-Gaussian fit to k th NND histogram to obtain the clustering coefficient and included linear velocity/distance in the clustering vector. This improved method is more efficient in OC searches, as it is insensitive to the variable parallax and has identified 2541 new OC/candidates (He et al. 2021a(He et al. , 2022b(He et al. , 2023. and 616 known nearby clusters (He et al. 2022a). However, to study further regions, we need to use the angular scale and take narrower parallax cuts to reduce the impact of parallax errors.
For those OC objects present in Gaia data, Cantat-Gaudin & Anders (2020) presented an effective way to distinguish between a star cluster and an asterism based on the apparent radius and proper motion dispersions of cluster members. However, we found some clusters have low dispersion in astrometric data but poor Color-Magnitude Diagrams (hereafter CMDs) and fewer members. We found the presence of possible asterisms in the clustering results is due to its nearby stellar aggregates or the surrounding dense star fields. To avoid such impacts, here we introduce an improved method called "Two-Gaussian Fitting for Isolated Groups" (TGFIG) to obtain a more accurate clustering coefficient and distinguish possible star clusters from field stars more clearly. In this method, we clip stars in dense fields such as existing clusters and unclustered dense regions, which reduces the pseudo-cluster signal in DBSCAN clustering.
This work focuses on finding new star clusters in the Galactic disk. In Section 2, we introduce the data and star cluster catalogs we used, and in Section 3, we outline our TGFIG method and search steps, including cluster search, crossmatch, and isochrone fitting. In Section 4, we present the newly found results, including different types of objects and some interesting examples. Finally, we present our conclusions and prospects in Section 5.
2. DATA 2.1. Gaia astrometric data Compared with Gaia DR2, the astrometric data in Gaia EDR3/DR3 are superior. For example, at G = 17 mag, the average uncertainty of parallax in Gaia DR3 is 0.07 mas, compared to 0.1 mas in Gaia DR2. Additionally, the proper motions in Gaia EDR3/DR3 have improved threefold, from 0.2 to 0.07 mas yr −1 . It should be noted that for distant cluster searches, the improvement of parallax values may not change significantly when separating two distant stars. For example, comparing two stars at 1 kpc and 2 kpc, the parallax differential is 0.5 mas. For stars at 5 kpc and 10 kpc, the differential is only 0.1 mas, which approaches the mean parallax error for G = 17 mag. Since farther clusters are usually fainter than nearby objects, the member stars on the main-sequence are mostly under 17 to 18 mag. Therefore, the new parallax determination may not improve significantly for distant cluster searches, particularly for old clusters. However, the increased accuracy of proper motion can improve comparisons between cluster members and field stars. Thus, some undiscovered faint/distant clusters in Gaia DR2 could be found in Gaia EDR3/DR3.
For the search for new star clusters, we utilized Gaia EDR3 data, taking advantage of the astrometric data (l, b, , µ * α , µ δ ) to identify clusters and the photometric data (G, BP-RP) for isochrone fitting. To identify clusters located in the Galactic disk, we selected sources with |b| < 10 deg. Once we had identified all clusters, we cross-matched member stars with Gaia DR3 data to obtain radial velocity values. The crossmatching radius was 0.1 arc-second.
Unlike in most cluster searches, we did not impose an apparent magnitude cut here to include the entire database and to identify fainter and more distant stars. However, researchers can impose criteria for each member in the final tables. It should be noted that for some red faint sources, their magnitude may be overestimated, especially in the BP band (Riello et al. 2021). Therefore, when doing the isochrone fits, we imposed a magnitude cut of 19.2 mag, and we carefully checked each result. For input data near l = 0 deg, we correctly accounted for the transition from 360 to 0 deg near the Galactic centre.

Cluster catalogs
To build a catalog of new star clusters and to reduce the impact of high-density regions on clustering parameters (see Section 3.1), we needed to eliminate all known cluster (and candidate) members (Section 3.1), so we relied on previously reported OCs and globular cluster (hereafter GC) catalogs, including the works of Cantat-Gaudin et al. However, not all member stars of clusters have been reported, so we used dispersion (sigma) measures (e.g. CG20, He22) in (l, b, , µ * α , µ δ ) or standard deviations to extract possible cluster members, removing stars within 5-sigma range of (l, b) and 3-sigma range of ( , µ * α , µ δ ) from all cataloged objects. For catalogs with no dispersion measures or member star information, we used the median dispersion depending on the parallax of those clusters. Specifically, we selected clusters with ± 0.2 mas in CG20 and calculated the median dispersion of those clusters as the clip range. To extract most GC members, we used the fixed dispersion range of (0.1 deg, 0.2 mas, 0.2 mas yr −1 ).
Although most cluster members were clipped, these cuts may neglect clusters' tidal tails, stellar streams, or some extended structures. During the preparation of this work, several new cluster catalogs were published (Hao et al. 2022a;Chi et al. 2023;Qin et al. 2023), and we were unable to clip those cluster members in our search procedures. However, we also cross-matched those cluster catalogs after the search. In the days leading up to our submission, Hunt & Reffert (2023, hereafter HR23) reported a catalog of 7200 cluster objects,  with 4114 of these classified as reliable open clusters. We also  cross-matched the results with HR23 and present an online  table with matched cluster identifications. Additionally, we checked a total list of 129 GC candidates (hereafter GCCs) found in the last two decades that were not cataloged in Vasiliev & Baumgardt (2021), including Willman et al. (2005) Garro et al. (2022). Although all of those clusters were not found depending only on Gaia data, some member stars of those GCCs could be detected in Gaia data. Nonetheless, only a few of those GCCs have parallax and proper motion values, so we checked them through CMDs and positions after the search. In the end, we also cross-matched Gaia-based cluster catalog Dias et al. (2021), pre-Gaia clusters in Dias et al. (2002); Kharchenko et al. (2013) and present the newly identified pre-Gaia OCs in this work.
The main steps of the TGFIG method were introduced in He et al. (2021aHe et al. ( , 2023), but we have made improvements to make it suitable for distant OC searches. The method is based on the clustering algorithm DBSCAN. In an input data vector, such as Gaia astrometric data (l, b, , µ * α , µ δ ), the algorithm calculates the distances between different points in the vector, and the number of neighbors n point within radius ε is calculated. Based on two input parameters (ε, MinPts), which are the radius and minimum number, respectively, each point of the vector is labeled in three subsets: core member (n point MinPts), outer member (n point < MinPts, but the point is a neighbor of a core member), or noise (n point < MinPts, and it is not a neighbor for any core member). Here, we improved our previous searches, since some of the steps came from our previous studies, but to search for distant clusters, we made changes to the method.
Inspired by the work of Castro-Ginard et al. (2018), which adopts k th NND histograms to help get the ε values, we conducted a two-Gaussian fits method to get the ε values in different input vectors. In our studies, we considered that when a bound star cluster exists in the search region, the k th NND of the vector could be divided into two approximate Gaussian distributions, as shown in Figure 1. We use two Gaussian curves to fit such a signature: where (a, µ, σ) i/ j are the parameter of the Gaussian function, D ik is the distance between data points x i and x k . and the possible cluster will have a lower k th NND in the histogram, and once the fits have a real solution, it may catch a cluster signature. Such an estimate depends on the physically bound groups, so it is more reliable when ε is in the right profile of the Gaussian curve 1. However, for the real situation in OC searches, if there are some huge cluster/stream/large asterism/dense stellar regions in the input dataset, it may increase the ε value and cause the plausible signal for some unbound stellar groups, which may lead to unreal cluster candidates in the result. To reduce the effects of the problem, in this work, we take two-fold clipping procedures and clustering steps in the searches.
Firstly, we clipped all known clusters in the search field and ran the clustering. For each once-detected group, we recorded it as a possible target group. As the star clusters recorded in the articles are not the complete lists, we then re-clustered it (target group) in a 100 to 500 pc region in (l, b), at least 0.5 deg. In the 2nd clustering, we clipped other once-detected groups. Secondly, considering the field star contamination, especially for brighter star contamination to the further OCs, we clipped all stars beyond 5 times of the uncertainty in ( 0 ± σ , µ 0 ± σ µ ), where ( 0 , µ 0 ) are the median astrometric values, σ , σ µ are parallax and proper motion dispersions of the target group. After these clips, the densest data points are the target group in the input dataset, which will not be affected by any other dense groups, and the contamination could be reduced to the least level.
In this study, we divided the Galactic disk into 6480 3 × 3 deg (l 0 ± 1.5 deg, b 0 ± 1.5 deg) cells under Galactic coordinates, where (l 0 , b 0 ) represents the center of the cell. For each cell, there is an overlap of ± 0.5 deg. Within each cell, we applied the parallax cut of 0.2 mas, with a step of 0.1 mas. We considered the results within (l 0 ± 1.25 deg, b 0 ± 1.25 deg, 0 ± 0.1 mas) in each cell, with a (0.25 deg, 0.05 mas) offset to ensure that the results stayed within the boundary. We used the 3-sigma range in (l, b, , µ * α , µ δ ) to cross-match the results and extract the duplicate results in the boundary. We applied the DBSCAN clustering procedure on each field with 21 different MinPts values (6 to 25) and extracted tiny groups with a cut of group numbers (N core ≥ 5 | N all ≥ 30) in each step. We selected the results with the maximum N core as the final result to ensure unbiased MinPts values.
In contrast to our previous works, we used (l, b, , µ * α , µ δ ) as input data as (d · sin θ l · cos b, d · sin θ b , d · µ α * , d · µ δ , ) is only suitable for nearby clusters, which limits distant cluster searches. However, liner distance/velocity remains valuable for nearby cluster searches. It is noteworthy that the core member of the result is mostly located in the core part of the vector, and the outer member also has some large uncertainties but may still be member stars, particularly for faint stars down to ∼ 20 mag. This is limited by their low astrometric accuracy and will be better evaluated in future Gaia data releases.   Figure 1. Example of two-Gaussian fitting in k th NND histogram and the cluster result. CWNU 3455 is detected as a reliable cluster with bound member stars (a.); the k th NND histogram fittings shows clearly separate between field stars and member stars of the cluster (b.), along with its low dispersion proper motions (c.), magnitude-parallax distribution (d.), and clearly CMD with reliable isochrone fitting (e.). The cluster also may have many blue straggler stars (hereafter BSSs), that could makes it more interesting for researchers. Due to overestimation of BP band photometry (Riello et al. 2021) in the fainter end of the CMD (especially for G < 19 to 19.5 mag), the CMD shifts to the bluer side, resulting in unreliable parameters. Therefore, we excluded G > 19.2 mag members when ran the systematic isochrone fits (Section 3.3).

Visual inspection
To supplement the above automatic steps, we performed visual inspections on all groups. We kept a few un-redetected groups as the final result if they had bound stars and reliable CMDs, which were not detected in the second DBSCAN either due to the dense search fields or the field range (100 to 500 pc) being insufficient to compare them with field stars. Besides, we cross-matched all resulted clusters with GC candidates and OC catalogs, visually checked all cross-matches, and extracted some substructures (as described in Piatti et al. 2023) in the study.
Moreover, based on our visual inspection that considered k th NND histogram Gaussian fits, positions, and CMDs, we removed some (∼200) results those with unbound shapes and unreliable CMDs. We kept those clusters that showed separate structures in k th NND histograms and the main sequence is clearly shown in the core members, even though some member stars, particularly the outer members of the cluster, were not on the evolution sequence. For some clusters, their space distribution (l, b) was not bound or their CMDs showed heavy field star contamination, but they were still separated in the k th NND histogram fits, and we kept them as candidate OC objects (see Section 4.2).

isochrone fitting
For each OC or candidate, we performed isochrone fitting by first automatically fitting the data using our previous methods as described in He et al. (2021aHe et al. ( , 2023. We used the function: where c is the extinction coefficient, x k is the photometric data of the k th member star, x kN is the nearest star to x k in the theoretical isochrone, A 0 is V band extinction (A v , hereafter we use A 0 instead) andd 2 is the average value of n input x k vectors. We fit the Gaia EDR3 band theoretical isochrone lines (Bressan et al. 2012) to the CMD of each cluster. The isochrone lines' age range was [6.0, 10.1] dex, with a step of This function is the same as the one used in Liu & Pang (2019). However, we independently produced the python fitting codes, which will be accessible to anyone interested in using them.
https://www.cosmos.esa.int/web/gaia/edr3-passbands http://stev.oapd.inaf.it/cgi-bin/cmd_3.6 0.05 dex. We should note that, as we did in He et al. (2023), we neglected the Galactic metallicity gradient (Magrini et al. 2009) and used the solar metallicity as a fixed isochrone parameter. This may only have a minor impact on the fitting process (Salaris et al. 2004;Cantat-Gaudin et al. 2020a), and a more improved fitting that considering metallicity would be come out in the future. Besides, We determined the range of the distance modulus by median parallax ± 0.1 mas of the member stars, with a step of 0.05. We also used an extinction A 0 step of 0.05 mag, from 0.05 to 15 mag. The extinction coefficient was derived from the polynomial function: , here c 1 to c 10 values were adopted from the public auxiliary data provided by ESA/Gaia/DPAC/CU5 and prepared by Carine Babusiaux. We then carefully checked each result visually and manually removed stars that were obviously far from the evolutionary sequence, such as stragglers showing up in the CMD (see Section 4.4). Note that we only removed these stars from the isochrone fits, but we still kept them in the member star list. Besides, for poor fits with low core-membership, we used all member stars in order to obtain valuable fits.

RESULTS AND DISCUSSION 4.1. Genaral
After following the steps outlined above, we have compiled a catalog of 2057 identified objects similar to OCs, 28 of them were cataloged in Dias et al. (2021). As illustrated in Figure 2, the majority of these objects are situated in low latitude regions (|b| < 5 deg), while some are located in higher latitudes. We believe higher latitude detections originate from the Galactic bulge or warp structure, and we have conducted further analysis on them in another study (He 2023, in preparation). Out of these, 1902 objects could be distinguished twice using our TGFIG method, while 155 responded only once. By visually analyzing the results described above, we have classified them into 1488 Type 1 clusters (Figure 3 and Figure 4) and 569 Type 2 clusters ( Figure 5). Type 1 cluster candidates have more tightly bound distributions in Galactic coordinates, or have better CMDs, making them the most dependable open clusters detected in this study when compared to Type 2 cluster candidates.
We have fitted all photometry (G, BP-RP) from the OCs and OC candidates with theoretical isochrones, and statistical parameters have also been included in Table 1. For all parameters in Table 1, we have used the core member and median values, with standard deviation displaying dispersion. It is worth noting that since not all clusters have radial velocity members more than 1, for those clusters with only one stellar radial velocity value, we have used the radial velocity error instead. Furthermore, we have provided researchers with member star information for all results, including astrometric data, uncertainties, ruwe values, photometry, and radial velocity, all of which come from the Gaia DR3 catalog. In Table 13, we have included an i f core value to designate core members (i f core=1) and outer members (i f core=0).

Newly detected OCs and candidates
Regarding the newly detected OCs and candidates, we have cross-matched them with pre-Gaia OCs using MWSC (Kharchenko et al. 2013) and DIAS02 (Dias et al. 2002). First, we cross-matched them in a radius r, where the r 1 value for MWSC catalog was used. To eliminate the clusters that superposition in a foreground cluster, we crossed the clusters in distance in (1/3 × distance, 3 × distance), and radius in (0.3 × r, 3×r). As a result, we identified a total of 85 pre-Gaia clusters (Figure 3), 78 of which were Type 1 and 7 of which were Type 2. Figure 3 shows examples of the pre-Gaia OCs we identified as reliable OCs (Type 1), with increasing ages, including some high extinguished (A 0 > 5 mag) and old clusters (logarithmic age > 9). The BP band overestimation affects the main sequence, making it appear unreliable in the fainter end for high extinguished clusters. However, they remain bound in the space/kinematic vector, and the CMD turn-off point is also visible. For some old clusters, the presence of BSSs sets them apart from other clusters, these stars were identified in the new clusters as well (Section 4.4).
The 1972 OCs and OC candidates were labeled 1410 reliable OCs (Type 1) and 562 candidates (Type 2), respectively. Figure 4 and Figure 5 depict some Type 1 and Type 2 clusters, respectively, with increasing ages. Type 1 clusters have more tightly bound structures and a visible CMD (with low contamination), while Type 2 clusters are unbound and/or heavily contaminated, which requires further research for candidates. Although the method may not fully detect the OCs and member stars in Gaia data, the large number of newly detected reliable OCs (Type 1) and OC candidates (Type 2) shows the high efficiency of the OC search under TGFIG method. The isochrone fits are also valuable for cluster studies. All CMDs, isochrone fits, and Galactic coordinates distribution diagrams can be viewed online to facilitate their usage.

Globular cluster candidates and dwarf galaxy
In addition to the OC samples, we have compiled a list of 28 GC candidates, of which 16 are cataloged in Palma et al. (2019); Minniti et al. (2017bMinniti et al. ( , 2021b; Gran et al. (2022); Garro et al. (2020Garro et al. ( , 2022, and 12 are newly detected in Gaia data. For eleven of them, we have identified them as He Zhihong 1 to 11, and the remaining one, FSR 2700, is cataloged as an OC in MWSC. Although the matched GCCs were detected before, in our work, we have solely used Gaia data, which shows that the TGFIG method is also applicable in detecting other stellar aggregates in Milky Way. The positions and astrometric statistic parameters of the GCCs can be found in Table 1. Figure 6 presents some examples of the GCCs, and we note that some of those objects located in low-latitude regions are highly extinguished, making them challenging to detect in Gaia data. However, since the sequence of most of the candidates is not distinct, further research is needed, particularly for the new findings that only based on Gaia data.   Meanwhile, when we analyzed the results, we noticed a dwarf galaxy located at (l = 118.97 deg, b = -3.33 deg) (Figure 7 a), with a radius of approximately 0.05 deg (core member) to 0.1 deg (outer member). When we cross-matched the simbad database , we found that the object is the dense part of IC 10, a local dwarf galaxy located at a distance of approxhttp://simbad.cds.unistra.fr/simbad/sim-fid imately 660 kpc (Huchra et al. 1999;Borissova et al. 2000), which has been discovered 135 years ago (Swift 1888). As a result, it also displays a unique distribution in the k th NND histogram (Figure 7b). Although the object have a median parallax of -0.02 mas (σ = 0.13 mas), which is not advantageous for distance measurement, the proper motion values of IC 10 are still useful. The median proper motion of µ * α = 0.02 mas yr −1 (σ µ * α = 0.15 mas yr −1 ), µ δ = -0.09 mas yr −1 (σ µ δ = 0.18 mas yr −1 ), it shows a tangent velocity of approx-   Figure 3, but for the newly detected reliable clusters (Type 1), which follow a distinct evolutionary sequence as depicted by the increasing age of the clusters.
imately 280 km s −1 , which is also comparable to its radial velocity (RV = -348 km s −1 McConnachie 2012). Although previous observations of the proper motion of IC 10 through VLBA showed a smaller value (µ * α = -0.039 ± 0.009 mas yr −1 , µ δ = 0.031± 0.008 mas yr −1 Brunthaler et al. 2006). However, the difference between the two values is not significant. In the future, acquiring a more precise measurement from Gaia DR4/DR5 would be of great interest.
This single sample demonstrates that the real error of parallax (including the parallax zero point -0.017 mas of Gaia data, Lindegren et al. 2021;Fabricius et al. 2021) in this work is about 0.02 mas. This is helpful for researchers to verify the validity of Gaia parallax. Additionally, after high-Galactic latitude dwarf galaxies found in Cantat-Gaudin et al. (2018), this is the first time that a low latitude dwarf galaxy has been reported in a blindly star cluster search work. This may in-  Figure 5. Same as Figure 3, but for the newly detected candidate clusters (Type 2)  Figure 6. CMDs and spatial distribution of the GCCs re-detected in this work (upper panels), and newly identified (lower panels). Gran 4 is already exhibiting a distinct horizontal branch, confirming its classification as a highly reliable globular cluster. Nevertheless, the other clusters included in the dataset only show dense spatial distributions from great distances, as well as red clump stars that are clustered with possible horizontal branch, although this cannot be conclusively determined due to the enormous distance and extinction. Additional research should be conducted to ascertain the source of these potential clusters, particularly the newly identified candidates that rely solely on Gaia data. spire us to use TGFIG to identify local satellites (even in low-Galactic latitude regions) and search for star clusters in LMC/SMC based on Gaia data.

Comparison with previous OC works
This study focuses on distant OCs and is a follow-up to our previous searches in Gaia DR2 and EDR3 (He et al. 2021a(He et al. , 2022b(He et al. ,a, 2023. Our previous searches discovered 615 new clusters in Gaia DR2, 886 known and new clusters under 1.2 kpc, and 1665 new candidates in 1.2 to ∼5 kpc. In this work, as shown in Figure 8, we increased the previously discovered OC sample size by up to three times beyond a distance of 5 kpc, detecting 1249 objects that < 0.2 mas, with 846 of them being reliable Type 1 OCs. This increase is significant as it enables us to better study the faraway structures of the Milky Way. The new discoveries have larger extinction levels compared to before, with an increase of at least 3 mag (Figure 9). While we used stars fainter to ∼20 mag, comparable to the previous searches (e.g. 18 mag in CG20, Castro-Ginard et al. 2022, and our previous searches), most searches show that the Gaia data are limited for the extinction at 5 mag. However, we found that 190 clusters with higher extinction values can still be detected (Figure 10), some of them are at logarithmic age  e.

IC 10
Outer member Core Member Figure 7. Same as Figure 1, but for the dwarf galaxy IC 10. The k th NND histogram (b.) shows a significant difference between the field stars and member stars. Despite its low Galactic latitude (a.) and distance of approximately 660 kpc (Huchra et al. 1999;Borissova et al. 2000), IC 10 remains detectable in Gaia data up to a magnitude of 17 (e.). ∼8.4 (e.g. FSR 0134 in Figure 3) or larger, which could useful in studying the Galactic extinction.
In addition, unlike in our previous discovery of young clusters, we found more old clusters, particularly those with logarithmic age larger than 9. We detected over 618 clusters in this category, with 309 of them being Type 1. Some of the old clusters contain visible BSSs (Figure 10), and we listed newly discovered clusters that may have BSSs in the Appendix. These old clusters can be helpful in the study of the evolution of the stars and the Galaxy.

Apparent radius and proper motion dispersion
To compare our new discoveries with previously reported reliable OCs, we applied the method created by CG20, as done in our previous studies (He et al. 2021a), to verify if the new clusters have OC-like apparent radius and proper motion dispersions. Figure 11 shows the total proper motion dispersion (TPD) of new clusters in Type 1 and Type 2, with all statistical values in CG20 being in a Gaia EDR3 membership, as its dispersions are lower under a smaller astrometric error.
The TPD of both sets is comparable to CG20's results, with Type 1 clusters showing slightly higher dispersion since more detections of faint member stars; and Type 2 candidates do not differ significantly from Type 1 clusters. The low-dispersion in proper motion may be due to the proper motion cut in our TGFIG method (Section 3.1).
The apparent radius (Figure 12) shows that Type 1 clusters are still comparable to CG20 clusters, but Type 2 clusters are larger than the latter. However, some of the Type 2 clusters have cluster-like CMDs and may be related to moving groups or stellar streams (Kounkel & Covey 2019;Kounkel et al. 2020), requiring further research in future works. We also investigated the radial velocity dispersion (Figure 13), and the Type 1 clusters are more reliable than the Type 2 ones. However, the larger dispersion (especially > 10 km s −1 ) presented in Type 1 clusters shows that field star contamination exists in the new discoveries, necessitating deep membership determinations in future works. These issues highlight the challenges associated with OC searches and member star determinations. . Statistics on extinction (left panel) and cluster age (right panel). The newly found clusters have extinction limitation approximately 3 magnitudes greater than those previously detected, suggesting that they are considerably more obscured than the clusters located closer to the sun. This could explain the higher number of detections of old clusters with a logarithmic age greater than 9 (right panel), since the member stars in the main-sequence of these billion-years-old clusters are fainter compared to younger OCs, making them more difficult to detect due to observational limitations. Counts CG20 Type 1 Type 2 Figure 11. A comparison between the total observed proper motion dispersions obtained from CG20, represented by grey dots (member stars were cross-matched in Gaia EDR3), and our work, indicated by black plus signs for Type 1 clusters and red crosses for Type 2 clusters. The x-axis in the left plot is presented in logarithmic scale parallax, while the right plot displays the histograms of the three sets.

SUMMARY
We present the results of a survey aimed at identifying the most distant Galactic star clusters in the Gaia Era. In the study, we conducted a continuous search for distant star clusters as a follow-up to our previous search in Gaia DR2 (He et al. 2021a(He et al. , 2022b and nearby (He et al. 2022a), middledistance (He et al. 2023) clusters. We employed an improved method, TGFIG, to search for the |b| < 10 deg sky, cataloging 2085 objects, including 28 re-detected clusters cataloged in Dias et al. (2021), 1462 newly identified reliable OCs, and 567 OC candidates, 28 GC candidates (16 of which were cataloged before). Importantly, we present a local dwarf galaxy (IC 10) in the study, this may be the first time that the two instruments (Gaia and VLBA) have been used for mutual verification at a distance of hundreds of kpc, confirming the powerful astrometric capabilities of these tools. We conducted isochrone fits for each OC and OCC and provided the membership for each cluster. The cluster members were classified as core members (i f core = 1) or outer members (i f core = 0).
The main idea for the TGFIG method involved clipping other dense parts of the search field, ensuring the result was the densest in the data set. We then conducted two-Gaussian fits to the k th NND histogram, calculating the DBSCAN coefficient (ε) as the cross-point. Our TGFIG method proved more efficient than our previous works, and our new findings showed that the method is useful in distant cluster searches, particularly those for highly extinguished and old clusters. Our study demonstrated that the Gaia data allowed for the identification of OCs with A v > 5 mag and/or distance more than 5 even to 10 kpc, which are valuable for studying the Galactic structure and evolutions.
However, the limitations of searching for OCs in Gaia data remain unclear. Our search results have shown that the limitations depend on factors such as extinction, distance, and cluster age. We found that clusters located within 3 to 5 kpc from the sun are often undetectable with A v ≥ 10 mag and a logarithmic age of around 8.5, indicating that older clusters may be concealed by interstellar extinction and distance. On the other hand, young OCs surrounded by star-forming regions are often embedded clusters, making them difficult to identify. However, despite their challenging nature, their brighter magnitudes may make them easier to detect than older OCs, especially in regions near the sun.
We cross-matched our data with 4114 reliable OCs in HR23 within a 3-sigma range, and identified 232 clusters (matched to 234 reported clusters in HR23), and we present a matched catalog in an online table. Though HR23 also reported an additional 3086 objects that were not classified as reliable clusters, we found 362 of these objects nearby our identified clusters (357 ones). We believe that our matched Type 1 clusters are reliable OCs or GCs detected through the TG-FIG method, based on the membership determination and CMD/isochrone data (e.g. CNNU 3451 in Figure 1, pre-Gaia OCs in Figure 3, and Gran 4 in Figure 6). However, more studies are needed to verify the physical origin of those reported objects.
The forthcoming release of Gaia DR4 could greatly improve astrometric data accuracy, and ultimately help to distinguish between physical and random bound clusters in astrometric data, as well as detecting diversity of stellar aggregates, such as OCs, GCs, stellar streams, globular/irregular dwarfs, and extragalactic galaxies. Although the TGFIG method shows high efficiency in cluster searches, we believe it is not comprehensive and that additional methods should be developed or improved to increase the likelihood of discovering new clusters. With every new discovery of star clusters, astronomers are better able to study the Milky Way and to achieve the primary goals of the Gaia mission. Table 3. A catalog of newly detected clusters (Type 1 ) exhibits a potential presence of blue straggler stars, with the left portion of the cluster demonstrating a higher likelihood of containing real stragglers, while the right section shows a lower probability.