Edge-on Low-surface-brightness Galaxy Candidates Detected from SDSS Images Using YOLO

Low-surface-brightness galaxies (LSBGs), fainter members of the galaxy population, are thought to be numerous. However, due to their low surface brightness, the search for a wide-area sample of LSBGs is difficult, which in turn limits our ability to fully understand the formation and evolution of galaxies as well as galaxy relationships. Edge-on LSBGs, due to their unique orientation, offer an excellent opportunity to study galaxy structure and galaxy components. In this work, we utilize the You Only Look Once object detection algorithm to construct an edge-on LSBG detection model by training on 281 edge-on LSBGs in Sloan Digital Sky Survey (SDSS) gri-band composite images. This model achieved a recall of 94.64% and a purity of 95.38% on the test set. We searched across 938,046 gri-band images from SDSS Data Release 16 and found 52,293 candidate LSBGs. To enhance the purity of the candidate LSBGs and reduce contamination, we employed the Deep Support Vector Data Description algorithm to identify anomalies within the candidate samples. Ultimately, we compiled a catalog containing 40,759 edge-on LSBG candidates. This sample has similar characteristics to the training data set, mainly composed of blue edge-on LSBG candidates. The catalog is available online at https://github.com/worldoutside/Edge-on_LSBG.


INTRODUCTION
Low-surface-brightness galaxies (LSBGs) are galaxies whose surface brightness is generally fainter than that of the night sky (Mcgaugh et al. 1995).Conventionally, LSBGs are face-on galaxies with central B-band surface brightness between 22 mag arcsec −2 and 23 mag arcsec −2 (Impey et al. 2001;Ceccarelli et al. 2012), or, alternatively, with central R-band surface brightness µ 0,R > 24 mag arcsec −2 (Adami et al. 2006).The LS-BGs make up a large proportion of the luminosity density in the local Universe (Impey & Bothun 1997), making their contribution to the Universe indispensable.
Due to the edge-on orientation, edge-on LSBGs provide an excellent opportunity to study the vertical structure of galaxies and galactic components (De Grijs 1998;Bizyaev et al. 2017).Studies on edge-on LSBGs help on LSBG samples have been identified and investigated.Matthews & Gao (2001) selected eight nearby, edge-on LSBGs using the National Radio Astronomy Observatory 12 m telescope and made CO observations of them.Bizyaev & Kajsin (2004) selected a sample of 11 edgeon galaxies from the faintest surface-brightness galaxies in the Revised Catalog of Flat Galaxies (Karachentsev et al. 1999).Matthews et al. (2005) selected 15 latetype edge-on LSBGs with the IRAM 30 m telescope and made CO observations.Caldwell & Bergvall (2006) selected a sample of 970 edge-on LSBGs from the SDSS DR4 data to study the galactic halo emission.Bergvall et al. (2010) selected a sample of 1510 edge-on LSBGs in the SDSS DR5 database to explain the "red halo" phenomenon.Bizyaev et al. (2014) obtained 5747 edge-on galaxies by parameter cutting and visual inspection of SDSS DR7 (Abazajian et al. 2009).Du et al. (2017) selected a sample of 12 edge-on LSBGs from the catalog of Bizyaev et al. (2014) and made spectral observations of them.He et al. (2020) selected a sample of 281 edge-on LSBG candidates from the catalog obtained by crossmatching SDSS DR7 with 40% of the Arecibo Legacy Fast ALFA survey (ALFALFA; Giovanelli 2007) and analyzed the optical and HI properties of the sample.
In previous studies, edge-on LSBGs have primarily been obtained by selecting a central surface brightness and axis ratio.The process of selecting edge-on LSBG samples often involved complicated steps.In addition, the inclusion of skylight contamination in faint galaxies (Du et al. 2015) and the sensitivity of edge-on galaxies to initial parameters of the profile fitting (He et al. 2020) have further complicated the selection process, requiring more manual intervention.Consequently, it becomes challenging to conduct a large-scale, automated search for edge-on LSBGs across sky surveys.Some machine learning methods have been applied to identify and classify LSBGs to improve efficiency.Traditional machine-learning methods such as support vector machines (SVMs; Platt 1998) and random forest (Breiman 2001) have been employed for LSBG selection.However, their accuracy in identifying LSBGs is only around 50%, often necessitating a combination with manual inspection (Greco et al. 2018;Tanoglidis et al. 2021b).The low efficiency of these methods in identifying LSBGs is primarily due to using galaxy parameters as training data.The faint surface brightness and complex morphologies of LSBGs make their features difficult to be accurately extracted, leading to significant recognition errors in the identification process.Fortunately, in recent years, deep learning has made significant advancements, greatly enhancing the image analysis capabilities of machine-learning models.Neural net-works, with their powerful feature extraction capabilities, can learn galaxy features directly from images, thereby enhancing the ability to identify LSBGs.For example, by using convolutional neural networks (CNNs; LeCun et al. 1998) to differentiate between LSBG images and artifact images, the model named DeepShadows achieved an accuracy of 92% (Tanoglidis et al. 2021a).The premise of using cutout galaxy images to identify LSBGs is that a galaxy list has already been successfully obtained.However, some faint galaxies are challenging to be accurately recognized, especially irregular or peculiar galaxies.Additionally, there may be instances where the components of a large galaxy are mistakenly identified as multiple small galaxies, leading to errors in the subsequent identification of LSBGs.Deep-learning-based object detection provides a potential approach for automatically identifying LSBGs.This technique enables the direct recognition and localization of multiple objects from a large image.For example, Yi et al. (2022) developed an automated detection model to mainly identify face-on LSBGs in SDSS images and achieved a detection accuracy of 92%.
In this study, we aim to detect wide-area edge-on LS-BGs from SDSS DR16 (Ahumada et al. 2020).Initially, we selected edge-on LSBG candidates using photometric parameters (expAB g ≤ 0.3 or expAB r ≤ 0.3, µ 0,B ≥ 22 mag arcsec −2 ) and obtained 875,993 candidates.However, by examining a subsample of 500 sources, we found that more than half of the samples do not exhibit the morphology of edge-on LSBGs, but rather dense stellar streams, star wings of bright stars, galaxies of nonelongated shape, or irregular morphology.Obtaining true LSBGs requires time-consuming manual inspection of candidates.In this study, we automate the detection of edge-on LSBGs using both object detection and anomaly detection techniques.The constructed object detection model was utilized to automatically identify edge-on LSBGs from SDSS field images, providing both the classification and location of these galaxies.
The layout of this paper is as follows.Section 2 introduces the training and test samples used for building our object detection model.The development of the detection model is described in Section 3. In Section sec:searching, we present detection results in SDSS DR16 and the process of purifying the candidates.In Section 5, we introduce the properties of the candidate LSBGs.We summarize and conclude our study in Section 6.

DATA PREPARATION
In this study, we used an edge-on LSBG sample set from He et al. (2020) to build an object detection model.This sample set contains 281 edge-on LSBGs that were selected from the crossmatching sample of the 40% ALFALFA (Giovanelli 2007) catalog and SDSS DR7 (Abazajian et al. 2009), with axis ratio b/a ≤ 0.3 in g-band or r-band and the corrected B-band central surface brightness µ 0,B ≥ 22.5 mag arcsec −2 .Most of the samples are "blue" galaxies, while a few are "red" galaxies according to the color (g − r; Bernardi et al. 2010).In the selection of these edge-on LSBGs, the disparity in surface brightness resulting from the different inclinations of face-on and edge-on galaxies has been corrected, according to He et al. (2020).Figure 1 shows SDSS images of six edge-on LSBGs in this sample set.
We obtained 281 gri-band composite images of 2048× 1489 pixels from SDSS DR16, each image containing one edge-on LSBG sample.We labeled these samples using Labelimg to obtain the appropriate size of the bounding box, which is the rectangular box that contains an object.The label parameters include (x, y, w, and h), where (x, y) represents the central coordinate of the target and w and h are the width and height of the bounding box, respectively.Furthermore, we divided the 281 photometric images with a ratio of 8:2, using 225 images as the training set and 56 images as the test set.Figure 2 shows one edge-on LSBG in an SDSS field image.

YOLO
In this work, we built a model to detect edge-on LS-BGs from SDSS field images utilizing the YOLOv5 algorithm, which is one of the You Only Look Once (YOLO) family (Redmon et al. 2016;Redmon & Farhadi (2017, 2018); Bochkovskiy et al. 2020).As a typical representative of one-stage algorithms, the YOLO algorithm directly extracts feature values and has the advantage of detection speed, accuracy, and learning capabilities.The YOLOv5 algorithm combines the ideas of the previous YOLO algorithms and has been innovated in terms of data augmentation, effectively solving the problem of not detecting small targets.In addition, YOLOv5 has a highly desirable speed of training and detection.In this work, we chose the medium-sized network architecture YOLOv5m provided by the Ultralytics YOLOv5 library to build an object detection model for detecting edge-on LSBGs.
The base model we used is the "Medium" variant of the YOLOv5 models, named YOLOv5m.The network structure of YOLOv5m consists of four parts: the input module, which is used to acquire input images and perform data augmentation; the backbone module, which extracts high-, medium-, and low-level features from images; the neck module, which fuses feature information from all levels of the image and extracts large, medium, and small feature maps; and the head module, which applies anchor boxes to the generated feature maps for final detection.The loss function consists of the localization loss L loc , classification loss L cls , and confidence loss L conf .We used complete intersection over union (CIoU) as the localization loss and binary cross-entropy loss as the classification loss.The total loss is a linear sum of the three losses, which is shown in the following equation.
where λ 1 , λ 2 , λ 3 are weighting factors assigned to the localization loss, classification loss, and confidence loss.

Model Training
In the training process of the model, the input image was resized to 640 × 640 pixels.The batch size was set to 12. To optimize the model parameters, the Adam optimizer (Kingma & Ba 2014) was adopted.Moreover, the parameters image translation, image scale and image mosaic were set to 0.1, 0.5, and 1.0, respectively.The other hyperparameters, such as the initial learning rate, IoU training threshold, etc., were kept as default values.
After 150 epochs of training, the localization loss and confidence loss of the model achieved a smooth convergence.Finally, we got an object detection model that can be used to search for edge-on LSBGs in SDSS DR16 composite images.

Performance Evaluation
We used recall and purity to evaluate the performance of our detection model on the test data set.The two measures are calculated as follows: where TP (true positives) is the number of detected samples among the labeled LSBGs, N is the total number of labeled samples, FP (false positives) is the number of newly detected objects, CP (check positives) is the number of candidate LSBGs confirmed by checking their shape and central surface brightness (SDSS parameters were used to calculate their B-band central surface brightness according to He et al. (2020); a galaxy with B-band central surface brightness greater than 22 mag arcsec −2 is determined as a candidate LSBG).

Model Testing
Our model outputs the center position of the detected object, the width and height of the bounding box, and the confidence.The confidence is the probability of an edge-on LSBG in the bounding box.In our experiments, the default confidence threshold of 0.25 was used in model testing; that is, the object with a confidence lower than 0.25 would not be retained.
The built model detected edge-on LSBGs from SDSS images in the test set.From the 56 SDSS field images of the test set, we have detected 90 sources, which include all 56 previously labeled edge-on LSBGs and 34 newly detected sources.We performed a visual inspection of the newly detected sources to see if they were galaxies.Furthermore, their B-band central surface brightness is calculated using the SDSS parameters, and the cutting of µ 0,B ≥ 22 mag arcsec −2 is implemented.Finally,  25 of the 34 candidates are considered to be candidate LSBGs.
Figure 3 shows the confidence distribution of the 56 correctly detected samples from the test set.Among them, only three had significantly lower confidence (the values are 0.48, 0.48, and 0.44, respectively) because of their abnormal shapes, while the remaining 53 samples had confidence values greater than 0.65.
We attempted to enhance the purity of the detected samples by increasing the confidence threshold.We set the confidence thresholds to 0.45 and 0.65, resulting in purity of 93.33% and 95.38% and recall rates of 98.21% and 94.64%.The detailed test results of the model at three different confidence thresholds are shown in Table 1.The test results suggest that as the the confidence threshold increases, the purity of the detected samples increases, but the recall rate decreases.Trading off recall and purity, we chose a confidence threshold of 0.65 for the subsequent edge-on LSBG detection.

SEARCHING FOR EDGE-ON LSBGS FROM SDSS DR16
Here we describe the pipeline used to search for edgeon LSBGs from SDSS DR16.First, we use the established object detection model to search for edge-on LS-BGs across all composite images from SDSS DR16.In addition, we trained an anomaly detection model to help remove contamination among the candidates, thereby improving the automation of the sample purification process.

Data Preparation
We obtained the gri-band composite images from the SDSS DR16 Science Archive Server (SAS), a total of 938,046 images, each with a size of 2048 × 1489 pixels.These images have been preprocessed with flat field, bias corrections, bad pixel corrections, and sky subtraction.

Results
The detection process was executed on a platform equipped with an NVIDIA GTX 1660 GPU.Each image's detection took approximately 0.023 s, resulting in a cumulative processing time of 18.91 hr for all 938,046 images.Within this dataset, our model identified 52,293 candidate LSBGs.The model predicted their central coordinates, bounding box dimensions (width and height), and assigned a confidence level.As an illustration, Figure 4 displays an SDSS image where our model identified two candidate LSBGs.

Purify the Samples Using Anomaly Detection
Through visual inspection of a random subsample of the detected candidate LSBGs, we found that there were a small number of candidates that had similar characteristics to edge-on LSBGs, such as some star wings and thin artifacts.Therefore, to further improve the purity of the sample of candidate LSBGs, we next tried to identify the anomalous candidates.
Anomaly detection (Chandola et al. 2009) is a data analysis technology used for identifying and detecting anomalous samples in a data set.As one of the anomaly detection algorithms based on deep learning, Deep Support Vector Data Description (Deep-SVDD; Ruff et al. 2018) can be used to process complex data such as images and directly identify a small percentage of anomalous image samples.Deep-SVDD uses deep neural networks to model complex data distributions and can learn a compact representation of the normal class data, for solving one-class classification problems.One-Class Deep-SVDD finds a hypersphere of minimum volume with center c and contracts the sphere by minimizing the mean distance of all data representations to the center.To achieve this goal, the neural network must extract the common factors of variation.For some input space X ⊆ R d and output space F ⊆ R p , ϕ(•; w) : X −→ F is a neural network with L ∈ N hidden layers and set of weights W = {W 1 , ..., W L }, where W l are the weights of layer l ∈ {1, ..., L}.Given some training data D n = {x 1 , ..., x n } on X, the loss function of One-Class Deep-SVDD objective is defined as follows: The first term of the loss function employs a quadratic loss for penalizing the distance of every network representation ϕ(x i ; w) to center c ∈ F .The second term is a network weight decay regularizer with hyperparameter λ > 0, where ∥ • ∥ F denotes the Frobenius norm.
During testing, the input data points are mapped to the latent space using the trained neural network.The distance between each data point and the center of the hypersphere is calculated, known as the anomaly score, and if the anomaly score is greater than a threshold, the data point is considered an anomaly.The equation for calculating the anomaly score is as follows: where x represents the test data, w * represents the network parameters of a trained one-class classification model, ϕ(x; W * ) represents the network representations, and c represents the center of the hypersphere.We used the previously mentioned data set of 281 edge-on LSBGs as training samples to train the anomaly detection model.These samples were obtained in the form of composite images with dimensions of 250 × 250 pixels from the SDSS SkyServer.As part of the preprocessing step, we conducted image scaling and applied random flipping.For building the anomaly detection model, we first utilized a deep convolutional autoencoder (DCAE; Masci et al. 2011) to initialize the network weights and obtain the hypersphere center c, which is set to the mean of the mapped data after performing pretraining.The DCAE consists of an encoder and a decoder.The encoder compresses the input data to learn informative features, while the decoder decompresses the learned representations to reconstruct the original input.Here, the DCAE encoder has the same architectures as the Deep-SVDD network.A LeNet-type CNN is used in the Deep-SVDD network, where each convolutional module consists of a convolutional layer followed by leaky ReLU activations and 2×2 max pooling.Three CNN modules are used, including 32 × (5 × 5 × 3) filters, 64 × (5 × 5 × 3) filters, and 128 × (5 × 5 × 3) fil- ters, followed by a dense layer of 128 units (see Ruff et al. 2018 for details 1 ).The pretraining used the normal samples of edge-on LSBGs.After pretraining the DCAE, we obtained the hypersphere center c.Then we trained the LeNet-type CNN using one-class classification loss, with initial weights from the trained DCAE to obtain the final model.We trained 500 epochs for both the DCAE and LeNet-type CNN.After completing the training process, we obtained a model capable of assigning anomaly scores to candidate images.Eventually, the detection model was applied to 52,293 candidate LSBGs and their anomaly scores were obtained.As a result, their scores were distributed in the range of 0.01-19.21.Most of the candidates had small anomaly scores close to 0.01, while a few had relatively significant scores.
To establish an appropriate threshold for anomaly scores, we employed the box-plot method (Tukey et al. 1977), which visually illustrates the distribution, dispersion, and skewness characteristics of numerical data using quartiles (DuToit et al. 2012).To determine the threshold for the anomaly scores, we set the upper limit in the box plot to Q3+3IQR (inter-quartile range, where IQR=Q3-Q1, and the factor 3 is used to identify extreme outliers), where Q1 is the lower quartile, Q3 is the upper quartile, and for this case they were calculated as 0.042594 and 0.154938, respectively.Consequently, we established an upper-limit value of 0.49.Furthermore, we computed the anomaly scores for all 281 training samples, and all of them fell below the upper-limit line at y = 0.49. Figure 5 displays the distribution of the anomaly scores for the candidate LSBGs.There are 3403 candidates located above the upperlimit line (y = 0.49), accounting for about 6.51% of the total candidate LSBGs.A visual inspection of these candidates revealed that most exhibited unusual characteristics, including artifacts, galaxies with larger axis ratios, merging galaxies, star wings, and star lines.They were incorrectly identified, either because of their similar shape to edge-on LSBGs or because of the interference from nearby bright stars.Figure 6 shows six images of these anomalous candidates detected by the anomaly detection model.After a visual inspection, we removed 1981 anomalous sources that did not match the profile of edge-on LSBGs.The remaining 1422 candidates exhibited the characteristic shape of edge-on galaxies, and thus they were retained.Their elevated anomaly scores primarily stemmed from the presence of other celestial objects in proximity, such as bright stars and galaxies.Consequently, 50,312 candidate LSBGs remained.
Following the manual inspection of candidates with outlier scores exceeding 0.49, we conducted random sampling checks on the remaining 48,890 samples with scores below 0.49.These checks were performed through five sets of sampling, each comprising 200 samples, and were designed to detect potential contaminants (our consideration revolves around the straightforward identification of contamination resulting from artifacts, star wings, and similar factors, without taking into account more intricate parameters like surface brightness or axis ratio, which can be challenging for the human eye to discern).Across the five sets, we observed contamination rates ranging from 2.5% to 0.5%, which decreased as the outlier scores decreased.The average contamination rate across the five sets stood at approximately 1.6%.While our automated process did not entirely eliminate erroneous sources, this 1.6% contamination rate signifies a significant improvement in purity compared to the results achieved by previous machine-learning methods.For reference, LSBG candidates identified using an SVM classifier exhibited a contamination rate of approximately 47% (Tanoglidis et al. 2021a).

DISCUSSION
To show the properties of the detected candidate LS-BGs, we crossmatched the detected samples with the galaxy catalog of SDSS DR16 to obtain the parameters provided by SDSS.With a search radius of 12 ′′ , we got a total of 49,972 matched galaxies.Regarding the remaining 340 unmatched candidate samples, we cannot find the corresponding source in the SDSS DR16 galaxy catalog.Upon inspecting their images, we determined that 201 of them were edge-on galaxies, while the remaining 139 were artifacts and star lines.After excluding these 139 samples, our data set comprised 50,173 edgeon LSBG candidates.Figure 7 shows the nine edge-on LSBG candidates that we have identified.Among these candidates, the top two rows include six that are listed  in the SDSS catalog, whereas the three candidates in the last row were newly identified by our model.

Properties of candidate LSBGs
For the 49,972 matched edge-on LSBG candidates, we obtained their photometric parameters from SDSS Sky-Server.To get the B-band central surface brightness µ 0,B , we removed 3136 candidates with invalid parameters (expRad g or expRad r less than 1 ′′ , redshift z=- 9999), leaving 46,836 candidate LSBGs.Then we obtained their µ 0,B following the calculation method of He et al. (2020), in which µ 0,B is corrected for the inclination effect of edge-on LSBGs.Figure 8 shows the distribution of number density of galaxies against the B-band central surface brightness µ 0,B for the detected sample (in blue).For comparison, the density distribution of µ 0,B of the training samples (in red) is also presented in Figure 8.
As can be seen from Figure 8, the µ 0,B of the detected candidate LSBGs is mainly distributed in the range of 21-26 mag arcsec −2 , in agreement with that of the training samples.The peak value of the µ 0,B of the detected samples is 23.76 mag arcsec −2 , about 0.11 mag arcsec −2 lower than the peak value of the central surface-brightness distribution of the training samples.One possible reason for the slight deviation in surface brightness is that surface brightness is essentially a challenging learning feature, as it involves many related factors.Another influencing factor is our limited number of training samples.
Additionally, we present the axis ratio distributions of the training samples and detected samples in the g band and r band, determined based on the SDSS photometric parameters expAB g and expAB r, as depicted in tified candidates and the training samples show the same distribution, where as the axis ratio decreases, galaxies with lower surface brightness become detectable, indicating that galaxies with lower axis ratios are easier to detect due to the accumulation of luminosity.Additionally, the detected galaxies exhibit a broader distribution of axis ratios and surface brightness than that of the training sample.Some detected sources fall beyond the range of axis ratios and surface brightness for edgeon LSBGs.This indicates that the model is capable of detecting relatively faint galaxies with slender shapes, but lacks the ability to accurately distinguish sources with axis ratio and surface brightness near the threshold boundaries.Fewer negative samples of axis ratio and surface brightness near the cutting thresholds may have resulted in insufficient learning of the identification boundaries.
The colour-magnitude relation for training samples and detected samples is shown in Figure 11.The absolute magnitude M r and color (g −r) have been corrected according to He et al. (2020), considering the differences in internal extinction and color changes between face-on and edge-on galaxies.The green line is the dividing line between "red"-sequence galaxies and "blue" cloud galaxies (Bernardi et al. 2010)."Red" galaxies lie above this line, while "blue" galaxies lie below it.Figure 11 shows that most of our edge-on LSBG candidates are located in the "blue" region, exhibiting excellent consistency with the training data.
In summary, the properties of the detected candidates are generally consistent with those of the training samples, affirming the effectiveness of our detection model.The identified galaxies exhibit slender shapes and relatively low surface-brightness features, indicating that The green line is the dividing line between "red"-sequence galaxies and "blue" cloud galaxies (Bernardi et al. 2010).
our detection model has effectively captured the key characteristics of edge-on LSBGs.It is worth noting that the surface brightness and axis ratio of some detected samples slightly exceed the predefined range for edge-on LSBGs, which represents a precision limitation of the model.Nonetheless, our model significantly enhances the level of automation in the recognition of edgeon LSBGs, reducing the need for manual intervention.

The Catalog of the Candidate LSBGs
To ensure the production of a more dependable catalog, we implemented selection criteria based on axis ratio and central surface brightness (expAB g < 0.3, µ 0,B > 22 mag arcsec −2 ).In addition to this, we removed three cosmic-ray contaminations through visual inspection, resulting in a final count of 40,759 edgeon LSBG candidates.Among them, 40,558 candidate LSBGs correspond to SDSS galaxies, while 201 candidates are newly detected sources not present in the SDSS Galaxy view.A portion of the catalog is shown in Table 2 and the full version is available online at https://github.com/worldoutside/Edge-onLSBG.

CONCLUSION
In this paper, we present an edge-on LSBG catalog identified from SDSS DR16 field images using deeplearning methods.With a sample of 281 edge-on HIrich LSBGs from He et al. (2020), a deep-learning object detection model was built using the YOLOv5 algorithm, achieving a recall of 94.64% and a purity of 95.38% for the test set.We then applied the model to search for edge-on LSBGs from 938,046 composite images in SDSS DR16, and 52,293 candidate LSBGs were identified.Subsequently, the candidate LSBGs were purified by the Deep-SVDD anomaly detection model.We showed the properties of the sample of candidate LS-BGs, including the B-band central surface brightness, axis ratio, and color-magnitude relation.The properties of the detected samples are in good agreement with those of the training sample.Finally, we provided a sample that includes 40,759 candidate edge-on LSBGs, which is a wide-area sample for future studies investigating the properties of edge-on LSBGs within the realm of galaxy research.
This study utilizes deep-learning methods for the automatic detection of edge-on LSBGs, leading to a significant enhancement in the automation of LSBG detection while reducing the need for manual inspection.This approach remains effective in identifying sources that are challenging to extract parameters from using traditional methods, including relatively dim and irregular galaxies.Importantly, the established detection model operates independently of photometric parameters, en-  abling the identification of sources that might be missed by photometric methods.This technology holds promise for the development of intelligent image analysis tools to support future large-scale sky surveys.We thank the SDSS team for the released SDSS images and parameter catalog.Funding for the SDSS has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Science Foundation, the US Department of Energy, NASA, the Japanese Monbukagakusho, the Max Planck Society, and the Higher Education Funding Council for England.The SDSS Web Site is http://www.sdss.org.

Figure 1 .
Figure 1.Images of six edge-on LSBG samples in SDSS DR7.

Figure 2 .
Figure 2.An edge-on LSBG in a labeled bounding box.The image is one of the training samples, which is a gri-band composite image of SDSS DR16 obtained from the SAS.

Figure 3 .
Figure 3. Distribution of confidence for all the labeled edgeon LSBGs in the test set detected by the YOLOv5m model.

Figure 4 .
Figure 4. Two candidates detected by the model in an SDSS composite image, with the confidence of the candidates predicted by our model.

1Figure 5 .
Figure 5.The distribution of the anomaly scores.The left panel shows a scatter plot, and the right panel is a histogram plot.The red line in the scatter plot represents the upper limit of 0.49.

Figure 7 .
Figure 7. Six candidate LSBGs identified by our object detection model, which are also included in the galaxy view of SDSS DR16 (the first two rows).And three candidate LSBGs identified by our model, which are not included in the galaxy view of SDSS DR16 (the last row).

Figure 8 .
Figure 8.The normalized distribution of µ0,B of the training edge-on LSBGs and the detected candidate LSBGs.The red and blue vertical lines are marked at the peaks of the training samples and detected samples, respectively.

Figure 9 .
Figure 9. Distribution of axis ratio in g band and r band of training edge-on LSBGs (top panel) and identified candidate LSBGs (bottom panel).

Figure 10 .Figure 11 .
Figure 10.Distribution of axis ratio vs. B-band central surface brightness for training and detected samples in g band (left panel) and r band (right panel).

a
Flag is used to denote the source of R.A. and Decl.When Flag=0, the first two columns correspond to the SDSS R.A. and SDSS Decl.When Flag=1, the first two columns represent the model-predicted R.A. and Decl.
supported by Shandong Province Natural Science Foundation grant No. ZR2022MA089, the National Natural Science Foundation of China (NSFC) grant Nos.U1931209, and 11803016, and the Chinese Space Station Telescope project.D.W. is supported by the National Natural Science Foundation of China (NSFC) grant Nos.U1931109 and 11733006 and the Youth Innovation Promotion Association, Chinese Academy of Sciences (CAS), No. 2020057.Y.B. is supported by the National Natural Science Foundation of China under grant No. 11873037 and partially supported by the Young Scholars Program of Shandong University, Weihai (2016WHWLJH09), and the science research grants from the China Manned Space Project with Nos.CMS-CSST-2021-B05 and CMS-CSST-2021-A08.

Table 1 .
Test Results of Our Detection Model at Three Confidence Thresholds.
a TP (true positives) is the number of detected samples that are already labeled in the training data set.b FP (false positives) is the number of newly detected candidates.c CP (check positives) is the number of newly detected candidates that are confirmed as candidate LSBGs by checking the shape and central surface brightness in B-band.

Table 2 .
Catalog of Candidate LSBGs Detected in SDSS DR16, Sorted by Model-predicted Confidence Notes.A copy of the catalog is also available at https: //github.com/worldoutside/Edge-onLSBG (the full catalog includes 40,759 candidate LSBGs).