Taxonomic Analysis of Asteroids with Artificial Neural Networks

We study the surface composition of asteroids with visible and/or infrared spectroscopy. For example, asteroid taxonomy is based on the spectral features or multiple color indices in visible and near-infrared wavelengths. The composition of asteroids gives key information to understand their origin and evolution. However, we lack compositional information for faint asteroids due to the limits of ground-based observational instruments. In the near future, the Chinese Space Survey Telescope (CSST) will provide multiple colors and spectroscopic data for asteroids of apparent magnitude brighter than 25 and 23 mag, respectively. With the aim of analyzing the CSST spectroscopic data, we applied an algorithm using artificial neural networks (ANNs) to establish a preliminary classification model for asteroid taxonomy according to the design of the survey module of CSST. Using the SMASS II spectra and the Bus–Binzel taxonomic system, our ANN classification tool composed of five individual ANNs is constructed, and the accuracy of this classification system is higher than 92%. As the first application of our ANN tool, 64 spectra of 42 asteroids obtained by us in 2006 and 2007 with the 2.16 m telescope in the Xinglong station (Observatory Code 327) of National Astronomical Observatory of China are analyzed. The predicted labels of these spectra using our ANN tool are found to be reasonable when compared to their known taxonomic labels. Considering its accuracy and stability, our ANN tool can be applied to analyze CSST asteroid spectra in the future.


INTRODUCTION
Small Solar System objects S3Os are thought to be remnants of planetesimals from the early stage of the planetary formation of the Solar System.Compared to the planets, the S3Os could retain more information of protoplanetary conditions because of suffering less secondary chemical and geological evolution, although they have undergone collisions, space weathering, and dynamical and thermal evolution, which shaped their present physical and orbital properties (DeMeo & Carry 2014).At present, most discovered S3Os are asteroids which are thought to originate from the inner planetesimals, as the building blocks of the terrestrial planets.The composition of asteroids vs. their orbits can provide some clues to the origin and the evolution of asteroids, as well as the constraints on planetary formation models in the inner Solar System (Bottke et al. 2002).
The surface composition of asteroids can be inferred from photometric colors and/or spectroscopic data in the visual (Vis) and near-infrared (NIR) wavelength regimes.The overall shape and absorption features of reflectance spectra of asteroids reflect the compositions and minerals they contain.Additionally, there are four effects that affect the reflectance spectra of asteroids: (1) phase reddening, which raises the spectral slope by 8-12 % in the phase angle range of 0°-100° (Clark et al. 2002); (2) space weathering, which makes a planetary surface darken and redden; (3) the particle size distribution of the regolith, which changes spectral slope and band depth; and (4) the temperature of the surface, which affects the shapes of spectral bands, e.g., associated with olivine and pyroxene minerals.The spectral taxonomy is a frequently used method to understand the surface composition of S3Os, and is based on the spectral slope and absorption features in the spectra.With the contributions of many astronomers, several classification systems of asteroids have been established.Using the data of the Eight-Color Asteroid Survey (ECAS, the wavelength range is 0.337-1.041µm), the Tholen taxonomy classified asteroids into 14 classes (Zellner et al. 1985).The SMASS II taxonomy, based on the data of the Small Main-Belt Asteroid Spectroscopic Survey in 2002 (SMASS II, the wavelength range of 0.435-0.925µm), classified asteroids into 26 types (Bus & Binzel 2002a).The Bus-DeMeo taxonomy (DeMeo et al. 2009) classified asteroids into 24 classes using both visible and near-infrared data (the wavelength range of 0.45-2.45µm).
Recently, taxonomic studies of asteroids have started to apply machine learning techniques to, e.g., multiple photometric colors (Colazo et al. 2022;Popescu et al. 2018), spectra (Penttilä et al. 2021), albedo (Belskaya et al. 2017;Mahlke et al. 2022), and spectral slope (Zellner et al. 1985;DeMeo et al. 2009).Various machine learning algorithms were applied to asteroid taxonomy, for example, principal component analysis (Bus & Binzel 2002a;DeMeo et al. 2009, PCA), random forest (Huang et al. 2017, RF), cluster analysis (Mahlke et al. 2022, CA), multinomial logistic regression (Klimczak et al. 2021, MLR), naive Bayes (Klimczak et al. 2021), support vector machines (Klimczak et al. 2021, SVM), gradient boosting (Klimczak et al. 2021, GB), multilayer perceptron (MLP), or feed-forward neural network (Klimczak et al. 2021;Penttilä et al. 2021;Howell et al. 1994).The PCA method refers to a transformation of a dataset of multiple dimensions, which could reduce the dimensionality of the data and reach a goal that the information of data set concentrated on a few variables (Baron 2019;Ivezić et al. 2019;Ball & Brunner 2010).The RF method is composed of decision trees, each decision tree giving a classification model.Repeatedly constructing decision trees, the RF outputs the final results of classification by voting process among the constructed decision trees (Baron 2019;Ivezić et al. 2019;Ball & Brunner 2010;Huang et al. 2017).MLP is the most used model among the artificial neural networks (ANNs), and usually applies a back-propagation algorithm (BP).Comparatively, the feed-forward neural network can have a higher accuracy when used to classify the spectroscopic data of asteroids (Klimczak et al. 2021).Such a neural network has a good nonlinear function approximation performance and prediction ability.It is robust to irrelevant or redundant attributes, which makes it able to adapt to the complex data environment with various features (Baron 2019;Ivezić et al. 2019;Ball & Brunner 2010).In the analysis of asteroid taxonomy, the neural network techniques could make full use of the spectral trend and absorption features, especially of those that are difficult to distinguish by human eye.The ANN is a supervised machine learning algorithm and has been applied successfully in the analysis of asteroid taxonomy.For example, Penttilä et al. (2021) applied the algorithm to build a neural network of 11 classes, reduced from the Bus-DeMeo taxonomic system (DeMeo et al. 2009), with a combination dataset using Vis-NIR wavelength range between 0.45 and 2.45 µm.Penttilä et al. (2021) showed the established ANN to work well in classifying asteroids, even in situations where the wavelength range of the tested spectra differs from that of the original taxonomic system, and concluded that the ANN could be applied to the spectral data of ESA Gaia space telescope.
In practice, most of the discovered asteroids have no spectral data due to their faint brightness.In the near future, the situation will be improved with large ground-based telescopes, i.e., the Vera C. Rubin Observatory, and space survey telescopes such as the CSST (Chinese Space Survey Telescope) and the ESA Euclid.
The ESA Euclid mission will employ a 1.2-m telescope to survey a sky area of 15,000 square degree in its sixyears running period.Its Vis-NIR imaging and spectroscopic instruments could detect celestial objects of apparent magnitude brighter than V AB = 24.5 mag.Accordingly, roughly 150,000 S3Os (mainly main-belt asteroids) could be detected by the imaging mode of Euclid, and 100,000 could be spectroscopically observed in the wavelength range of 0.5-2 µm.
The CSST, a 2.0-m telescope with a field of view of 1.2×1.2deg, is expected to be launched in 2024 and located next to the Chinese space station (Zhan 2021).At present, the CSST has 5 observational modules: Survey module, Terahertz module (THz), Multi-Channel Imager (MCI), Integrated Field of view Spectrometer (IFS), and Cool Planet Image star Coronagraph module (CPI-C).During its ten-year operation, the survey module will occupy 70 % of the total observation time.Consequently, a 17,500 deg 2 deep-sky area and a 400 deg 2 extreme-deep sky area could be obtained.The exposure times for the deep-sky area and the extreme-deep sky area will be 150 s and 240 s, respectively.The survey module of the CSST is composed of 30 CCD detectors with imaging and spectroscopy functions (see Fig. 1).The imaging part consists of 18 CCDs and 7 color filters: NUV (0.22-0.32 µm), u (0.32-0.4 µm), g (0.4-0.55 µm), r (0.55-0.69 µm), i (0.69-0.82 µm), z (0.82-1.0 µm), and possibly a near-infrared filter (0.9-1.7 µm).The slitless spectroscopic part involves 12 CCDs and three gratings (Zhan 2021): GU (0.255-0.4 µm), GV (0.4-0.6 µm), and GI (0.6-1.0 µm).From the designed three gratings, the spectral observations will cover a wavelength of 0.255 to 1.0 µm.Roughly, the limiting magnitudes of imaging observations for the deep-sky and extreme-deep sky modes are 25.5 and 26.5 mag, respectively, and they are 23 and 24 mag for spectroscopic observations in the deep-sky and extreme-deep sky survey.According to the planned ten-year observations with the CSST's survey module and the averaged magnitude limit, around 200,000 known S3Os will be observed and new S3Os will be discovered in the survey observations.In the near future, a largely extended spectral database of asteroids will be obtained by large ground-based telescopes and space-based telescopes.The taxonomic analysis of asteroids is necessary for these new spectral data in order to understand the origin and evolution of asteroids.To meet the goal, we developed a spectral analysis platform for the spectral data of asteroids obtained by the CSST survey module with the aid of an artificial neural network (ANN) technique.
To build the ANN for asteroid taxonomy to be applied in the future to the CSST spectroscopic data, we used the SMASS II data (Bus & Binzel 2002a) and the Bus-Binzel taxonomy system (Bus & Binzel 2002b).The design of the spectroscopic units of the CSST's survey module is going to provide us with the spectral data of a wavelength range of 0.255 to 1.0 µm with an rough resolution of 200 (Zhan 2021).Therefore, the SMASS II data of the wavelength range of 0.435 and 0.925 µm, used as the training and test procedures of ANN, are re-sampled according to the resolution of the CSST's spectroscopic gratings.As the application of the derived ANN, 64 spectra of 42 asteroids observed by us in 2006 and 2007 with the 2.16-m telescope in Xinglong site of the National Astronomical Observatory of China are analysed.
This paper is arranged as follows.Section 2 introduces the construction of the ANN applied in this work, and Section 3 describes the preparation of the training and test data to be used with the ANN.The procedure of training and evaluation of the ANN is shown in Section 4. As an application of our ANN, in Section 5, the analysis results with the derived ANN are presented, and the final Section 6 gives a summary of the work.

ARTIFICIAL NEURAL NETWORK
Machine learning techniques are frequently used to solve problems in the field of, e.g., classification and regression.The machine learning algorithms are usually divided into four types: supervised learning, unsupervised learning, selfsupervised learning, and reinforcement learning.The ANN built for the taxonomic analysis of asteroids belongs to supervised learning, it "learns" by looking for the mapping relationship between the features of the input data to their corresponding labels.As a result, the trained ANN classification model giving a high accuracy for test data can be applied to new spectra of asteroids.

Structure of the artificial neural network
We constructed an ANN of an input layer, one hidden layer, and an output layer (see Fig. 2).Neurons in the input layer take the elements of an input data or features of spectra, represented by a vector X = [x i ], i = 1, . . ., R. The number of neurons in the input layer R is the number of features in the input spectral data.The K neurons in the output layer present a discrete probability distribution Y = [y j ], j = 1, . . ., K for K classes (j is the class index, and neuron y j in the output layer presents the probability of the input spectral data belonging to class j).
Here, we temporarily set the number of neurons in the input layer as R = N in = 151 considering the wavelength range of available training data (i.e., SMASS II, 0.435 ∼ 0.925 µm ), and a lower limitation of the resolution of 200 of the CSST slitless spectroscopic units.The concrete element λ i of input data of our ANNs is computed by Eq.(1), in which λ 0 is the shortest wavelength in the training data or test data, (1 + 1 R ) i (i, the index of element of input data) is the interval distance of two adjacent elements of i and i − 1.For a wavelength range of spectral data from 0.435 to 0.925 µm, the maximum of i is 151: The number of neurons in the output layer is set as K = 10, namely, only ten most popular classes of asteroids are included in the presented ANNs.Detailed information about the chosen asteroid classes is described in Sec. 3.
The number of neurons in the hidden layer is usually hard to determine accurately due to its flexible property.Often, it is determined by iterating over different choices.Generally, more neurons in the hidden layer allow more complicated problems to be solved by the ANN.One should take care when choosing the number of neurons in the hidden layer, because too few neurons in the hidden layer would result in under-fitting the input data, while too large number of neurons can lead to over-fitting.In other words, for the under-fitting case, the ANN will give a low accuracy of prediction to the input data, and for the over-fitting case, the ANN could give a high accuracy to training data, but a low accuracy to the test data.In our case, we found N h = 45 neurons in the hidden layer to work well.
As a feed-forward neural network, the input information is transferred to neurons in the hidden layer, and the output of the hidden layer is continued to be transferred to the neurons of the output layer, and the output of the output layer offers us the results.The transfer, or the connections between the neurons in two consecutive layers, is done using weights, biases, and activation functions.
In detail, the input layer receives the information/data by storing the features/elements of input in the neurons.Each neuron h in the hidden layer receives information/signals from all neurons [x i ], i = 1, N in in the input layer, multiplies that with the weights v i,h and adds bias b 1 h , finally outputting a h after a nonlinear activation function (see Eq. ( 2)).In here we use the ReLU function which can improve the convergence rate of the estimation of weights and biases: The output of the hidden layer a h , h = 1, . . ., N h is taken as the input of the output layer.With a similar procedure but using a different activation function, the information continue to be passed to the neurons in the output layer with the weights w h,j and the biases b out j .With a classification problem as in our work, the so-called softmax function is taken as the activation function, denoted as f 2 , between the hidden layer and the output layer.After the transfer function f 2 , the output of the output layer Y = [y j ϵ[0, 1]], j = 1, . . ., K (see Eq.( 3)) is the predicted probability distribution for the classes: (3) For the goal of the ANN to classify asteroids using spectral data, the output y j of this ANN should be close to that of the known labeled class.If the weights and biases are properly fitted, the predicted values of the ANN should be consistent with the labeled values of the input data.Otherwise, these weights and biases involved in the hidden layer and the output layer need to be adjusted.In our case, there are (N in + 1)N h + (N h + 1)K = 7300 parameters (weights and biases) that need to be optimized to achieve accurate prediction of the input data classes.The procedure to find the best values for these involved parameters is also called a 'learning' or 'training' procedure.

Learning procedure
The structure of our ANN provided, the back-propagation algorithm is applied to optimize the weights and biases.The cross-entropy measure (see Eq. ( 4)) is used as the cost function of our ANN (Boer et al. 2005), which measures the difference between the predicted values and the labeled values in the training data.Given the labels of input data, the smaller value of loss function means a better performance of this network.If there are N sample training samples included in a training set, the loss function can be written as: where vectors P = ([p (i,n) ], i = 1, . . ., K; n = 1, . . ., N sample ) and Q = [q (i,n) ] are the labeled and predicted values, respectively.The subscript (i, n) refer to class i and sample n.
The learning procedure is actually optimization of the ANN where the weights and biases are adjusted in order to minimize the loss function.The optimization method we applied here is the mini-batch gradient descent (MBGD) algorithm (Goodfellow et al. 2016).It is one variant of the stochastic gradient descents (SGD), and has both the merits of the SGD and the batch gradient descent (BGD).The MBGD involved a small part of samples picked randomly from the whole training data set, called a batch of samples.With a batch of samples, the weights and biases are iteratively corrected until the cross-entropy reaches a minimum.The most-used approach in the optimization of the ANN is to move along the gradient direction of the loss function.Let a vector z = [w, b] T represent the parameters with w as the weights and b as the biases.The parameter vector is iteratively updated along to the negative gradient of the loss function with Eq. ( 5).
From Eq. ( 5), new step of update to the parameters z k+1 is related to the previous values of parameters z k and the directions of gradient decrease of the loss function g1, g2.The quantity α (> 0), called the learning rate, is usually set considering an efficient convergence of the training procedure.
Then, a new batch of samples is picked up from the rest of the training data set, and the parameters are continuously updated until each sample in the training data set is used.In practice, the entire training data set is randomly split into some batches according to the number of samples in a batch.The round when all batches are used in training the ANN is called a training epoch.Continuously, the training data set is split randomly and the next epoch of training of the ANN is done, until some conditions are satisfied, for examples, the variation of the loss function is smaller than a threshold or the number of epoch reaches the maximum limit.Based on our experience in training the ANN, the MBGD has a stable convergence speed, and is more efficient than the BGD.Here we build an ANN tool to classify future spectral data of asteroids from the CSST and the new observed spectra from a ground-based telescope.The future CSST's spectral data of asteroids could be obtained by one of 12 spectroscopic units of the survey module (see Fig. 1, four for each of GU (0.255 ∼ 0.4 µm), GV (0.4 ∼ 0.6µm), and GI (0.6 ∼ 1.0 µm).Each spectroscopic unit consists of two gratings of opposite dispersion directions, one band-passing filter, and a CCD detector.The +1 and -1 order spectrum of two gratings could give a stellar spectral image of 3.4 to 4.1 mm in length, corresponding to the equivalent spectral image a 300 ∼ 400 pixels in a 9K CCD image assuming a pixel size of 10 µm.Spectra of asteroids could cover a wavelength range of 0.255 to 1.0 µm if the target passes the GU, GV, and GI spectroscopic units.Considering the wavelength range of CSST spectral data (0.255 to 1.0 µm) and our new derived spectral data (0.40 to 0.83 µm), the SMASS II data and the classification labels from the Bus-Binzel taxonomy system are chosen to construct our ANN tool.
To train and test the ANN of asteroid classification, we used the spectral data from the database of the Small Main-Belt Asteroid Spectroscopic Survey Phase II (Bus & Binzel 2002a,b, SMASS II) and the Bus-Binzel taxonomy system (Bus & Binzel 2002a).

Sample selection
The SMASS data are the first spectral data of asteroids obtained with the spectroscopic instrument using a CCD detection, so more features than the broad-band spectrophotometric colors are detected.The SMASS II database contains spectral data of 1447 asteroids covering a wavelength range of 0.435-0.925µm.Each spectrum is composed of 182 data points with a sampling interval of 2.5 nm in wavelength.
To inherit from the Tholen taxonomy, Bus & Binzel (2002a) constructed a feature-based asteroid taxonomy with a PCA method based on the SMASS II data and information.It is called the Bus-Binzel or SMASS II taxonomic system, in which 26 classes (A, B, C, Cb, Cg, Cgh, Ch, D, K, L, Ld, O, Q, R, S, Sa, Sk, Sl, Sq, Sr, T, V, X, Xc, Xk and Xe) are identified from the spectra of 1447 asteroids.
The spectral data of SMASS II we used here is downloaded from a website1 .Among the 1447 taxonomic asteroids in the SMASS II system, 1139 asteroids belong to C, S, and X-complexes, each of which are further divided into subclasses according to features in spectral data.The rest are identified as the special classes: A, B, D, K, L, O, Q, R, T, V.The occurrence of subclasses in the Bus-Binzel system reflect the connection or transition to its closely neighboring classes (Bus & Binzel 2002a).We consider that it is also important to understand the spectral variations due to heterogeneous surfaces and due to different observational geometry and epoch.
As the first stage of the analysis to the spectral data using CSST and ground-based telescopes, 10 classes (A, B, C, D, K, L, S, T, V, and X-type) are included in our ANN to have enough samples in each class.The classification to subclasses of asteroids will be considered in future work.In all, 834 samples are picked from the SMASS II database.The detailed numbers of samples for each class are listed in Table 1.For convenience, we attributed integer numbers 0, . . ., 9 to the classes we use here (A, B, C, D, K, L, T, V, X, and S-type).

Resampling the SMASS II spectra
The future slitless spectral data of the CSST will cover a wavelength range of 0.255 to 1.0 µm with a lower limitation of the resolution of 200 (Zhan 2021).Before the construction of the ANN for the CSST spectra, the training data, meaning the SMASS II spectra, need to be re-sampled to match the CSST observations.Considering the shortest wavelength of λ 0 = 0.435 µm and the longest wavelength of 0.925 µm, the re-sampled data of SMASS II are composed of 151 data points(Resampling is completed according to expression (1)).
In the actually observed spectral data, as the wavelength increases, the number of data points will become more and more sparse.Using expression (1) to resample will be closer to the actual observation data.The reflectance value corresponding to each re-sampled wavelength is obtained by cubic spline fitting to the spectral data.Considering the wavelength interval of original spectra of SMASS II, we set 0.02 µm as the bin interval of cubic spline fitting, so at least 10 data points are involved in fitting each piece-wise spline function.Most re-sampled spectra are very close to the original ones, only some very noisy data are smoothed in the spline fitting procedure (see Fig. 3).The re-sampling procedure gives us 834 spectra, each having 151 features/channels.

Cloning of spectra
From Table 1, it is easy to notice that the sample counts of different classes are not balanced, for example, the D-type has only 9 samples whereas the S-type has the largest number of samples, 382.If such data is used to train the ANN, the neural network could optimize and learn the features of the classes with large numbers of samples only.To balance the samples in different classes, we added some synthetic spectra into those classes with few samples.The synthetic spectra of a class are generated by a clone method based on all real data in that class.
The spectra of asteroids of the same class show similar shapes with some variation.To obtain the synthetic spectra, first, the representative spectrum of each class is found by averaging all re-sampled spectra in the class (the red line in Fig. 4).In detail, we calculated the mean reflectance at wavelengths λ i (i = 1, . . ., 151) and the corresponding standard deviations σ i k (with k as class index, k = 0, . . ., 9) over all re-sampled spectra in the class k.Then, a clone spectrum of class k is simulated by adding Gaussian noise with standard deviation of σ i k into the representative spectrum of the k class.Repeatedly, cloned spectra are generated until the total number of spectra in each class reaches 400.The cloned and real spectra for each classes are shown in Fig. 4. The 10 × 400 spectra form the dataset used in training, validating, and testing our ANN in the next step.During the construction of the ANN, the dataset is divided into training, validation, and test data sets.

CONSTRUCTION OF THE ANN
The construction of the ANN is a procedure to find an optimal mapping relationship between the input data and the labels.In detail, for an ANN of three layers (the input layer of N in = 151 neurons, one hidden layer of N h = 45 neurons, and the output layer of K = 10 neurons), there are (N in + K + 1) × N h + K = 7300 parameters to be optimized.
To use all spectra in every optimization round would result in a time-consuming and heavy procedure.Therefore we applied the MBGD (Mini Batch Gradient Decrease) algorithm in optimization with multiple parameters.With the   Learning rate controls the step size when updating the parameters in each iteration steps.Too large a learning rate may cause oscillating values for the parameters.Using the ReLU function as the transfer function between the input layer and the hidden layer, a small learning rate is preferred.In the beginning of training, the parameter space grid (batch size, epochs, learning rate) is scanned to find good values for the three parameters.Each combination of the three parameters is tested by checking the efficiency of the training performance.Finally, the batch size of 64, the epoch number of 1000, and the learning rate of 0.001 were chosen.All the ANN parameter values are listed in Table 2.We have an input dataset of 4000 spectra to train and test the ANN, and a ratio of 4:1 is set to divide randomly the 4000 spectra into the training and test datasets.The training procedure of the ANN starts with initializing the weights and biases by random values, and then those weights and biases are optimized iteratively by the training procedure with the following steps: (1) separate the input dataset into training and test data set with the ratio 4:1;

ANN training
(2) split randomly the training data set into batches according to the batch size of 64; (3) correct the weights and the biases by the gradient descent method to minimize the loss function, in which the back-propagation method is used to implement the optimization; (4) repeat to update the weights and the biases with another new batch of data until all batches are used; that is one epoch of iteration to construct the ANN; ( 5) repeat ( 1)-( 4) to reach the required number of epochs; (6) check and test the ANN performance with the test dataset.
Additionally, we introduced a dropout layer in our ANN to avoid an over-fitting problem.The ratio of dropout layer is set as 0.2, which means that random 20 % of neurons in the hidden layer are not connected to the neurons in the output layer in each round of training.
The optimization with gradient decrease (GD) family methods often finds a local minimum directly related to the initial values of the optimized parameters.To overcome this, five ANNs with different initial values for the parameters are constructed, and the final classification result to an input data is determined by voting among the five ANNs.

ANN accuracy
Using 834 real spectra of the SMASS II dataset, the classification performance of the trained ANN is assessed using the classification accuracy, the count ratio of correctly predicted cases to the entire sample.When using the softmax function as the activation function of the output layer, the ANN outputs a probability distribution of the taxonomic classes.The final predicted label of a spectrum is the class with the maximum probability in the probability distribution.To make the prediction result more robust, the vote result of five predicted labels suggested by five ANNs, initialized with different random numbers, is taken as the final predicted label for the input sample.
We investigated the accuracy of the classification ANN by comparing the final predicted labels for the 834 chosen real spectra from SMASS II to their labels in the Bus-Binzel taxonomic system.Generally, 764 out of 834 asteroids are correctly classified, which gives an accuracy of 92 %.Considering the accuracy of each class, the accuracies of A, B, L, and S-class are higher than 94 %, whereas the C, D, T, X, and K-classes have slightly lower accuracies.The detailed values for the 10 classes are listed in Table 3. From the confusion matrix (see Fig. 5), we can see where the problematic cases are.The diagonal elements of the confusion matrix are the number of samples with predicted labels consistent with that in the Bus-Binzel taxonomy, and conflicting labels are the off-diagonal elements of the matrix.It is easy to note that the top conflicting labels are between the C and X-class, and between the S and K-class.For example, 25 C-class asteroids are predicted to be X-class, and 14 S-class asteroids were labeled as K-class with our ANN.Among the 25 C-class samples in the Bus-Binzel system labeled as X-class by our ANN, we found 9 to show a gentle convex behaviour between 0.6-0.8µm, indicating that they might belong to the Xc-class or the transition between the C and X-classes.The 14 S-class samples in the Bus-Binzel system are predicted as K-class by our ANN probably because the classes show very similar spectral shapes.

Test with the Small Solar System Objects Spectroscopic Survey data
As additional test of our ANN, we selected spectra of 415 asteroids from the Small Solar System Objects Spectroscopic Survey (S3OS2) (Lazzaro et al. 2007, for short, S3OS2), which are obtained with the 1.52-m telescope of the European Southern Observatory during the period of November 1996 and September 2001.The wavelength of the S3OS2 data range 0.49-0.92µm.Lazzaro et al. (2007) classified the S3OS2 data refering to the Bus taxonomy system, the analysis results denoted here as S3OS2(B).Among the 415 samples picked up from the S3OS2 dataset, 80 are of S-type, 205 of B, C, or X-type, and 130 of A, D, K, L, T, or V-type.
With a similar procedure, those test data are firstly re-sampled according to the format of the input data of our ANN.Then, they are analyzed with our ANN.If comparing our classification results to those in S3OS2(B), 362 out of 415 samples have the same taxonomy labels.The ratio of the matched samples to the whole data is ∼ 87 %.In detail, the accuracies for S-type, the B/C/X-type group, and the A/D/K/L/T/V-type group are 0.89, 0.87, and 0.86, respectively.

APPLICATION
As the first application of our ANN tool, our observed spectral data obtained by our group with the 2.16-m telescope in the Xinglong site of National Astronomical Observatory (observatory Code 327) in years 2006-2007 are analyzed.The aims of the spectroscopic observations are to figure out the diversity of spectra among primitive asteroids, and the spectral variations of individual asteroid at different rotational phase.So, we have chosen targets which are primitive asteroids (i.e., C-complex ) and could be observed well with the 2.16-m telescope of National Astronomical Observatory of China in the Xinlong site during the allowed time period (with a small zenith distance at asteroid's meridian passage).With our ANN tools, those spectral data mentioned above are analysed.In what follows, detailed information is given on the spectroscopic observations, data reduction, and the analysis results of classification.

Observations and data processing
The five nights of spectroscopic observations were carried out in three nights in 2006 (December 26th, 27th, and 28th) and in two nights in 2007 (November 17th and 19th).The spectral data of the selected asteroids were obtained by an OMR cardigan spectrometer and a PI 1340×400 CCD detector.A grating of 300 lines/mm of the OMR and a long slit of 2.5" in width gives a resolution of R = 200.The orientation of the long slit of the OMR instrument is along the South-North direction.At each night of observations, at least two solar analog stars (selected from HR4030, HR3538, HR3951, HD28099, HD191854, and HR996) were observed one or two times.For each observation of asteroids and solar-like stars, one spectrum of a HeAr lamp installed in the OMR system was followed for the wavelength calibration.In order to eliminate the effect of atmospheric extinction, we arranged the spectroscopic observation of asteroids at the smallest possible zenith distance.
Because of the motion of an asteroid relative to the stars in the same field of view, it is possible for the asteroid to closely pass by a star, and that could result in stellar light entering the slit, which would lead to contamination of the spectral data of the asteroid.We checked the observation images of asteroids, and contaminated data are picked up and rejected.Also, part of the spectral data on Dec. 28th, 2006 are not used due to the abnormal operation of the OMR.Finally, 64 spectra of 42 asteroids are involved in the analysis.In Table 4, the detailed observational information, including the time of observation, exposure time, airmass, and the solar analog stars used, is listed for each spectrum.
The data reduction of the spectroscopic observations of asteroids were done according to the standard procedure (Bus et al. 2002) with the IRAF software.Firstly, the systematic errors of the spectroscopic image, i.e., bias, dark current, and flat-field effects were corrected, then cosmic rays in the image were identified by a threshold of 4 times the standard deviation of the sky background and removed.Secondly, one-dimensional spectra of asteroids, solar-like stars, and HeAr lamp were extracted by an optimal aperture and a fitted background.The wavelength calibration of a target's spectrum was done with the aid of the spectrum of the HeAr lamp after the target.As for the atmospheric extinction effect, we used an average extinction curve of the Xinglong site (Code 327).
At each observational night, multiple bias frames were obtained at the beginning and end of observation of that night, and were combined into a synthetic bias image by averaging, called the best bias image.All other spectroscopic images, i.e., those of the targets, flats, and HeAr lamp, were subtracted with the best bias image to correct the bias effect of the CCD detector.Similarly, a combined flat image was generated from multiple observed flat images which actually are spectroscopic images of an arc lamp in the telescope.The illumination and response effect on the combination flat was simulated and then removed from the combination flat image.The final flat image is a normalized combination flat image without the illumination and response effect.The flat-field effect of the CCD detector on all spectroscopic images are corrected by dividing with the final flat image.
When extracting one-dimensional spectra of celestial objects, a trace procedure was applied.That is to trace the peak flux along the dispersion direction in the two-dimensional spectroscopic image with an optimal width (or aperture), after which all flux within the aperture are summed along the vertical dispersion direction to form the one-dimensional spectrum.The optimal aperture of the extraction of the one-dimensional spectrum is determined by balancing the maximum target flux and the minimum of background flux in the aperture.
For the wavelength calibration of a target spectrum, we used a Legendre function with the order of three or five.The calibration is derived by fitting the pixel positions of the emission lines for the spectrum of the HeAr lamp following the target to their wavelength values.On average, the wavelength range of our spectra is between 0.40 and 0.83 µm.
To reduce the influence of extinction as much as possible, we scheduled the observation epochs for each asteroid for the time when they are as high in zenith as possible.In practice, we applied an average extinction curve of the Xinglong site of National Astronomical Observatory: the correction of extinction for each source spectrum is done according to the airmass of the time of the observation.
To obtain the reflectance of an asteroid, we conducted spectroscopic observations for the selected solar analog stars in the same night of the asteroid observations.Both the spectra of the asteroids and the solar analog stars are normalized to unity at the wavelength of 0.55 µm, then the reflectance of an asteroid is that of the normalized spectrum of the asteroid divided by the normalized spectrum of the selected solar analog star.Sometimes multiple spectra of single solar analog star are obtained during one night, the averaged spectrum of those is then used to derive the reflectance of the asteroids.If spectra of multiple solar analog stars are obtained, we prefer to use the spectrum of a relatively stable star, or the averaged spectrum of those stars with no obvious activity.
The sample of 55 reflectance spectra for 41 asteroids is shown in Fig. 6 and an example with 9 reflectance spectra of asteroid (469) Argentina is shown in Fig. 7.

Taxonomy analysis with our ANN tool
According to the format of the input data of our ANN tool (151 channels ranging from 0.43 to 0.92 µm with an increase of the span of (1 + 1 R ) i (i = 1, . . ., 151), the 64 spectra of 42 asteroids are re-sampled using the cubic spline  -class (1512).For the three E/M/P-type asteroids in the Tholen system, ( 469) is identified as T-class, (514) as K-class, and (663) as S-class.Considering the large albedo of E/M/P-type asteroids, they usually have a moderate spectral slope in the visible band.The new labels for the three asteroids are reasonable.
The 24 asteroids labeled as C-type or sub-class of C in the Tholen system are classified as C, B, and X-class (in detail, 8 asteroids as C, 11 as B, 3 as X and 2 as T) by our ANN tool.Checking the data of the Bus-Binzel taxonomic system, we found that 16 of the 42 observed asteroids are included, and fourteen asteroids were labeled as sub-class in the C-complex (C, Ch, Cb, or Cg-type), and two asteroids (199 and 907) as X-complex.For these fourteen C-complex asteroids, six are classified as B-type, five as C-type, two as X-type, and one as T-type by our ANNs tool.The asteroid (199) is classified as T-type and the asteroid (907) as B-type by our ANN tool.
Generally, asteroids of C-type in the Tholen system or C-complex in the Bus-Binzel system are classified as one of the C, B, X, and T-types with our ANN tool.We think this is due to the featureless spectra of the B, C, X, and T-types.Some features, i.e., the slope between 0.55 and 0.86 µm, shallow absorption in UV, absorption around 0.70 µm, or maybe absorption around 0.85 µm (see Fig. 4) are key points to distinguish the above four types.The reasons of confusion between the four types could be the low signal-to-noise ratio in the UV-band and the lack of data above 0.83 µm of our data.Anyway, we could not rule out the spectral variations of an asteroid rising from heterogeneous composition over its surface and/or change of the observational geometry, e.g., solar phase angle.

Spectral variation of asteroids at different rotational phases
In our data, 14 asteroids have two spectra, and one asteroid has 9 spectra obtained in three adjacent nights.For the aim of classification of asteroids, we built five ANNs to determine the type for each spectra by voting.To check the final type of each spectrum for those asteroids with more than one spectrum, 13 asteroids received a consistent type, two asteroids have different labels (see Table 5).To understand the subtle spectral variation of an asteroid at different rotational phase, we choose one ANN from our five ANNs which gives the most similar result as compared to the voted result.Here, we compared the maximum probability output by the chosen ANN of each spectrum of asteroids with multiple spectra.
For the 14 asteroids with two spectra for each, 12 asteroids are classified as the same type for their two spectra, while two asteroids ( 696) and ( 780) received different labels (see Fig. 8).By the maximum probability that a spectrum belong to a certain type ( the column 3 of Table 6),the extent of variation of spectra of an asteroid, or different asteroids but belong to the same type, could be investigated.As for the two asteroids ( 696) and (780), we checked the extinction effect on their observations.We found that the airmasses for the two spectra of (696) in the two nights are 1.64 and 1.22, respectively, and that the airmasses of (780) in the same nights are large (2.61 and 2.63).So, we could not rule out the possibility that the diverse labels of the two asteroids rise from the atmospheric effects, although the extinction correction has been done with the averaged extinction curve of the Xinglong station.
For the asteroid (469) Argentina, we compared its nine spectra obtained in three adjacent nights.Considering the maximum probability of each spectra, eight spectra are labeled as T-type with high probabilities, and one spectrum as C-type (see Table 6).The photometric data of Argentina show a more complicated lightcurve shape than that of a sinusoid with two peaks, implying multiple potential periods (17.57h, 12.84 h, and 8.79 h) of brightness variations (Colazo et al. 2021;Wang et al. 2005;Warner 2007).Here, we selected a spin period of 8.79 h (this period gives a double-peaked lightcurve over a full rotation) to calculate the rotational phase (zero phase at JD2454096.14340).Figure 7 shows nine spectra of Argentina arranged from bottom to top with increasing rotational phase angle.From Fig. 7, the most significant variation occurs at the phase of 0.067 deg, and spectral variations at other rotational phases could be related to the values of maximum probability of each spectrum in the ANN's output layer (see Table 6).

SUMMARY
The main aims of our work are to build an ANN tool to classify the future slitless spectra of asteroids from the Chinese Space Survey Telescope (CSST).By the way, we made a taxonomic analysis for 64 unpublished spectra of 42 asteroids with the established ANN tool.The ANN Tool is composed of five ANNs to give the final result for a tested spectrum by voting.Each ANN consists of an input layer of 151 neurons, a hidden layer, and an output layer of 10 neurons (i.e., A, B, C, D, T, K, V, L, S, and X-type).Considering the spectral wavelengths of the CSST's spectroscopic instrument, spectra of 834 asteroids selected from the SMASS II dataset and their labels from the Bus-Binzel taxonomic system are applied to train our ANN with different initial weights and biases.
Following the resolution of the CSST spectroscopic data, spectra of SMASS II are re-sampled by a cubic spline method.To overcome the imbalance of sample numbers in SMASS II between the different categories, we cloned samples for each type until the number of samples of each type reached 400.Finally, the training data set contains a total of 4,000 spectra.The training procedure applies the mini-batch gradient descent (MBGD) algorithm.The accuracy of each of the five ANNs is higher than 92 %.
As an application of our ANN tool, 64 unpublished spectra of 42 asteroids are analyzed.40 out of the 42 asteroids are primitive small objects because they are classified as one of the B, C, X, T, and D-types, and the remaining two are K and S-type.For the first time, asteroid (1303) received a classification: based on our analysis, it is a D-type asteroid.
Comparing our labels of the asteroids to the previous labels, inconsistent labels mainly occur from (1) C-type to X and B-type and from (2) P-type to D and T-type.These cases may be explained with the space weathering effect.Lantz et al. (2018) found that the space weathering effect could make an original C-type spectrum to be a redder X-type or a bluer B-type depending on the composition of its surface, and a D-type asteroid could be X-type or P-type.
The ANN technique gives us the opportunity to check the spectral variation of an asteroid along the rotational phase.For example, the classification results of 9 spectra of asteroid (469) show uniform composition along its rotational phases except for a single rotational phase indicating C-type.
By the ANN analysis, 40 primitive asteroids are classifed as B, C, X, T, or D-type, reflecting the diversity of primitive asteroids.Investigating visual slopes of 40 asteroids, we found that slopes (a linear fit to the spectrum between 0.55-0.86µm) of B, X, T, and D-types are located in −3.5-0.11, 1.5-4.5, 4.6-6.3, and 4.9-9.9%/1000 Å, respectively, which is consistent with the trend of the visual slope (also the dominated feature) of these four types.As for the C-type asteroids, wide visual slopes are presented.In detail, most values occur between −0.82-1.64,while 5 spectra (see Fig. 6 for 146,414,and 508;and  Table 6.Classification results of asteroids with multiple spectra To identify the feature of 0.7 µm absorption band in a spectrum, linear trend of the spectrum derived by fitting two section data 0.55-0.58µm and 0.83-0.86µm is removed.Here, we think, a possible absorption band feature must be a concave shape with a maximum depth deeper than 1 percent of values of two shoulder parts.With this idea, other observed spectra are checked, and the results are 10 out of 12 spectra of B-type, 9 out of 17 spectra of C-type (including 5 spectra mentioned above), and 2 D-type and 1 T-type show this absorption band feature.It seems that the proportion of B-type asteroids bearing hydrous minerals is the largest, followed by the C-type.
We are aware that the presented ANN tool analyse our spectral data needs to be improved for satisfying the future needs for the spectral data of asteroids from the CSST survey.This is because the new data will contain important UV data (0.255-0.43 µm) and data from 0.83-1.0µm, which are keys to understand primordial objects in the Solar System.More importantly, we will obtain spectroscopic data of numerous faint Solar System objects from the CSST ten years survey, which may improve significantly our understanding to the origin and evolution of small bodies of the Solar System, and even planetary systems.

Figure 1 .
Figure 1.The CCDs, filters, and gratings arrangement in the survey module of the CSST.

Figure 2 .
Figure 2. The structure of the artificial neural network.The network contains an input layer, a hidden layer, and an output layer.X represents the input, v and W represent weights, b represents biases, f represents the transfer function, and y represents the output.

Figure 3 .
Figure 3.The blue line is the original data of SMASS II, and the red line is the re-sampled data.

Figure 4 .
Figure 4. Input data/spectra for the ANN.Blue color shows the real and cloned spectra, and the red line represents the average spectrum for each class.

Figure 5 .
Figure 5.The confusion matrix.Each diagonal element with green color indicate the number of samples having consistent labels with the Bus-Binzel taxonomy, given by the row index, while the elements in that row with yellow color show the number of samples mislabeled into a class corresponding to the column of the value.

Figure 7 .
Figure7.Spectra of (469) Argentina.Rightmost column lists the rotational phase folded with a period of 8.79 h.Each spectrum is offset for clarity.

Table 1 .
Number of samples for each taxonomic type in our data.

Table 2 .
ANN parameters., some additional parameters are introduced: batch size, number of epochs, learning rate, and the dropout layer ratio.Batch size is the number of samples in each batch as the training data set is randomly divided into smaller batches.One training round means training the ANN with one batch.When all batches are used to update the weights and the biases, one epoch of training is done on the ANN.So, the quantity of epochs determine how many times we split randomly the training data set.When setting this value, one needs to consider the computation time and the convergence of the training procedure. algorithm

Table 3 .
The classification accuracy of our ANN tool.

Table 4 .
Observation information for 42 asteroids

Table 5 .
Classification results for 42 observed asteroids.
b and 1911) and T