Taxonomy of Subkilometer Near-Earth Objects from Multiwavelength Photometry with RATIR

We present results from observations of 238 near-Earth objects (NEOs) obtained with the RATIR instrument on the 1.5 m robotic telescope at San Pedro Martir’s National Observatory in Mexico, in the frame of our multiobservatory, multifilter campaign. Our project is focused on rapid response photometric observations of NEOs with absolute magnitudes in the range 18.1–27.1 (diameter ≈ 600 and 10 m, respectively). Data with coverage in the near-infrared and visible range were analyzed with a nonparametric classification algorithm, while visible-only data were independently analyzed via Monte Carlo simulations and a 1-Nearest Neighbor method. The rapid response and the use of spectrophotometry allows us to obtain taxonomic classifications of subkilometer objects with small telescopes, representing a convenient characterization strategy. We present taxonomic classifications of the 87 objects observed in the visible and near-infrared. We also present the taxonomic distribution of an additional 151 objects observed in the visible. Our most accurate method suggests a nonfeatured-to-featured ratio of ≈0.75, which is consistent with the value found by the Mission Accessible Near-Earth Object Survey, which conducted a similar study using a spectral analysis. The results from the Monte Carlo method suggest a ratio of ≈0.8, although this method has some limitations. The 1-Nearest Neighbor method showed to be not suitable for NEO classifications.


Introduction
Solar system minor bodies are tracers of the system's formation and evolution, and they can be used as current samples of such processes (Delsemme 1991;Malhotra 1997).In order to understand the processes occurring in our system, studies with numerous samples focused on analyzing colors, taxonomies, and orbital and physical properties of asteroids from different populations have been made (see for example Ivezić et al. 2001;Carvano et al. 2010;Carry et al. 2016).Near-Earth objects (NEOs), the minor bodies closest to us, originate from the asteroid main belt (MB) between the orbits of Mars and Jupiter.Hence, these surrounding objects inform us about the physical properties in the region of the gas giants (Bottke et al. 2002;Granvik et al. 2018) and enable us to take a closer look at the processes that occur in the solar system.
It is well known that an asteroid of a few dozens of meters in diameter can cause serious damage when impacting Earth (a recent example is the "Chelyabinsk Event" in 2013.See for example Brown et al. 2013).Given that there exist nearly a million objects of that size or larger (Trilling et al. 2017), a global effort is being made to prevent a catastrophe (see, for example, Planetary Defense Strategy and Action Plan Working Group 2023).Thus, the discovery and characterization of such objects has become a high priority.
Spectroscopy is the gold standard for measuring NEO composition (e.g., Binzel et al. 2019).Different taxonomies are defined by distinctive spectral features of the objects.However, spectroscopy is expensive in telescope time and aperture.In comparison, spectrophotometry has the advantage of being less expensive than spectroscopy in terms of telescope time and can be performed with smaller telescopes.Spectrophotometry consists of photometric measurements in a few key spectral bands, hence collecting the light within a broad bandpass instead of dispersing it as in spectroscopy.It is worth to mention that a historical drawback to spectrophotometry is that images in different filters were taken in series, hence, lightcurve effects needed to be considered; as we explain in the next section, this is not the case for the instrument we use.It has been shown that spectrophotometry can be sufficient to classify asteroids using a taxonomic scheme (Tholen 1984;Hahn & Lagerkvist 1988;Ivezić et al. 2001;etc.),where objects of a given taxonomy cluster in a color diagram, generally creating different zones for each taxonomy.
Recent observations of asteroids in near-Earth space (Mommert et al. 2016;Erasmus et al. 2017;Ieva et al. 2018;Lin et al. 2018;Perna et al. 2018;Binzel et al. 2019;Devogèle et al. 2019;Navarro-Meza et al. 2019) have shown that the taxonomy of NEOs is size dependent and different from the objects in the MB.Currently, we are unable to observe subkilometer objects in the MB, so a comparison between these and small NEOs is not possible.The aforementioned studies agree that rocky S and Q types (unweathered version of the former), together with the nonfeatured C type and X type, constitute the majority of small NEOs.However, the percentage of each taxonomy in the distribution reported by each study can differ by a factor larger than 3: Devogèle et al. (2019) report 17% S types, while Ieva et al. (2018) estimate 61%.Several factors affect different estimations: the size range, sample size, observational biases, taxonomies considered, the quality of the data, methodology used, etc.To help with this discrepancy, the fraction of characterized NEOs needs to be increased.During the characterization process, it would be helpful to make studies with larger samples (a few hundred objects) using a homogeneous classification method.In order to compare with other studies, here we assume that X-type asteroids are of a nonfeatured nature, and we will refer to them as such.Nonetheless, we remark that this is only one of the different interpretations of these objects with weakly featured spectral signature (see Clark et al. 2004, and references therein).
Here we present a combination of spectrophotometry and rapid-response observations, i.e., observations that are obtained shortly after the discovery of the target, when it is still relatively bright.This study is part of a worldwide systematic survey of NEO compositions (see Mommert et al. 2016;Erasmus et al. 2017;Navarro-Meza et al. 2019, for our previous results), aimed to increase the number of small NEOs with physical information, which is of interest for science, industry, and planetary safety.The rapid-response approach is generally not feasible through classical observing proposals due to heavily oversubscribed classically scheduled major research facilities.
Section 2 describes our rapid-response approach and the instrument we used.Section 3 addresses our data selection and analysis, which are different for objects with visible-only data and objects observed in the visible and near-infrared (NIR) regions.In Section 4 we provide our results and the corresponding discussion is given in Section 5. Finally, in Section 6 we present our conclusions.

Observations
We present data taken from 2014 to 2019.Observations were performed with the Reionization And Transients InfraRed camera (RATIR; Butler et al. 2012), on the San Pedro Martir 1.5 m telescope at the National Mexican Astronomical Observatory (Observatorio Astronómico Nacional).The instrument is equipped with two visible and two NIR detectors, all of them 2048 × 2048 pixels.Each of these detectors corresponds to a different channel with specific filters in the visible and NIR, as shown in Table 1.RATIR takes four images of an object in a single shot, enabling us to observe simultaneously with six different filters. 7otential targets are identified and uploaded into the RATIR queue on a daily automated basis.Based on the Provisional Designation provided by the Minor Planet Center, accessible targets are identified among those NEOs that have been discovered within the last seven weeks; this duration is partially arbitrary,8 but the method usually leads to a number of well-observable and bright potential targets.A target is considered accessible if it has a visible brightness V 19 and an air mass 1.8, as provided by the JPL Horizons system (Giorgini et al. 1997), for at least the duration of the estimated RATIR integration time.A signal-to-noise ratio of 100 or larger was preferred, but values down to 50 were accepted for targets of interest.This value was estimated in the V filter.
Potential targets are manually selected from the list of accessible targets, prioritizing objects with high absolute magnitudes H mag (small sizes) and large values of H mag − V , where V stands for the apparent magnitude of the target of the upcoming night.A high value of H mag − V ensures that our target is observed when it is close to the Earth.RATIR queue observing scripts are automatically created for the selected targets, using the latest orbital elements of the objects of interest provided by JPL Horizons.The exposure time of each frame, as well as the total integration time in each band per visit, are a function of the object's brightness.Exposure times usually range between 5 and 30 s, while the total integration time per target is usually less than 1 hr.One object can have one single visit during our campaign or different ones, made during different days or all in one night.The maximum number of visits of a target reported here is four.
Our rapid-response approach is the key feature of this project.We trigger rapid-response spectrophotometric observations of NEOs within a few days of their discovery when the objects are generally still bright.We can observe and characterize objects as faint as H mag ≈ 27 (equivalent to a 10 m diameter using an average NEO albedo of 0.28; Thomas et al. 2011).
Image reduction and photometry is carried out using a pipeline developed for gamma-ray burst observations (see, e.g., Littlejohns et al. 2015;Becerra et al. 2017).The pipeline was adapted for working with asteroids: it removes sidereal sources to extract the NEO from the image and coadds in the moving frame of the target.The point-spread function determined from the sky image is then fit to the moving-target image and the zero-point from the sky image is applied to normalize the photometry (see Navarro-Meza et al. 2019 for details).We obtain from the pipeline the magnitude and error of the observed object in each of the six broadband filters used, which enables us to have up to 15 different pair-wise astronomical colors.The error associated with each color is obtained by adding in quadrature the errors from each of the corresponding bands.The colors reported in this article were calculated in the Vega system and the solar component has been subtracted.
Our observations were planned in order to observe faint objects within our facility's capabilities.However, weather or other factors might interfere during our automated observations.Hence, we use several criteria in order to identify and reject photometric outliers.In next section we describe the different methods we used to analyze our data.One of them makes use of different colors to classify each object, forming a multidimensional space, while the other one makes a classification using only one color (i.e., it is one dimensional).The rejection method is very similar for all data, however the one with several colors is more accurate, hence we imposed more relaxed limits on the selection of the data analyzed with such method.
For all targets, we visually confirm the performance of the pipeline by verifying that the image resulting from the coaddition is clean: a well-defined source and the successful removal of background sources are required.The next lines describe the selection process of the objects analyzed with the multidimensional approach.We use a 10σ clipping algorithm applied to the measured colors: for all the objects of the sample under analysis, we estimated the variance of each color, and used the largest of them as the standard deviation for the σ clipping.If any of the colors of an object are not within the 10σ limit, all measurements from that object are removed.The 10σ clipping was carried out three times.Each time, the photometric errors from the individual measurements were propagated to obtain the weights for the average.Notice that this step is conservative since it uses the color with the largest, not the smallest, variance.Using the smallest could clip many good measurements in other colors.The rejection process has an additional stage, in which we exclude those objects outside the limits of the expected distribution of colors.We define the expected distribution as the range of values of a given color in the spectra we use as a reference (we define the reference spectra in the next section), plus one-third of this range on each side.For example, if the color a − b has values between 0.0 and 0.1 in the reference sample, we allow for values as low as −0.03 and as high as 0.13.We reject observations with an error in the color larger than 0.13.We tested this upper limit by giving it values between 0.07 and 0.13, finding no direct correlation between this value and the accuracy of the results for objects with measurements in five photometric bands.For objects with measurements in three bands, the accuracy in the classification is negatively correlated with the photometric error.However, in all cases, even objects with large errors can be effectively classified as nonfeatured or featured.For example, object 2016 FB (see Table 4) has a Z − Y error of 0.127, but the probability of it being a featured object is 85%.
For objects classified with the 1D approach, their r − i color is obtained with a weighted average of the individual measurements as described in Navarro-Meza et al. (2019); then a limit on the color index's error due to a photometric uncertainty of 0.07 was set.As explained below, the 1D analysis is made with a fit considering the characteristic colors from each of the considered types in the sample.Objects with a measured color outside the limits of the color range in the reference sample (with a margin of 0.07 for the propagated photometric error) are excluded.We require a minimum of four measurements per visit in our observations.The outcome of this selection process is 87 different objects for the multidimensional analysis.The median of the propagated error in color for these objects is 0.052.For the 1D analysis we have 151 different objects, with a median of the propagated error in color of 0.02, making a total of 238 objects.

Data Analysis
We use the MITHNEOS database as a reference for the asteroids' colors.This is one of the largest sample of measured NEO spectra (Binzel et al. 2019; data available at http://smass.mit.edu/minus.html).To obtain the characteristic color of each taxonomic type, its reflectance spectrum is convolved with the spectral response of each of RATIR's filters and the solar spectrum.Details of the process are discussed by Mommert et al. (2016;Section 3).Hereafter, we refer to this database as the "Control Sample."We refer to all objects we observed with RATIR as the "Observed Sample." In an ideal case, the Control Sample would cover the same wavelengths as RATIR's broadband filters.However, not all the objects in the Control Sample have observations in the same wavelength range, and not all the objects from the Observed Sample have valid measurements in all the considered bands.To address this, we consider those objects from the Control Sample that have at least half of the coverage in wavelength of a given RATIR filter.Due to this reason, the size of the Control Sample varies for the different cases described below, and in each case its size is specified.

Taxonomic Classification
We base our classification system in the Bus-DeMeo scheme (DeMeo et al. 2009).This is a widely used taxonomic scheme that combines the visible and NIR ranges, covering from 0.45 to 2.45 μm.The taxonomy includes 24 classes, most of which are subtypes of the C, S, and X complexes (see DeMeo et al. 2009).To classify an object in the Bus-DeMeo scheme formally, it is required to take equally spaced spectral measurements in the referred wavelength interval.The distinct taxonomies in such scheme are created based on the spectral features of the objects, while the spectral slopes are not considered.Here we make a "translation" of the Bus-DeMeo scheme (or the Bus scheme in Section 3.3) to an equivalent scheme based on spectral slopes.In this way we are able to assign taxonomies consistent with Bus-DeMeo based on photometric measurements.
The Control Sample includes spectra from most of the classes in the Bus-DeMeo scheme.We train our k-Nearest Neighbors (k-NN) algorithm on the characteristic colors of each type (see the next subsection), and only considered taxonomies that have at least five spectra to provide sufficient training data.Based on this constraint and test runs, we decided to use the S, C, X and Q 9 taxonomic complexes, but not to make a distinction between their individual sub-lasses; we also considered D-and V-type objects.During test runs, we considered the inclusion of L-type objects, finding a fraction of them consistent with previous studies.However, our method showed low accuracy on identifying such types, with a tendency to create false positives, hence we do not consider them in our classification scheme.

Nearest Neighbors, Nonparametric Classification
We base our analysis on k-NN, a supervised classification/ regression algorithm implemented in Python via Scikitlearn (Pedregosa et al. 2011).We recall that not all the observed objects have valid measurements in the same number of bands.For example, the colors involving the H band do not meet the criteria explained in Section 2, hence we decided not to use this band.We use k-NN and cross-validation via GridSearchCV from Scikitlearn to study the training sample and determine which combination of bands is most effective for classifying the objects.With the use of GridSearchCV, our k-NN algorithm uses 80% of the control sample to train itself and 20% to evaluate the training by reclassifying the objects, keeping track of its success when doing so.This is done four times (four-fold cross-validation), mixing the sample and taking every time different specific objects.The average of the success of assigning the correct classification is the socalled "test accuracy."We judge the different filter combination options by considering the outputs from GridSearchCV and the number of observed objects with valid measurements for each of the combinations.Table 2 shows the three combinations selected and their structure: photometric bands, number of objects classified, k, training sample accuracy, and test sample accuracy.We call each of these combinations "Sets."Notice that the selection of the Sets is dependent on our Observed and Control Samples, thus it is not a generalized recommendation of filters.The so-called confusion matrix for each Set is also shown in Figure 1.These matrices show how many objects from each kind were accurately classified, as well as the false positive and false negative classifications.
We use subsets of the Control Sample to train the algorithm.These subsets are chosen in order to match (in wavelength range) the Sets of our Observed Sample, and they constitute the Training Sample for our algorithm.The algorithm learns the regions of each taxonomic class in an N-dimensional space, where N depends on the number of bands that constitute a particular Set (see Table 2).The objects from the Observed Sample that do not fulfill the selection criteria for Set 1 are returned to the pool of observed data to be considered by Set 2. Equivalently, those not suitable for Set 2 are considered for Set 3.This last one is constituted by the r − i color and its analysis is described in Section 3.3.1.All the objects from the Observed Sample that meet the selection criteria from Section 2 belong to a Set.
Once the algorithm is trained, it is fed with a Set from the Observed Sample and the classification of each object is returned according to its position in the N-dimensional color space.We describe this procedure for the case of Set 1: from the Control Sample, we extract 188 spectra with a wavelength range that produces 10 different colors, which we use to train our code using a 3-NN.This is equivalent to a 10D space, in which each of the taxonomies considered occupies a characteristic region.Then we feed the code with our observed objects with measurements in Set 1 and the algorithm determines the classification of each member according to its position on this 10D space.From the 170 spectra that constitute the Training Sample for Set 1, 12 belong to the C or X The training and testing accuracies are the average of such parameters over the four-folds considered by our method for the indicated k (Section 3.2).Set 3 is analyzed with a Monte Carlo (MC) approach (Section 3.3.1). a Notice that for this case, the test accuracy is defined in terms of the model's ability to distinguish between nonfeatured and featured objects, rather than individual types (see Section 3).complexes, 95 to the S complex, and 50 are classified as Q types, five as D types, and eight as V types.For Set 2 we follow an equivalent procedure, this time using a 7-NN, and with 239 elements in the Training Sample: 17 are classified as members of the C or X complexes, 114 belong to the S complex, and there are 85 Q types, eight D types, and 15 V types.The data were standardized prior to the k-NN analysis by using StandardScaler from Scikitlearn.Figure 2 shows an example of the regions corresponding to each taxonomy in a 2D space.In the plot can be noticed that the V region is clearly defined, while the borders between the other regions are uncertain.Therefore, objects with a measured color close to those borders will have some degree of uncertainty in their classification.Such uncertainty is decreased when considering more colors (dimensions).This figure is for illustration purposes, since it only corresponds to a "slice" from our 10D model.
The true a − b color from one of our objects for any a and b bands is expected to be in the interval [α − ς, α + ς] at the 1σ confidence level, where α is the measured value of a − b and ς is the photometric error derived by the pipeline.In order to account for the uncertainty in our classification scheme, 10,000 copies of each Complete Set were generated and classified.From this, we can obtain the likelihood of an object to be assigned to different taxonomies, as well as the error in the taxonomy distribution of the Set.The 10,000 copies of each object in the Set were generated under a Gaussian distribution with a mean α equal to the measured value of each color and a standard deviation equal to ς.For objects with multiple visits, the entire procedure is made independently and the likelihoods of each case are averaged.The outcome of this approach is discussed in the next section.

Monte Carlo Simulations with Synthetic Objects
Observations with visible-only data (Set 3) incorporate data from two bands (one color, thus 1D) and it is not possible to perform a robust analysis like the one described above.For these data, we use a modification of the model presented in Navarro-Meza et al. (2019).The mathematical description is below, but in plain words, all the objects from Set 3 are analyzed in bulk: we build a distribution using the measured ri color of each object and the error; then, using synthetic samples of asteroids, we find the most likely classification (C or S type) of our distribution.This result is from the whole Set, and no classification of individual objects is obtained.
The model presented in Navarro-Meza et al. (2019) is a model used to analyze data in one dimension, and it is based on kernel density estimation (KDE) with a Gaussian core, however here we make slight modifications.The general procedure is as follows.We consider every observation as a normalized Gaussian centered on its r − i value and a width equal to the r − i error.To account for objects observed in different telescope visits, the Gaussian's normalization factor is divided by the number of visits each object has.The Gaussians from all the objects are added up to build a probability density function (PDF).Then we perform MC simulations with synthetic asteroid samples in order to find a curve that resembles the PDF; hence, we can approximate the taxonomy distribution of the observations.To form the synthetic population we use the types that have a representation of more than 1% in the Control Sample, and which belong to one of the taxonomy complexes in the Bus scheme,10 namely B, C, X, S, Sr, Sw, Q, and Sq.The number of objects belonging to these taxonomies is set with a pseudorandom number generator, which is also used to assign an error to the color of each synthetic object.The total amount of objects per synthetic sample is the same as the number of observed objects we are pursuing to fit.
A MC simulation requires a distribution with information from the system.For this, we use four Gaussian distributions based on the percentage of each complex in the Control Sample.The percentages are scaled to the sum of all the complexes (≈90%).This is equivalent to our model's assumption that all the objects belong to a complex.We give room for testing other distributions by means of the standard deviation of the Gaussians.In this paper, we define a diagnostic ratio of nonfeatured to featured asteroids, described further below.In this space, which here we write as + + C X S Q , our simulations are centered on 0.26 and cover the range while considering three standard deviations from the mean of the distribution.With each synthetic sample, we build a density function as described at the beginning of this section.We call the result a random probability density function (RPDF) to distinguish it from the PDF of our data.N sim synthetic distributions are made and the reduced chisquared (c r 2 ) between the PDF and RPDF is calculated.This methodology has two classes of variables: the taxonomic distribution and the errors of the synthetic objects.The later are needed for the model, but they do not contribute useful information to the results.To measure the accuracy of the model, and the influence of both sets of variables, we input the Control Sample into the MC model, trying to retrieve the known taxonomic distribution.We observed that the PDF and RPDF can converge via the synthetic errors, hence, a low c r 2 is a proxy, but this does not necessarily represent an accurate solution for the taxonomic distribution.To avoid interference from the synthetic errors, we used the 100 simulations with the lowest c r 2 , but we analyzed independently the C, X, S, and Q distributions from these top cases with KDE.The width of the Gaussian is very important for building a representative probability function.For the control sample, we first used Python's standard bandwidth for a 1D Gaussian KDE (Kern 2004), with bandwidth = n 0.25 , where n is the sample size.In this way, we retrieved the + + C X S Q ratio from the Control Sample with 99.94% accuracy.However, this bandwidth does not include information from the data, other than its size.Since our Observed Sample has large errors, we consider it is important to use a bandwidth that uses information from the standard deviation of the sample.Hence, we decided to use "Silverman's rule of thumb" (Silverman 1986): where σ is the standard deviation of each taxonomic distribution for the top 100 cases.This value is well below the limit for oversmoothing (Scott 2004; Equation (9)).Using Equation (1), we retrieved the S Q ratio from the Control Sample with 89.2% accuracy.While working with the Control Sample, we observed that Q objects can be classified as X, given the proximity of their r − i color indices.This behavior can be observed in the upper panel from Figure 5. There, the curve corresponding to the Q types is the most deviated from the expected value, and both the Q-and X-type curves are bimodal.The two modes in both curves have approximately the same separation, indicating that Q types are commonly misclassified as X and vice versa.In terms of the + + C X S Q ratio, this behavior produces an underestimation of Q types, which is balanced out by an overestimation of S types, reasonably preserving the value of the ratio.We also observed that = N 10 sim 6 is sufficient to obtain a reliable solution.The size of the Observed and Control Samples is similar, therefore we use the same number of simulations for extracting results from our data.

1-Nearest Neighbor
We explore the use of the method used in Erasmus et al. (2020) to distinguish between C-and S-type objects.The method can be understood as a simplification of the MC approach previously explained.10,000 clones are generated under a Gaussian distribution centered on the measured color and with an error equivalent to the propagated photometric error corresponding to the r − i color.The taxonomy of each synthetic object is determined depending on its relative distance to the characteristic color of the C and S types.The classification of each of the observed objects is estimated from the highest probability, based on the classification of each of the clones.
The MC procedure presented a relative error ∼ 0.1, while the 1-NN model presented a relative error ∼ 2.9.The higher discrepancy using 1-NN is due to the characteristic color of the Q complex.Q and Sq types have characteristic colors between those of the X and S complexes.Hence, in a binary11 1-NN approach, the classification of members of the Q complex will be split between X and S types while due to their features they are more similar to the S types.1-NN tests were made including the possibility of an object being classified as a Q type, however the accuracy did not improve.We decided not to use this method for our classification.

Results
Here we report spectrophotometric observations of asteroids within an absolute magnitude (H mag ) range of 18.1 to 27.1, with a median of 21.8.Using the average albedo for NEOs (Thomas et al. 2011), this is equivalent to a diameter range from approximately 600 to 10 m, respectively.We observed them using RATIR, a multiimager with four channels, able to observe in the visible and NIR wavelengths at the same time.Different authors have taken different approaches and considerations while studying small NEOs.Among the factors with a larger effect on the results (ergo the comparison between different ones) are the taxonomies that are considered.For this reason we present the individual classification of each asteroid and the taxonomic distribution when possible, but we also present the results as the ratio of nonfeatured to featured asteroids, roughly equivalent to the "carbonaceous" to "rocky" types, with where C, X, Q, and S correspond to the complexes and all types within them.Taxonomies not considered under a particular model have a value equal to zero.Notice that nonfeatured objects tend to have a low albedo and a relatively similar color among them, while featured objects tend to be brighter, with similar colors among them as well.
We present our results in Table 3 and Figures 3, 4, and 5. Table 3 shows the taxonomic distribution found for the 87 objects analyzed with the multidimensional approach that have data in the visible and NIR range, equivalent to fraction = 0.75 ± 0.16. Figure 3 shows a graphical representation of such results, also including the distribution found with a 1D approach for objects with visible-only data.Figure 4 compares our results by taxonomic type with the latest study of subkilometer NEOs (Devogèle et al. 2019) and the largest study to date of this kind (Binzel et al. 2019).Similarly, Figure 3 compares the values we find for fraction with the corresponding values found in similar studies.Table 4 shows the objects most successfully classified with this method, which have a probability of belonging to a given class of P > 50%.Table 5 is equivalent to Table 4, but this time showing objects with lower classification probabilities.These tables display the probability of an object to be assigned into each of the main taxonomies, where the highest of them is denoted in bold.The last column refers to the Set that was used to estimate the probabilities: 1 is the 10D model, 2 is the 3D model, while 3 is the 1D model, all of them defined in Table 2.
For the objects with only an r − i color available, we estimate the result from the 100 cases with the lowest χ 2 out of the 10 6 runs in the MC simulation.A KDE is built for the C-, X-, S-, and Q-type distributions in such cases, as explained in Section 3.3.1.These distributions are shown in the bottom panel from Figure 5 and suggest our sample is formed by 32% C, 10% X, 20% S, and 32% Q types, and 5% something else, producing fraction = 0.8 ± 0.69.

Discussion
Five broadband filters (10 colors) are enough to build a robust model able to distinguish between the C, X, Q, S, D, V, and L types.The accuracy and details of the classification depend on the size and completeness of the training sample, meaning that it has several elements of the taxonomies to consider, and that the wavelength coverage from spectra is comparable to that of the broadband filters used.We find that a model with five bands is robust enough to allow for large photometric errors without affecting the general quality of the results.
We obtained our objects' Jovian Tisserand parameter (T J ) from JPL's Small-Body Database search engine.10 of our targets have T J < 3. From those, 4four were assigned a "C + X" classification, four are classified as S, one as Q, and one as D. The fraction of featured objects is higher than expected; in fact, the percentage of each class roughly agrees with the percentages reported in Table 3.However, 10 objects is quite a small sample.et al. (2019) report that objects classified respectively as Q, A, O, and K type using chi-square fitting to measured spectra, are classified as S type if PCA is used instead.Our classification method is based on the classified spectra reported by Binzel et al. (2019), who make use of PCA.This explains why our k-NN result for S in Figure 4 is higher than the one from Devogèle et al. (2019), while our values for Q are lower than theirs.

Devogèle
Table 2 gives a measurement of the accuracy of the k-NN approach.It suggests that the accuracy for Set 2 is relatively low; however, as explained in Section 3.2, by using the photometric errors, we made copies of the observations to "spread" the measurements around the true value of the color.In this way, we are able to obtain highly reliable classifications for some objects.We also observed that the dependence of the result on the observational error weakens by increasing the dimensions of the model.For the k-NN algorithm, we measure this dependence by counting the amount of objects with a taxonomic probability higher than 60% and 90% as a function of the maximum error on the color allowed for the objects.The proportions from Set 1 (10D) were not significantly affected, where the error limit was allowed to range from 0.07 to 0.13.For Set 2 (3D), adding objects with an error in the color higher than 0.07 produced a less reliable classification.This is not always the case, however, since an object could have a large error in the filter with the least sensitivity but smaller errors in the rest.
We cross-checked the efficiency of the k-NN method with the 17 objects observed in multiple visits.Most of them obtained a consistent classification in up to four visits.There were four objects in which the classifications based on two different visits were inconsistent.The observations of such cases had poor quality, so this inconsistency is likely due to poor photometric measurements rather than to flaws in the method.
The values presented in Table 3 are obtained by adding up the probability of all objects that belong to each taxonomy.This is equivalent to adding the probability columns in Table 3 and dividing by the number of objects.This procedure is statistically more correct, since it yields the distribution of the sample as a whole instead of using the result of each individual 3. Comparison of the ratio of nonfeatured to features objects (see text) found for the 87 objects analyzed with k-NN and the 151 objects analyzed with the MC approach.Also shown the equivalent values from UKIRT (Mommert et al. 2016), MITHNEOS (Binzel et al. 2004(Binzel et al. , 2019)), MANOS (Devogèle et al. 2019), and KMTNET (Erasmus et al. 2017).In all cases, we show the time span of the observations that yield the result.The value from MITHNEOS is shown as recalculated by Devogèle et al. (2019), and their sample also considers asteroids of larger sizes.(2019) also considered other taxonomies; here we only plot the intersection with the classes we considered, which constitute the majority of the population.The results our team found with KMTNET are not included in this plot.
object to build it.The difference in the percentage of each taxonomy obtained by the two different procedures is lower than 3% and the value of fraction differs only by ∼1%.The low difference between these values suggests robustness in our model, despite the fact that some of the objects in Table 3 have large uncertainties in their classification.
The results from the 1D MC algorithm have a value of fraction that is consistent with the one obtained with multiple colors.However, we recall that this method provides a taxonomic distribution only of the sample, not the classification of individual objects.We highlight that any method using only one variable to classify objects must be used with caution and analyzed thoroughly.
In Erasmus et al. (2020), the 1-NN method was used to distinguish between C and S types in asteroid families.They also used two broadband filters in the visible region, however most of the elements in those families are expected to belong to be C and S types.In near-Earth space, the population of Q types is significant and the characteristic r − i color for Q and Sq types is between those of X and S types in r − i space.Therefore, we do not consider that the 1-NN method is reliable for studying NEOs in r − i space.

Limitations and Comparison with Other Studies
Our observations are biased in favor of high-albedo objects.At a given distance from Earth, for objects of a given diameter, those with a higher albedo are easier to observe.For targets close to our limiting magnitude, only those with relatively high albedos will be observed.Hence, our sample may contain a higher fraction of S and Q objects than in the actual asteroid population.A similar bias is imposed by the minimum orbit intersection distance (MOID) as remarked by Devogèle et al. (2019).For small objects, only those that get close to Earth (small MOID) will be visible with a small, ground-based telescope.The close encounter with Earth of an asteroid can cause the rejuvenation of its surface by tidal forces, reducing the effects of space weathering (Binzel et al. 2004).This phenomenon could increase the real proportion of Q types observed from Earth (the "nearest Earth objects"), but not necessarily in the whole NEO population.
Having access to the MITHNEOS sample (Binzel et al. 2019) allowed us to refine our MC model with respect to Navarro-Meza et al. (2019).The 1D analysis performed here includes the objects reported in Navarro-Meza et al. ( 2019), but we consider the slightly updated approach here to be more conservative and correct.The main limitation of this model is that it obtains the taxonomy distribution of the observed sample, but it is unable to classify objects individually.
Regarding the k-NN model, the size of our Control Sample is small for this type of algorithm.Particularly, we have very few elements of the classes D "&" V to train our model.The confusion matrix in Figure 1 suggests our model works relatively well with these classes.In any case, the reader is reminded that our results are statistical, and the methodology followed is consistent with the one from Mommert et al. (2016).
While analyzing size dependency in the subkilometer NEO population, authors have split the sample in absolute magnitudes bins of H mag < 20, 20 H mag < 25, and H mag 25, or analyzed the sample by mass fraction (see Mommert et al. 2016;Erasmus et al. 2017;Binzel et al. 2019).Our observations are focused on targets of absolute magnitude in the range 20 H mag < 25, hence we have very few objects in the other ranges to make a similar size-dependence analysis.We find no significant variations within the 20 H mag < 25 range.The value of fraction we find for this bin agrees within 1σ with all the works cited.Mommert et al. (2016) and Erasmus et al. (2017) also used k-NN to analyze their data.The individual compositional fractions they report do not necessarily agree with ours or with the ones from Devogèle et al. (2019); however, it is worth mentioning that by the time of those studies, fewer classified spectra were available to take as a reference and they consider fewer taxonomic types.Hence, the compositional fractions they present are more representative of their sample than of the observed NEO population.Nevertheless, the ratios they present agree with ours within the error bars, as can be seen in Figure 3.
A conjecture about asteroid bias can be drawn from Figure 3.The detection limits of current surveys are fainter than the limiting magnitudes of the UKIRT, MANOS, and particularly RATIR targets.Therefore, the population in the magnitude range of our observations is closer to completion, which makes it less biased.However, we select our targets by brightness, so the bias explained at the beginning of this section still applies to some degree.The higher level of completion may explain why our results are in better agreement with recent results from MANOS (Devogèle et al. 2019) and UKIRT (Mommert et al. 2016), both of which show a slightly larger fraction of dark objects than the result reported by MITHNEOS (Binzel et al. 2019), a study that started 20 yr ago.On this topic, Figure 1 shows Set 2 is likely to yield false positives on the "C + X" complex, though Set 2 only retrieved three elements with such a classification.Taking the extreme assumption that those three objects are misclassified and should rather be Q or S type, the value of fraction decreases to ≈0.65, which is approximately the value obtained by UKIRT.Therefore, the comparison with the other surveys is not affected by the limitations of Set 2. In general, Figure 3 suggests that the ratio of nonfeatured to featured objects measured slightly increases with time due to a better completion rate of the observed samples.This is just a hypothesis, however, since other factors like the methods used to extract the results may play a role.

Implications for Large-scale Projects
The methodologies explored in this article allow for smaller telescopes to study the taxonomy of subkilometer NEOs.They also offer an efficient way to analyze data obtained with large telescopes.Our k-NN analysis effectively classified objects with an error in color as large as 0.13 (equivalent to a 0.091 magnitude error per band).With high-precision photometric measurements, this machinery can classify hundreds of objects in minutes with high accuracy.For example, the Vera C. Rubin Observatory (Ivezić et al. 2019) will observe with the filters u, g, r, i, Z, and Y, four of which we studied here.A larger sample of reference spectra with coverage in the visible and NIR would improve the accuracy of the method and likely enable it to distinguish between the classes that compose the taxonomic complexes.A different but promising approach is to use the (This table is available in its entirety in machine-readable form.) active learning method described in Ishida et al. (2019), developed for photometric classification of supernovae.It is important to mention that all of the reference spectra need to be classified homogeneously.

Conclusions
We present a taxonomic study of 238 subkilometer NEOs taken with a 1.5 m robotic telescope over the course of 5 yr.We used two methods to analyze our data.The taxonomic distribution found for the objects analyzed via k-NN agrees with the one obtained by Devogèle et al. (2019) from a spectral analysis.This presents a strategical advantage that allows small telescopes to perform characterization, or a more efficient method for large surveys.The MC method offers an approximation of the taxonomy distribution, and its use is suggested for cases when only two photometric bands are available.In all cases, we find that the nonfeatured asteroids, often denoted as carbonaceous, are a little less common than siliceous objects for the observed subkilometer population of NEOs.

Figure 1 .
Figure 1.Confusion matrix for Set 1 (left)and Set 2 (right).This is produced by using 20% of our sample to test our model (one-fold).The principal diagonal (highlighted) shows the objects that were accurately classified.

Figure 2 .
Figure2.Two-color map generated with the Control Sample (bullets) and the k-NN algorithm with k = 3.This corresponds to a "two-color slice" from our 10D model generated with Set 1.

Figure 4 .
Figure 4. Taxonomic fractions in the subkilometer NEO population found by different collaborations: UKIRT (40 objects analyzed with k-NN; Mommert et al. 2016), Binzel et al. (2019; who selected 762 objects with H mag 17, most of them analyzed using a principal component analysis (PCA) of spectra), the MANOS survey (210 objects analyzed with a chi-square fit to spectra; Devogèle et al. 2019), RATIR-kNN (our results from the 87 objects analyzed with k-NN), and RATIR-MC (our results from the 151 objects analyzed with MC simulations).We remark that Binzel et al. (2019) and Devogèle et al.(2019) also considered other taxonomies; here we only plot the intersection with the classes we considered, which constitute the majority of the population.The results our team found with KMTNET are not included in this plot.

Figure 5 .
Figure5.Results from the 1D MC approach: the 100 best fitting solutions are used to build a KDE, where the vertical axis corresponds to likelihood.Top: Control Sample.Dashed lines show the expected value (see Section 3.3.1).Colored regions limit the 1σ values from the distribution's peak.These proportions are from the subset of the Control Sample with visible information and does not represent the general distribution.The + + C X S Q ratio is retrieved with ∼89% accuracy.Bottom: KDE of the 151 observed objects with visible-only data.The peak of the distributions is at C = 32.4%,X = 9.6%, S = 20.4%, and Q = 31.8%,corresponding to fraction = 0.8.

Table 1
Channels of the RATIR Instrument

Table 2
For the Analysis, Our Data Are Split in Three Independent Sets a Note.By combining two different bands (out of r, i, Z, Y, and J) in a noncommutative way, Set 1 is constituted by 10 different colors, producing a 10D model.Similarly, Set 2 makes a 3D model and Set 3 a 1D one.Sets 1 and 2 are analyzed with the k-NN algorithm (see Section 3.2).

Table 3
Taxonomic Distribution Found for the 87 Objects Analyzed with the k-NN Algorithm

Table 4
Observed Targets Analyzed with k-NN and a Probability of Belonging to a Taxonomy Larger than 50%Note.The observation midtime, duration of the observing run, and absolute magnitudes are presented for all objects.The last column refers to the Set that was used to estimate the probabilities (see Table2).The full table is available online.(This table is available in its entirety in machine-readable form.)

Table 5
Equivalent to Table4, but for Objects with a Probability of Belonging to a Given Taxonomy of Less than 50% Note.The full table is available online, which includes the objects analyzed with the 1D MC method.