Impact of Rubin Observatory Cadence Choices on Supernovae Photometric Classification

The Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST) will discover an unprecedented number of supernovae (SNe), making spectroscopic classification for all the events infeasible. LSST will thus rely on photometric classification, whose accuracy depends on the not-yet-finalized LSST observing strategy. In this work, we analyze the impact of cadence choices on classification performance using simulated multiband light curves. First, we simulate SNe with an LSST baseline cadence, a nonrolling cadence, and a presto-color cadence, which observes each sky location three times per night instead of twice. Each simulated data set includes a spectroscopically confirmed training set, which we augment to be representative of the test set as part of the classification pipeline. Then we use the photometric transient classification library snmachine to build classifiers. We find that the active region of the rolling cadence used in the baseline observing strategy yields a 25% improvement in classification performance relative to the background region. This improvement in performance in the actively rolling region is also associated with an increase of up to a factor of 2.7 in the number of cosmologically useful Type Ia SNe relative to the background region. However, adding a third visit per night as implemented in presto-color degrades classification performance due to more irregularly sampled light curves. Overall, our results establish desiderata on the observing cadence related to classification of full SNe light curves, which in turn impacts photometric SNe cosmology with LSST.

1. INTRODUCTION Supernovae (SNe) are used for diverse astrophysical and cosmological studies, such as measurements of the Universe's accelerated expansion (e.g.Riess et al. 1998;Perlmutter et al. 1995;Astier et al. 2006;Kessler et al. 2009a;Betoule et al. 2014;Scolnic et al. 2018a;Abbott et al. 2019;Brout et al. 2022).For most cosmological analyses, SNe were spectroscopically-classified to ensure a pure type Ia sample, but this will be impossible for the large SNe sample expected from the Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST; LSST Science Collaboration et al. 2009, 2017;Ivezić et al. 2019).Thus, LSST will rely on photometric classification, utilising spectroscopically-confirmed SNe samples to train classifiers.
Accurate classification requires representative training sets; the feature-space distributions of the training set should be similar to those of the test set (e.g.Lochner et al. 2016).However, classifiers are usually trained with either simulated datasets, which may suffer from model misspecification, or spectroscopically-confirmed events, which are non-representative of the test set due to selection effects.Several methods have been proposed to address the second problem, predominantly based on data augmentation techniques (e.g.Revsbech et al. 2017;Pasquet et al. 2019;Boone 2019;Carrick et al. 2021).This previous work has demonstrated that the bias introduced by non-representative training sets can be corrected.
Another crucial factor which impacts the accuracy of photometric classification is the survey observing strategy (Alves et al. 2022;Lochner et al. 2022).Over the course of ten years, LSST will repeatedly observe the southern sky every few days in multiple passbands.Its observing strategy encompasses diverse aspects such as the survey footprint, season length, inter-and intranight gaps, cadence of repeat visits in different passbands, and exposure time per visit.Changes in how LSST observes the sky can improve the scientific output of the survey; however observing strategy optimization is challenging due to the diverse goals of LSST (LSST Science Collaboration et al. 2009;Ivezić et al. 2019).
Recently, the Survey Cadence Optimization Committee (2022) Phase 1 report (hereafter: Ph1R) narrowed down the choice of possible observing strategies and recommended new simulations2 to respond to the findings of the previous optimization work (e.g.LSST Science Collaboration et al. 2017;Lochner et al. 2018;Scolnic et al. 2018b;Gonzalez et al. 2018;Olsen et al. 2018;Laine et al. 2018;Jones et al. 2020;Bianco et al. 2019;Alves et al. 2022;Lochner et al. 2022) and enable further optimization.In particular, it is not yet decided whether LSST will use a rolling cadence3 , and whether it will visit each sky pointing two or three times per night.
In this work, we study the impact of these key observing strategy choices on photometric classification accuracy.We focus on the rolling cadence and the intra-night observing strategy, since we expect these factors to have the greatest impact on the efficacy of light-curve classification.
Our work builds upon Alves et al. (2022) by studying the performance of photometric SN classification for light curves simulated with different LSST observing strategies for the first three years of the survey; we chose this time-frame because early science drivers are one of the highest priorities for the next set of cadence decisions.First, we simulated multi-band light curves using the SuperNova ANAlysis package4 (SNANA; Kessler et al. 2009b).These simulated datasets included a nonrepresentative spectroscopically confirmed training set, biased towards brighter events.Next, we followed the classification approach of Alves et al. (2022), using the photometric transient classification library snmachine5 (Lochner et al. 2016;Alves et al. 2022) to build a classifier based on wavelet features obtained from Gaussian process (GP) fits.We also included the host-galaxy photometric redshifts and their uncertainties as features.The simulated training set was augmented to be representative of the photometric redshift distribution per SNe class, the cadence of observations, and the flux uncertainty distribution of the test set.
In Section 2 we describe the LSST observing strategies and the framework that we used to generate our SNe datasets.Our classification and augmentation methodologies that relied on snmachine are presented in Section 3. Section 4 focuses on our results and their implications for observing strategy.We conclude in Section 5.

Overview
In this work we simulated LSST-like SN light curves for the first three years of the survey using three observing strategies: baseline v2.0, noroll v2.0, and presto gap2.5 mix v2.0.These observing strategies were created with the Feature-Based Scheduler (FBS; Naghib et al. 2019), which is the default scheduler for LSST.We then used the infrastructure developed for PLAsTiCC to simulate the light curves of the SNe with realistic sampling and noise properties (Kessler et al. 2019).We describe the observing strategies in Section 2.2, the SNe models in Section 2.3, and the simulations infrastructure in Section 2.4.
Following The PLAsTiCC team et al. ( 2018); Kessler et al. (2019), we simulated the Wide-Fast-Deep (WFD) survey, which is the main survey of LSST (containing ∼ 98% of our simulated events), and the Deep-Drilling-Fields (DDF) survey, which covers small patches of the sky with more frequent and deeper observations.The properties of each survey mode depend on the observing strategy but since the release of PLAsTiCC the footprints of the DDFs have changed considerably and the three observing strategies simulated in this work used the DDF locations presented in Table 2 of Jones et al. (2020).Since their DDF sequence is generally the same, we focused our analysis on the implications of the observing strategy for the WFD survey.The DDF events are still included in the training set, as they improve our augmentation procedure (see Section 3.3) but are not included in the test set.
We used 0.2% of the simulations to construct a nonrepresentative spectroscopically-confirmed training set.
The training set was biased towards brighter events, with a median redshift ∼ 0.3.The relatively small training set mimics the available data from current and nearterm spectroscopic surveys at the start of LSST science operations.Following Kessler et al. (2019), we loosely based the training set on the planned magnitude-limited 4-metre Multi-Object Spectroscopic Telescope Time Domain Extragalactic Survey (Swann et al. 2019).

Observing Strategies
Rubin's Survey Cadence Optimization Committee 6 (SCOC) has been formed to make recommendations for the observing strategy with inputs from the community.Following their recommendations, a new set of LSST observing strategy simulations created with FBS was released to respond to the findings of previous optimizations (Ph1R), including an update of LSST baseline observing strategy: baseline v2.0.Under the updated baseline, the telescope observes each field twice with a gap of approximately 15 min during twilight and 33 min during the rest of the night; these visit pairs are in different passbands.In the extragalactic (i.e., dust-extinction limited) WFD, the sky is divided into two regions: an 6 For further details see https://www.lsst.org/content/charge-survey-cadence-optimization-committee-scoc.'active' area which is observed more often (rolling at 90% strength) and a 'background' area.This two-band rolling cadence is defined by declination and shown in Figure 1.In this observing strategy simulation, the telescope observes in a rolling cadence between the years 1.5 and 8.5 of the survey to ensure that the first and last years have uninterrupted coverage of the entire sky (Ph1R).In this work, we simulated the first three years of the survey, and therefore only half of the light curve observations were performed with the rolling cadence.
A key aim of the new LSST observing strategy simulations is to evaluate whether a rolling cadence is suitable, as demonstrated by science metrics (Ph1R).In this work, we studied the impact of the rolling cadence on the photometric classification of SNe by comparing the baseline observing strategy with a similar strategy without the rolling (noroll v2.0, hereafter referred to as no-roll).Since the rolling starts in year 1.5 of the baseline simulation, we restricted this comparative analysis to the events observed between years 1.5 and 3 on both simulations.We refer to this subset of the baseline dataset as Y1.5-3 baseline; we used Y0-3 baseline when considering the entire three years.The simulations are otherwise identical.
Another aim of the new observing strategies is to investigate modifications of the intra-night cadence.In this work, we studied the impact of adding a third visit per night in a passband that had been previously observed; this addition is motivated by expected improvements to the performance of early classification and fast transient detection.The presto-color family (Bianco et al. 2018(Bianco et al. , 2019) ) encompasses a number of variations of the third visit inclusion, such as different intra-night gaps between the observations (e.g.1.5 hrs to 4 hrs between the first pair of observations and the third), whether the initial pair of visits is in consecutive passbands (g+r, r+i or i+z) or mixed passbands (g+i, r+z or i + y), and whether to obtain the visit triplet every night or every other night (Ph1R).SNe do not vary significantly during a single night, so the difference in intranight gaps between 1.5 hrs and 4 hrs has minimal impact.We thus chose a presto-color cadence whose third visit has an intermediate value of 2.5 hrs for the intranight gap (presto gap2.5 mix v2.0, hereafter referred to as presto-color).Since the total number of visits per pointing is fixed, adopting a presto-color cadence results in longer inter -night gaps; further, each field is observed for fewer nights in total.Similarly to baseline, the rolling starts in year 1.5 of the presto-color cadence; thus we compare baseline and presto-color for the entire first three years of LSST.For more details on the simulations, see Ph1R and the descriptions in the associated Jupyter Notebook.

Supernovae Models
Following Alves et al. (2022), we focused on classifying SN Ia, SN Ibc, and SN II, which have been found to be difficult transient classes to distinguish (Hložek et al. 2020).We simulated each class in a similar manner to Kessler et al. (2019), using models from Guy et al. (2010); Kessler et al. (2010bKessler et al. ( , 2013)); Villar et al. (2017); Pierel et al. (2018); Guillochon et al. (2018).However, similarly to Lokken et al. (2023), we did not include the SNIbc-MOSFiT model because it produces unphysical light curves.We also adjusted the relative fraction of simulated core-collapse SNe (CC SNe) to follow Table 3 of Shivvers et al. (2017).Additionally, due to the lack of SN IIb models in Kessler et al. (2019), we redistributed their fraction among the other stripped envelope SNe (SN Ib and SN Ic); see Table 2 of Appendix A for the relative rates used to simulate CC SNe in this work.Table 1 shows the resulting number of SNe per class for each observing strategy.

Framework for Generating Simulations
Our SNe simulations were built on top of the observing strategy cadences produced by FBS7 previously discussed in Section 2.2 (Naghib et al. 2019); this scheduler decides the passband to use and the direction to point the telescope to using a Markovian Decision Process, while accounting for interruptions, such as telescope maintenance downtime.Despite the FBS outputs containing a record of each simulated pointing of the survey, for generating light curves it is more convenient to compute all the observations of each event, and iterate over the events.Therefore, we used the python package OpSimSummary8 (Biswas et al. 2020(Biswas et al. , 2022) ) to reorder of the observations.This package also translates the FBS output into the appropriate format for use with the SNe simulation code from SNANA (Kessler et al. 2009b), which we used to generate realistic light curves in the LSST passbands.We broadly followed the methodology described in Kessler et al. (2019) which relies on SNANA to generate simulated datasets of SNe and associated metadata (e.g.host galaxy photometric redshift and its uncertainty).SNANA uses models of the SN sources, observing conditions, observing strategy, and instrumental noise to generate light curves.Then, it applies triggers to select the observations that would be seen by LSST.
Following Kessler et al. (2019), we applied the SNANA transient trigger to only keep events with at least two detections in our datasets; SNANA uses the DES-SN detection model from Kessler et al. (2015) to decide which observations are flagged as detected.See Figure 13 of Kessler et al. (2019) for a summary of the SNANA simulation stages.
Following Kessler et al. (2019)'s usage of SNANA, we truncated the 10-year survey to the first three years, removed season fragments with less than 30 days, and used the cosmological parameters Ω m = 0.3, Ω Λ = 0.7, w 0 = −1, and H 0 = 70.However, we used an updated version of the code9 which included improvements for the K-corrections for events at the highest simulated redshift.We made two further changes from Kessler et al. (2019) to improve the realism of our simulations, as follows.While Kessler et al. (2019) used a pixel-flux saturation of 3,900,000 photoelectrons/pixel, we used the more realistic value of 100,000 photoelectrons/pixel.We also corrected the code to ensure that any observations in the same band in a given night are co-added and count as a single observation.We provide our SNANA input files for each observing strategy simulation on zenodo.

PHOTOMETRIC CLASSIFICATION
We followed the approach of Alves et al. (2022) to photometrically classify the SNe simulated from each observing strategy.In that work, we benchmarked our classification approach against the winning PLAsTiCC entry (Boone 2019) and showed that our classification results were generalizable; they hold if we replace our classification predictions with the predictions of Boone (2019).Here we used the photometric transient classification library snmachine (Lochner et al. 2016;Alves et al. 2022) and updated it to handle the output files of SNANA (FITS files).In the sections below, we describe the main steps of the approach and any modifications relative to Alves et al. (2022).

Light Curve Preprocessing
Following Alves et al. (2022), we preprocessed the simulated light curves to only include the observing season in which the SNe is detected.To isolate this season for each event, we removed all observations 50 days before the first detection and 50 days after the last.Next, we divided the remainder light curve into sequences of observations without inter-night gaps > 50 days; we selected our preprocessed light curve as the sequence of observations which contained the largest number of detections.Finally we translated the light curve so the first observation was at time zero.The longest resulting light curves, as measured between the first and last observations, lasted for 274, 253, and 295 days respectively, for baseline, no-roll, and presto-color.

Gaussian Process Modeling of Light Curves
We used GP regression (e.g.MacKay 2003;Rasmussen & Williams 2005) to model each light curve.Following Boone (2019); Alves et al. (2022), we fitted twodimensional GPs in time and wavelength; we applied a null mean function and a Matérn 3/2 kernel for the GP covariance.We fixed the length-scale of the wavelength dimension to 6000 Å and used maximum likelihood estimation to optimize the time dimension length-scale and amplitude per event.We implemented the GPs with the python package George10 (Ambikasaran et al. 2014).We note that Stevance & Lee (2022) investigated possible improvements to using GPs for SNe light curve fitting.We leave these extensions to future work on SNe classification.

Augmentation
We applied the methodology developed in Alves et al. (2022) to augment the training set of each simulated observing strategy to be representative of their respective test set in terms of the photometric redshift distribution per SNe class, the cadence of observations, and the flux uncertainty distribution.We delineate below the departures from the augmentation procedure described in Section 4 of Alves et al. (2022) and refer the details to Appendix B.
We augmented the training set SNe to generate synthetic events at a different redshift from the original; this approach relied on using two-dimensional GP models of the training set events to generate the synthetic light curves.Since we removed a SN model (as mentioned in Section 2.3), the redshift distribution of the events changed with respect to Alves et al. (2022).Consequently, we used a different distribution to produce the augmented training sets, as detailed in Appendix B.
Following our previous work, we generated 15440 WFD synthetic events for each SNe class.Figure 2 shows that for the Y0-3 baseline, the photometric redshift distribution of the augmented training set is closer to the test set than the original training set.Although the distribution does not match exactly, it is sufficiently close to expect minimal impact on performance.Ensuring identical distributions could require introducing an undesirable amount of fine-tuning to the methodology, and hence we have not attempted to obtain a We also tuned the distribution of the number of observations and their flux uncertainty for the synthetic events of each observing strategy.We drew the target number of observations for each light curve from a Gaussian mixture model based on the test set; Table 3 of Appendix B shows the parameters used for each observing strategy.For the flux uncertainty, we followed Boone (2019); Alves et al. (2022) and combined the uncertainty predicted by the GP in quadrature with a value drawn from the flux uncertainty distribution of the test set.Table 4 of Appendix B shows the parameters of the Gaussian mixture model used to fit the flux uncertainty distribution of each passband and observing strategy.

Feature Extraction
For photometric classification we used the host galaxy photometric redshift, its uncertainty, and modelindependent wavelet coefficients obtained from the GP fits as features.The redshift features mentioned above were directly obtained from the metadata associated with each event.For the wavelet features, we followed the feature extraction procedure of Lochner et al. (2016); Alves et al. (2022), which we briefly summarize in the following paragraph.
To perform a wavelet decomposition on the light curves we sampled them onto a time grid.We used the two-dimensional GP that models each light curve to interpolate between the observations.For uniformity, we used the same time grid for all the observing strategies; the time range of the grid corresponds to the maximum light curve duration of the events, 295 days.Following Alves et al. (2022) we chose 292 grid points to sample the events approximately once per day.Next, we performed a two-level wavelet decomposition using a Stationary Wavelet Transform and the symlet family of wavelets11 .These decomposition choices resulted in 7008 redundant wavelet coefficients per event.Following Lochner et al. (2016); Alves et al. (2022), we reduced the dimensionality of the wavelet space to 40 components using Principal Component Analysis (PCA; Pearson 1901;Hotelling 1933).We used the augmented training set of each observing strategy to construct the dimensionalityreduced wavelet space; the test set events were projected onto the corresponding wavelet space.

Classification
We used snmachine to build a photometric classifier trained on the augmented training set.We used Gradient Boosting Decision Trees (GBDT) (Friedman 2002), classifiers whose predictions are based on ensembles of decision trees.We trained the classifier for each observing strategy separately, using dedicated augmented training sets (Section 3.3) and features (Section 3.4).The GBDT classifier hyperparameters were optimized following the procedure described in Section 3.4 of Alves et al. (2022); Table 5 of Appendix B shows the values of the hyperparameters per observing strategy.

Performance Evaluation
We used the PLAsTiCC weighted log-loss metric (The PLAsTiCC team et al. 2018;Malz et al. 2019) to optimize the photometric classifiers and to evaluate their performance.Following the PLAsTiCC challenge, we gave the same weight to each SN class.
Confusion matrices are commonly used to assess the performance of classifiers (see e.g.Hložek et al. 2020).To produce a confusion matrix, we first assigned each test set event to its most probable class.For ease of comparison between different classes and observing strategies, we normalized the resulting confusion matrices by dividing each entry by the true number of SNe in each class.In this setting, a perfect classification results in the identity matrix.
We measured the classification performance using the recall (also called completeness/sensitivity) and precision of each SNe class.These are defined as recall = TP TP + FN (1) and precision = TP TP + FP , where in a binary classification setting, TP, FN, and FP are, respectively, the number of true positives, false negatives and false positives.The computational performance of this procedure and an estimate of the resources needed for reproducing this analysis are discussed in Appendix C.

RESULTS AND IMPLICATIONS FOR OBSERVING STRATEGY
Here we present our results on the impact of rolling cadence and the intra-night gap on SNe classification performance.We perform a comparative analysis for these two cases relative to the baseline strategy in Section 4.1.In our previous work (Alves et al. 2022) we found that light curve length (time difference between the first and last observation after the light curve preprocessing described in Section 3.1) and inter-night gap (time difference between consecutive observations which are more than 12 hrs apart) were the key properties of observing strategies affecting classification performance.We thus investigate how the no-roll and presto-color families affect the recall and precision as a function of several factors including the above.

Overall Classification Performance
Figure 3 shows the confusion matrices for classifiers trained on the augmented training set of Y1.5-3 baseline and no-roll.The Y1.5-3 baseline classifier yields a slightly higher performance for SN Ibc, SN II, and a percent-level improvement in the PLAs-TiCC log-loss metric.This small difference indicates that rolling at this level makes a negligible difference to the overall efficacy of SNe photometric classification.However this result masks a significant difference between the classification efficacy between the active and background regions due to an averaging effect.Therefore we also investigated the difference in performance between the active region (which we visually identified as the dark bands in Fig. 1; 65% of the test set events) and the background region (35% of the test set events).The confusion matrices in Fig. 4 show that the classification performance of the active region is higher than of the background region for all SNe classes.Indeed the log-loss metric improves by 25% for events in the active region as compared to the background.
Figure 5 shows the confusion matrices for Y0-3 baseline and presto-color.The baseline cadence outperforms presto-color for SN Ia, SN II; the PLAsTiCC log-loss metric degrades by ∼ 10% for presto-color.While adding a third visit per night is expected to improve performance for early classification and for fast transient detection, our results indicate that this choice moderately degrades classification performance for long-lived transients.
All the observing strategies considered in this work yield a higher performance in terms of the log-loss metric compared with the observing strategy used for PLAs-TiCC (Alves et al. 2022   formance gains achieved by recent updates to the FBS scheduler.

Light Curve Length
We found in Alves et al. (2022) that the light curve length of an event has a significant impact on classification performance for long-lived transients such as SNe.In particular, longer light curves are easier to distinguish within a classifier since they incorporate more information about the time evolution of the event.In that work, we focused on events with light curve length between 50 and 175 days due to their higher performance.As shown in Figure 6, our results for the Y0-3 baseline show a similar performance behavior.We also find that these conclusions generalize to the other observing strategies analyzed here, and hence the conclusions of Alves et al. (2022) carry over to these new cadence simulations.We note that the recall and precision figures (Figure 6 and subsequent figures) show a small scatter above our sta-tistical uncertainties, likely arising from the limited diversity of the simulations in those particular bins.Figure 7 shows that the distribution of light curve lengths is similar for all the cadences.Indeed the cadence choices currently under consideration (v2.0) have similar distributions of gaps larger than 50 days, so the light curve length distribution correlates more with the intrinsic duration of the events and our preprocessing of the light curves than with the cadences.Therefore, even though this is a very important factor for overall classification performance, it is not strongly affected by observing strategy choices.

Median Inter-night Gaps
The observing strategies proposed for LSST have different intra-and inter-night gaps distributions.Given the finite total number of observations available, these distributions are intrinsically linked.In Alves et al. (2022) we demonstrated that the median inter-night gap was a crucial factor in photometric SNe classification, and Ph1R has highlighted the necessity for sciencemotivated metrics to measure the impact of intra-night gaps.Since the timescale for changes in SN light curves is in days, multiple observations in a single night in the same filter do not contribute towards characterization of the light curve.Here we firstly investigate the impact of a higher intra-night cadence on the median inter-night gap, and hence classification performance.
Figure 8 shows that, in accordance with our expectations, a lower median inter-night gap leads to higher precision and recall for Y0-3 baseline.SN II show a larger sensitivity to cadence as compared to the other types because SN Ia with high median inter-night gaps are misclassified as SN II, driving down the precision of the latter.Overall, we find similar results for the other observing strategies analyzed.
While this overall conclusion still holds, our results show that the classification performance depends less on the median inter-night gap for this set of observing strategy simulations compared to Alves et al. (2022), where we recommended a median inter-night gap of 3.5 days.Our new results suggest a cut of 5.5 days; however, all the current observing strategies aim for a lower median inter-night gap, making such a recommendation redundant.We attribute this reduced sensitivity of classification results to cadence to the recent improvements made to the FBS scheduler.
The left panel of Figure 9 shows that the peak of the median inter-night gap for Y1.5-3 baseline is lower than for no-roll.However, while rolling improves the cadence of the events in the active region, the events in the background region are less regularly sampled than no-roll, which leads to the heavier tail of Y1.5-3 baseline.Thus overall, the classification performance is not significantly improved by rolling.
Since the total exposure time is fixed, the addition of a third visit each night leads to the presto-color events being visited fewer nights.Consequently, this cadence has sparser observations than the Y0-3 baseline.This is reflected in a slightly higher median inter-night gap for presto-color, as shown in the right panel of Figure 9.The shaded areas correspond to the 95% confidence limits obtained by bootstrapping the recall and precision values for each bin.To remove small-number effects we only present the results for bins with more than 300 events.However, the small shift in the median inter-night gap distribution does not explain the degradation in performance seen for presto-color.We now turn to other cadence properties in order to understand this result.

Regularity of sampling
In this section we investigate whether the performance differences seen for the cadences considered arise from the (ir)regularity of sampling.Characteristics of the regularity of sampling which are potentially important for classification include large gaps in the light curve and the number of observations near the peak.
In Alves et al. (2022) we found that the GPs successfully interpolate between large gaps (> 10 days) so the classifier is still able to identify the SNe. Figure 10 confirms that the recall and precision of SNe either slowly decrease or remain constant with the increase of the length of longest inter-night gap.These conclusions also generalize to the other cadences we study.This indicates that the GP step is generally able to interpolate large gaps.
A related consideration for characterizing the regularity of light-curve sampling is observing SNe near peak brightness, where the shape of the light curve changes rapidly.These observations are critical for obtaining a reliable cosmological distance modulus and facilitates accurate photometric classification.In this work, we estimate the SNe peak as the moment that maximizes the GP fit predicted flux in any passband.Then, we define the number of observations near the peak as those 10 days before and 30 days after peak brightness; we sum the observations in all passbands to calculate this quantity.Similarly to Alves et al. (2022), we find that the classification performance generally increases with the number of observations near the peak for all the new observing strategies.Figure 11 shows the results for Y0-3 baseline as a representative example.For type Ia SNe, this performance levels off around 15 observations near the peak.This is comparable to the SNe cosmology metric used in Lochner et al. (2022) which requires 5 observations before peak and 10 observations after.The shaded areas correspond to the 95% confidence limits obtained by bootstrapping the recall and precision values for each bin.To remove small-number effects we only present the results for bins with more than 300 events.
Figure 11 also shows a drop in performance for ∼ 2 observations near the peak.We find that such events generally only contain the latter part of the transient, and their light curves tend to be flat.The classifier predicts events with flat GP fits as SNe II: the latter tend to have long light curves so it is likely that a flat part of the light curve will be observed.Once there are more observations near the peak, the light curves are not as flat so the SN II recall decreases until there is sufficient information in the light curve for the classifier to correctly identify the transient shape.
Having established the influence of these characteristics on classification performance, we now consider how they impact the relative classification performance seen in Figures 3 and 5 for the observing strategies considered.
The distribution of the longest inter-night gap in Y1.5-3 baseline exhibits two peaks, as shown in the left panel of Figure 12.This is due to the fact that differ-ent areas of the sky start rolling at different times, and Y1.5-3 baseline therefore includes some events in areas of the sky which have not yet started rolling.Thus, we see a second peak at higher values of longest internight gap for Y1.5-3 baseline.The peak in the distribution corresponding to the rolling region is at shorter timescales than in no-roll.Overall, these differences do not result in a significant change in performance, because there is no significant tail produced towards longer gaps.By contrast, the right panel of Figure 12 shows that, as expected, the distribution of the longest gaps for presto-color does exhibit a broad tail, due to the more irregular sampling: presto-color has 15% more events which have a long gap of 20 days or more.
Figure 13 compares the distributions of the number of observations near the SNe peak for the various cadences.The left panel of Figure 13 shows the impact of rolling (Y1.5-3 baseline), with a bimodal distribution corresponding to the 'background' and 'active' areas.While the difference between this distribution and that of no-roll may appear visually large, the latter only has 5% more events in the poorly-classified region (< 15 observations near peak).This again results in very little difference in classification performance due to rolling.The right panel of Figure 13 shows that presto-color events have ∼ 10% more events with < 15 observations near peak compared to Y0-3 baseline.While this difference does not make a large visual impact, it is nevertheless in a regime which strongly affects classification performance.
Figure 5 shows that presto-color mainly impacts classification of type II SNe.In turn, Figure 11 shows that type II SNe classification performance is a strong function of number of observations near peak as compared to the other classes, for < 15 observations near peak.Since presto-color has more events in this regime, one may therefore expect that SNII classification is particularly degraded for this cadence, and this expectation is confirmed by our results.
Overall, presto-color exhibits small but significant changes in the distribution of the longest inter-night gap and the number of observations near the peak.These combine to result in irregularly-sampled light curves, which in turn leads to degraded classification performance.

DISCUSSION AND CONCLUSIONS
We have presented the impact of LSST cadence choices on the performance of SNe photometric classification, using simulated multi-band light curves from the LSST baseline cadence, the non-rolling cadence, and a presto-color cadence.For each dataset considered, we augmented the non-representative training set to be representative of the test set and built a classifier using the photometric transient classification library snmachine.In line with previous studies, we confirmed that the light curve length, median inter-night gap and number of observations near the SNe peak, which differ between the cadences, affect the photometric classification.
Previous works argued that a rolling cadence benefits SNe science due to the improved sampling but that more in-depth simulations and studies were needed (LSST Science Collaboration et al. 2017;Lochner et al. 2018).We find that the considered rolling cadence (which increases by 90% the footprint weight of the active region) only mildly improves the overall classification performance.However, crucially, our results show that the active re-gion of the rolling cadence as implemented in the current baseline strategy has a significantly higher classification performance than the background region.This in turn suggests that the SN Ia light curves in the active region could be better measured, and hence more useful for cosmological analyses.We now investigate this point.Lochner et al. (2022) defines a set of light curve requirements for well-measured SN Ia which form the basis of seven cosmology metrics; in Appendix D we present the updated version of these requirements currently being used by LSST DESC.Considering all but one of the updated requirements (ignoring a color-related requirement due to its computationally-intensive nature), we compared the SN Ia light-curves in the active and background regions of the Y1.5-3 baseline.We found that ∼ 50% of the SN Ia in the active region fulfilled the light curve requirements, compared with only ∼ 20% of the SN Ia in the background region.While these results are indicative rather than definitive due to ignoring the color requirement, they suggest that the 25% improvement in the classification performance log-loss metric in the active region is also associated with an increase of up to a factor of 2.7 in the number of cosmologically-useful SN Ia in the active region.These results taken together strongly motivate the implementation of a rolling cadence within the baseline observing strategy.
We also found that the presto-color cadence led to shorter and sparser light curves: the light curve length distribution of this cadence on Figure 7 is skewed towards lower values.Additionally, there are more presto-color events with large gaps and fewer observations near the SNe peak.These results indicate that the events simulated under this cadence have a more heterogeneous sampling than the baseline events.Irregular sampling, especially around the peak where the light curve varies more rapidly, results in worse constraints on its shape; therefore the classifier is less able to distinguish between the SNe classes.
Since the third visit per night implemented in presto-color is in part motivated by facilitating early transient classification, our results imply that there is a trade-off in the observing strategy requirements of early and full light-curve classification.
The accuracy of SN Ia photometric classification and core-collapse contamination affect the measurements of the dark energy equation of state parameter (Kessler & Scolnic 2017;Jones et al. 2017).While a Bayesian methodology can marginalize over the contamination, minimizing such contamination reduces systematic uncertainties in cosmological constraints (Kunz et al. 2007;Lochner et al. 2013;Roberts et al. 2017;Jones et al. 2018).Since SNe cosmology with LSST is expected to be limited by systematic uncertainties, the relationship between the efficacy of photometric classification and cosmological constraints is of crucial importance.

A. SIMULATED CC SN RATES
In this appendix we present the absolute and relative rates used to simulate CC SNe in this work.These rates follow Shivvers et al. (2017) with the adjustments described in Section 2.4.Table 2 includes both the rates of each CC SNe class and the models used (see Kessler et al. (2019) for further details).The resulting number of SNe for each class is shown in Table 1.
B. AUGMENTATION DETAILS AND CLASSIFICATION HYPERPARAMETERS Section 3.3 described the differences between the augmentation procedure used in this work and the one in the Section 4 of Alves et al. (2022).In particular, we changed the distribution used to create the augmented training sets because the removal of the SNIbc-MOSFiT model (mentioned in Section 2.3) altered the redshift distribution of the Table 3. Parameters of Gaussian mixture models used to fit the number of observations of the test set light curves for each observing strategy.These values were later used to create an augmented training set (Section 3.3).We used visual inspection to select the number components of the Gaussian mixture models; that number is indicated through the number of weights provided for each observing strategy.The weight, mean and variance of each component are displayed in the same order.

Y1.5-3 baseline
No where z ori is the spectroscopic redshift of the original event.However, we used a different class-agnostic target distribution.In particular, we drew an auxiliary value z * from a log-trapezoidal distribution; the probability density function of the trapezoid distribution is where x min = log (z min ), x max = log (z max ), and ∆x = x max − x min .Then, we calculated the redshift of the new augmented event following Alves et al. (2022), z aug (z * ) = −z * + z min + z max .
For each observing strategy, we also adjusted the parameters of the Gaussian mixture models used to fit the number of observations per light curve (Table 3) and the flux uncertainty distribution of each passband (Table 4).We fitted Gaussian mixture models to the test set, and used visual inspection to select the number components.The resulting photometric distributions are shown in Figure 14.
In this section we also show the hyperparameters values of the GBDT classifier used for each observing strategy (Table 5).

C. COMPUTATIONAL RESOURCES
We simulated the observing strategy datasets on a Intel E5-2680v4 @ 2.4 GHz.Each dataset with 2.5 × 10 6 events takes ∼ 200 core hours to simulate.The data processing, classification and analysis was performed on an Intel(R) Xeon(R) CPU E5-2697 v2 (2.70GHz).Using a single core, the pipeline takes ∼ 5.6 hrs to preprocess these events.Modeling them with GPs and performing their wavelet decomposition takes ∼ 44.4 hrs.Generating a augmented training set with 15440 events takes ∼ 9 hrs, and reducing the dimensionality of their wavelet features using PCA takes ∼ 45 min.Optimizing the GBDT classifier takes ∼ 5.5 hrs.Obtaining the test set predictions on the precomputed test set features with the trained classifier takes 5 min.Overall, the entire classification pipeline takes ∼ 200 + 70 core hours of computing time for each observing strategy.

D. WELL-MEASURED TYPE IA SUPERNOVAE
To measure cosmological parameters accurately, it is crucial to obtain a large sample of well-measured SN Ia.Lochner et al. (2022) presented a set of requirements to denote a SN Ia light curve as well-measured; these requirements have recently been updated and refined.The updated requirements only use the light curve observations in the grizy Table 4. Parameters of Gaussian mixture models used to fit the flux uncertainty distribution of the test set in each passband (ugrizy) and observing strategy.These values were later used to create an augmented training set (Section 3.3).We used visual inspection to select the number components of the Gaussian mixture models; that number is indicated through the number of weights provided for each observing strategy.The weight, mean and variance of each component are displayed in the same order.

Y1.5-3
No where λobs is the mean wavelength of the telescope in the passband of the considered observation.They also limit the light curves to the observations with phases between 20 days before and 60 days after peak; the phase of the light curve is given by where t is the time of the observation and t peak is the time of the SNe peak brightness.The requirements are: • at least 3 observations before peak with phase > −20 • at least 8 observations after peak with phase < 60 • at least 1 observation with phase ≤ 10 • at least 1 observation with phase ≥ 20 • σ C < 0.04, where σ C is color uncertainty obtained when fitting the light curve with the SALT2 package (Guy et al. 2007).
In this work we ignore the last requirement because SALT2 fits are computationally intensive.We also use the light curves preprocessed as described in Section 3.1 rather than the three-year-long light curves.

Figure 1 .
Figure1.Footprint of the baseline cadence with the number of WFD observations in the first three years (1.5 years of nonrolling followed by 1.5 years of rolling cadence).The dark bands correspond to the 'active' area of the rolling cadence which is observed at a higher cadence, and the light bands to the 'background'.

Figure 3 .
Figure 3. Normalized test-set confusion matrix for the classifier trained on the augmented training set of the Y1.5-3 baseline (left panel) and the no-roll cadences (right panel).The uncertainty in the log-loss corresponds to the 95% confidence intervals obtained by bootstrapping.The results show a slightly higher SNe classification performance when rolling is implemented at the level in the baseline cadence.
Figure3shows the confusion matrices for classifiers trained on the augmented training set of Y1.5-3 baseline and no-roll.The Y1.5-3 baseline classifier yields a slightly higher performance for SN Ibc, SN II, and a percent-level improvement in the PLAs-TiCC log-loss metric.This small difference indicates that rolling at this level makes a negligible difference to the overall efficacy of SNe photometric classification.However this result masks a significant difference between the classification efficacy between the active and background regions due to an averaging effect.Therefore we also investigated the difference in performance between the active region (which we visually identified as the dark bands in Fig.1; 65% of the test set events) and the background region (35% of the test set events).The confusion matrices in Fig.4show that the classification performance of the active region is higher than of the background region for all SNe classes.Indeed the log-loss metric improves by 25% for events in the active region as compared to the background.Figure5shows the confusion matrices for Y0-3 baseline and presto-color.The baseline cadence outperforms presto-color for SN Ia, SN II; the PLAsTiCC log-loss metric degrades by ∼ 10% for presto-color.While adding a third visit per night is expected to improve performance for early classification and for fast transient detection, our results indicate that this choice moderately degrades classification performance for long-lived transients.All the observing strategies considered in this work yield a higher performance in terms of the log-loss metric compared with the observing strategy used for PLAs-TiCC(Alves et al. 2022 reported a log-loss metric of 0.550 for this case).This indicates substantial per-

Figure 4 .Figure 5 .
Figure 4. Normalized test-set confusion matrices for the classifier trained on the augmented training set of the Y1.5-3 baseline cadence.The left panel shows the results for the active region of the rolling cadence and the right panel for the background region.The uncertainty in the log-loss corresponds to the 95% confidence intervals obtained by bootstrapping.The results show a significantly higher SNe classification performance for the active region of the rolling cadence.

Figure 6 .Figure 7 .
Figure 6.Test-set recall (left panel) and precision (right panel) as a function of light curve length per SNe class for Y0-3 baseline.The shaded areas correspond to the 95% confidence limits obtained by bootstrapping the recall and precision values for each bin.The dashed lines mark the high performance region between 50 and 175 days.To remove small-number effects we only present the results for bins with more than 300 events.

Figure 8 .
Figure 8. Test-set recall (left panel) and precision (right panel) as a function of inter-night gap per SNe class for Y0-3 baseline.The shaded areas correspond to the 95% confidence limits obtained by bootstrapping the recall and precision values for each bin.To remove small-number effects we only present the results for bins with more than 300 events.

Figure 9 .
Figure 9. Test-set density of events as a function of inter-night gap for Y1.5-3 baseline vs no-roll (left panel), and Y0-3 baseline vs presto-color (right panel).

Figure 10 .Figure 11 .
Figure10.Test-set recall (left panel) and precision (right panel) as a function of the length of the longest inter-night gap per SNe class for Y0-3 baseline.The shaded areas correspond to the 95% confidence limits obtained by bootstrapping the recall and precision values for each bin.To remove small-number effects we only present the results for bins with more than 300 events.The reduced performance below 8 days corresponds to less than 5% of the events which tend to have very short light curves and therefore are not well classified.

Figure 12 .Figure 13 .
Figure 12.Test-set density of events as a function of the length of the longest inter-night gap for Y1.5-3 baseline vs no-roll (left panel), and Y0-3 baseline vs presto-color (right panel).

Table 1 .
Breakdown of the number of SNe per class and observing strategy used in this work.(Left) events simulated between years 1.5 and 3 of the survey.(Right) events simulated between years 0 and 3 of the survey.

Table 2 .
Kessler et al. (2019))scale rate used to simulate core-collapse SNe in this work expressed in percentage.The rates followShivvers et al. (2017)and the SNe models are described inKessler et al. (2019); SNIb-Templates and SNIc-Templates are both described together as SNIbc-Templates.France) funded by the Centre National de la Recherche Scientifique; the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231; STFC DiRAC HPC Facilities, funded by UK BEIS National E-infrastructure capital grants; and the UK particle physics grid, supported by the GridPP Collaboration.This work was performed in part under DOE Contract DE-AC02-76SF00515.