Brought to you by:

UNSUPERVISED CLUSTERING OF TYPE II SUPERNOVA LIGHT CURVES

and

Published 2016 September 12 © 2016. The American Astronomical Society. All rights reserved.
, , Citation Adam Rubin and Avishay Gal-Yam 2016 ApJ 828 111 DOI 10.3847/0004-637X/828/2/111

0004-637X/828/2/111

ABSTRACT

As new facilities come online, the astronomical community will be provided with extremely large data sets of well-sampled light curves (LCs) of transients. This motivates systematic studies of the LCs of supernovae (SNe) of all types, including the early rising phase. We performed unsupervised k-means clustering on a sample of 59 R-band SN II LCs and find that the rise to peak plays an important role in classifying LCs. Our sample can be divided into three classes: slowly rising (II-S), fast rise/slow decline (II-FS), and fast rise/fast decline (II-FF). We also identify three outliers based on the algorithm. The II-FF and II-FS classes are disjoint in their decline rates, while the II-S class is intermediate and "bridges the gap." This may explain recent conflicting results regarding II-P/II-L populations. The II-FS class is also significantly less luminous than the other two classes. Performing clustering on the first two principal component analysis components gives equivalent results to using the full LC morphologies. This indicates that Type II LCs could possibly be reduced to two parameters. We present several important caveats to the technique, and find that the division into these classes is not fully robust. Moreover, these classes have some overlap, and are defined in the R band only. It is currently unclear if they represent distinct physical classes, and more data is needed to study these issues. However, we show that the outliers are actually composed of slowly evolving SN IIb, demonstrating the potential of such methods. The slowly evolving SNe IIb may arise from single massive progenitors.

Export citation and abstract BibTeX RIS

1. INTRODUCTION

There is a controversy in the literature regarding the division of SNe II into photometric sub-classes. The classical division into Type II-P and II-L events (Barbon et al. 1979) based on the post-peak light curve (LC) has been supported by Arcavi et al. (2012) and Faran et al. (2014a, 2014b), but challenged by Anderson et al. (2014) and Sanders et al. (2015). Until now, the focus has been on extracting characteristics from LCs (e.g., peak magnitude, post-peak decline rate, plateau length), and searching for separation into populations. However, it is also known that some SNe have strikingly similar LCs, and the determination of which characteristics to compare is not straightforward.

New surveys such as the Zwicky Transient Facility (ZTF; Bellm 2014) and the Large Synoptic Survey (LSST; Ivezic et al. 2008) will provide the astronomical community with large data sets of well-sampled LCs. The rolling nature of these surveys will provide good coverage of the poorly studied early phases of SNe. However, there is and will remain a permanent lack of resources for spectroscopic classification of the transients found. Therefore, it is of utmost importance to develop tools for the study and classification of SN LCs based on their photometry alone.

Here we attempt to divide the sample of 57 LCs presented in Rubin et al. (2016)—in addition to two SN LCs that were originally included in that sample but subsequently removed—into classes of similar LC shape by using the unsupervised clustering algorithm K-Means on the LCs directly (as opposed to extracted parameters). This algorithm is one of the simplest both conceptually and algorithmically. Unsupervised clustering on SN characteristics (using a different technique) has been previously used for SNe Ia by Benetti et al. (2005). k-means has been used for many applications in the past; however, its use on SN LCs has not yet been attempted—perhaps due to the sparse sampling of typical SN LCs.

The Rubin et al. (2016) sample has the advantage of containing LCs with very well constrained times of first light, allowing us to utilize the early-time LC behavior, which has previously been unaccessible. Here we explore how the sample divides into classes, and explain this technique's limitations and pitfalls when it is applied to SN LCs.

2. ANALYSIS

We used the LC sample presented in Rubin et al. (2016), and added two SNe (iPTF13blq and iPTF14bas; I. Arcavi et al. 2016, in preparation), which were originally in that sample, but were removed for reasons explained below and discussed in Section 6. We smoothed the LCs using the algorithm described in Rubin et al. (2016). The Rubin et al. (2016) algorithm did not fit iPTF14bas well around peak, and we found that a smoothing spline performed much better. The smoothed LCs of both iPTF13blq and iPTF14bas are presented in Figure 1. See Rubin et al. (2016) for details on how the rise-time and peak magnitude were calculated.

Figure 1.

Figure 1. Smoothed LCs of iPTF13blq (top) and iPTF14bas (bottom). iPTF13blq was smoothed with the algorithm of Rubin et al. (2016) while iPTF14bas was smoothed with a smoothing spline. Dark blue and cyan symbols are the photometry from the Palomar 48'' and 60'' respectively. The empty diamonds in the iPTF13blq plot are auxiliary points added to regularize the smoothing algorithm.

Standard image High-resolution image

We used the k-means technique to automatically divide our LCs into classes. We interpolated our smoothed LCs to measure the magnitude between rest frame days 1 and 30, with uniform spacing of 0.1 days. Unless stated otherwise, days refer to days from estimated explosion in the rest frame. This is equivalent to treating the data set as 59 observations (the number of SNe in the sample) of 291 highly correlated parameters (relative R-band magnitude at 291 times). Data from $t\lt 1$ days was ignored because it contained information that was interpolated between the last limit and the first detection.

2.1. Explanation of k-means Clustering

The k-means clustering method involves solving the following problem: given a set of points ${x}_{1},{x}_{2},\ldots $ and the number of desired clusters k, partition the points into clusters and find cluster centroids Ci such that the within-cluster sum of square distances is minimized. This can be defined formally as

Equation (1)

where D is the chosen distance measure. An iterative algorithm for this problem, which is widely used today, was proposed by Lloyd (1982). The algorithm operates as follows.

  • 1.  
    Randomly select a starting point for each cluster in the multi-dimensional space.
  • 2.  
    Classify all points according to their closest cluster.
  • 3.  
    Calculate a new centroid for each cluster based on the classified points.
  • 4.  
    Repeat 2 and 3 until convergence.

This algorithm converges, but suffers from two major drawbacks: it is slow to converge and the random initialization of the centroid position tends to return less-than-optimal clusters (i.e., the sum of within cluster distances is larger than the true global minimum).

2.1.1. Improving the Initialization of k-means: k-means++

An improved initialization method called k-means++ was proposed by Arthur & Vassilvitskii (2007). According to this method, the initial centroids are chosen sequentially according to the following scheme.

  • 1.  
    Choose a data point randomly from xi and set this as the first centroid, C1.
  • 2.  
    Calculate the distance of each point xi from C1.
  • 3.  
    Choose C2 according to the probability
    Equation (2)
  • 4.  
    Classify all points according to their closest cluster.
  • 5.  
    Choose Cj according to the probability
    Equation (3)
    where Ck is the centroid closest to xi.
  • 6.  
    Repeat 4 and 5 until the number of desired clusters has been reached.

This method of initialization begins in many cases at a near-optimal solution. Arthur & Vassilvitskii (2007) have shown that this method achieves faster convergence to a more optimal result.

2.1.2. Determining the Optimal Number of Clusters: The Caliński–Harabasz Criterion

The Caliński & Harabasz (1974) criterion is often used to estimate the optimal number of clusters. The objective is to maximize the variance ratio criterion (VRC) defined as

Equation (4)

where n is the number of observations, k is the number of clusters, BGSS is the between group sum of squares and WGSS is the within group sum of squares defined by

Equation (5)

Equation (6)

where ni is the number of points in cluster i, $\bar{x}$ is the mean of the observations, Ci is the ith centroid, and D is the distance metric. This criterion agrees with the intuitive idea of clustering: a high value is achieved when the observations are close to their cluster, but the clusters are far from each other.

2.2. The use of k-means in This Work

We used the simplest distance metric—the Euclidean sum of squares—to determine proximity. On the one hand, this is reasonable because all of our observations are in magnitude differences. On the other hand, the rise is much shorter in duration than the decline/plateau. Therefore, late-time data can dominate the division. Limiting our analysis until day 30 after the estimated explosion gives similar weight to both the rise and the decline of the LCs. We chose the number of clusters empirically as described in Section 3 and found that it agreed with the number of clusters given by the Caliński & Harabasz (1974) criterion. k-means was initialized randomly 100 times to eliminate sensitivity to the arbitrary starting point of the algorithm.

In order to visualize our results, we also performed the principal component analysis (PCA). PCA finds directions in multi-dimensional sample space (in descending order) along which the variance is maximal. In other words, it finds eigenvector LCs, which give the highest variance when the sample LCs are projected onto them. Defining our m × n matrix of LCs (where m is the number of SNe and n is the number of parameters) as ${\boldsymbol{K}}$, and our PCA eigenvectors as ${{\boldsymbol{p}}}_{i}$, the principal components ${{\boldsymbol{c}}}_{i}$ are given by

Equation (7)

Note that ${{\boldsymbol{c}}}_{i}$ is a vector of the ith components of all of the LCs in the sample.

3. RESULTS

As we explored various partitioning of the data, we found that above k = 3, three SNe (iPTF13blq, iPTF14ajq, and iPTF14bas) tended to be in clusters of one or two objects. We identified these events as outliers—removed them from the sample—and discuss them separately in Section 6. We found that the remaining LCs were best divided into three clusters. This was supported by the results of the Caliński & Harabasz (1974) test (Figure 2), which consistently gave a maximum at k = 3. A color visualization of the LCs sorted by cluster is presented in Figure 3.

Figure 2.

Figure 2. Caliński & Harabasz (1974) criterion for various number of clusters. One hundred realizations are presented, indicating that the optimal k = 3 does not depend on the initialization of k-means.

Standard image High-resolution image
Figure 3.

Figure 3. Visual plot of the clusters. Color represents the magnitude (normalized to the peak). Each row is an SN, where the SNe have been ordered by cluster. The red lines highlight the cluster locations.

Standard image High-resolution image

The rise time, which is uniquely available for this sample, plays a major role. Two clusters, which have fast rise-times (II-F), are separated by their decline phase. One cluster has a fast rise and a slow decline (II-FS; similar to a II-P), while the other has a fast rise but fast decline (II-FF; similar to a II-L). The remaining cluster is separated by its slow rise (II-S). The II-FS, II-FF, and II-S clusters are of similar size. The centroid LCs and the standard deviations of each cluster are shown in Figure 4 and given in the online material (the structure of the table is given in Table 1). Histograms showing the cumulative distribution of peak magnitude, rise-time, and ${\rm{\Delta }}{m}_{15}$ within2 each cluster are shown in Figures 57 respectively. Note that the II-FS cluster does not come from the same population as II-FF and II-S in both peak magnitude and decline rate at the 95% significance level according to a Kolmogorov–Smirnov (KS) test. The II-S cluster does not come from the same population as the II-FF and II-FS clusters in rise-time at the 95% significance level according to a KS test.

Figure 4.

Figure 4. k-means clusters found using R-band light curves sampled from day 1 to 30. The median rise-time of each cluster has been marked with a vertical line. Type II-P SN 2005cs (Pastorello et al. 2009) is superimposed, and fits only the II-FS category. This is the only well-studied SN II-P with a good sampling of the rise.

Standard image High-resolution image
Figure 5.

Figure 5. Cumulative distribution function of the peak absolute R-band magnitude for each of the clusters. The number of objects in each cluster are given in parenthesis. The II-FS cluster is enriched in lower-luminosity events, and does not come from the same population as the II-FF and II-S clusters at the 95% significance level (based on a KS test).

Standard image High-resolution image
Figure 6.

Figure 6. Cumulative distribution function of the rise-time for each of the clusters. The number of objects in each cluster is given in parenthesis. The II-S cluster is not consistent with having the same parent population as the II-FF and II-FS clusters at the 95% significance level (based on a KS test).

Standard image High-resolution image
Figure 7.

Figure 7. Cumulative distribution function of ${\rm{\Delta }}{m}_{15}$ for each of the clusters. The number of objects in each cluster is given in parenthesis. The II-FS cluster is not consistent with having the same parent population as the II-FF and II-S clusters at the 95% significance level (based on a KS test).

Standard image High-resolution image

Table 1.  Mean and Standard Deviation of Each Cluster

  II-FS II-FF II-S
Phase Mean σ Mean σ Mean σ
Days $M-{M}_{\mathrm{peak}}$
1.0 0.68 0.22 0.68 0.28 1.54 0.40
30.0 −0.01 0.07 0.31 0.11 0.20 0.18

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as:  DataTypeset image

The first four PCA eigenvectors are shown in Figure 8. The first two components describe most (90%) of the variance between LCs. Figure 9 shows that applying k-means clustering on the interpolated LCs is equivalent to performing clustering on the projection of the LCs on the first two PCA eigenvectors. Moreover, we show that applying k-means directly on the PCA projections divides our sample into identical clusters. Figure 10 shows that the algorithm is not clustering based on rise-time and decline rate. Clustering based on the LC morphology is not equivalent to clustering based on rise-time and decline rate.

Figure 8.

Figure 8. Top panel: the first four PCA eigenvectors. Bottom panel: percent of the variance explained by each component. The first two components explain 90% of the observed variance.

Standard image High-resolution image
Figure 9.

Figure 9. First two PCA components colored by their cluster association. The same result is achieved when k-means is applied to the light curves or to the PCA projections.

Standard image High-resolution image
Figure 10.

Figure 10.  ${\rm{\Delta }}{m}_{15}$ as a function of rise-time colored by cluster association.

Standard image High-resolution image
Figure 11.

Figure 11. Sample LCs colored by cluster.

Standard image High-resolution image
Figure 12.

Figure 12. Plots of the cluster $1-\sigma $ limits. Notice that II-FF and II-FS are disjoint at late times, and the two are weakly disjoint from the II-S at early times. The slow-risers bridge the gap at late-time between II-FF and II-FS.

Standard image High-resolution image

4. CAVEATS

We draw the reader's attention to the following caveats and issues we observed during this analysis.

  • 1.  
    The clustering is sensitive to the determination of the time of explosion. Running 100 simulations where we shifted each LC randomly in time over the range $[-{\rm{\Delta }}{t}_{0},+{\rm{\Delta }}{t}_{0}]$, we found that 34 objects had a probability of changing cluster less than 20%. This is extremely conservative, because for most events there are measurements during the rise, making a random uniform distribution an extreme overestimation of the uncertainty. Figure 13 shows the probability of each SN to change cluster and Figure 14 shows how these probabilities are distributed. Some objects are more sensitive to these shifts than others. Despite what one might expect, the probability to change cluster does not appear to be correlated with the uncertainty on the time of explosion (see Figure 15).
  • 2.  
    An optimal way of weighting the full LC (including information beyond day 30) is still lacking.
  • 3.  
    The Caliński & Harabasz (1974) criterion does not have a strong statistical basis. It has several desirable characteristics, but it is not proved to be optimal.
  • 4.  
    We normalize the LCs by their peak luminosity. This may be reasonable given our uncertainties regarding host galaxy extinction, and our objective of finding SNe that have very similar LC shapes, but very different peak luminosities. However, much of the physics is encoded in the peak luminosity, and with a larger sample it may be more meaningful to explore clustering while taking peak luminosity into account.
  • 5.  
    Our clustering is performed in R band. It is well known that LC shapes can vary in different filters, and R band may not be the optimal filter in which to perform such an analysis.

Figure 13.

Figure 13. Probability of each object to change cluster when LCs were shifted randomly in time between $[-{\rm{\Delta }}{t}_{0},+{\rm{\Delta }}{t}_{0}]$ and $[-{\rm{\Delta }}{t}_{0}/2,+{\rm{\Delta }}{t}_{0}/2]$.

Standard image High-resolution image
Figure 14.

Figure 14. Histogram of the probability to change cluster when LCs were shifted randomly in time between $[-{\rm{\Delta }}{t}_{0},+{\rm{\Delta }}{t}_{0}]$.

Standard image High-resolution image
Figure 15.

Figure 15. Probability to change cluster plotted against the uncertainty on the time of estimated explosion. No correlation is apparent.

Standard image High-resolution image

5. DISCUSSION

Several recent works explore the morphology of large samples of Type II LCs. Arcavi et al. (2012), Anderson et al. (2014), Faran et al. (2014a, 2014b), and Sanders et al. (2015) tried to assess whether II-P and II-L LCs actually form two disjoint classes, or are rather a continuous class. They all examined post-maximum morphology. Arcavi et al. (2012) and Faran et al. (2014a, 2014b) found that II-P and II-L are disjoint in R band, but may have suffered from small number statistics. Studies of larger samples (Anderson et al. 2014; Sanders et al. 2015) led their authors to the conclusion that the decline rates, peak magnitudes, and plateau lengths are all continuous properties, with no clear separation into disjoint populations. Anderson et al. (2014) was particularly careful in fitting the post-maximum decline rate separately from the plateau decline rate, and found these quantities to be continuous in a sample of 116 V-band LCs. More recently, Rubin et al. (2016) presented a sample of 57 very well-sampled LCs (the sample studied in this work) and did not observe any clear discontinuities in the decline rates or rise-times as can be seen in Figure 10. However, these two parameters may not be the most appropriate to discriminate between the populations.

We have, for the first time, performed unsupervised clustering on Type II LCs. Importantly, our method takes into account the full morphology until day 30, including rise-time data. Our division into three SN LC classes leads us to several conclusions. First, while the rapidly rising SNe appear to naturally separate into two populations, divided by their slow/fast declines (II-FS and II-FF, which resemble II-P and II-L) the late-time morphology of the slow rising events (II-S, Figure 4) does not appear to be very different from the classic II-P/II-L classes. Therefore, the previous classes of II-P and II-L may have had a large number of interlopers from this slow rising population—perhaps this is the source of the continuum behavior observed in the previous works mentioned. While there is some excess at longer rise-times, our sample is consistent with originating from a single population.

Figure 11 overplots these three clusters, while Figure 12 overplots the three centroid clusters and their standard deviations. It is clear from the figure that the fast rise/slow decline (II-FS, Figure 4) and fast rise/fast decline (II-FF, Figure 4) are disjoint at late times, but the II-S class bridges the gap—possibly inducing the observed continuum in previous works.

Examining Figure 5, we see that the II-FS cluster contains lower-luminosity events, while the remaining clusters are comprised of brighter SNe on average (recall that the clustering does not take into account peak magnitude). This excess of faint objects is the slow/faint SN II-P population (e.g., SN 2005cs and PTF10vdl). Using a KS test, we find that the II-FS cluster does not come from the same population as the II-FF and II-S clusters in peak magnitude at the 95% significance level. The fact that the II-FS cluster differs in an observable other than LC shape may indicate that there is a physical meaning for the clusters found.

Our comparison to PCA is instructive. We found that using the full LC morphology is equivalent to using just two parameters: the projections of the LCs on the first two principal components. Applying k-means to both produced identical results. This may have far reaching ramifications, and may indicate that Type II LCs of this quality are a two parameter problem. Including late-time data and the peak magnitude will add some complexity.

It is important to note that the classes identified in this work have some overlap, are limited to R band, and it is unclear if they represent distinct physical classes. We have found some sensitivity to the determination of the time of explosion. Moreover, the sensitivity does not appear to be correlated with the uncertainty on the time of explosion, making it unclear if higher cadence photometry would improve the results.

These clusters may be useful in photometric identification of SNe, or the completion of gaps in photometric coverage. The photometric outliers—discovered by unsupervised clustering—were confirmed to also be spectroscopically distinct from the rest of the sample. This provides additional confidence in these techniques. However, the true test of these results will be the application of this analysis on an independent sample of similar size and quality. As photometric surveys begin to supply the community with large data sets of similarly sampled LCs, it will likely become necessary to employ unsupervised techniques to identify and classify transients.

6. OUTLIERS: SNe IIb FROM MASSIVE PROGENITORS

Examining the outliers more closely, we found that two of them (iPTF13blq and iPTF14bas) have spectra that are consistent with a IIb classification, although their LCs are unusually slowly rising. This was the reason they were removed from the Rubin et al. (2016) sample after performing the analysis presented in this work. We present their spectra with similar IIb spectra superimposed in Figure 16. Only later did we realize that iPTF14ajq is also an outlier, and we confirmed that the algorithm identifies this by increasing the number of clusters to five, which clustered iPTF13blq and iPTF14ajq together, leaving iPTF14bas as a cluster alone. Returning to Rubin et al. (2016) we find that the spectrum of iPTF14ajq suffered heavy galaxy contamination. While it is conclusively an SN II, we cannot rule out a IIb classification based on the only available spectrum. All three spectra are available on WISeREP (Yaron & Gal-Yam 2012).

Figure 16.

Figure 16. SNID comparison of SNe iPTF13blq (top) and iPTF14bas (bottom) with IIb spectra.

Standard image High-resolution image

Type IIb (as well as Type Ib and Ib/c) originate from stripped envelope progenitors. While the stripping mechanism is the subject of some debate, the two main possibilities are stripping via binary interaction (Podsiadlowski et al. 1992) and stripping due to episodic mass loss or to stellar winds. While the low ejecta masses and high rates of SNe IIb (19%–25% by volume; Li et al. 2011) may point toward the binary interaction channel, it is expected that at least some SNe IIb will result from single massive stars (large ${M}_{\mathrm{ZAMS}}$) that are stripped by winds (rather than binary interaction). These are expected to have massive cores and produce a large ejecta mass—leading to a long-rising LC. SN 2009jf is an example of a related event, where a massive star was almost completely stripped of its hydrogen envelope, leading to a Ib classification. Valenti et al. (2011) and Sahu et al. (2011) found that SN 2009jf was most likely an SN Ib originating from a massive single-star progenitor with ${M}_{\mathrm{ZAMS}}\gtrsim 20\mbox{--}30$ ${M}_{\odot }$.

In Figure 17, we compare the outliers to SN 2009jf and find that the LCs are strikingly similar. We speculate that iPTF14ajq is in fact a IIb and suggest that these extremely slowly evolving IIb's originate from massive stripped envelope stars. They are also relatively luminous, two with absolute peak magnitudes less than −17.5, while iPTF14ajq peaks at −16.4, lending further support that these are high-energy events originating from very massive progenitors. It is noteworthy that the outliers we identified are similar not just in their early-time behavior but also in their later evolution, despite having been selected based on the first 30 days of their LCs. The outliers may also be of the same type identified in González-Gaitán et al. (2015) as the long-rise population. They attribute these events to interloping IIn or 1987-like events, but this is not the case for our events. Clearly this population requires further study.

Figure 17.

Figure 17. Comparison of iPTF13blq, iPTF14bas, and iPTF14ajq to SN 2009jf.

Standard image High-resolution image

We thank Y. Cao for reducing the spectrum of iPTF13blq. We thank A. Tanay and B. Zackay for helpful discussions. A.G.Y. is supported by the EU/FP7 via ERC grant No. 307260, the Quantum universe I-CORE Program by the Israeli Committee for Planning and Budgeting and the Israel Science Foundation (ISF); by Minerva and ISF grants; by the Weizmann-UK "making connections" program; and by Kimmel and ARCHES awards.

Footnotes

  • Defined as the difference in magnitude between the peak and 15 days after the peak.

Please wait… references are loading.
10.3847/0004-637X/828/2/111