Utilizing Convolutional Autoencoder for Anomaly Detections in LIGO Spectrogram Data

Massive amounts of data generated from the continuous running of LIGO gravitational-wave detectors comes with a need to search for signals within the data. Gravitational-wave data captured by the detector consist of astronomical events or glitches that last seconds. We present an unsupervised learning using convolutional autoencoder trained on the no-glitch Gravity Spy dataset to do anomaly search on spectrogram data. Reconstruction error is used as the basis and multiple windows are used to improve the model. Results on test data show that the model is capable of detecting signals with significant anomalies such as the Chirp or Koi Fish glitch. Meanwhile, detecting subtle anomalies such as the 1400 Hz Ripples is difficult because its reconstruction error is near the range of noise signals. Validating the result on confirmed gravitational-wave signals shows that the model is capable of gravitational-wave detection.


Introduction
Gravity, as stated in Einstein's general relativity, is the curvature of space-time due to the presence massive objects.General relativity also theorised the existence of gravitational-waves, analogous to how Maxwell's equations predict electromagnetic waves [1].The theory is then confirmed by the first observation of gravitational-waves from a binary black hole merger on September 14, 2015 by the Laser Interferometer Gravitational-Wave Observatory (LIGO) Scientific Collaboration and Virgo Collaboration.Spatial strain is measured by using a large scale laser interferometer to detect the gravitational-wave as seen in figure 1.The results from gravitational-wave observations enable studies to test general relativity [2].
The observation data gathered by the detectors are openly available through gravitationalwave Open Science Center (GWOSC) and contains years worth of data.Data collected by the detectors include gravitational-wave data and transient non-gaussian noises that are known as glitches [3].These data are important not only for gravitational-wave data, but also finding dark matter, cosmic string, and many more exotic objects that can be detected using LIGO [4,5].However, the problem is that the data we are interested in as seen in figure 1 lasts seconds amongst years of data.We require a method that could do searches on the data to discover more of these glitches to encourage more findings.In this paper, a gravitational-wave search method is proposed by utilizing machine learning algorithms, namely convolutional autoencoder.
Research has been done to do anomaly detection based on detector strain time series data using an autoencoder by Morawski et al. [6].We propose a method to use the strain spectrogram data that could provide more information based on the shape of the signal.

Dataset and Model
In this paper, we use the Gravity Spy dataset and a convolutional autoencoder to construct the deep learning model for anomaly detection.We first explain the data within the Gravity Spy dataset, how the data is processed, and the model that is trained.

Dataset
The data gathered from gravitational-wave observatories have transient, non-Gaussian noise known as glitches.These glitches can be instrumental or environmental, such as fluctuations in the laser or small ground motion.Within the first observation of LIGO, there has been 10 6 glitches detected over a minimum signal-to-noise ratio threshold of 6 [3].To help with categorizing the data, the Gravity Spy project is created.It is a crowdsourcing project that couples human classification with machine learning models to identify glitches within the gravitational-wave data and provide data for machine learning models.People from around the world identify glitches into their respective categories and the result is the dataset containing around 30,000 data that are conveniently split into train, validation, and test data [ The dataset is generated using the Q-tranform method designed for the detection of gravitational-wave transients, similar to how spectrograms represent glitches in time-frequencyenergy space [3].Within it is anomaly and non-anomaly data with the size 569 × 479 pixels with additional red, green, blue, and alpha color values.The images are categorized into 22 classes that is 1 no-glitch class and 21 glitch class.Each unique event time has four spectrograms with time window of 0.5 s, 1 s, 2 s, and 4 s as shown in figure 2. The data still needs to be prepared for training purposes.Firstly, we require the strength of the signal to do the anomaly detection.For that, the dataset is first converted into grayscale.Then, the data is normalized by dividing every value with the maximum 8-bit value of 255.The resulting images are then separated by their respective time window.These images are then put into the model for training.

Model
Autoencoders are a neural network that is trained to reconstruct its input.They have the ability to learn in unsupervised manner and generate an "informative" representation usually known as latent space [8].The usual implementation is encoding an input into a latent space that is then decoded as shown in figure 3. of convolution operation in machine learning is usually done by operating a kernel/filter on a matrix like a stencil.The operation can be an edge detector or make images sharper depending on the kernel used [9].We used a convolutional autoencoder with the architecture shown in figure 4 implemented using TensorFlow [10].At each encoding and decoding process, the ReLU activation is used for its non-linear properties and fast to calculate, no activation is used for the encoded layer, and sigmoid activation for the output layer to limit the result from 0 to 1.The cropping layer is used to trim the resulting convolution dimensions to the same dimension as the input.First, for training, the mean squared loss is used for its nature in having higher loss when the result are far from ground truth.With this, the model is motivated to reconstruct the image perfectly.Second, to optimize the training process, the Adam (adaptive moment estimation) is chosen for its adaptive learning rate during training [11].Third, to not waste time in training, early stopping callback with patience of 5 based on the loss is used to stop the model from training after it stops improving in the last 5 iterations.Lastly, a checkpoint callback is used to save the best model based on minimum loss after the model has finished training.

Input
We chose the no-glitch data to be trained with because the glitch data could result in the model being capable of reconstructing no-glitch data because it contains data with subtle glitch signal and giving information on how to reconstruct no-glitch data from it.For that reason, the 428 no-glitch data is chosen to be trained.Looking at figure 2, we see that the shapes and size of the background noise are different in each time window.To ensure the model learning to reconstruct correctly, we train the model on each time window separately bringing down the training data to 107 for each model.The result is four models that are the 0.5 s, 1.0 s, 2.0 s, and 4.0 s time window model.

Results and Discussion
The resulting models after training are then benchmarked on the training data to know how to determine the anomaly data.Then, the method is tested on real data generated straight from GWOSC time series data.

Benchmarks
After the model has been trained on 107 data for each time window, the mean squared loss or the reconstruction error of the final model are shown.However, they represent only the no-glitch reconstruction.We run another calculation on 22.000 data to determine the average, standard deviation (STD), maximum, and minimum of the reconstruction error.These datas are crucial to determine whether a spectrogram contains an anomaly or not.The data presented in table 1 shows that reconstruction error of the no-glitch data and the mean glitch data are far apart with the mean glitch data having higher values.From that, we can say that the model did its job in detecting anomalies by failing to reconstruct it.Although, we can see its shortcomings when we view the lowest glitch values that comes from mostly the 1400 Hz ripples.Due to its subtle signal, the model has difficulties in detecting it as anomaly.Furthermore, glithces that are only visible in certain time window can cause a false negative if the models are used individually.Therefore, we can not use a single model's result directly to assess whether the data is an anomaly or non-anomaly.Having multiple models lets us to use multiple windows to not miss an anomaly.For determining whether a data in a certain window is an anomaly, we add together the average and the standard deviation to avoid false positives on the no-glitch data.Using this method in the validation data by reconstructing 4,800 spectrograms (0 for no-glitch and 1 for glitch) yields the following accuracies in figure 5.The accuracies were calculated by calculating the mean values of the prediction in glitch data and switching 0 to 1 and 1 to 0 for the no-glitch data.

Tests on real data
Testing on real data means that we must generate the spectrogram outselves.Both accessing and generating the Q-transform data will be performed with the help of GWpy, a Python package for gravitational-wave astrophysics [12].Using the TimeSeries.fetch_open_datafunction within the GWpy package we choose the detector, start and end GPS, and 16384 Hz sample rate.Then we perform the Q-transform with whitening, selecting frequencies ranging from 10 Hz to 2048  Comparison of the multiple windows assessment.The accuracies are taken from the prediction of 4,800 validation data with the model.The detection with only using the average as the threshold results in many false positives as seen in the no-glitch accuracy.On the other side, although it may seem that using the average and the standard deviation for the threshold is worse in terms of glitch accuracy, it is way better at detecting true negatives on the no-glitch data.
Hz, and mean normalization.We then generate the plot and limiting the signal values from 0 to 25.5.These settings are required to generate spectrograms similar to Gravity Spy's dataset.
The program is created by following these steps.First, we pick a starting point for the program to search.Second, we fetch the data corresponding event time data and generate the spectrogram for the 0.5 s, 1.0 s, 2.0 s, and 4.0 s windows.Third, we use the model for each window to calculate the reconstruction error.Fourth, determine whether the reconstruction error is an anomaly or not.If one window shows up as an anomaly, then it is determined that the corresponding event time is an anomaly.Finally, the program is repeated by searching for the next time after the starting point.
Figure 6 shows the actual data and what the model reconstructs.We observe that the model is capable of reconstructing the glitch that appears but the model is also amplifying the noise on the background leading to an increase in reconstruction error and it being categorized as an anomaly.Another thing to note is that the model is reconstructing the blacks as gray.
Testing it on Gravitational-wave Transient Catalog (GWTC) data, we found two examples.In the first case of figure 7, the glitch is prominent in the ground truth.We also observe the same noise amplification as in the blip data.The reconstructed signal also seems to have a feathering effect in that it thickens leading to it being detected as anomaly.In contrast, the failed detection in figure 8 shows that the ground truth looks similar to a noise leading to a lower reconstruction error.This shows the limitation of the model when it comes to detecting subtle signals.
Running the program through 10 samples of GWTC event times and using both detectors by determining the event is an anomaly if one detector counts as an anomaly, the model was able to guess all of them correctly.The algorithm was able to do anomaly search consistently within around 30 seconds.For accessing the data, it took around 10 seconds, generating the spectrograms took around 18 seconds, and predicting the signals took around 2 seconds.These processes are run in Google Colab, an interactive Python environment for prototyping machine learning models [13].
Improvements could be made to speed up the anomaly detection process.Downloading the data so that the algorithm only has to access the data once or doing multiple searches at once.Improving the model by lowering the spectrogram dimension could also be done to speed up the training and exploration process.

Conclusion
In this paper, we developed a method to do anomaly search in gravitational-wave data using convolutional autoencoders.Showing that it is capable of detecting glitches and also gravitational-wave signals.The model was able to guess from 10 samples of the GWTC data that the event time contains anomaly showing its capability in detecting gravitational-wave signals.However, it has a flaw in which the model could not detect subtle signals due to the low reconstruction error, nearing the no-glitch signal.
The search using our model and algorithm took around 30 seconds per event time using the Google Colab platform.This algorithm could be improved using multiple search or batching the spectrogram images for later predictions.Furthermore, decreasing the dimension of the spectrogram is also an option for faster training on the model.

Figure 1 .
Figure 1.Plots of the GW150914 detection.The left side is the Hanford interferometer and on the right side is the Livingstone interferometer.The time series data shows the merger of two black holes rotating faster as they get closer, generating a higher frequency.The spectrogram shows similar depiction of the "chirping" behaviour.Figure taken from [2].

Figure 2 .
Figure 2. Spectrogram example of a glitch in differing time windows.Smaller time windows has more apparent noise in the background and the larger time windows has less.Figure taken from [3].

Figure 3 .
Figure 3. Visualization of an autoencoder.The input is compressed into a latent space that is then decoded.Figure taken from [8].

Figure 4 .
Figure 4. Architecture of the autoencoder used in this work.The size below the layer's name correspond to the size of the image and the depth of the image, in this case the number of filters.The number beside Conv2D corresponds to the kernel size and the number beside MaxPool2D corresponds to the pooling size.Small house-like shape pointing towards the next layer is a visualization of the convolution operation.

Figure 5 .
Figure 5.Comparison of the multiple windows assessment.The accuracies are taken from the prediction of 4,800 validation data with the model.The detection with only using the average as the threshold results in many false positives as seen in the no-glitch accuracy.On the other side, although it may seem that using the average and the standard deviation for the threshold is worse in terms of glitch accuracy, it is way better at detecting true negatives on the no-glitch data.

Figure 6 .Figure 7 .
Figure 6.Blip spectrogram ground truth (top) and reconstruction (bottom) with time window 0.5 s to 4.0 s from left to right.The spectrogram that contains a glitch shows an amplified noise causing the error to go up and detected as a glitch.

Figure 8 .
Figure 8. GWTC spectrogram ground truth (top) and reconstruction (bottom) with time window 0.5 s to 4.0 s from left to right.This shows a failed detection on subtle signal.In this case, the signal blends with the noise, resulting in lower reconstruction error.

Table 1 .
Benchmark of reconstruction error on the training data.