Optimizing Numerical Simulations of Colliding Galaxies. II. Comparing Simulations to Astronomical Observations

Matthew Ogden; Graham West; John Wallin; Zachariah Sinkala; William Smith

doi:10.3847/2515-5172/abad9c

1. Introduction

1.1. Introduction

The merging of galaxies disrupts orbits creating dramatic changes in the stellar distribution function, as well as causing the collapse of gas clouds leading to bursts of star formation. Given the importance of collisions in the evolution of galaxies, we want to better understand the interaction of observable systems by modeling them. Unfortunately, we have limited observational data to reconstruct these interactions. For instance, we know the projected galaxy positions in the night sky, but not their offsets in the Z-direction. We do not know their complete velocities, only the z-velocity when we have redshift data available on both galaxies. We can constrain the orientation and mass of galaxies based on observations, but there is still a range of possible values. With so many unknown and uncertain parameters, our ultimate goal is to find those values to better understand observable galaxy mergers.

1.2. Research Goals and Objectives

To explore and constrain so many unknown parameters, we hope to apply an adaptive genetic algorithm on models and simulations. For more information, please see Graham West's research note "Optimizing Numerical Simulations of Colliding Galaxies I: Fitness Functions and Optimizing Algorithms."

For this poster, we discuss our approach to developing a computational method for visualizing the models by creating a model image and comparing them to a target image. By doing so, we quantify how similar the two images are and hope to create a computational method for comparing our models to observational data (Abazajian et al. 2009).

2. Introduction

2.1. Simulation Software: SPAM

SPAM is a public release software using the restricted three-body method, created and distributed by Wallin, Holincheck, and Harvey (Wallin et al. 2015). SPAM takes the predicted orbital parameters such as positions, velocities, orientation, masses, and sizes for each galaxy, and generates potential functions for massless point particles, resulting in O(N) computational speed. A disk of massless particles is placed in orbit around their parent galaxy and then integrates forward in time. The simulated disks undergo realistic tidal distortions that have been validated against N-body simulations.

2.2. Galaxy Zoo: Mergers

Galaxy Zoo: Mergers is a large-scale citizen science project to find models of current ongoing galaxy interactions (Holincheck et al. 2016). Thousands of volunteers viewed ∼3.31 million models via a website to identify potential models with similar tidal distortions to 62 target galaxy collisions. From those models, 58,000 were flagged by the volunteers as having possible similarity and given a ranking, hereafter human fitness score, based on a tournament-like competition for how similar they match their target image.

3. Visualizing Models

3.1. Simulation

Models from Galaxy Zoo: Mergers were sent through SPAM to generate high-resolution simulations, creating two sets of particles for each model: The initial and final particle positions before and after the interaction.

3.2. Image Creation

The initial and final particle sets are processed and visualized to create a more realistic image of the interacting galaxies, further referred to as initial and model images. We can then directly compare model images to target images. Visualization processes include Gaussian convolutions, realistic brightness profiles, and normalizing brightness.

3.3. Tidal Perturbation

A key piece of information used later was analyzing how much the morphology of the particles changed before and after significant tidal distortions during an interaction. By applying our direct image comparisons between the initial and model image, we are quantifying the perturbation of the particles.

4. Machine Score: Direct Image Comparison

4.1. Goal

By comparing the model image with a target image, we quantify the similarity between our model and observations data, with the hope of capturing morphological differences. Human-based fitness scores from Galaxy Zoo: Mergers are an independent check on the accuracy of our machine-score. Unfortunately, our initial results found no strong correlation. Some methods used thus far are Average Pixel Difference, Brightness Correlation, Structural Similarity Index, Fraction of Overlapping Pixels.

**Figure 1.** Comparing a machine score against human fitness before and after filtering lowly perturbed models.
Download figure:
Standard image High-resolution image

4.2. Results

By graphing the perturbation of the models, we found all machine methods to be profoundly affected by models with low tidal distortions. Once filtered, the correlation drastically improved for all image comparison methods. However, it also filtered some high ranked human models. Merely measuring the fraction of overlapping pixels or doing pixel by pixel comparisons of the brightness profiles does produce scores similar to the human ranking on their own. These results show that we need to include additional metrics beyond simple pixel comparison to ensure that models are not highly ranked because they have a well-chosen but unperturbed disk. This single filter likely indicates regions of the image not relating to morphology, such as bright galactic centers, profoundly affect the machine score in an undesired way.

5. Machine Score: Data-driven Method

5.1. Single Layer Neural Network

In contrast to comparing model and target image via fitness function, we can use machine learning to train directly on the models from Galaxy Zoo: Mergers. Using half of the models from a given target totaling around 1300 models, we trained a single layer neural network. This network read in the model images and was trained to predict the human fitness score. Using the other half of the models, we test our neural network showing excellent initial results.

5.2. Weight Analysis

In addition to producing an accurate scoring method, we can visualize the network weights and identify regions of the image that highly impact the predicted fitness score. When visualized, the bridge and tail regions are clearly highlighted. This makes intuitive sense, models with bright pixels in these regions have a shape similar to the target image. In addition, the weights highlight negative regions that penalize the score.

6. Results and Conclusions

6.1. Results: Direct Image Comparison

As is, our direct image scoring method functions poorly and is arguably naive. Each pixel on the image has an equal contribution to the final machine fitness score. This approach leads to undesired results, such as when models with lowly perturbed disks, bright galactic centers and low human scores are given a high machine-score. Despite this known flaw, we see some correlation between machine and human scores, which can be built upon and improved.

6.2. Results: Single-layer Neural Network

Although our Neural Network produced excellent results, it is important to understand why. By design, neural networks train directly on data and have known biased results. For instance, the majority of models from Galaxy Zoo: Mergers are poorly ranked, so the neural network is biased toward correctly ranking bad models. As a consequence, this method can only be applied to target systems for which we have a large set of ranked simulations. We would not be able to generalize this method to new target systems.

Lastly, at no point is observation data considered in this particular approach. Our goal is to develop a scoring method against observational data and optimizing on that scoring method. If we attempted to optimize using this scoring method, it is unlikely to converge toward observational orbital parameters.

6.3. Conclusions and Future Development

Thus far, none of the scoring methods are robust. Both direct image comparison and data-driven neural networks have strengths and weaknesses. Our next goal is to develop a method for creating a weighted mask for the direct image comparison. By doing so, we can emphasize regions that capture morphological changes. With citizen science data, we can robustly test our method for creating weighted features. Additionally, we will also look into more complex direct image comparisons such as feature extraction, pattern recognition. It is also possible to incorporate these methods into more complex neural networks.

Optimizing Numerical Simulations of Colliding Galaxies. II. Comparing Simulations to Astronomical Observations

Article metrics

Permissions

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

1.1. Introduction

1.2. Research Goals and Objectives

2. Introduction

2.1. Simulation Software: SPAM

2.2. Galaxy Zoo: Mergers

3. Visualizing Models

3.1. Simulation

3.2. Image Creation

3.3. Tidal Perturbation

4. Machine Score: Direct Image Comparison

4.1. Goal

4.2. Results

5. Machine Score: Data-driven Method

5.1. Single Layer Neural Network

5.2. Weight Analysis

6. Results and Conclusions

6.1. Results: Direct Image Comparison

6.2. Results: Single-layer Neural Network

6.3. Conclusions and Future Development

Optimizing Numerical Simulations of Colliding Galaxies. II. Comparing Simulations to Astronomical Observations

Article metrics

Permissions

Share this article

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

1.1. Introduction

1.2. Research Goals and Objectives

2. Introduction

2.1. Simulation Software: SPAM

2.2. Galaxy Zoo: Mergers

3. Visualizing Models

3.1. Simulation

3.2. Image Creation

3.3. Tidal Perturbation

4. Machine Score: Direct Image Comparison

4.1. Goal

4.2. Results

5. Machine Score: Data-driven Method

5.1. Single Layer Neural Network

5.2. Weight Analysis

6. Results and Conclusions

6.1. Results: Direct Image Comparison

6.2. Results: Single-layer Neural Network

6.3. Conclusions and Future Development