Automatic detection process of solar active region based on SDO/AIA digital image

The aim of this study is to do automatic detection of the sunspot on the solar disc using MATLAB. The digital images from the satellite SDO (Solar Dynamic Observatory)/AIA (Atmospheric Imaging Assembly) with 512 x 512 pixels jpg image were being processed through several steps: segmentation threshold to separate the object from its background and obtain the area and coordinate points (centroids) of the sunspots. Then, the sunspot coordinates gained from the automatic detection are compared with the coordinates released by NASA at solarmonitor.org. The error calculation using RMSE gives 30.824022 “to 1120.2831” for horizontal position and 46.509763 “to 266.74988” for vertical position. The errors are still considered significant since Lembaga Penerbangan dan Antariksa Nasional (LAPAN, Indonesian National Institute of Aeronautics and Space) specify the error tolerance value of 20” maximum. The main factor for the difference is the spatial characteristics of the EUV images used for automatic detection are different with the characteristics of the photosphere image used by solarmonitor.org as the catalogue reference of the sunspots.


Introduction
The phenomena happened on the sun called the sun activities consist of sunspot, prominence, flare, mass hurl (CME), and corona hole. One of the activities that the development can be observed is the sun's active area or the sunspot. Due to its link to other kinds of solar activity, sunspot occurrence can be used to help predict space weather [1], the state of the ionosphere [3], [4], and hence the conditions of shortwave radio propagation or satellite communications [5]- [7].
The forming of sunspots can be spotted by identifying the sun's active region. The active region can be seen from the satellite images of the sun. One of the satellites for sun observation is the SDO (Solar Dynamic Observatory)/ AIA (Atmospheric Imaging Assembly) which outputs digital images data. NASA had provided the sunspot identification on the website solarmonitor.org using multiple types of image wavelength. There are also several methods for the active region detections, such as fractal-based fuzzy technique [8], thresholding and region growing technique [9], or even using the solar interior properties [10]. Here, we would like to investigate if it possible to detect the solar active region by using only the corona image of the sun.
MATLAB is a software that is often used for the purposes of computation and simulations [11]-[13]. MATLAB also provides an easy tool for image processing as a user can easily access each and every  [14]. Moreover, there is an 'image processing tool box ' [15] built in MATLAB for this purpose. Because of that, automatic detection of sunspots on solar disc will be done based on the digital image from SDO/AIA using MATLAB. Furthermore, the purpose of this research is to predict the forming of sunspots on solar disc by using automatic detection based on SDO/AIA.

SDO/AIA satellite image
The Solar Dynamic Observatory (SDO) is a mission by NASA to observe the sun since the year 2010. SDO was launched to space on February 11th 2010. This observatory is a part of the Living with a Star (LWS) program. SDO is a stable spacecraft with a third axis that has two configurations and two antennas with high gain, and has a geosynchronous orbit that goes round the Earth. The purpose of the SDO is to understand the effect of the Sun on the Earth by studying activities that occurs on the Sun on its atmosphere on a small spatial and temporal scale and also on the wavelength simultaneously [16].
AIA (Atmospheric Imaging Assembly) can provide a clearer view on the phenomena that occurs on the outer atmosphere of the Sun that is still developing. The main purpose of AIA is for scientific research. The usage of data along with SDO instruments' data and from other observations significantly improves understanding of the activities that was shown at the atmosphere of the Sun also occurs on the heliosphere and around the planet.
Images of solar activities can be accessed daily through the solarmonitor.org website with various formats and image sizes. This research uses an image of the surface of the Sun taken on August 30 th to September 3 rd , 2017.

Threshold segmentation
Image segmentation is a process to divide an image into segments or non-overlapping areas. In digital image terms, resulting segmentation areas are defined by groups of neighboring or connected pixels. The segmentation process is used in many applications, and even though the methods used can be varied, they all support the same purpose which is to obtain a simple and useful representation from an image [17].
Thresholding is one of the segmentation techniques that commonly used for images with significantly different intensity values between the background and the main object. In its application, the thresholding method requires a value that is used as a limiting value between the background and the main object, and that value is called the threshold. Thresholding is used to partitioning the image by adjusting the intensity value of all the pixels that are bigger than the value of threshold T as a foreground and pixels that are smaller than the value of threshold T as the background, with T being a floating value.
Generally, the thresholding process for a gray scale image is done to generate a binary image [18], which can be written mathematically below: being a binary image from the gray scale image ( , ), and T being the threshold value.

MATLAB regionprops function
In MATLAB, the regionprops function is an object that is represented as a region with a rectangular approach. Figure 1 shows a region that consists of white pixels represented with a rectangular approach [17].

Image processing process
The data used in this research is an image in JPG format with the temporal resolution of 15 minutes and image resolution of 512 x 512 pixels that was taken from NASA SDO website [19]. The images we used are the colored image which has three color components of RGB (red, green, and blue). In MATLAB, the image will be represented in a three-dimensional matrix with the size of 512 × 512 × 3. Hence, transformations of the colored image to a gray scale image was done to simplify the image to a gray level between 0 (black) to 255 (white), so the representation of the image will be a regular matrix of 512 × 512.
After having the result of gray scale conversion, next is the thresholding operation. The thresholding process divides gray levels into two, black and white, so that the main object and the background of the image can be identified. An image with brightness level higher than the threshold point will be given the value of 1 (with black background), while images with brightness level below the threshold point will be given the value of 0 (object, with white color). This was done to make the bright area indicating the active region stand out from the background.
Next is the process to locate the area and coordinates of the sunspots that is derived from the centroid value using the regionprops function in MATLAB. The area has a scalar value, while the centroid has values in the form of coordinates. The values of the area and the centroid of sunspots for the August 30 th , 2017 can be seen on the Table 1.

Comparison of the sunspots detection with NASA
To determine whether the result of the automatic detection matches the sunspots that was provided by NASA, a comparison was done between the resulting image from the automatic detection and the image provided by NASA on solarmonitor.org. comparison was done by matching the centroid coordinates from the automatic detection and coordinates from NASA.

Coordinate validation using RMSE
Because the data of the sunspots that were the result of automatic detection is in pixels while the data received from solarmonitor.org was using arc second units, transformations need to be done before the validation process. Other than the difference in units, the scale and origin point (0,0) in the cartesian coordinate of the centroid of the two images also compared. Total width of the image from solarmonitor.org is 2460" and the origin point is in the middle, while the total width of the analysed image is 512 pixels with the origin point at the top left. Thus, the coordinates' transformation was done using these functions. ′ = 4.805 − 1230 ′ = 1230 − 4.850 . Results of the transformation can be seen on table 3. The RMSE values were ranged between 30 to over 1000 arc second. This was far above the ideal research target that was determined by Lembaga Penerbangan dan Antariksa Nasional (LAPAN, Indonesian National Institute of Aeronautics and Space), which is maximum 20". Results of this research can be affected by factors stated below:  Spatial characteristics of the EUV image (that was detected) differs from the characteristics of the photosphere image (from solarmonitor.org) that became the reference catalogue of the sunspots.  The image resolutions were different between the one used for detection (512 x 512 pixels) and the one taken from solarmonitor.org (1230 x 1230 pixels)  The analysed EUV image had a different epoch. Sunspots may change in position during the rotation of the Sun (13° to 130° per day)

Conclusion
Automatic detection of sunspots was done in MATLAB with images that was taken from SDO/AIA satelites. The image that was taken was a solar ultraviolet (EUV) image in JPG format with a resolution of 512 x 512 pixels. Steps that were taken including threshold segmentation and regionprops. Images acquired from the automatic detection process were compared with the results provided by NASA in the solarmonitor.org website. Detection results were transformed so that the data would have the same unit of measurement as solarmonitor.org. Error calculation using RMSE resulting in values between 30.824" to 1120.283" for the horizontal position, and 46.509" to 266.749" for the vertical position. This value was considered significant as LAPAN had determined tolerance levels to be below 20". Because of that, the validation for further research should be done by processing digital image that occurs on the same time period and have the same pixel size as the image provided by NASA.