Correlation Analysis of GNSS-Derived Precipitable Water Vapor (PWV) with Rainfall Data in Indonesia

GNSS (Global Navigation Satellite Systems) have become an important tool for various activities related to positioning, such as navigation, construction projects, and deformation measurement. Additionally, GNSS can also estimate Precipitable Vapor Vapor (PWV) for meteorological purposes. PWV is a measure of atmospheric water vapor, which can eventually precipitate as rain. Since Indonesia’s climate is mainly characterized by the change in rainfall, monitoring precipitation is a crucial step in understanding its pattern. In this study, we aim to analyze the correlation between PWV and rainfall in Indonesia. We used GNSS observations from the InaCORS network across Indonesia, as well as rainfall data from GSMaP at the InaCORS stations in 2019 from 1st January to 31st December. Both of these data are normalized every five days (pentad days) and compared to each other to obtain the Pearson’s correlation coefficients. From these results, we generated a heatmap of the Pearson’s correlation coefficients between pentad PWV and rainfall at InaCORS locations in Indonesia, with the highest correlation of 0.753819 in Garut Regency and the lowest correlation of 0.090146 in Fakfak Regency. Moreover, we also analyzed the timeseries of PWV and rainfall comparison at several sampling stations. From the results we found, PWV and rainfall correlate each other in positive way. Southern regions of Indonesia have higher correlation compared to northern regions of Indonesia.


Introduction
In modern times, Global Navigation Satellite System (GNSS) has become an integral part of daily activities in the public.GNSS is a terminology that refers to a set of satellite systems and technologies for positioning and navigation [1].In addition, GNSS can be used in any weather without experiencing significant obstacles.GNSS is included in the extraterrestrial measurement method because it utilizes objects or objects in space.GNSS has been widely used in various countries for activities and activities that especially require information about position, navigation and time [2].Currently, there are various GNSS satellites such as GPS (Global Positioning System) from the United States, GLONASS

Study Area
The Republic of Indonesia is a country lies between two continents, Asia and Australia, and between two oceans, the Pacific Ocean and the Indian Ocean [12].Indonesia has approximately 17.000 islands, which is located between 6 degrees north and 11 degrees south latitude, and from 95 to 141 degrees east longitude.There are five largest islands in Indonesia, which are Sumatra, Java, Kalimantan (Borneo), Sulawesi, and Papua (New Guinea) [13].Indonesia has a tropical climate because it is located in an area passed by the equator [14].
For the locations of station samples, they are spread across the major islands of Indonesia, as shown in Figure 1.These islands are Sumatra, Java, Kalimantan, Sulawesi, and Papua.For Sumatra, Java, and Papua, each island used two InaCORS stations to compare lowland and highland InaCORS stations.For Kalimantan and Sulawesi, each island uses only one InaCORS station because these islands do not have an InaCORS station in the highlands.
The latitude and longitude coordinates and altitude above the ellipsoid for each InaCORS station studied can be seen in Table 1.The coordinates and altitude were obtained from the link http://nrtk.big.go.id/ managed by Badan Informasi Geospatial (Geospatial Information Agency) [15].Based on the table, the InaCORS stations in highland areas are InaCORS stations CBKT, CLBG, and CWMN.InaCORS stations, other than these three stations, are located in lowland areas.

Precipitable Water Vapor (PWV) data
These PWV data were originally GNSS observation data from the InaCORS station.GPS data obtained from InaCORS station observations can be processed to obtain tropospheric delay parameters.This data processing uses software called GAMIT (GNSS at MIT).The party that does the processing with the software is the Geospatial Information Agency (BIG).GAMIT is software that has been specially designed by MIT (Massachusetts Institute of Technology), Harvard-Smithsonian Center for Astrophysics (CfA), Scripps Institution of Oceanography (SIO), and Australian National University [16].The software can estimate coordinates and deformation velocities of observing stations, stochastic or functional representation of post-seismic deformation, atmospheric deceleration, satellite orbits, and earth orientation parameters.To run the software, Fortran and C are the programming languages used.GAMIT can be ran on various operating systems, such as Windows, LINUX, and MacOS [16].
The InaCORS data come from 237 different observation stations spread across Indonesia.For InaCORS samples, 18 stations were used.These data are then further processed to obtain values related to the disturbance parameters in GNSS positioning.The values in the processed data are GNSS observation time, Zenith Total Delay (ZTD), Zenith Hydrostatic Delay (ZHD), Zenith Wet Delay (ZWD), pressure, temperature, and PWV.The time span of the data obtained is more than 16 years from 1 st January 2004 to 10 th October 2021, with an interval between data every 1 hour.

Rainfall data
The rainfall data used belong to the secondary data type, just like the PWV data discussed earlier.These data include rainfall data from March 1, 2014, to September 30, 2021.The data are presented as a text document (.txt).Rainfall data can be accessed through JAXA Global Rainfall Watch (GSMaP).GSMaP has a global rainfall database that can be downloaded for free, although user registration is required.GSMaP rainfall data can be accessed at https://sharaku.eorc.jaxa.jp/GSMaP/.
As previously stated, the rainfall data used in this study are from 237 InaCORS locations.In the data, there are several values listed.The first value is the time of the rainfall observation.Rainfall observation time is divided into year, month, day, and hour.The next value is the amount of rainfall that coincides with the observation time.The amount of rainfall intensity in an hour is expressed in millimeter [23,24].

PWV determination from GPS data
In determining the value of PWV mathematically, PWV is closely related to the disturbing parameters in GNSS positioning caused by atmospheric deceleration.These parameters, namely ZTD, ZHD, and ZWD, can be calculated with several formulas.Based on Nilsson et al. [25], ZTD can be determined in relation to the total tropospheric refractivity in the following formulation: From the equation, h l is the height of the lowest level of the troposphere and N t (h) is the total refractivity of the troposphere.The total tropospheric refractivity can be attributed to the effects of dry gas and water vapor.
Based on equation (1), ZTD is affected by dry gas and water vapor.Dry gas is related to ZHD and water vapor has a correlation with ZWD.In simple terms, ZTD can be expressed by the following equation [1]: To obtain PWV value, the first value to be found is the ZWD value.With equation ( 3), the ZWD value can be determined.

Pentad days normalization
In this study, the normalization method of every five days (pentad days) was carried out for two reasons.
The first reason for this method is to smoothen the appearance of the data when plotted onto a graph.
The raw rainfall and PWV data used had a one-hour interval during 2019.If the raw data is plotted into a graph for one year, the plot of the graph will make it difficult to draw analysis and conclusions from it.Instead of using hourly rainfall and PWV data, the data were converted into pentad days to make it easier to highlight the characteristics of rainfall and PWV at the stations under study [28].Therefore, 1 year with 8760 hours can be divided into 73 pentad days.
To see how much difference there is between the graphical plots before and after normalization every five days, Figure 2    The second reason the pentad days data normalization method is used is to reduce the effects of random high-frequency temporal variability of the weather system [29].This can provide an opportunity to evaluate data for monitoring extreme weather and climate events in the short term [30].By doing so, the data used can illustrate the presence of drought and rainfall extremes that occur over relatively short periods of time [31].Before delving deeper into the correlation coefficient value, the spatial distribution values of each PWV and rainfall must be considered.Throughout 2019, northern Indonesia, except for parts of northern Sumatra Island, had a higher spatial distribution of PWV compared to southern Indonesia, especially Java Island and its surrounding islands.This behavior can be seen in Figure 4. On the other hand, the spatial distribution pattern for rainfall in Indonesia is more varied, as shown in Figure 5.The majority of Indonesia, especially Sumatra, Java, northern Sulawesi, and the Maluku Archipelago, have low spatial distribution values of rainfall.

PWV and rainfall correlation in Indonesia, 2019
Based on Figure 6, Indonesia has a positive correlation coefficient.The lowest correlation coefficient value is at the CFAK station, which is located in Fakfak Regency, West Papua, and has a value of 0.090146.Meanwhile, the highest value is at the CRUT station, with a value of 0.753819, which is located in Garut Regency, West Java.From the figure, southern Indonesia, especially Java Island, tends to have a higher correlation than northern Indonesia.Island is CMEN station in Tulang Bawang Regency, Lampung Province.Figure 7 shows that the PWV values obtained normalized every five days during 2019 are in the 28-65 mm range.In this area, rainfall is relatively low from mid-March to mid-October.The rainfall trend line increases as the PWV trend line does the same.Furthermore, based on the distribution plot of pentad days of rainfall compared to PWV at CMEN station, as shown in Figure 8, rainfall has a range of values of 0-1.5 mm.The scatter plot at CMEN station shows that the intensity of rainfall increases with increasing PWV value, particularly when PWV exceeds 50 mm.This is also shown by the trend line of the scatter plot.The resulting correlation coefficient between PWV and rainfall data after normalizing every five days is 0.64985.The next InaCORS station on Sumatra Island is the CBKT station in Bukit Tinggi, West Sumatra Province.Based on the plot graph in Figure 9, the PWV value after being normalized every five days is found to be in the range of 25-48 mm.The graph also shows the value of PWV and rainfall, which decreased from the end of April to mid-September.The rainfall trend line generally increases when the PWV trend line does the same.In addition, based on the distribution plot of pentad days of rainfall compared with PWV, as shown in Figure 10, the most rainfall occurs in the range of 0-1.3 mm, although there are rainfall values around 2.4 mm.The distribution plot at CBKT station shows that the intensity of rainfall increases with increasing PWV value, particularly when the PWV value is greater than 35 mm.This is also indicated by the trend line of the scatter plot.The correlation coefficient between PWV and rainfall got after normalizing every five days at CBKT station is 0.53156.

InaCORS samples on
Java.On the island of Java, the first InaCORS sample studied was the station located in West Bandung Regency, West Java Province.In Figure 11, the PWV value obtained after normalizing every five days is between about 14-43 mm during 2019.From the figure, rainfall decreased in intensity from mid-March to mid-September.The rainfall trend line generally increases when the PWV trend line does the same.Then, the distribution pattern of pentad days PWV against rainfall for CLBG station can then be seen in Figure 12.In the figure, rainfall has a value between 0-1.2 mm.The scatter plot shows that the intensity of rainfall events at CLBG station increases with increasing PWV values, especially when PWV has values above 20 mm.This is also indicated by the trend line of the scatter plot.The resulting correlation coefficient between PWV and rainfall data after normalizing every five days is 0.75047.The second InaCORS station studied on Java Island is the CPAC station located in Pacitan Regency, East Java Province.In Figure 13, the PWV value obtained after normalizing every five days is between about 27-63 mm during 2019.From the figure, rainfall almost does not occur during mid-March to early October.The rainfall trend line increases when the PWV trend line experiences the same thing.Then, in Figure 14, the distribution pattern of PWV pentad days against rainfall for CPAC station can be seen.In the figure, most rainfall occurs between 0-0.1 mm with PWV of around 27-50 mm.The scatter plot at CPAC station shows that rainfall with intensity around 0-0.1 mm still occurs a lot even though the PWV value has increased.However, the scatter plots also show that rainfall events at CPAC stations with intensities ranging above 0.1 mm increase with increasing PWV values.The trend line of the scatter plot also shows this.The resulting correlation coefficient between PWV and rainfall data after normalizing every five days is 0.6456.

InaCORS samples on Kalimantan. The InaCORS station on Kalimantan
Island studied is CBAL Station, in Balikpapan City, East Kalimantan Province.In Figure 15, the PWV value obtained after normalizing every five days has a range between 37-67 mm during 2019.The rainfall trend line generally increases when the PWV trend line does the same, although there are some times when the rainfall trend line is inversely proportional to the PWV trend line.Figure 16 shows the distribution plot of pentad days of rainfall compared to pentad days of PWV data at CBAL station.According to the graph, rainfall is generally distributed between 0 and 0.9 mm.However, there is one rainfall value of 1.875 mm.Rainfall with intensity in the 0-0.1 mm range occurs mostly when PWV is in the range of 37-61 mm.The scatter plot also shows that rainfall events at CBAL station with intensities above 0.1 mm increase with increasing PWV values.The increase in rainfall intensity above 1 mm starts at a PWV of 50 mm and goes up to over 60 mm.This is also indicated by the trend line of the scatter plot.The resulting correlation coefficient between PWV and rainfall data after normalizing every five days is 0.51522.The last InaCORS station in Papua studied is the CUKE station located in Merauke Regency, Papua Province.In Figure 21, the pentad days PWV values obtained have a range between 27-67 mm during 2019.Rainfall in this area is relatively low from March until the end of October.The rainfall trend line generally increases when the PWV trend line experiences the same thing.Figure 22 shows the distribution plot of pentad days of rainfall compared to pentad days of PWV at station CUKE.Based on the figure, rainfall has values in the range of 0-3.2616 mm.Most rainfall occurs in the range of 0-0.1 mm, with PWV in the range of 27-62 mm.The scatter plot at CUKE station shows that rainfall intensity in the range of 0-0.1 mm still occurs a lot even though the PWV value has increased.However, the scatter plot also shows that rainfall events at CUKE station with intensities above 0.1 mm increase with increasing PWV values.This is also indicated by the trend line of the scatter plot.The increase in rainfall intensity above 1 mm starts when the PWV is 55 mm.The resulting correlation coefficient between pentad days PWV data and pentad days rainfall data is 0.44127.

Analysis of PWV and rainfall correlation and comparison
Based on the results obtained, the correlation coefficient values obtained have a positive value throughout 2019.The correlation coefficient values of the sampling stations located in the highland and lowland areas can be expressed in Table 2 and Table 3. the results obtained and the two tables above, the highest correlation coefficient from sampling stations is shown by the CLBG station on Java Island with a value of 0.75047 at 1329.7 m above the ellipsoid.In addition, the InaCORS samples on Java Island have a larger correlation coefficient value compared to other islands, although the PWV values are more volatile.This shows that the relationship between PWV and rainfall on Java Island tends to be stronger.The tables above also show that stations in lowland areas do not always have a smaller correlation coefficient value compared to stations in highland areas, although the range of the highest PWV values in the lowlands tends to be greater than the range of the highest PWV values in the highlands.
This positive correlation indicates that PWV and rainfall have a directly proportional relationship.This unidirectional relationship is also visible in the graph comparing pentad days PWV with rainfall at each sampling station.On the graph, rainfall tends to occur when PWV has increased before.The opposite is true when rainfall decreases and PWV also decreases.
A positive correlation between rainfall and PWV in the Eulerian atmospheric moisture budget equation.The equation is expressed as follows [32]: In equation ( 6), E and P are variables for total evaporation and precipitation, respectively.Time is represented by the variable t.g is the gravitational constant, which has a value of 9.8 m/s 2 .The variables ps and pt are the pressures at the Earth's surface and top of atmosphere, respectively.The pressure at the top of the atmosphere can be considered zero [33].The variable q is a value that states the specific humidity in a particular area.v is a variable that states the wind vector.
Based on equation (6), if an area has E-P>0, it is experiencing water evaporation.Conversely, if an area has E-P<0, it is experiencing precipitation.The value of E is greater when the gradient of PWV increases, and the value of P is greater when the gradient of PWV decreases.This is in accordance with the results obtained, where rainfall occurs when the PWV increases before the rain occurs.In contrast, during the rain, PWV decreased.
On the other hand, there are some rainfall intensities in the range of 0-0.1 mm based on the distribution plot of rainfall with PWV at each station, and the intensity increases with increasing PWV values.The phenomenon of a lot of rainfall occurring when it has a value between 0-0.1 mm can also be explained by equation (6).A lot of rainfall between 0-0.1 mm can be caused by the influence of wind that makes the specific humidity in an area change.
Then, if the altitude of each InaCORS station is considered, InaCORS stations in lowland areas have lower PWV values than InaCORS stations in highland areas.The event where PWV is different in the highlands and lowlands can be caused by significant pressure and temperature differences in the area.These pressure and temperature differences also cause differences in the humidity present in the two areas.To understand why differences in humidity at low and high altitudes can affect PWV, the equation below expresses its correlation with specific humidity [34].
The specific humidity used is proportional to the mixing ratio of water vapor in the atmosphere.The mixing ratio of water vapor is closely related to air pressure.The relationship is expressed as follows: From equation ( 8), r is the variable for the mixing ratio at pressure p at various altitudes of interest.rs is the variable for the mixing ratio at ground level or at the reference altitude of interest.The power λ is a variable that depends on altitude, season, and atmospheric conditions [33].Equation (7) and equation (8) show that PWV has a relationship with air pressure.The air pressure in the lowlands is high.Conversely, air pressure in the highlands is low.In addition, based on Gay-Lussac's Law, air pressure is directly proportional to air temperature.This is under the actual environmental conditions where lowlands tend to have higher ambient temperatures than highlands.Therefore, lowlands have high PWV values and highlands have low PWV values.
In addition, the results obtained also show that the PWV in areas near the equator is higher compared to areas far from the equator when excluding areas in the highlands.This is because areas close to the equator have higher temperatures.This leads to higher evaporation rates and increases the specific humidity.

Conclusion
In this study, researchers used PWV data derived from GPS data and rainfall data to determine the correlation that occurred in Indonesian regions during 2019.There are some interesting findings that can be highlighted in this study.The southern part of Indonesia, especially Java Island, has a stronger relationship between PWV and rainfall than other islands, although the PWV value is more volatile.This is indicated by the relatively higher correlation coefficient of PWV with rainfall on Java Island.For lowland areas, the correlation coefficient values for CMEN station in Sumatra, CPAC station in Java, CBAL station in Kalimantan, CBIT station Sulawesi, CUKE station in Papua are 0.64985, 0.6456, 0.51522, 0.59358, and 0.44127, respectively.For highland areas, the correlation coefficient values for CBKT station in Sumatra Island, CLBG station in Java Island, and CWMN station in Papua Island are 0.53156, 0.75047, and 0.46827, respectively.
The correlation coefficient between pentad days of PWV data and pentad days of rainfall data obtained at all InaCORS stations studied is positive.The positive correlation coefficient value indicates that the PWV derived from GNSS data is directly proportional to the rainfall that occurs in Indonesia.The positive correlation between pentad days of rainfall data with PWV and rainfall that mostly occurs at 0-0.1 mm intensity can be explained by the Eulerian atmospheric moisture budget equation.
If the altitude of the InaCORS station location is considered, the highlands have a lower range of PWV values compared to the lowlands.In addition, the PWV will be higher and more stable the closer to the equator.These are due to the difference in specific humidity.This difference is caused by differences in pressure and temperature in these areas which affect the level of evaporation that occurs.The difference in specific humidity at low and high altitudes and in areas close and far to the equator can be explained by the PWV equation with specific humidity and the mixing ratio equation.

Figure 1 .
Figure 1.InaCORS samples distribution and Figure 3 can be observed.Both figures are graphical plots of the comparison of PWV and rainfall at CUKE Station before and after normalization every five days.The increase and decrease of PWV and rainfall can be more easily seen in the Figure 3 compared to the Figure 2.

Figure 2 .
Figure 2. Graph of PWV and rainfall data before pentad days normalization.

Figure 3 .
Figure 3. Graph of PWV and rainfall data after pentad days normalization.

Figure 6 .
Figure 6.Heatmap of PWV and rainfall correlation in Indonesia, 2019.5.3.Timeseries of PWV and rainfall comparison at InaCORS samples 5.3.1.InaCORS samples on Sumatra.On the island of Sumatra, The first InaCORS station on SumatraIsland is CMEN station in Tulang Bawang Regency, Lampung Province.Figure7shows that the PWV values obtained normalized every five days during 2019 are in the 28-65 mm range.In this area, rainfall is relatively low from mid-March to mid-October.The rainfall trend line increases as the PWV trend line does the same.Furthermore, based on the distribution plot of pentad days of rainfall compared to PWV at CMEN station, as shown in Figure8, rainfall has a range of values of 0-1.5 mm.The scatter plot at CMEN station shows that the intensity of rainfall increases with increasing PWV value, particularly when PWV exceeds 50 mm.This is also shown by the trend line of the scatter plot.The resulting correlation coefficient between PWV and rainfall data after normalizing every five days is 0.64985.Figure 7.Comparison graph of pentad days PWV with rainfall at CMEN station in 2019.

Figure 7 .
Comparison graph of pentad days PWV with rainfall at CMEN station in 2019.

Figure 8 .
Figure 8. Scatter plot of pentad days PWV with rainfall at CMEN station in 2019.

Figure 9 .
Figure 9.Comparison graph of pentad days PWV with rainfall at CBKT station in 2019.

Figure 10 .
Figure 10.Scatter plot of pentad days PWV with rainfall at CBKT station in 2019.

Figure 11 .
Comparison graph of pentad days PWV with rainfall at CLBG station in 2019.

Figure 12 .
Figure 12.Scatter plot of pentad days PWV with rainfall at CLBG station in 2019.

Figure 13 .
Comparison graph of pentad days PWV with rainfall at CPAC station in 2019.

Figure 14 .
Figure 14.Scatter plot of pentad days PWV with rainfall at CPAC station in 2019.

Figure 15 .
Comparison graph of pentad days PWV with rainfall at CBAL station in 2019.

Figure 16 .
Figure 16.Scatter plot of pentad days PWV with rainfall at CBAL station in 2019.

Figure 17 .
Figure 17.Comparison graph of pentad days PWV with rainfall at CBIT station in 2019.

Figure 18 .
Figure 18.Scatter plot of pentad days PWV with rainfall at CBIT station in 2019.

Figure 20 .
Figure 20.Scatter plot of pentad days PWV with rainfall at CWMN station in 2019.

Figure 21 .
Comparison graph of pentad days PWV with rainfall at CUKE station in 2019.

Figure 22 .
Figure 22.Scatter plot of pentad days PWV with rainfall at CUKE station in 2019.

Table 2 .
Correlation coefficient of PWV with rainfall at highland InaCORS samples

Table 3 .
Correlation coefficient of PWV with rainfall at lowland InaCORS samples.