Spatial-temporal dynamics of paddy productivity on the north coast of Java Island, Indonesia based on the principal component analysis of MODIS NDVI anomaly data

Comprehending the dynamics of paddy productivity is imperative for enhancing the efficacy of agricultural land developments. This study provides the application of principal component analysis (PCA) as a method for visualizing the spatial-temporal changes in paddy productivity. The analysis is conducted using the 8-day NDVI (normalized difference vegetation index) anomaly data of MODIS (Moderate Resolution Imaging Spectroradiometer) data spanning the period from 2000 to 2020. The regencies of Karawang, Subang, and Indramayu on the north coast of Java island are chosen as the study area because of their top rice production areas in Indonesia. The results show that the first leading PCA of the NDVI anomaly is related to the interannual variability of paddy productivity with 3-4 year cycles. The spatial and temporal dynamics of the first mode of eigenvectors and principal component time series can generally be grouped into nine categories. Two important categories to note are category-1 (1 January – 19 March) and category-8 (12 September – 16 December). In category-1, the NDVI anomalies move from north to middle and middle to north areas in Karawang and Subang regencies, respectively. In Indramayu Regency, the NDVI anomalies relatively remain in almost all areas. In contrast, in category-8 the NDVI anomalies move from the middle to northern areas in Karawang, Subang, and Indramayu regencies.


Introduction
Rice holds significant global importance as a fundamental food.Rice, which constitutes approximately 20% of the global caloric intake, serves as the primary food staple for a staggering population of 3 billion people.Rice holds the greatest significance as an agricultural crop in Indonesia, exerting a substantial influence on the country's economy broadly.According to data obtained from the United Nations' Food and Agriculture Organization Statistical Database (FAOSTAT), Indonesia ranks as the third most prominent global rice producer, following China and India (https://www.fao.org/faostat/en/#data/QCL/visualize).In Indonesia, it is typically to observe three distinct rice growing seasons, characterized by one rainy-season crop followed by two arid-season crops.The rainy season crop, typically cultivated between the months of October and December and harvested from March to April, accounts for is primarily attributed to cloud cover.Consequently, several model advancements in paddy monitoring have incorporated Synthetic Aperture Radar (SAR) data [14]- [17].
Few studies have intensively examined the spatial and temporal distribution of paddy productivity in Indonesia.The objective of this study is to conduct a thorough examination of the spatial-temporal changes in paddy productivity in the northern part of Java Island, with a specific focus on the Karawang, Subang, and Indramayu regencies.The analysis will cover the period from 2000 to 2020, utilizing the NDVI anomaly as the primary indicator.The NDVI anomaly is a useful measure of less or more productivity when compared to 'normal' plant health.Regions, where paddy crops are more productive than average, are given by positive anomaly, while those with less productive than average or drought are given by negative anomaly.Furthermore, the PCA model serves as an appropriate instrument for assessing extensive datasets that encompass a substantial number of dimensions or features per observation.This is achieved by reducing the dimensionality of the datasets, thereby enhancing their interpretability, while simultaneously minimizing the loss of information.

Study area
The research location was selected on the northern part of Java Island, which has become known as an important rice-producing region in Indonesia.This area encompasses both irrigated and shallow land areas.The area covers Karawang, Subang, and the western part of Indramayu regencies which are the most significant paddy field areas in Java (Fig. 1).In 2019, the rice production in Karawang, Subang, and Indramayu regencies is 642,192.32tons; 541,721.37 tons; and 790,768.93 tons, respectively [18].

Data and sources
The main data used in this study are the 8-day MODIS NDVI anomaly data for the period 2000-2020.
The source of this data is the MODIS Analysis Ready Data (ARD) developed by the National Research and Innovation Agency of Indonesia (BRIN).The MODIS ARD product is geometrically and radiometrically corrected MODIS data so that it could be directly used for various applications.The data are prepared through the following steps: daily reflectance MODIS data (spatial resolution of 500 meters) acquired from the BRIN Remote Sensing Ground Station are processed into 8-day composite reflectance using the BRIN Advanced Image Processing System.This image processing system will generate data that are geometrically, radiometrically, and topographically corrected.If there is a data void due to interference during the reception, the data were obtained from USGS (https://e4ftl01.cr.usgs.gov/MOLT/MOD09A1.006/).Afterward, the data are processed into a cloudfree 8-day ARD product with a tile size of 5 degrees x 5 degrees.Figure 2 shows the tiles of ARD MODIS data for all Indonesia areas, meanwhile, the study area is located within the E105_S05 tile.The next process is calculating NDVI for each pixel using the equation: where RNIR is a reflectance value in the near-infrared band (band 2) and RRED is a reflectance value in the red band (band 1).The areas containing photosynthetically active vegetation exhibit significantly higher absorption of radiation within the red part of the electromagnetic spectrum, in comparison to regions devoid of vegetation or containing photosynthetically inactive vegetation.The phenomenon can be attributed to the presence of pigments in the leaves, needles, and other plant structures.On the other hand, the absorption of near-infrared (NIR) radiation is minimal for both surface types.Nevertheless, the quantity of near-infrared (NIR) radiation that is transmitted or scattered is dependent on the type and roughness of the surface.Consequently, the disparity in reflectance is minimal in regions lacking any or containing photosynthetically inactive vegetation, while it is substantial in areas characterized by photosynthetically active vegetation.It is important to acknowledge that in regions abundant with vegetation, the relationship between the scattering and/or transmission of near-infrared (NIR) radiation and the type/roughness of the surface aids in mitigating saturation effects observed in the red portion of the electromagnetic spectrum.Consequently, this enables a more effective differentiation of elevated NDVI values.
The final step of the data preparation is to calculate the 8-day of mean NDVI and its associated 8day of NDVI anomalies from 2000-2020.The equations for deriving the 8-day of mean NDVI and the 8-day NDVI anomaly, consecutively are: where   ̅ is the 8-day of mean NDVI for day-i where i is the i th 8-day index spanning from 1 to 46.While n is the year index spanning from 2000 to 2020 (21 years period).Thus,   is the 8-day NDVI anomaly for day-i in year-j and   is the 8-day of NDVI.

Methods
The PCA is a data reduction technique in basic terms.Data reduction is accomplished through the identification of a collection of empirical orthogonal eigenvectors and their corresponding principal components (PCs), which collectively explain a sufficient amount of the total variance present in the initial dataset.The PCA is conducted in order to acquire the principal modes of variability.
The PCA involves the construction of a covariance matrix using the data field, which is then diagonalized to derive eigenvectors and their corresponding eigenvalues.The elements comprising the covariance matrix are derived from the deviations or anomalies relative to the mean values of the dataset.Every eigenvector represents a spatial pattern mode and its corresponding series of time coefficients (PC) that determine the progression of the spatial mode.Since the eigenvectors exhibit orthogonality in space, it follows that the corresponding principal components also demonstrate orthogonality in time.The proportion of the total variance attributed to a specific eigenvector is directly proportional to its corresponding eigenvalue.Moreover, the eigenvalues are organized in a descending sequence based on the proportion of variance linked to each of them.Eigenvectors that explain significant portions of the variance are commonly regarded as possessing physical significance.The remaining modes, which explain only a negligible portion of the variance, are regarded as statistically and presumably physically insignificant, resembling noise.Due to the orthogonality of the modes, it is possible to consider any two modes as being uncorrelated.Comprehensive explanations and practical implementations of the PCA technique can be found in various studies, such as [19]- [21].
The procedures for computation are outlined as follows.The matrix data FMxN is defined by its elements fmn, which are represented as anomaly values.The outline of the computational steps is given as follows.Here, m is a variable that takes on integer values starting from 1 and increasing by 1.The variable M represents a grid point or station position counter, while the variable N takes on values of 1, 2, and so on.The variable N serves as a time counter.The eigenvectors and eigenvalues obtained through the computation of the characteristic equation of the covariance matrix R are provided as follows: where E = [e1, e2, … eM] represents the set of eigenvectors, λ = [1, 2, … M] represents the set of eigenvalues, I represent the M-order unit matrix, and R be defined as: The matrix F T represents the transpose of matrix F. Therefore, it is possible to represent any observation vector fn as a linear combination of the M eigenvectors em.
The variable Cmn represents the component of the projection of the vector fn onto the vector em.The coefficients Cmn represent the individual elements of the MxN matrix C, as defined by the given equation.
To perform PCA, the 8-day NDVI anomaly dataset is organized into a rectangular matrix.In this matrix, the rows correspond to specific grid points, while the columns represent the 8-day data.

Result and Discussion
PCA is a widely used feature extraction technique that employs a specific mathematical procedure to combine input variables.It is possible to eliminate the variables deemed as "least important" while preserving the most significant aspects of all variables.In this study, we will analyze the first leading principal component (PC-1).The PC-1 is chosen because it can equivalently be defined as a direction that maximizes the variance of the projected data.To understand the evolution of temporal dynamics of the PC-1 of the 8-day NDVI anomaly, the correlation coefficients between the PC-1 of the n th day and the (n+1) th day of the 8-day NDVI anomaly are first calculated.  1 lists the PC-1 category along with its range of the 8-day NDVI anomaly.From these figures and table, we can see that all PC-1 can be grouped into nine categories.The 1 st to 10 th , the 11 th to 13 th , the 14 th to 21 st , the 22 nd to 24 th , the 25 th and 26 th , the 27 th to 29 th , the 30 th to 32 nd , the 33 rd to 44 th , and the 45 th  For further analysis, each PC-1 category will be represented by the 6 th , 12 th , 18 th , 23 rd , 25 th , 28 th , 31 st , 38 th , and 46 th day of the 8-day NDVI anomaly.The percentages of explained variance associated with those eigenvalues are shown in Figure 4.The percentage of explained variance refers to the proportion of variance that can be accounted for by each of the chosen components.The significance of the principal components can be determined by arranging the eigenvectors in descending order based on their corresponding eigenvalues.Based on the observations made in Figure 4, it is obvious that the initial component contains the highest amount of potential information, followed by the subsequent components in descending order of information content.During the period of category-2, the magnitudes of NDVI anomaly increased in the middle part of Karawang and Subang regencies.The high magnitudes of NDVI anomaly occur in the middle part of Karawang Regency (Kutawaluya, Tempuran, Cilamaya Kulon, and Tegalsari sub-districts).In Subang Regency, the high magnitudes of NDVI anomaly occur in the northern part, i.e., in Ciasem, Sukasari, Pamanukan, Tambakdahan, and Compreng sub-districts.In Indramayu Regency, the magnitudes of NDVI anomaly tend to increase though they are still low.During the category-3 period, the magnitudes of NDVI anomaly decreased in Karawang and Subang regencies.The NDVI anomalies in Karawang Regency weaken from the north towards the middle.In contrast, the high magnitudes of NDVI anomaly are found in the western part of Karawang Regency (Karawang Barat, Rawamerta, Karawang Timur, and Telukjambe Barat sub-districts).In Subang Regency, slightly higher magnitudes of NDVI anomaly are in the middle part (Pabuaran, Pagaden Barat, Cipunagara, and Cibogo sub-districts).Meanwhile, the magnitudes of NDVI anomalies increase from south to the north in Indramayu Regency.The high magnitudes of NDVI anomaly are found in the southern part of Indramayu Regency (Gantar, Kroya, Trisi, Cikedung, Lelea, Haurgeulis, Bongas, Gabuswetan, and Losarang sub-districts).
During the category-5 period, the extent of NDVI anomalies increases in the western part of Karawang Regency, while those in the middle-east Subang Regency and the middle Indramayu Regency remain the same as in the category-4 period.
The first mode of the eigenvector map and the corresponding PC-1 for the 28th day of the 8-day NDVI anomaly in Karawang, Subang, and Indramayu regencies represented in category-6 are presented in Figs.5f.1 and 5f.2.The corresponding variance for this PC-1 is 12.27%.The eigenvector map in Fig. 5f.1 shows that the magnitudes of NDVI anomaly act as the opposite signals in the northwest and the northeast Karawang Regency.Meanwhile, the opposite NDVI anomalies in Subang Regency are more north and south.In contrast, the variabilities of NDVI anomaly in Indramayu Regency are relatively low.2000, 2001, 2003, 2004, 2005, 2006, 2008, 2009, 2010, 2011, 2013, 2016, 2017.During the category-9 period, the magnitudes of NDVI anomaly in Karawang, Subang, and Indramayu regencies are weakening.

Conclusions
Understanding paddy productivity dynamics could determine the optimal information on agricultural land development programs.This study examines the spatial and temporal evolutions of the paddy productivity in Karawang, Subang, and Indramayu regencies on the north coast of Java Island, Indonesia from the 2000-2020 period based on the 8-day NDVI anomaly.The 8-day NDVI anomaly data are constructed from the daily MODIS reflectance data that have been pre-processed into geometric, radiometric, and topographic corrections.Furthermore, the reflectance data are processed to obtain the cloud-free 8-day of NDVI data, then organized into a tile-based ARD format that allows immediate analysis without additional user effort.The 8-day NDVI anomaly is calculated by subtracting the 8-day of NDVI data from its mean data.The advantage of using the 8-day of MODIS ARD format is its high temporal resolution which is excellent for time-series investigative analysis..
The results show that the first leading PCA of the NDVI anomaly can be grouped into nine categories and summarized as follow.
• Category-1 (1 January -19 March) The NDVI anomalies in Karawang and Subang regencies act as the opposite north and middle productive anomalies, while those in Indramayu regency act more as the opposite middle and the rest of the areas.• Category-2 (20 March -13 April) The magnitudes of the NDVI anomaly slowly decrease in Karawang, Subang, and Indramayu regencies.This category could be considered as the transition period to the next category.The signals of the NDVI anomaly slowly emerge in the western part of Karawang Regency.In the meantime, the NDVI anomalies in the southern part of Indramayu Regency are intensified and move to the middle up to the western part of the middle Subang Regency.
• Category-5 (10 July -25 July) The magnitudes of the NDVI anomaly in the western part of Karawang Regency, the middle part of the Subang Regency, and the western part of the middle Subang Regency are slowly reducing.• Category-6 (26 July -18 August) The magnitudes of the NDVI anomaly slowly decrease in Karawang, Subang, and Indramayu regencies.This category could be considered as the transition period.The magnitudes of NDVI anomaly in north Karawang and Subang regencies are decreasing.Meanwhile, the magnitudes of NDVI anomalies in the north and south parts of Indramayu Regency tend to increase slightly.This category could be considered as the transition period to the next category.
The PCA is utilized to extract fractions of variances from the 8-day NDVI anomaly data.These fractions of variances possess physical significance, but only after the noise has been removed.The findings indicate that the spatial and temporal distributions of the first mode, which represents the most significant mode of variability in the NDVI anomaly over the Karawang, Subang, and Indramayu regions, align with the underlying pattern of interannual variability in paddy productivity.Overall, this study suggests that remote sensing optical data shows promise for agricultural applications.However, some limitations, such as cloud cover and low spatial resolution led to inaccurate estimation of paddy areas.In addition, the results of this study would require further validation with the ground data.

Figure 1 .
Figure 1.Map of the study area over the high-resolution satellite image (left) and the rice field boundaries (right)

Figure 2 .
Figure 2. Tiles of MODIS ARD product with the size of 5 x 5 Figures 3a-i present the grouping results based on the correlation coefficient analysis and Table1lists the PC-1 category along with its range of the 8-day NDVI anomaly.From these figures and table, we can see that all PC-1 can be grouped into nine categories.The 1 st to 10 th , the 11 th to 13 th , the 14 th to 21 st , the 22 nd to 24 th , the 25 th and 26 th , the 27 th to 29 th , the 30 th to 32 nd , the 33 rd to 44 th , and the 45 th and 46 th days of the 8-day NDVI anomaly fall into categories 1 to 9, respectively.The PC-1 time series of category 1 show the distinct opposite phase between the years 2003, 2006, 2008, 2012, 2015, 2019 and the years 2000, 2001, 2005, 2010, 2013, 2016, 2020.This pattern clearly indicates the 3-4 year cycles or interannual variability of paddy productivity.For further analysis, each PC-1 category will be represented by the 6 th , 12 th , 18 th , 23 rd , 25 th , 28 th , 31 st , 38 th , and 46 th day of the 8-day NDVI anomaly.The percentages of explained variance associated with those eigenvalues are shown in Figure4.The percentage of explained variance refers to the proportion of variance that can be accounted for by each of the chosen components.The significance of the principal components can be determined by arranging the eigenvectors in descending order based on their corresponding eigenvalues.Based on the observations made in Figure4, it is obvious that the initial component contains the highest amount of potential information, followed by the subsequent components in descending order of information content.

Figure 3 . 8 Figure 4 .
Figure 3. (a-i) Nine categories of PC-1 derived from the correlation coefficient analysis for the n th day of the 8-day NDVI anomaly and the (n+1) th day of the 8-day NDVI anomaly

Figures 5c. 1
Figures 5a.1 and 5a.2 show the first mode of the eigenvector map and the corresponding PC-1 for the 6th day of the 8-day NDVI anomaly in Karawang, Subang, and Indramayu regencies represented in category-1.The corresponding variance for this PC-1 is 30.63%.From the eigenvector map in Fig. 5a.1, it can be seen that the opposite large spatial signals of NDVI anomaly occur in the northern and middle part of Karawang Regency, no significant signals of NDVI anomaly are detected in Subang Regency, and the low magnitudes of NDVI anomaly take place in almost all Indramayu Regency. Figure 5a.2 indicates that the signals of NDVI anomaly in years 2003, 2006, 2008, 2012, 2015, and 2019 had the opposite phase with years 2000, 2001, 2005, 2010, 2013, 2016, and 2020.Meanwhile, no NDVI anomaly variabilities occurred in 2002, 2004, 2007, 2009, 2011, 2014, 2017, and 2018.During the category-1 period, the high magnitudes of NDVI anomaly move from north to middle areas in Karawang Regency.Meanwhile, NDVI anomalies in the middle part of Subang Regency tend to move north.In Indramayu Regency, low magnitudes of NDVI anomaly are taking place.The first mode of the eigenvector map and the corresponding PC-1 for the 12th day of the 8-day NDVI anomaly in Karawang, Subang, and Indramayu regencies represented in category-2 are presented in Figs.5b.1 and 5b.2.The corresponding variance for this PC-1 is 22.53%.It is apparent from Fig. 5b.1 that the NDVI anomalies in the north have an opposite signal from those in the middle in both Karawang and Subang regencies.Meanwhile, the magnitudes of NDVI anomaly are still low in Indramayu Regency. Figure 5b.2 reveals that the signals of NDVI anomaly in years 2002, 2003, 2004, 2006, 2007, 2008, 2012, 2015, and 2019 had the opposite phase with years 2005, 2010, 2013, 2016, 2017, and 2020.In the meantime, no NDVI anomaly variabilities occurred in 2000, 2001, 2009, 2011, 2014, and 2018.During the period of category-2, the magnitudes of NDVI anomaly increased in the middle part of Karawang and Subang regencies.The high magnitudes of NDVI anomaly occur in the middle part of Karawang Regency (Kutawaluya, Tempuran, Cilamaya Kulon, and Tegalsari sub-districts).In Subang Regency, the high magnitudes of NDVI anomaly occur in the northern part, i.e., in Ciasem, Sukasari, Pamanukan, Tambakdahan, and Compreng sub-districts.In Indramayu Regency, the magnitudes of NDVI anomaly tend to increase though they are still low.Figures 5c.1 and 5c.2 show the first mode of the eigenvector map and the corresponding PC-1 for the 18th day of the 8-day NDVI anomaly in Karawang, Subang, and Indramayu regencies represented in category-3.The corresponding variance for this PC-1 is 18.67%.The eigenvector map in Fig 5c.1 shows that the variabilities of NDVI anomaly are low in the northern and middle parts of Karawang and Subang regencies.In Indramayu Regency, the spatial signals of NDVI anomaly generally are found in Figure 5d.2 indicates that the signals of NDVI anomaly in years 2006, 2012, 2013, 2015, and 2019 had the opposite phase with years 2000, 2001, 2007, and 2016.In the meantime, no NDVI anomaly variabilities occurred in Figure 5e.2 reveals that the signals of NDVI anomaly in years 2006, 2013, 2015, 2019, and 2020 had the opposite phase with years 2001 and 2009.Meanwhile, no NDVI anomaly variabilities occurred in
Figures 5g.1 and 5g.2 show the first mode of the eigenvector map and the corresponding PC-1 for the 31st day of the 8-day NDVI anomaly in Karawang, Subang, and Indramayu regencies represented in category-7.The corresponding variance for this PC-1 is 12.08%.It is seen from the eigenvector map in Fig. 5g.1 that the spatial signals of NDVI anomaly, in general, are weak in Karawang, Subang, and Indramayu regencies.Figure 5g.2 denotes that the signals of NDVI anomaly in the years 2000, 2007, and 2020 had the opposite phase with the years 2004, 2009, and 2018.Meanwhile, no NDVI anomaly variabilities occurred in 2001, 2002, 2003, 2005, 2006, 2008, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, and 2019.During the category-7 period, there is a gradual movement of NDVI anomaly magnitudes from the southern regions towards the central areas of Karawang, Subang, and Indramayu regencies.On the other hand, the levels of NDVI anomaly in the northern areas of Karawang, Subang, and Indramayu regencies exhibit relatively low magnitudes.The first mode of the eigenvector map and the corresponding PC-1 for the 38th day of the 8-day NDVI anomaly in Karawang, Subang, and Indramayu regencies represented in category-8 are presented in Figs.5h.1 and 5h.2.The corresponding variance for this PC-1 is 16.95%.The eigenvector map in Fig. 5h.1 shows the opposite north and south NDVI anomalies in Karawang, Subang, and Indramayu (particularly in the northwest part) regencies.Figure 5h.2 indicates that the signals of NDVI anomaly in years 2000, 2002, 2004, 2005, and 2007 had the opposite phase with years 2013, 2015, 2016, 2017, Figure 5i.2 reveals that the signals of NDVI anomaly in years 2002, 2007, 2014, 2018 had the opposite phase with years 2012, 2015, 2019, 2020.In the meantime, no NDVI anomaly variabilities occurred in

Table 1 .
The 8-day NDVI anomaly ranges within each PC-1 category Figures 3a-i present the grouping results based on the correlation coefficient analysis and Table • Category-3 (14 April -15 June) While the magnitudes of the NDVI anomaly in Karawang and Subang regencies are still low, those in Indramayu Regency gradually increase, particularly in the southern part.•Category-4 (16 June -9 July) • Category-7 (19 August -11 September) The spatial signals of NDVI anomaly in Karawang and Subang regencies are weakening.Meanwhile, the NDVI anomalies in the western part of the middle Indramayu Regency are slowly appearing.• Category-8 (12 September -16 December) The magnitudes of NDVI anomaly tend to increase in the middle part of Karawang, Subang, and Indramayu regencies.The variabilities of NDVI anomaly then move north-westward to Karawang and Subang regencies.Distinct opposite north and middle NDVI anomalies occur mostly in Karawang and Subang regencies.It is also found from the PC-1 time series that productivity tends to decrease over time.• Category-9 (17 December -31 December)