A near real-time satellite-based global drought climate data record

Reliable drought monitoring requires long-term and continuous precipitation data. High resolution satellite measurements provide valuable precipitation information on a quasi-global scale. However, their short lengths of records limit their applications in drought monitoring. In addition to this limitation, long-term low resolution satellite-based gauge-adjusted data sets such as the Global Precipitation Climatology Project (GPCP) one are not available in near real-time form for timely drought monitoring. This study bridges the gap between low resolution long-term satellite gauge-adjusted data and the emerging high resolution satellite precipitation data sets to create a long-term climate data record of droughts. To accomplish this, a Bayesian correction algorithm is used to combine GPCP data with real-time satellite precipitation data sets for drought monitoring and analysis. The results showed that the combined data sets after the Bayesian correction were a significant improvement compared to the uncorrected data. Furthermore, several recent major droughts such as the 2011 Texas, 2010 Amazon and 2010 Horn of Africa droughts were detected in the combined real-time and long-term satellite observations. This highlights the potential application of satellite precipitation data for regional to global drought monitoring. The final product is a real-time data-driven satellite-based standardized precipitation index that can be used for drought monitoring especially over remote and/or ungauged regions.


Introduction
Droughts are typically categorized into four major classes: (a) meteorological drought, a deficit in precipitation; (b) hydrological drought, a deficit in streamflow, groundwater level or water storage; (c) agricultural drought, a deficit in soil moisture; and (d) socioeconomic drought, incorporating water supply and demand (Anderson et al 2011, Wilhite and Glantz 1985). All four categories of droughts are related to a sustained lack of precipitation and thus, having accurate, long-term, and timely precipitation data is fundamental to drought monitoring and analysis.
Content from this work may be used under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. Several indices have been developed for drought monitoring based on indicator variables such as precipitation, soil moisture, runoff and evapotranspiration (Karl 1983, McKee et al 1993, Hao and AghaKouchak 2012, Dai et al 2004, Mo 2008, Shukla and Wood 2008, Anderson et al 2011. The precipitation deficit can be expressed with the commonly used standardized precipitation index (SPI) (McKee et al 1993, Hayes et al 1999. The SPI identifies precipitation deficits at various timescales (e.g., 1-, 3-, 6-and 12-month). The amount of precipitation with respect to climatology in a normalized scale and over a given time period (e.g., 6-month) is represented by this value. The SPI is standardized to a normal distribution having zero mean and a standard deviation of one (Tsakiris and Vangelis 2004). A positive SPI value indicates an above average precipitation accumulation over a period of time (for example, 6-month), whereas a negative SPI indicates a dry period with an average precipitation accumulation below the climatological mean. An SPI value near zero refers to precipitation accumulation near the climatological mean. In other words, a sequence of negative (positive) SPI values indicates that the climate condition has a dry (wet) status (McKee et al 1993).
In a recent meeting (WMO 2009), drought experts made a consensus agreement to recommend the SPI for the characterization of meteorological droughts (Hayes et al 2011). Following the discussions at (WMO 2009), the National Meteorological and Hydrological Services (NMHSs) has been encouraged to use the SPI for meteorological drought analysis (Hayes et al 2011, WCRP 2010. The true strength of SPI is that precipitation anomalies can be calculated over flexible timescales in a consistent fashion. Another attractive feature of the SPI is that drought information can be provided in a timely manner for operational drought-monitoring applications if precipitation is available in near real-time. In numerous studies, the SPI were derived using long-term rain gauge data (Santos et al 2010, Gonzalez andValdes 2006). However, the spatial distribution of rain gauges in most parts of the world is not sufficient enough to capture reliable estimates of precipitation and its spatial variability (Easterling 2012). Additionally, individual rain gauges of a global observation network often have different lengths of records which can affect the climatology, and thus SPI-based drought information. Alternative to rain gauges is satellite precipitation data, as they offer near real-time accessibility, better representation of spatial variability of precipitation, and uninterrupted measurement (Sorooshian et al 2011). Recently, satellite precipitation data sets have been used for drought monitoring (Sheffield et al 2006, Paridal et al 2008, Zhang et al 2008. For example, the experimental African Drought Monitor provides drought conditions using a land-surface model trained by remotely sensed precipitation data (Sheffield et al 2006). (Anderson et al 2008) analysed droughts based on vegetation response using remotely sensed precipitation data, and concluded that satellite data can improve drought monitoring.
One limitation of the near real-time and high resolution satellite precipitation data sets for drought monitoring is the relatively short record of data (currently, 10-14 years). Also, the Global Precipitation Climatology Project (GPCP; Huffman et al 1997, Adler et al 2003 provides satellite-based gauge-adjusted precipitation data in a global scale. However, the GPCP data set is not available in real-time. In this paper, a Bayesian algorithm is used to combine real-time satellite data sets with GPCP observations to create a long-term and near real-time record with consistent climatology for drought analysis. Using the final merged product, a global data record of the SPI is generated for the applicability of drought monitoring and analysis. This data set provides a basis for greater utilization of satellite data for drought monitoring, particularly in describing the spatial extent and dependences of drought events. It should be noted that this data set is data-driven (satellite-based observations corrected with gauge data), and numerical weather/climate models are not used in its development. Therefore, this data set can be used for validation and verification of model outputs.

Data resources
In this study, the following remotely sensed precipitation products are used for drought analysis:

Methodology
Having long-term precipitation data is fundamental to reliable drought monitoring. The GPCP, with global coverage and long-term monthly data, has been widely used in weather and climate change studies. However, the data is not available in near real-time due to post-processing needed to combine all satellite and rain gauge data sets. This issue limits the application of long-term satellite data to real-time drought monitoring. In this study, as schematically shown in figure 1, the long-term GPCP satellite data is merged with real-time satellite estimates (here, TMPA-RT and PERSIANN) for drought analysis using a Bayesian-based correction algorithm.
In the merged data set, the climatology is driven by GPCP data, whereas the near past data (approximately 9-18 months) are based on real-time satellite precipitation data (typically available to public within few hours to days after observation).
In a recent study, (Tian et al 2010) introduced a real-time bias adjustment method for correcting satellite data. In this paper, a similar methodology is adopted for creating a consistent climatology. Having both GPCP (G) and real-time satellite data (S) for the overlap period (2000-10), one can derive the joint probability P(G, S) using the Bayesian theorem: where G and S denote GPCP and real-time satellite data (here, PERSIANN and TRMM-RT), respectively. The conditional probability P(G|S) indicates the likelihood of the measurement G given the satellite observation S. For more detail about this methodology, the reader is pointed to Tian et al (2010). The right hand side of the equation (1) can be computed for the overlap period (2000-10). Then, one can derive G for any S by maximizing P(G i , S j ) using the maximum likelihood method. Using this approach, for the period for which GPCP (here, G) observations are not available (real-time data S), one can obtain the likely value G given S. In other words, based on the overlap period, the algorithm will estimate the likely value of G (here, GPCP data) given an observed S from real-time satellite data. The likely value of G, and hence the adjusted satellite data, is derived based on historical values of G and S over the same period of time (e.g., January, February) to account for seasonality. Figure 2 displays example time series of GPCP (solid blue), satellite data (here, PERSIANN) before correction (dashed red), and satellite data after the Bayesian correction (solid green) for two locations. One can see that the differences between real-time satellite data and GPCP observations reduce after applying the Bayesian correction algorithm.

Results: near real-time global SPI data
The GPCP data is available with a special resolution of 2.5 • , while the real-time PERSIANN and TRMM-RT satellite data are available in a 0.25 • resolution. In this study, the merged products of SPI are generated at the following spatial resolution: (a) 2.5 • grid-GPCP in the original resolution and real-time satellite data re-gridded onto a 2.5 • grid; (b) 0.5 • grid-all data sets re-gridded onto a common 0.5 • grid.
The presented results are based on GPCP data from Jan. 1979-Dec. 2010 and real-time PERSIANN and TRMM-RT satellite data from Jan. 2011 to Mar. 2012. SPI obtained by merging GPCP and real-time satellite data are validated by removing one year of data (2010) from the overlap period (2000-10). The final product is then validated over 2010 by comparing SPI data from the corrected real-time satellite and GPCP data in 2010. Table 1 lists error (%) of the number of drought pixels derived from 6-month and 12-month SPI based on PERISANN and TRMM-RT merged with GPCP for the validation period (2010) before and after the proposed Bayesian correction to   the climatology. Here, drought is defined as pixels with SPI ≤ −1 (SPI = −1 indicates moderate drought). One can see that the combined GPCP-PERSIANN data at a 2.5 • resolution leads to a relatively small error of 6.58% and 3.22% for 6-month and 12-month SPI data, respectively. Figure 3 displays 6-month SPI data for March and September 2010, whereas figure 4 presents 12-month SPI data for the same periods. In the two figures, the top panels show the reference GPCP data. The second row (panels 3(c), (d), 4(c), (d)) displays SPI data from GPCP-PERSIANN without any correction. Given the biases in satellite precipitation data (see (Tian et al 2009, AghaKouchak et al 2012), one can see that the SPI data are rather unrealistic (e.g., compare figures 3(a) and (c)). This issue reflects in high errors in the combined GPCP-PERSIANN data before the Bayesian correction (see table 1). For example, comparing North America in figures 3(a), (c) and (e), one can see that there are major discrepancies between reference data (figure 3(a)) and uncorrected GPCP-PERSIANN data (figure 3(c)), especially over Canada. Figure 3(c)-before the Bayesian correctionshows a major drought event over northern Canada, and is not consistent with reference observation (figure 3(a)). After the Bayesian correction (figure 3(e)), the drought condition over Canada is more consistent with reference data.
As shown in figures 3(e), (f), 4(e), (f), the Bayesian correction leads to drought information more consistent with the reference data (GPCP). Note that in these figures (similar to table 1), GPCP data from 2010 is eliminated from the correction analysis. Figures 3(g) and (h) show 6-month SPI data based on combined GPCP and TRMM-RT before the proposed Bayesian correction for September and March 2010, respectively. Figures 3(i) and (j) present 6-month SPI data based on combined GPCP and TRMM-RT after the suggested correction. Figure 4 shows similar examples for 12-month SPI data before and after correction, and for both GPCP-PERSIANN and GPCP-TRMM-RT. A visual comparison and the error values provided in table 1 confirms that the Bayesian correction algorithm improves real-time drought monitoring by combining long-term low resolution GPCP data with real-time high resolution satellite observations.
While SPI data from corrected real-time satellite data are consistent with GPCP data, one can see discrepancies over several regions. For example, all data sets correctly identify the 2010 drought of Horn of Africa (see figure 4). Also, the 2010 drought of Australia has been captured in all data sets (see figures 3 and 4). However, discrepancies between SPI values form reference GPCP and corrected satellite data can be observed over several regions (e.g., China). Overall, the SPI data derived from the merged GPCP-PERSIANN is in a better agreement with GPCP data for the validation period  Previous publications indicate that satellite precipitation data sets are subject to uncertainties and different algorithms have their advantageous and disadvantageous over different geographical and climate regions (Turk et al 2008, AghaKouchak et al 2011a, Tian et al 2009, Ebert et al 2007, AghaKouchak et al 2011b, Hong et al 2006. For this reason, the authors believe combinations of multiple real-time satellite products and long-term GPCP observations should be considered for more reliable drought monitoring. While we acknowledge uncertainties in satellite precipitation data, we believe that integrating real-time satellite observations will provide additional information on droughts especially on droughts onset. In a recent study (Mo 2011), investigated the drought onset in the United States and concluded that SPI often detects the droughts onset a few month earlier than other drought indicators. This indicates improvements in global drought analysis using satellite precipitation data sets could advance drought detection and monitoring. Integration of satellite data for real-time drought analysis is particularly important for regions where dense networks of observations are not available.
Figures 5 and 6 present example time series of 6-month (figure 5) and 12-month (figure 6) SPI data for several regions across the globe using the combined GPCP and PERSIANN data sets: Texas, USA (5(a) and 6(a)); Ethiopia (5(b) and 6(b)); Amazon (5(c) and 6(c)); Central Europe (5(d) and 6(d)); Australia (5(e) and 6(e)); India (5(f) and 6(f)); and China (5(g) and 6(g)). One can see that several recent major droughts can be detected from the proposed data set. For example, the 2011 Texas drought can be observed in both figures 5(a) and 6(a)-see negative SPI values in 2011. Also, the 2010 droughts in Ethiopia (figures 5(b) and 6(b)) and Amazon (figures 5(c) and 6(c)) can be detected from the time series.

Summary and conclusions
Reliable drought monitoring and analysis requires long-term and continuous precipitation measurements. High resolution satellite data provide valuable precipitation information on a quasi-global scale. However, their short length of records limit their applications to drought monitoring. On the other hand, long-term low resolution satellite-based gauge-adjusted data sets such as GPCP are not available in real-time for timely drought monitoring. The overarching goal of this study is to bridge the gap between low resolution long-term satellite gauge-adjusted data and the emerging high resolution satellite precipitation data sets. It is worth mentioning that merging multiple data sets may lead to some level of inconsistency in the climatology, since different data sets may be biased with respect to each other and lead to unrealistic changes in the climatology at different periods in the record. For this reason, methods/algorithms are necessary to create a consistent climatology from multiple data sets. This paper introduces a Bayesian approach for combining long-term GPCP and real-time satellite data (here, PERSIANN and TRMM-RT) to create a long-term climate data record for drought monitoring and analysis.
The results revealed that the combined data after the Bayesian correction improved significantly compared to the uncorrected data (original GPCP and real-time satellite data stitched together-see table 1). Figures 3 and 4 confirm that the combined GPCP-PERSIANN and GPCP-TRMM-RT exhibited less error during the validation period (2010) after implementing the correction. Furthermore, several major drought such as the 2011 Texas, 2010 Horn of Africa, and 2010 Amazon droughts were detected in the combined real-time and long-term satellite observations. This highlights the potential application of satellite precipitation data for near real-time global drought monitoring.
Currently, real-time satellite observations are available within few hours to days from observation. This provides a unique opportunity to create a real-time climate data record with monthly or weakly updates as the observations becomes available. We predict more efforts in future will be devoted to combining data sets from different sensors or sources to create long-term climate records. The authors acknowledge that satellite data sets have biases and uncertainties that could affect drought analysis. Similarly, numerical models and even ground observations are subject to different levels of uncertainty. Given that over many regions of the world no other source of precipitation information is available, satellite data sets cannot be ignored. The authors argue that the presented data set is particularly important for drought monitoring over remote and/or ungauged basins.
In recent years, regional and global climate models have been extensively used to study droughts and their causes. The presented observation-driven drought data is model-independent and can be used a validation and verification data set. Finally, climate change and its impacts on extreme climate events including droughts has been the subject of numerous studies most of which are based on climate simulations. The presented satellite-based data sets provide the opportunity to investigate changes in patterns and severity of droughts over the past three decades. The entire record of data sets, presented in this study, (1979-present) can be made available to interested researchers upon request.