Statistical modeling of monthly maximum temperature in Senegal

We provide the first statistical analysis of maximum temperature in Senegal. The data are from twelve stations spread across Senegal. The generalized extreme value distribution was fitted to maximum temperature by the method of maximum likelihood. Probability and quantile plots showed that the generalized extreme value distribution provided an adequate fit for all stations. The vast majority of stations did not exhibit significant trends in temperature. Three of the stations exhibited positive trends in temperature. Estimates of return levels are given.


Introduction
Nearly 75 percent of the population in Senegal works in the agricultural sector, which is regularly threatened by inclement weather such as droughts and other climate changes. Droughts in Senegal have been occurring every two or three years since the 1970s. Along the Senegalese coastline between Saint Louis in the northwest and Ziguinchor in the southwest (passing through Dakar, Mbour and Fatick), studies have shown that each year is warmer than the previous one and that all the months are affected by warming. Hence, it is important that an assessment is made of the extreme values of temperature.
The generalized extreme value (GEV) distribution is a traditional and a most popular model for extreme values. The GEV distribution has been applied to data from many countries. But there have been only three papers on the application of the GEV distribution to climate data from Senegal. Sarr et al (2015) fitted the GEV distribution to extreme precipitation data from six stations in Senegal. They found that projected changes in extreme precipitations are not consistent across stations and return periods. Sane et al (2018) computed intensity-duration-frequency rainfall curves by fitting the GEV distribution to data from fourteen stations in Senegal. The fitted curves were extrapolated to give a spatial map for Senegal. Wilcox et al (2018) fitted the GEV distribution to extreme floods in the period 1950-2015 for seven tributaries in the Sudano-Guinean part of the Senegal River basin and four data sets in the Sahelian part of the Niger River basin. For the Senegal basin, stations switched from a decreasing streamflow trend to an increasing streamflow trend in the early 1980s. In the Niger basin the trend was generally positive since the 1970s. None of the stated papers are about extreme temperature.
But there have been several papers on temperature in Senegal. Vukovich et al (1987) derived surfacetemperature and albedo relationships in Senegal using satellite data. Thiam and Singh (2002) performed spacetime-frequency analysis of rainfall, runoff and temperature in the Casamance River basin, southern Senegal. Fall et al (2006) presented a geographical information systems-based analysis of monthly rainfall (twenty stations) and mean temperature (twelve stations) for 1971 to 1998 in Senegal. Stisen et al (2007) estimated diurnal air temperature using data in West Africa during the 2005 rainy season. Aifa and Dabo (2015) studied microstructures and temperature variability during the Eburnean deformations in the Dalema area, eastern Senegal. Djaman et al (2017) investigated trends in annual precipitation, sunshine duration, wind speed, annual mean minimum temperature, monthly mean minimum temperature, annual mean maximum temperature, monthly mean maximum temperature and relative humidity for six locations in Senegal for 1950-2000. Manzanas (2017 tested the suitability of statistical downscaling approaches to generate seasonal forecasts of daily maximum temperature and daily maximum precipitation for stations in Senegal during 1979-2000. Brottem and Brooks (2018 examined obstacles to rural livelihood adaptations to hotter 21st century temperatures in eastern Senegal. Sambou et al (2020) studied springtime heat wave occurrences over Senegal using data from 12 stations. Sambou et al (2021Sambou et al ( ) studied long-term (1950Sambou et al ( -2100 observed and projected changes in springtime heat waves in Sahel, Senegal, and three thermally-coherent zones within Senegal. Yet again none of these papers apply statistical models for extreme temperature in Senegal. The aim of this paper is to provide the first statistical analysis of extreme values of temperature in Senegal. We will be able to answer the following questions and more: What are the hottest areas with respect to extreme temperature? What are the coolest areas with respect to extreme temperature? The answers to these questions and more could lead to actions (for example, increased agricultural production in coolest areas and planting of crops withstanding droughts in hottest areas) which may be of help to improve the economy of Senegal.
The contents of the paper are organized as follows. Section 2 describes the data from twelve locations in Senegal: Dakar, CapSkiring, Dioubel, Kolda, Kedougou, Zinguinchor, Saintlouis, Matam, Tambacounda, Linguere, Podor and Kaolack. Section 3 describes the method used to analyze the data. Section 4 presents the results of the method and their discussion. The paper is concluded in section 5.

Data
The data are monthly temperature in centigrade for twelve stations in Senegal. The station names and years of record are given in table 1. The location of the stations are shown in figure 1. We see that the stations give a good representation of the geography of Senegal. The data were obtained from the Department of Meteorology in Dakar.
We take monthly maximum temperature for each year as the extreme value. It was computed as the maximum of the twelve monthly values. Some summary statistics (mean, median, skewness, kurtosis, standard deviation, range, minimum and maximum) of the monthly maximum temperature are also shown in table 1.
The largest monthly maximum temperature varies between 29.30 Celsius and 37.32 Celsius. The largest temperature of 37.32 Celsius was observed in Matan. The smallest monthly minimum temperature varies between 19.11 Celsius and 25.44 Celsius. The smallest temperature was observed in Dakar. The mean values for Kedougou, Kolda and Tambacounda are smaller than their median values, which indicates that the monthly maximum temperature are positively skewed for these locations. The mean values for CapSkirring, Dakar, Diourbel, Kaolack, Linguere, Matam, Podor, Saintlouis and Ziguinchor are larger than their median values, which indicates that the monthly maximum temperature are negatively skewed for these locations. The kurtosis values for all but two of the locations are less than 3, which indicates that their distributions are lighter than the normal distribution. The kurtosis values for Diourbel and Kaolack are larger than 3, which indicates that their distributions are heavier than the normal distribution. Matam has the largest standard deviation with a value of 3.331 881. Zinguinchor has the smallest standard deviation.

Method
Let X denote a random variable representing the monthly maximum temperature. According to extreme value theory (see Leadbetter et al 1983, Resnick 1987and Embrechts et al 1997, the cumulative distribution function of X can be approximated by denotes a location parameter, σ > 0 denotes a scale parameter and − ∞ < ξ < ∞ denotes a shape parameter. Note that if ξ > 0 then X has a heavy tail bounded below by μ − σ/ξ. If ξ < 0 then X has a short tail bounded above by μ − σ/ξ. (1) is the GEV distribution. The GEV distribution was fitted to the data in section 2 by the method of maximum likelihood. Suppose x 1 , x 2 , K, x n is an enumeration of the data in section 2. The maximum likelihood estimates of μ, σ and ξ were obtained by maximizing

The distribution in
over all possible values of μ, σ and ξ. The maximum likelihood estimates are the values of μ, σ and ξ corresponding to the maximum of L(μ, σ, ξ). The maximization was performed using the command fgev in the R package evd (Stephenson 2018, R Core Team 2022). Other distributions (for example, the normal distribution) may provide better fits to the monthly maximum temperature. But the GEV distribution is theoretically justified. Let m , s  and x  denote the maximum likelihood estimates of μ, σ and ξ, respectively. A quantity of interest based on (1) is the T-year return level loosely interpreted as the monthly maximum temperature expected on average once in every T years. Let x T denote the T-year return level corresponding to (1). It must satisfy Inverting (2), we obtain m s

Results and discussion
The GEV distribution was fitted to the monthly maximum temperature data from each of the twelve stations. The estimates, standard errors and 95 percent confidence intervals for the parameters of the GEV distribution are shown in The largest of the probable maximum of monthly maximum temperature is for Podor, and the second largest of the probable maximum of monthly maximum temperature is for Tambacounda. The smallest of the probable maximum of monthly maximum temperature is for Koda. The second smallest of the probable maximum of monthly maximum temperature is for Linguere.
The standard errors appear smallest for Capskiring with respect to all three parameters. The standard errors appear largest for Dakar with respect to the shape parameter and largest for Kedougou with respect to the scale and location parameters.
The fit of the GEV distribution for each station was checked by probability plots and quantile plots. The plots are shown in figures 2 and 3 for the twelve stations.
Probability plots are plots of F x i  ( ) ( ) , the observed probabilities, versus i/(n + 1), the expected probabilities, where x (1) x (2) L x (n) are the data arranged in increasing order and   (i) simulate a random sample of size n from F ;  (ii) refit the GEV distribution to the sample and let m , s  and x  denote the parameter estimates; (iii) compute for i = 1, 2, K , n; (iv) repeat steps i)-iii) 10 000 times, giving 10 000 values for F x ; , the 95 percent simulated confidence interval, versus i/(n + 1) for i = 1, 2, K , n.
The confidence intervals in figure 3 were computed as follows: (i) simulate a random sample of size n from F ;  (ii) refit the GEV distribution to the sample and let m , s  and x  denote the parameter estimates; denotes the inverse function of (4); (iv) repeat steps i)-iii) 10 000 times, giving 10 000 values for + -F i n 1 ; (v) compute the empirical distribution function of the 10 000 values in step iv), denoting it by F ; , the 95 percent simulated confidence interval, versus x (i) for i = 1, 2, K , n.
The closer the plotted points (in figures 2 and 3) are to the diagonal lines the better the fit. The plotted points must lie within the simulated confidence intervals for the fit to be considered adequate. Hence, the fit of the GEV distribution for monthly maximum temperature from the twelve locations is adequate.
Having checked the goodness of fit, we computed (3) for every station and a range of values of T. Plots of x T versus T = 2, K , 1000 are shown in figure 4. As expected, the return level estimates increase with the return period. The return level estimates are largest for Matam, second largest for Kedougou, third largest for Tambacounda, fourth largest for Podor, fifth largest for Linguere, sixth largest for Kolda, seventh largest for Diourbel, eighth largest for Kaolack, ninth largest for Saintlouis, tenth largest for Ziguinchor and eleventh largest for Capsikiring. The return level estimates are smallest for Dakar. This ordering of locations with respect to return levels is consistent with the spatial distribution of the mean annual temperature given in figure 20 of Fall et al (2006). The fact that Kolda has larger return levels than Ziguinchor is confirmed by table 4 in Thiam and Singh (2002) giving values of mean, minimum, maximum, standard deviation and coefficient of variation for mean annual temperature for the two locations. Also the fact that return levels are high for Matam and Tambacounda is consistent with figure 10 in Stisen et al (2007).
The largest estimates for Matam can be justified by its geographical location. Indeed, Matam is a part of Senegal that is influenced by the desert of Mali and Mauritania. The smallest estimates for Saintlouis, Ziguinchor, Capskirring and Dakar can be justified by their proximity to the sea.
Finally, we investigated to see if there are significant trends in the monthly maximum temperature for each station. We fitted (1) with the location parameter μ = a + b × (Year − 2001), where b is the trend parameter. By comparing the fit of this model with the earlier fit of the GEV distribution, we can see if the trend is significant or not. We also fitted models like Year 2001 [ ( ) ], but they did not provide significantly better fits. The methodology used for fitting models like μ = a + b × (Year − 2001) is described in Chapter 6 of Coles (2001). Table 3 lists the station names and the parameter estimates of a and b, and p-values showing significance of the trend. We see that only three of the stations exhibit significant trends. All three stations exhibit positive trends. These trends may be due to climate change or other factors. The positive trends in these locations are consistent with mean annual temperature trends shown in figure 25 of Fall et al (2006). The smallest of the trends is for Dakar which is in a costal area unlike Kedougou and Tambacounda. The costal area having the least increase is consistent with Djaman et al (2017).

Conclusions
This paper has provided the first statistical analysis of maximum temperature in Senegal involving data from twelve stations. The generalized extreme value distribution was shown to provide an adequate fit (as assessed by probability plots and quantile plots) to data from each station.
The hottest areas with respect to return levels are Matam, Kedougou and Tambacounda. The coolest areas with respect to return levels are Ziguinchor, Capsikiring and Dakar. Three of the stations (Dakar, Kedougou and Tambacounda) exhibited significant positive trends in maximum temperature. The remaining stations did not exhibit significant trends.
The results presented in this paper can inform positive actions by the Government of Senegal: for example, vegetables and other commodities less reliable on rain can be planted on the coolest areas; increased electricity production through solar energy can take place in the hottest areas; and so on.

Data availability statement
The data that support the findings of this study are available upon reasonable request from the authors.