The application of spatial empirical Bayesian smoothing method in spatial analysis of bacillary dysentery: A case study in Yudu County, Jiangxi Province

Bacillary dysentery (BD) has become one of the major public health threats to the sustainable development of human beings. The primary objective of this paper is to validate the effectiveness of applying spatial empirical Bayesian smoothing (SEBS) method in spatial analysis of BD in Yudu County, Jiangxi Province. By using SEBS method in calculating BD incidence, we compared the raw incidence data and smoothed incidence data. Spatial distribution map and global autocorrelation analysis were adopted to explore the effect of SEBS method in BD analysis. The result showed that SEBS method provided a stable incidence estimate in epidemic research. The proposed method could be used as an effective tool for studying the spatial distribution of BD, especially in town level study.


Introduction
Bacillary dysentery (BD) is a kind of intestinal infectious diseases caused by the bacteria of the genus Shigella, which is listed as class-B notifiable infectious disease in China [1,2]. The incidence of BD is listed in the top 5 place of class-A and class-B notifiable infectious diseases in China, affecting the health and life quality of people seriously [3]. At the same time, millions of people are infected with BD every year worldwide [4]. From a global perspective, BD is a crucial public health problem in the world, hindering the sustainable development of human in 21st century [5,6].
With the assistance of geographic information system and spatial statistic technology, the spatial distribution characteristics of disease can be mapped, so as to monitor and analyse its spatial distribution pattern of incidence [7,8]. Nevertheless, when the population size or the infected cases of the study area is small, especially in town scale study, the raw incidence of the disease is calculated directly by infected cases and population, the results tend to be fluctuant, making it difficult to identify real regional differences. Spatial empirical Bayesian smoothing (SEBS) is a method for adjusting or smoothing spatial variables, which can eliminate the influence of unreliable information such as small sample or extreme value, preferably obtaining a stable incidence estimate in epidemic research [9][10][11].
Based on SEBS method, this paper analysed the spatial distribution pattern of BD in Yudu County of Jiangxi Province in 2017. We calculated the raw incidence and SEBS-based incidence of BD, compared the spatial distribution and autocorrelation of the two situations, intended to prove that SEBS method was more effective for spatial analysis of BD incidence, especially in small spatial scale. IOP Conf. Series: Earth and Environmental Science 568 (2020) 012009

Study Area
Yudu County is located in the south of Jiangxi Province, extending between 25°35′8" to 26°20′53" North and 115°11′ to 115°49′ East. The whole county covers an area of 2893 km 2 , including 23 towns and villages (Figure 1). Yudu County belongs to subtropical humid monsoon climate zone, with warm and humid climate, four distinct seasons, sufficient light and rainfall [12].

Data
The data of BD cases in 2017 used in this paper were retrieved from the Center for Disease Control and Prevention of Yudu County. The time information of the original data were collected daily, the location information were accurate to the township street, and other information include gender, age, and occupation. Demographic data originated from the 2017 China county statistical yearbook. The data were analysed at town level, the incidence of BD was calculated as the number of new cases per 100,000 population in one year.

Methodology
Spatial empirical Bayesian smoothing method defines a neighborhood for each spatial area, the spatial smoothing is conducted based on neighborhood scale, neighborhood incidence, the mean and variance of neighborhood area. As a result, the incidence of the disease can be recalculated by SEBS method. The equation of SEBS approach is defined as follows [13]: where u i represents study area i, λ(u i ) is the Bayesian shrinkage factor, z(u i ) is the raw incidence, m*(u i ) is population-weighted average incidence in the neighborhood. The smoothed incidence data can IOP Conf. Series: Earth and Environmental Science 568 (2020) 012009 IOP Publishing doi:10.1088/1755-1315/568/1/012009 3 reduce the spatial differences of incidence caused by small population base, and ensure that the results are more in line with the first law of geography. The raw and smoothed incidence data were displayed through two incidence maps at town level firstly, in order to visualize the spatial distribution of BD incidence for comparison.
Then spatial autocorrelation analysis was integrated to explore the spatial association of the BD incidence. In this study, the global Moran's I index was adopted to make a comparison between raw incidence data and smoothed incidence data, in order to explore the difference of the two datasets and to test the effect of SEBS method. The range of Global Moran's I index is between -1 and 1. A positive value of Moran's I index indicates a positive spatial association, that is, nearby areas had similar incidence. Conversely, the negative value represents a concentration of different incidence [14,15]. Particularly, approaching 0 means a random distribution of the BD incidence. Combined with the population base of the 23 townships in Yudu, the annual incidence of each township in Yudu County was calculated. We conducted a comparison between the raw incidence and SEBS-based incidence of the spatial distribution of BD, the result was shown in Figure 2.  demonstrated the spatial distribution of raw incidence and SEBS-based incidence in 2017 respectively. The high incidences were concentrated in Pangushan, Jingshi Township and Licun Township in two maps. It is worthwhile to pay attention to these high incidence rate townships. Compared to the raw incidence map, the incidence of 23 townships in Figure 2(b) was smoother than Figure 2(a). In detail, the range of annual incidence in (a) was 0.00 to 138.63, compared with 0.74 to 133.81 for (b). This result suggested that the SEBS method eliminated the impact of IOP Conf. Series: Earth and Environmental Science 568 (2020) 012009 IOP Publishing doi:10.1088/1755-1315/568/1/012009 4 missing cases on the incidence rate. Meanwhile, for the towns with small population size or infected cases, SEBS method smoothed the incidence to make it closer to local average value.

Spatial autocorrelation analysis
Global Moran's I index was employed to estimate the spatial autocorrelation of BD incidence in Yudu. The spatial autocorrelation analysis result for BD incidence-raw incidence and SEBS-based incidence-is listed in Table 1. In general, the global Moran's I for BD incidence in 2017 was positive, suggesting a clustering distribution of similar incidence rates at the town level in Yudu. The Moran's I value of SEBS method was bigger, implying that the spatial correlation of smoothed incidence across Yudu County was stronger.
The major difference between two methods was the statistical significance level, the global Moran's I value under SEBS method was significant at level 0.05, while the global Moran's I value from raw data was significant at level 0.1. This result further confirmed the effectiveness of SEBS method from a statistical view.

Discussion
Analyzing the spatial characteristics of BD is of great theoretical and practical significance for monitoring, preventing and controlling the epidemic. In this research, spatial empirical Bayesian smoothing approach was used to analyse the spatial distribution pattern of BD in Yudu County, Jiangxi Province, China. According to the demographic data and BD cases, both the population and infected cases were unevenly distributed across the region. The small number of cases and population may result in unstable incidence. Furthermore, the number reported in each township may have missing cases, which increased the instability of incidence as well. Based on the above issues, SEBS was applied to recalculate the incidence of BD.
Here, we compared the spatial distribution map and global Moran's I index of BD raw incidence and SEBS-based incidence. It was found that SEBS-based incidence map could reduce the effect of small population and cases, eliminating the impact of spatial outliers, making the incidence map smoother. This discovery was consistent with some previous studies [16][17][18].
Global Moran's I index was chosen to estimate spatial autocorrelation because it measured the spatial dependence at global scale [19,20]. By defining the similarity between spatial weights and spatial attributes, the distribution pattern of BD incidence can be reflected. Based on the contrast of Moran's I index, it can be concluded that the clustering trend of BD after smoothing is more significant, and the spatial dependence of SEBS-based incidence is more convincing under statistical significance test. Overall, the SEBS method should be taken into consideration in spatial epidemiology study, especially in region scale study.
Due to the limited data availability, this study targets the effect of SEBS method in Yudu County for one year. Nevertheless, more related data should be input to test the effectiveness of SEBS method. We call for future research on SEBS method for different spatial and temporal scales. This paper aims to incorporate SEBS method in spatial analysis of BD at town level. Through the comparison of distribution map and spatial autocorrelation analysis, the SEBS method taken in the incidence calculation was more rational than raw incidence in town level study. The findings of this study suggest that the SEBS method is recommended when the infected cases or the population of the study area is small, especially in town scale research. In addition, the result can provide valuable reference for further epidemiological investigation and disease prevention.