Poisson regression application for sosiodemography Acute Respiratory Infections (ARI)

Acute respiratory infection (ARI) is an infectious disease caused by bacteria or viruses that attack the respiratory organs. The case of ARI in toddlers is called Pneumonia. This research aims to determine regression Poisson and negative binomial regression models and significant factors affecting ARI in toddlers in East Nusa Tenggara province based on sociodemography aspects. The method used is the Poisson regression model and the negative binomial regression model. Poisson regression is used to form ARI case models in toddlers in East Nusa Tenggara province based on sociodemography aspects and negative binomial regression to overcome the overdispersion case in Poisson regression. The results showed that the Poisson regression model over dispersed, so the Poisson regression model was not suitable to be used suitable model for ARI cases in toddlers in East Nusa Tenggara based on sociodemography aspects is negative binomial regression models. The result of the data processing obtained a negative binomial regression model: μ i = exp(3.5260 + 0.021x 2). The result of this study indicates that a significant sociodemographic aspect affecting ARI cases in toddlers at NTT is poverty (X2).


Introduction
One of the main problems in the health sector facing the world is infectious diseases problem, both those caused by factors that cause them or those caused by lifestyle. Acute respiratory infection (ARI) is a contagious disease. The problems of ARI has tended to increase in the last few decades, both globally and nationally. Nationally, the prevalence of ARI in Indonesia increased from 2010 with a percentage of 23% until 2016 with a percentage of 65.27% [1]. ARI has become the leading killer of toddlers in the world. This disease has become a public health problem in both developed and developing countries [2].
ARI cases in toddlers are called pneumonia caused by an influenza-A virus, adenovirus, and parainfluenza virus. ARI is a disease that often occurs in children. The incidence by age group toddlers is estimated to be 96.7% in developing countries. Most cases occurred in India (43 million), China (21 million), and Pakistan (10 million) as well as Bangladesh, Indonesia, and Nigeria with 6 million episodes each [2].
Based on basic health research in 2018, East Nusa Tenggara (NTT) ranks 4th out of 34 provinces in this problem with 7.5% of ARI patients [1]. There are 22 districts in NTT with prevalence rates of ARI Kupang District is an area with the highest ARI prevalence rate with a percentage of 61.07%. Until now, ARI is also the main cause of infant and toddler mortality in NTT with a percentage of 80% to 90% [3].
Research by M. Habibi Syahidi shows that ARI in toddlers is caused by various factors including environmental factors, individual child factors, and behavioral factors. These factors are related to population characteristics. Sociodemography is a scientific study about the structure and processes of the population in an area where structural changes of the population are also influenced by social processes and social changes in the society. Demography does not study the population as individuals, but the population as a group [4]. The three sociodemography aspects studied relate to factors causing ARI, where population density is a part of environmental factors, the poverty is a part of behavioral factors, and malnutrition state is a part of individual child factors.
This research aims to determine regression Poisson and negative binomial regression models and significant factors affecting ARI in toddlers in East Nusa Tenggara province based on sociodemography aspect The method used is Poisson regression model and negative binomial regression model. Poisson regression is used to form ARI case models in toddlers in East Nusa Tenggara province based on sociodemographic aspects and negative binomial regression to overcome the overdispersion case in Poisson regression.
Poisson regression is a nonlinear regression analysis modeled by the Generalized Linear Model (GLM). Poisson regression analysis is used when dependent variable data to be analyzed is form count data, and independent variables data are categorical, interval, or count data. The Poisson regression analysis has to fulfill the equidispersion assumption, which is the assumption of the similarity between mean and variance of the dependent variable. In practice, there are often differences in variance and mean values of data. Differences in variance value and mean indicates an overdispersion condition. Overdispersion is a condition of variance value greater than mean. If there is a violation of equidisspersion assumption, where the variance greater than the mean in this study, then the overdispersion case will be resolved with a negative binomial regression model [5].

Research methods
The variables in this research are: a. Dependent Variable (Y) is the number of ARI cases in toddler in East Nusa Tenggara Province in 2018. b. Independent Variable (X) is the cause of ARI in todlers in East Nusa Tenggara Province based on sociodemography aspects, includes::

Kolmogorov-Smirnov test
Kolmogorov-Smirnov test was conducted to test the suitability of Poisson distribution on dependent variable Y, that is number of cases of acute respiratory infections (ARI) data, with the hypothesis: 0 : Poisson distribution data 1 ∶ not Poisson distribution data The criterion for this test is to reject 0 if greater than value with statistic test D = max │ 0 ( ) − ( )│. The Kolmogorov-Smirnov test results are presented in Table 1. value of Kolmogorov-Smirnov test is 0.281 with a significant level of α = 5%. The value of < then 0 is accepted and it can be concluded that the number of cases of acute respiratory infections (ISPA) data in toddlers in East Nusa Tenggara Province follows the Poisson distribution.  Regression Poisson models indicate that ARI cases in toddlers in NTT Province are influenced by the sosiodemography aspects, that is population density factors (X 1 ), poverty (X 2 ) and malnutrition states (X 3 ), with each of their P-Value lesser than 0,05.

Poisson overdirspersion test
Overdispersion Poisson in table 3 shows that deviance and Pearson Chi-Square value greater than 1. This result indicates that the Poisson regression model does not meet the equidispersion assumption or is overdispersed then Poisson regression model is not suitable for modeling ARI cases in toddlers in NTT based on sociodemographic aspects. Therefore, a further analysis was carried out to overcome the overdispersion case is using the negative binomial regression model.

Binomial negative regression models
Binomial Negative regression is typically used when there is overdispersion in Poisson regression models. Generally, binomial negative regression model formed for ARI cases in toddlers in East Nusa Tenggara Province based on sociodemography aspects is: With : 0 : intercept or parameter when the value of 1 , 2 , and 3 are 0 ( constant ) 1 : parameter variable 1 2 : parameter variable 2 3 : parameter variable 3  Table 4 presents the results of binomial negative regression analysis which shows that binomial negative regression model formed from ARI cases in toddlers in NTT based on sociodemography aspects is only influenced by the poverty factor ( 2 ) (p-value < 0,05) with the models μ = (3.5260 + 0.021 2 ). Negative binomial regression model showed that the sociodemographic aspect that significantly affected on ARI cases in toddlers in NTT was poverty factor ( 2 ). The model states that every increase in poverty rate by one thousand people will increase the ARI cases in toddlers in NTT by an average of exp (0.021) or 1.0212220516 cases in each district. In other cases, the total number of ARI cases toddlers in NTT increased by around 22 cases if poverty in NTT increased by a thousand people. Twenty-two cases of ARI in toddlers occurred in NTT, which means that there were districts that had no ARI cases and there were districts that had more than one ARI case.

Poverty and ARI cases in toddler in NTT
Poverty in East Nusa Tenggara Province is measured by calculating the poverty line which consists of two components, that is the Food Poverty Line and non-Food Poverty Line. Based on the poverty profile in 2018, the number of poor people in East Nusa Tenggara in March 2018 was 1,142.17 thousand people (21.35 percent), an increase of around 7,430 people compared to the poor population in September 2017 which amounts to 1,134.74 thousand people (21, 38 percent) [6]. poverty in NTT has increased.
East Nusa Tenggara is a province that is very thick with the culture that contributes to the level of poverty which is called Cultural poverty [7]. Based on the residence, from September 2017 to March 2018, the number of poor people in rural areas increased by 4,510 people, from 1,015.70 thousand people to 1,020.21 thousand people and urban areas also experienced an increase of 2,910 people, from 119.04 thousand people to 121.95 thousand people [6].
Poverty culture affects to low education level and public knowledge about healthy and hygienes. Many toddlers in direct contact with kitchen smoke without proper air ventilation, toddlers in direct contact with outside dust which can infect the toddler's respiratory tract. Home cleanliness is a risk factor for ARI in toddlers. The unhealthy house can interfere with the health of its occupants and increase the risk of various diseases [8].
Based on WHO data (2000), more than two million poor people in the world still depend on to use biomass (wood, charcoal, animal dung, coconut dregs) and they use of coal as household energy needs. These materials have an impact on increasing indoor air pollution that exceeds the applicable international air quality standards, which caused to exposure of children who live in poverty in their daily lives. This exposure increases the risk of diseases such as acute respiratory infections and lung cancer.

Conclusion
The best model for ARI cases in toddlers in East Nusa Tenggara province based on sociodemography aspects is a negative binomial regression model and a significant factor affecting ARI is poverty. The binomial regression model that is formed is: μ = ( 3.5260 + 0.021 2 ) (4)