Parameter estimation and hypothesis testing of geographically and temporally weighted bivariate Poisson inverse Gaussian regression model

One of the appropriate methods used to model count data response and its corresponding predictors is Poisson regression. Poisson regression strictly assumes that the mean and variance of response variables should be equal (equidispersion). Nonetheless, some cases of the count data unsatisfied this assumption because variance can be larger than mean (over-dispersion). If overdispersion is violated, causing the underestimate standard error. Furthermore, this will lead to incorrect conclusions in the statistical test. Thus, a suitable method for modelling this kind of data needs to develop. One alternative model to outcome the overdispersion issue in bivariate response variable is the Bivariate Poisson Inverse Gaussian Regression (BPIGR) model. The BPIGR model can produce a global model for all locations. On the other hand, each location and time have different geographic conditions, social, cultural, and economical so that Geographically and Temporally Bivariate Poisson Inverse Gaussian Regression (GTWBPIGR)) is needed. The weighting function spatial-temporal in GTWBPIGR generates a different local model for each period. GTWBPIGR model solves the overdispersion case and generates global models for each period and location. The parameter estimation of the GTWBPIGR model uses the Maximum Likelihood Estimation (MLE) method, followed by Newton Raphson iteration. Meanwhile, the test statistics on the hypothesis testing is simultaneously testing of the GTWBPIGR model is obtained with the Maximum Likelihood Ratio Test (MLRT) approach, using n large samples of the statistical test is chi-square distribution. Moreover, the test statistics for partially testing used the Z-test statistic.


Introduction
The Poisson distribution is a discrete distribution with a random variable value in positive integers, so it is a good choice for modelling count data. The Poisson distribution is only determined by one parameter that defines the distribution's mean and variance. Poisson regression has assumptions when the mean and variance of the response variable must be the same value (equidispersion). However, the mean and variance of count data are often not the same [1]; often, the variance is greater than the mean ( 1)2 1 3 2 2 ( ; ) (2 ) , The BPIG distribution is based on the Inverse Gaussian and Poisson mixture distribution which has the following joint density function: is the vector of the predictor variable in the ith observation and the j-th response variable , is the vector of the dimensionless regression coefficient in the j-th response variable.
The parameter estimation of the BPIGR model was obtained by the MLE method with Newton Raphson numerical iteration. While the test statistics on hypothesis testing are simultaneously obtained by the MLRT method.

Geographically and Temporally Weighted Bivariate Poisson Inverse Gaussian Regression Models
The GTWBPIGR model is a development of the GWBPIGR model by considering the temporal aspect. Based on the BPIG equation (3), the joint probability function and is (4) with -th is the period, , is the vector of the predictor variable in the i-th observation and the j-th response variable; ; l=1,2,…,L dan j=1,2). are vectors of regression coefficients that are dimensioned on the response variable in the j-th in the lperiod with a temporal-spatial weighting matrix. , are location latitude and longitude.
The parameter estimation GTWBPIGR model using the MLE method is carried out in stages in each period by including observations of the previous period. The parameter estimation in period first uses n data in period 1. In period 2 uses n data in period 1 st and n data in period 2 nd . The parameter estimation in period L uses total n data in periods 1, 2 to L. The spatial-temporal weighting function in the GTWBPIGR model uses * ii l w [5].  For the weighting function for the period l = 1, the temporal-spatial weighting is obtained with the same Euclidean distance value as the GWR weight. During one period, weighting calculations are only performed using the data for period l total n observations to obtain the weights for the ilocation. When period l = 2, the weighted calculation uses data for periods 1 and 2 with a total of 2n observations. So that the weights for the i-location are obtained. And so on for three periods to L periods.
For temporal-spatial weighting in determining the Euclidean distance between the location and time of observation with using the formula in the following equation.
So we get a temporal-spatial weighting with the Adaptive Bisquare Kernel Function, follows as is the Euclidean distance and is bandwidth. The selection of the optimum bandwidth can be made by using the Generalized Cross-Validation (GCV) method. This GCV method is defined as follows: The process of obtaining bandwidth that minimizes GCV value can be done using the golden section search technique [7]. Furthermore, for test statistics on the hypothesis testing simultaneously, the GTWBPIGR model is obtained with the Maximum Likelihood Ratio Test (MLRT) approach. Using n large samples of the statistical test is chi-square distribution. Moreover, the test statistics for partially testing used the Z-test statistic.

Parameter Estimation of GTWBPIGR Model
The MLE method is carried out by taking the previous n random samples with and . The form of the model has been mentioned in equation (4). The joint probability density function of and as follows: Parameter estimation using data from the previous period. Suppose that period L uses data information from n samples of period 1, n samples of period two until n samples of period L [8]. So that the likelihood function for the population in period L of equation (5) Next, the log-likelihood function for equation (6) is as follows: x β x β ββ  So we get the log-likelihood function to estimate the location with * ii l w is spatial-temporal weight as follows: log ( ( , , ); ( , , ); ; 1, 2,..., ) To obtain the parameter estimation of the GTWBPIGR model, the functions in the equation are derived respectively to Then, substitution equation (10)   With the same procedure, we get: Furthermore, the first derivative of the * L Q to parameter is formulated as below: Equations (11)-(13) equate to zero and produce a non-explicit form. Hence, an iterative method needs to be applied for estimating parameters, so Newton Raphson's iteration will solve it. So that the Newton-Raphson iteration algorithm is as follows: Step 1. Determine the initial value for the parameters Step 3. Determine the Hessian matrix Step 4. Start working on the Newton-Raphson iteration using the following equation: Step 5. The iteration will stop when with is a very small value and will produce the estimator value for each parameter.

Hypothesis Testing of GTWBPIGR Model
The hypothesis testing of the GTWBPIGR model using the Maximum Likelihood Ratio Test (MLRT) method both simultaneously and partially [9]. Hypothesis testing is carried out in stages each period as the parameter estimates, with an L's period Simultaneously hypothesis testing is performed to determine the significance of the regression parameters in the model with the following hypothesis: The likelihood function () L L  for each model, as follows:    Suppose the results of hypothesis testing decide to reject . In that case, the conclusion obtained is that the predictor variables affect the response variable and a degree of error of 5%.
If the simultaneous testing decision is rejected , the next step is to partially test the parameters to determine which parameters significantly affect the model. The test statistic Z approximates the Normal Standard distribution [11]. The hypothesis used is: the null hypothesis is vs with , , and The test statistics used is where the value is obtained from the principal diagonal element of the covariance variant matrix of the following equation . The area of rejection is with is the level of significance.
Furthermore, to get partial hypothesis testing on the following parameters, the hypothesis used is: The test statistical used is: where the value is obtained from the principal diagonal element of the covariance variant matrix of the following equation. The area of rejection is the level of significance.
The area of rejection is with is the level of significance.