Characteristics of parameter mixed geographically weighted regression model : global (a-group) and local (b-group)

Mixed Geographically Weighted Regression (MGWR) is a combination of a linear regression model and a Geographically Weighted Regression (GWR) model that considers several variables to be constant and some to vary spatially. The MGWR model also considers location characteristics as indicated by global and local variables. The MGWR model is more flexible than the GWR model because its characteristics are the same for each location. These characteristics are shown in the variables contained in the MGWR model, namely global and local variables. Global variables produce global a-group parameters that have the same effect for all observation locations and local variables produce local b-group parameters that have different effects for each observation location. This article reviews the characteristics of global (a-group) and local (b-group) parameters in the MGWR model. The research shows that the two parameters can be predicted by different methods, estimating global parameters in the regression model by the least squared (LS) method and estimating local parameters by the weighted least squared method (WLS). The difference in this parameter estimation method is due to local parameters influenced by the presence of observational characteristics so that the assumption does not occur heteroscedasticity cannot be obtained.


Introduction
In linear regression, the relationships are modeled using linear predictor functions whose unknown model parameters are estimated from the data [1]. Such models are called linear regression models [2]. The linear regression model is a model used to determine the influence of a predictor variable on response variables. Parameter estimators in regression model will have the same value for each location of observation, or it can be said that the parameter estimators are global. Geographical location in an observation exerts much influence on the linear regression modeling. This is in accordance with Fotheringham's statement that everything is related to everything else, but near things have more influence than distant things [3]. Such correlation is known as spatial correlation [4,5].
References [6] developed spatial statistics called geographically weighted regression (GWR) model from linear regression model by adding geographical factors so that each location has different parameter values. The GWR model deals with the drawback of the linear regression model that the model results in similar values for all locations. It belongs to one of spatial models that considers characteristics of each location. The weakness of the GWR model is that since it takes different characteristics of each location into account, similar characteristics of all locations are ignored or even eliminated. The GWR model is developed from regression, but in several cases of spatial diversity several coefficients may be insignificant or even be ignored. For that reason, [4] developed the mixed geographically weighted regression (MGWR) model.
A Mixed Geographically Weighted Regression model (MGWR) is a combination of linear regression and the GWR. It is a regression model of which some independent variable coefficients are constant, while some other spatially vary [6]. The combination is obtained after testing for spatial variability has been carried out. The MGWR is appropriate for data which are influenced by local and global variables. The GWR and the MGWR differ on their spatial variability. All contributing factors in the GWR have spatial variability, and therefore result in local parameters which are influential for  [4,6,7]). Meanwhile, some contributing factors in the MGWR have spatial variability, but some other do not. The factors having no spatial variability will generate a global parameter-a parameter which is influential to spatial units, while those having spatial variability will produce a local parameter. In order to explain the influence of the spatial units, weight is utilized [9].
The MGWR model is more flexible than the GWR model since characteristics and similar characteristics of each location are taken into account. Such characteristics are indicated by variables existing in the model known as global variables which result in global parameters exerting the same influence for all locations, and local variables which generate local parameters with different influences for each location [9,10,11,12]. According to [13], the implementation of the MGWR model with bi-square weight on poverty rates contributes to adequately high goodness of fit. Several other applied researches on the MGWR and GWR models were carried out by [14].
The linear regression model is also called the global regression model due to the use of the method of least squares to estimate the parameters [9]. Therefore, such method is applied in global parameter estimation. In contrast, it is not applied in local parameter estimation since spatial aspects or geographical conditions of observations enable to lead to spatial heterogeneity in which regression parameters partially vary (or so-called the occurrence of spatial non stationarity in regression parameters) [15]. Moreover, if the method of least squares is applied in parameter estimation, the assumption of homogeneity of error variance is difficult to fulfill. For that reason, the weighted least squares is used to estimate local parameters. In the article, both global and local parameter estimation in MGWR model is determined using the least squares and the weighted least squares.

Research method
The present research belongs to a theory-based research, which examines the estimation of a-group global parameters and b-group local parameters in the MGWR model using the method of least squares. The main procedures were deriving the MGWR model and estimating its parameters. Such procedures involved the elaboration of the MGWR model, the construction of vector matrix of the MGWR model, the formation of a-group global parameter linear model, derivation of formula of a global parameter from its linear form, formation of b-group local parameter linear model, and derivation of the formula of b local parameter from its linear form, and determine the estimated parameters of the MGWR model parameter estimation of parameter a and parameter b. The estimation of parameter a using the method of least squares and estimation of parameter using the method of weighted least squares generates

The geographically weighted Regression (GWR) model
The GWR model is developed from the linear regression model. Unlike the linear regression model, the GWR considers aspect of location. The linear regression model is merely influenced by independent variables, while the GWR model is influenced by the aspect of location, in this case coordinates of location points, and therefore the GWR model will result in estimation for each location point [8]. The GWR is a nonparametric model. The general formula of the GWR model is expressed where , = 1,2, … , is the response variable of the i th data, , = 1,2, … , is the parameter for the i th location and the k th variable, is the i th value of and is the residual of normallydistributed GWR model.
The parameter of the GWR model is estimated using the weighted least square (WLS) with weight. The parameter estimation ̂ is expressed is the × 1 matrix of dependent variables, is the × ( + 1) matrix of independent variables, ̂ is the ( + 1) × 1 vector of parameter estimation of the GWR model for the i th data, and ( , ) represents the × weight matrix.

The MGWR model
The MGWR model is developed from the GWR model (1). Both models are different in parameters. The parameters of the GWR differ according to the geography, leading to eliminable parameters which are not influenced by geography and different models for each location. To conduct a modeling with global parameters which are not influenced by the geography and local parameters, the MGWR model can be used and expressed where is the global parameter, = 1,2, … , and are local parameters, = + 1, … , . is the global parameter-related independent variable, and is the local parameter-related independent variable. The MGWR model is constructed by dividing into two groups of independent variables, namely a-group global parameter and b-group local parameter [4].

The method of least squares
Gauss' method of least squares was first used to predict Ceres' location. On January 1, 1801, an Italian astronomer named Giuseppe Piazzi, (using an analysis of Gauss' least squares) discovered Ceres and was able to maintain tracking of its position for forty days at which time it disappeared behind the sun. Based on the data, astronomers could determine Ceres' location after appearing from behind the sun's glare without having to solve complicated Kepler's nonlinear equation on planetary motion. Another prediction was done by a Hungarian astronomer, Franz Xaver von Zach, to relocate Ceres using Gauss' least squares analysis [16,17]. Gauss did not publish the method of least squares until 1809. In 1822, Gauss pointed out that the approach of least squares for regression analysis is optimal, in a sense that in a linear regression model in which the errors have expectation zero and are uncorrelated and have equal variances, the best linear unbiased estimator is given by the ordinary least squares estimator.
The method of least squares is a standard approach to solve an equation or equation system. "the least" means that the overall resolution minimizes the sum of the squares of the generated errors. The least squares problems are categorized into two domains: linear and nonlinear. They occur in regression model. The least squares model aims at determining good parameter estimation appropriate for function ( ) of dataset { 1 , 2 , … , }. It assumes that the best and the most suitable curve is that with the least squares (LS) from dataset. The estimator resulted from the LS is BLUE (Best Linear Unbiased Estimator).
The presence of heteroscedasticity does not necessarily weaken the regression model. If the regression using the LS with the presence of heteroscedasticity is carried out, the unbiased but more sharply fluctuated parameter will be obtained. In other words, if the model is reupdated by adding data or different samples, estimators will significantly vary in the mean scores. Due to large amplitude of oscillation of coefficients resulted from estimation results, the updated single estimation error in each model will largely change, and therefore estimators will not be efficient. The mean scores of estimation errors in the long run will be equal to the mean scores of estimation errors resulted from a model without the presence of heteroscedasticity. A good estimation model requires that estimation coefficients be unbiased and point estimation of a model change in a narrow range. It is called unbiased concept with efficient estimators. An alternative of good estimation model when heteroscedasticity exists is the method of weighted least squares (WLS). In addition to being capable of neutralizing as a result of assumption violation, the WLS eliminates bias and consistency of the estimation method of least squares. If the efficiency of estimators is considered more important than its bias and consistency of the least squares estimators under condition of heteroscedasticity, the method of WLS is more appropriate to use than the estimation model of least squares. The method of WLS is a specific case of the generalized least squares. The WLS is a form of the least squares estimation created to encounter heteroscedasticity, and therefore it can maintain its efficiency of estimation without having to lose its bias and consistency.

a-group global parameter estimation
The a-group global parameter estimation is carried out by forming the MGWR model into the a-group model. The a-group model means that in the MGWR model, local parameters and local parameterrelated independent variables are considered as independent variables.
The MGWR model in equation (2) is expressed in the form of vector matrix: in which = 1,2, … , , is the vector of × 1 dependent variables , is the × a-group variable matrix, a is the × 1 a-group parameter vector, is the × 1 residual vector. Such expression indicates that the weighted parameters are influenced by geography and treated as independent variables by subtracting a-group in equation (3), and therefore it is expressed: It is clear that the equation (4) forms the GWR model if it is assumed that a is known and m is estimated using the GWR. In the GWR, ̂ can be expressed as ̂= , where S is the hat matrix formulated as: , and therefore m can be estimated and expressed as: Equation (5) is substituted into equation (4), so that the following equation is obtained: If given that ( − ) = and ( − ) = , then equation (6) can be expressed: Equation (7) is the equation of linear regression, and therefore parameter a can be estimated using the method of least squares. The estimation of parameter is explained as following. The basic idea of using the method of least squares is the presence of errors of linear regression equation. Therefore, the error of equation (7) is: Error of equation (8) is calculated for each n observation, and therefore the error sum of squares is obtained: Based on equation (8), the following equation is obtained: It is in accordance with the rule in the method of least squares that the error sum of squares should be minimum and can be fulfilled if parameter a in equation (9) is minimum, i.e. = .
For that reason, the estimation of parameter by deriving equation (9)

b-group local parameter estimation
The b global parameter estimation is done by constructing the MGWR model into b-group model. The b-group model means that in the MGWR model, global parameters and independent variables related to global parameters are treated as dependent variables. Such treatment constructs the GWR model, so that the parameter b is estimated using the weighted least squares. The MGWR model in equation (2) is expressed in the form of matrix: , where is the matrix of × (1 + ) b-group variables and b is the vector of (1 + ) × 1 b-group parameters. Equation (10) is constructed by replicating patterns on the GWR model: Therefore, to estimate b, the WLS method can be used. Unlike the least squares (LS), the error sum of squares in the WLS is multiplied by weight W which will be minimized. The estimation of parameter is explained as follows: From equation (10) The residual sum of squares will be minimum if = is fulfilled, and therefore: The estimation of parameter by estimating equation (11) using the method of WLS results in:

Conclusion
The research conclude that the MGWR model (2)