Generalized predictive recursion maximum likelihood for robust mixture regression

In the application of econometric model, the error distribution is unknown and is not easily to specify in the likelihood function. In some situations, there might exist a mixture distribution in the errors and thus the traditional estimation method would probably yield a biased result. In this study, this mixture distribution of the error term is taken into account and the generalized semiparametric estimation is presented and applied in regression model. We also use an experiment study and the real application analysis to check the performance of this estimator in regression model. The performance of this estimation is then compared with that of conventional Least Squares method in the real data analysis.


Introduction
The linear regression model which has received considerable attention in many financial applications can be written as , where t Y is a 1 T  vector of dependent variables, t X is T K  matrix of independent variables, and t  is 1 T  vector of errors with density function. In the conventional estimation, one assumes that the errors have a normal density with mean zero and variance 2  .
In the literature, the regression model could be estimated by various classical estimators such as Conditional Least Squares (CLS), Maximum Likelihood and Bayesian estimations (see [1,2]). These classical estimators assume that the error has normal density and is free from nuisance parameters. Under these assumptions, the classical methods can provide the optimal solution for estimated parameters when the error density happens to be normal. The main reason for using a normal distribution is especially in simplicity of derivation and desirable properties of the estimated parameters which are consistent, efficient, and asymptotically normal under mild conditions, for more details (see [3,4]). Nevertheless, if the error density is non-normal, these classical estimators will not provide a reliable and efficient result [5]. In theory, the normal distribution shows a symmetric, nonexcess kurtosis and skewness; and these may not be true in the real world, especially in the financial time series data, as it often exhibits heavy tail and asymmetric distribution. Thus, when we have a nonnormal error density function, alternative non-normal density estimators, such as nonparametric Maximum Likelihood or Bayesian, can be adopted to deal with these problems. However, in the most recent work of [6]  complicated, leading to difficulty solving these problems. They mentioned that numerical optimization of ML is not stable and is computationally expensive, while nonparametric Bayes also consumes computational time since each marginal likelihood evaluation requires its own MCMC iterations and optimization requires several such iterations. Therefore, a general scale mixture density for the error distribution with Predictive Recursion marginal likelihood ( ) PR L  is developed and estimated. To do the optimization, [6] developed a hybrid Predictive Recursion-Expected Maximization (PR-EM) algorithm and applied it in the regression framework. This method is then applied to estimate the Threshold model of [7] by [8], they confirmed that this method outperforms the LS and Bayesian methods when the error has normal scale mixture distribution. However, if the error happens to have other mixture distribution such as student, skewed normal and skewed student-t scale mixture distribution, we expect that the accuracy of the PR-EM based normal mixture density is lost. This paper attempts to fill this gap in the literature by applying various combinations of mixture distribution. We propose more general form of PR-EM by applying a student-t, skewed normal and skewed student-t distributions to construct the mixture density in PR-EM. General speaking, we generalize PR-EM algorithm by replacing normal distribution with student-t, skewed normal or skewed student-t distributions, to construct the mixture density in PR-EM. Thus, the estimation becomes more flexible for solving almost all possible mixture distributions which have not been studied in earlier literature.
The main contribution of this study is to further develop and extend the PR-EM algorithm of [6] to have a mixture student-t, skewed normal or skewed -student-t. To our knowledge, this study is the first attempt to estimate regression model using PR-EM based student, skewed normal or skewed student-t method. As a consequence, the model is more flexible to overcome the difficulty arising when a general scale mixture of various parametric with unknown mixing distribution exists for the error term.
The rest of this paper proceeds as follows: Section 2 briefly presents the estimation approach to regression model. Section 3 proposes a simulation experiment to assess and illustrate the extended PR-EM algorithm as well as compare it with the conventional PR-EM. Section 4 employs the proposed method to real data while conclusions are given in Section 5.

Constructing Predictive Recursion marginal likelihood for regression model
The study applies the PR algorithm to the error to obtain the mixture density, the PR marginal likelihood for  is written as: Thus, the density function equation (2) contains both parametric distribution function ( ) F  , and other unspecified mixing distribution. Note that our study aims to propose a more general form of PR-EM by replacing normal distribution with student-t, skewed normal or skewed student-t distributions, to construct the mixture density in PR-EM, hence, the distribution ( ) F  can be assumed to be either student-t, skewed normal or skewed-student-t. To estimate the parameter  , we can use the numerical optimization method to maximize the likelihood function. When the likelihood reaches its maximum, we can obtain the optimal parameter set. Nevertheless, the study employs hybrid PR-EM algorithm of [6] to be an optimization method for estimating the model since it was proved to have an efficiency in 2.2. Hybrid PR-EM algorithm Reference [6] proposed a hybrid PR-EM algorithm taking an advantage of the latent scale parameter structure 1 ,..., T U U in the mixture model. To fit this density function, we apply the PR algorithm which is a recursive estimation of mixing distribution in nonparametric mixture model to estimate the mixing density (see [10]). By taking a logarithm in equation 2, the study gets Then, integrate out t U with respect to the density and   is the estimated parameter in the previous estimation and  is the estimated parameter in current estimation. In practice, we can rewrite equation (4) where t w is a weight depending on mixing distribution for error term which is defined as In sum, this study employs PR-EM algorithm to maximize 1 2 Q Q  which corresponds to weighted least squares minimization. The PR-EM algorithm is done by the following steps:

Simulation study
To assess the accuracy of our generalized PR-EM, we conduct an experiment study. Also, the purpose of this experiment is to compare the performance of our estimation under a variety of mixture error distributions as well as the conventional PR-EM of [6] and traditional least squares (LS) methods. The simulation is now considered the simple linear regression. The study generates simulated data from the following model specification: 1, where the error term t  has been generated independently from the following distributions    replications with each of the method are obtained to make the comparison. The computing language is written in R software and run on a desktop computer with 1.7GHz IntelCore i3 with 4GB RAM capacity.
According to table 1-2, the overall pattern is similar in all scenarios. We can observe that if the estimator is correctly specified, the PR-EM based correct distribution always outperforms excellently and provides the best fit estimation. In addition, we can see a pattern of convergence to zero of the MSE and bias when T increases. As a general rule, we can say that bias and MSE tend to approach zero when the sample size increases, indicating that the estimates based on the generalized PR-EM have good asymptotic properties. In the computation aspect, among all methods for estimating the unknown parameter, the computational cost of PR-EM is higher when compared to the LS, however, it can greatly enhance the parameter estimation and thereby yielding more accurate and reliable results. exchange market pressure (EMP). We collect quarterly data from 1999 Q1 through 2017 Q1. The data is collected form the Federal Reserve Economic Data (FRED) provided by the Federal Reserve Bank of St. Louis. We have considered the following regression model for exchange market pressure:

Real data analysis
These data are transformed to be growth rate in order to avoid the nonstationary data and spurious regression. We then fit the model using PR-EM based normal, student-t and skewed student-t mixture distributions as was described in Section 2. To compare them, we consider the Bayesian information criterion (BIC), Mean Square Error (MSE), and the value of the estimated log-likelihood function. Table 2 presents the obtained results for making a comparison. According to table 3, we can observe that the skewed student-t mixture provides the lowest MSE and BIC, indicating the best mixture distribution is the skewed student-t. In addition, we compare the best fit PR-EM method with the least squares (LS) method to confirm the performance of the PR-EM over the conventional method and the result reveals the superiority of PR-EM based skewed student-t over LS in terms of the lower MSE.

Conclusions
Our proposed method generalizes the recent work of [6]. We generalize PR-EM algorithm by introducing various distribution, namely student-t, skewed normal and skewed student-t distributions, to construct the mixture density in PR-EM. Thus, the estimation becomes more flexible of capturing almost all possible mixture which have not been studied in earlier literature. This robust estimation also simultaneously accommodates asymmetry and heavy tails and thereby allowing researcher to analyze data in a flexible way from different areas. We conduct a simulation and experiment study and the assessment of the PR-EM was based on the average of the MSE of parameters in the framework of a linear regression. The numerical study shows that the PR-EM algorithm exhibits the best performance and has good finite sample performance when it estimates the correct model. Last but not least, we apply this approach to study the effect of Chinese domestic factors such as Gross Domestic Product (GDP), monetary based (M1), and domestic credit (DC) on exchange market pressure (EMP) and find that PR-EM algorithm based on skewed student-t is perform better when compared with other mixing distributions including LS.