Autocorrelated process control: Geometric Brownian Motion approach versus Box-Jenkins approach

Existing of autocorrelation will bring a significant effect on the performance and accuracy of process control if the problem does not handle carefully. When dealing with autocorrelated process, Box-Jenkins method will be preferred because of the popularity. However, the computation of Box-Jenkins method is too complicated and challenging which cause of time-consuming. Therefore, an alternative method which known as Geometric Brownian Motion (GBM) is introduced to monitor the autocorrelated process. One real case of furnace temperature data is conducted to compare the performance of Box-Jenkins and GBM methods in monitoring autocorrelation process. Both methods give the same results in terms of model accuracy and monitoring process control. Yet, GBM is superior compared to Box-Jenkins method due to its simplicity and practically with shorter computational time.


Introduction
There is a major change in modern manufacturing industries when 'surpass customer expectation' becomes a philosophy of quality since the late 1990s. Industries believe that the importance to stay competitive is by producing not only the high quality of process and products but also a creative, innovative and useful with pleasing unexpected features. However, those criteria are not likely to be static, and will certainly be changed based on time and demands. In practice, a fundamental idea to improve the quality of process and products is realized by reducing the process variability. If the process variability is small then the quality will be high.
The most popular control charts developed by Shewhart have wide application in industry for monitoring and improving the manufacturing process with the assumption that observations are uncorrelated. However, that assumption is not always valid and successive observations show autocorrelation. [1] mentioned that in manufacturing system, one of the most challenging issue is autocorrelation. [2] and [3] observed that the presence of autocorrelation among observations leads to false alarms and misleading conclusions about the control state of the process. So, it is necessary to eliminate the autocorrelation before use any statistical process control to detect the assignable causes [4]. If the existing of autocorrelation will not take care carefully, it will affect the quality of the products and cause huge loss for manufacturing industries. There are two major approaches in dealing with autocorrelated process data in process control; i) residual-based approaches and ii) methods that modify control limits to adjust for autocorrelation. That is the main focus for this paper.
In order to control autocorrelation in a process, usually, Box-Jenkins method will be considered first due to its popularity as can be seen in [5], [6], [7], [8], [9], [10] and [3]. Box-Jenkins method uses the concept of three components which are component of autoregressive (AR), the integration filter (I) and the moving average (MA). However, Box-Jenkins has its own disadvantages. First, this method need of time series data with a minimum of 50 observations which is sometimes difficult to obtain when 2 1234567890 ''""  [11]. Besides that, Box-Jenkins method is more complicated to apply [12] and time-consuming [13] which is not suitable choice for a practitioners that need a fast result and prompt action. [6] mentioned that we just need suitable transformation instead of using sophisticated time series modelling. Hence, another alternative method is proposed to overcome this problem which is Geometric Brownian Motion (GBM). The reason why GBM is preferred due to it is relatively easy to implement and it does not use much data to predict the future [14]. The application of GBM can be seen in [15], [16], [14], and [17]. The objectives of this research are to develop GBM and Box-Jenkins model and the process quality will be monitored by using both methods. The accuracy of both models also will be compared and measured using mean absolute percentage error. To start with, in the next section we begin our discussion with the methodology of Box-Jenkins and GBM method. Section 3 presents the models obtained and monitoring process control. Conclusion in Section 4 will close this discussion.

Methodology
In this research, a set of univariate manufacturing data which is furnace temperature data was used. An oven that can heat to above 1000 o C (1832 o F) is in a term as a ceramic furnace. Hence, a total of eighty observations of furnace's temperature was recorded. These observations record the consecutive hourly temperature reading of thermocouple that placed inside the large ceramics furnace. These eighty-hour time segment represented a stable period that the process engineers used as a baseline for a good process performance.

Box-Jenkins Method
Box-Jenkins analysis refers to a systematic method of identifying, fitting, checking, and using integrated autoregressive, moving average (ARIMA) time series models. This practice can be seen in the literature such as, for example, [18] and [19]. The general non-seasonal time series is denoted as ARIMA (p,d,q): where t  and t  are coefficients, d is the degree of the first differencing involved, p and q are the orders of autoregressive and moving average polynomials, respectively. Box-Jenkins method refers to the application of three steps which are identification, estimation, and diagnostic checking. Model identification to select values of d and then p and q in the ARIMA (p,d,q) model. Mann-Kendall test [20] and Dickey-Fuller test [21] will be used to test the data behaviors. Mann-Kendall test is used to identify trend behavior, meanwhile Dickey-Fuller test is used to verify the stationary characteristic.
After done the testing, autocorrelation plot (ACF) and partial autocorrelation plot (PACF) will be used to choose a suitable model for the data based on identifying the pattern of the plots. The following table shows the way to identify the model [22]. Next step is an estimation process. In this step, the parameter for Box-Jenkins model will be estimated. The final step is diagnostic checking. The checking is based on studying the autocorrelation plots of the residuals. Another test will be conducted by using Minitab is Ljung-Box test. Ljung-Box test is used to determine the correlation of error in ARIMA model. Besides that, mean absolute  Moreover, Akaike information criterion (AIC) will also be conducted by using Microsoft Excel add-in to find the best model. These steps are applied iteratively until step three does not produce any improvement in the model.

Geometric Brownian Motion (GBM)
GBM is an extension of Brownian Motion [14]. The details fundamental of GBM can be found in [23].
is the Wiener process, ii.  and  are the drift parameter and variance parameter respectively, iii. t  has identical independent standard normal distribution.
Let 0 x be an initial value, the solution of equation (2) can be written as: Hence, for each t, t x has a lognormal distribution. For dependent data, transformation of log-return is needed and shown as below, Eq (5) specifies the logarithmic return at time n in terms of the logarithmic return one time period earlier, it is so called AR (1) which is autoregressive with the parameter, p =1. Under this model, the predicted value of t X is where ĉ and  are the maximum likelihood estimates in the regression model in Eq (5).
The construction of GBM starts by checking whether the data set is under GBM properties which are i) test that the log-returns are independent and identically normally distributed, ii) test the normality of log returns [23]. To confirm the absence of autocorrelation in (i) and (ii), Durbin Watson Test and Anderson Darling test are used, respectively. There is no need to monitor the data using complicated models if the two assumptions fulfilled by a time series data [24].

Monitoring Process Control
A control chart is a graphical display of a quality characteristic that can let most people understand the meaning of the chart easily. The control chart for autocorrelated process can be monitored by using the residual i e obtained from î e y y  where i y and ˆi y represents the number of observation and predicted values of the response variable for each observation [25]. The two horizontal line in the chart which are called as upper control limit (UCL) and lower control limit (LCL). These control limits will ensure that the process is in control as long as the values of i e plotted within these two limits, and there is no further action taken. The equation of both control limits is shown below. 3 where x is average mean of residual and  is the standard deviation of residual.

Box-Jenkins Method
In order to find a suitable ARIMA model to control the error, it is necessary to identify the behavior of the data. Time series plot is used to evaluate the patterns of data and its behavior, as can be seen in Figure 1. From Figure 1, a trend presence in this data and it is suspected has non-stationary characteristic based on visual inspection. The p-value obtained from Mann-Kendall test is 0.129 which greater than alpha value of 0.05 and it indicated that there is no trend in the series. Meanwhile, Dickey-Fuller test gives a p-value of 0.056 which is greater than the alpha value 0.05. Hence, it can be concluded that furnace temperature data is not stationary. Due to its unstable mean, differencing is needed in order to stabilize the data. The time series plot after first differencing is constructed in Figure 2 and visually revealed that the data distributes randomly around zero. Next, Autocorrelated Function (ACF) plot and Partial Autocorrelated Function (PACF) plot are plotted to determine a suitable ARIMA model. There are a lot of possible models can be selected to monitor and control the process. These models can be referred to Table 3.
where all the values are coefficients, t x is predicted value and t e is residual for period t. In order to determine whether there is any autocorrelation occurs in the residuals for ARIMA (2,1,1) model, ACF and PACF plots of residuals for furnace temperature are constructed in Figure 3 and Figure 4 respectively.

Geometric Brownian Motion
In order to fulfil the two assumptions of GBM; normality test and independent test was conducted. Based on Figure 5, the value of Anderson-Darling Test, 2 A computed is 0.2737 which is less than the critical value 0.754 ( =0.05). Hence, random samples follow normal distribution. In Figure 5, the pattern of the QQ plot indicates that it is normal which satisfy the first assumption of GBM. The scatter plot in Figure 6 strongly indicates that it has a trend and the coefficient correlation r is 0.7103. This result can be supported by using Durbin-Watson test to test the independent which the value is 1.5015, less than D 1.662 U  and D 1.611 L  at 5% significant level. The value of D U and D L can be obtained from https://www3.nd.edu/~wevans1/econ30331/Durbin_Watson_tables.pdf. Hence, it is correlated to each other and dependent. Therefore, log-returns need to be computed which is used to transform the furnace temperature data into independent form. After computed the log-return, the run chart was constructed as in Figure 7 and shows that all the points approach to zero and there is a stationary pattern. The normality and independency of logreturn will be tested by using Q-Q plot and scatter plot respectively.  Based on Figure 8, Q-Q plot shows that log-return is normal since p-value greater than the critical value 0.05. The presence of autocorrelation in the data is visualized again by using lag-1 scatter plot as presented in Figure 9. This figure clearly indicates that it is independent due to the random pattern of the scatter plot. Meanwhile, Durbin-Watson test is conducted too in order to ensure log-return is independent. Durbin-Watson test is 1.8952 which is greater than D 1.662 U  and D 1.611 L  at 5% significant level. This meant no correlation exists. To find the fitted model, we calculate the estimates parameter of ĉ and  as in Eq (5). Accordingly, then the fitted model is  (11) After the fitted model had been formed, the accuracy of the model needs to be measured. The accuracy of fitted GBM model is tested by using MAPE which is 0.0197%. The value of MAPE is considered highly accurate based on Table 2. Hence, the GBM model is suitable to monitor the process.

Monitoring process of two methods
Fitted model for Box-Jenkins method and GBM method are used to compute residuals as in Eq (7) in order to plot control chart. Table 4 consists of the information to construct the residual chart for ARIMA(2,1,1) and GBM.  When the control limits and center line are computed, the residual chart for ARIMA model will be plotted in Figure 10.  Figure 10, there is no signal of out of control because all the point is between the upper and lower limit. The process is in control. Meanwhile for GBM since all the points also within the control limits, then the process is in control as can be seen in Figure 11. The residual control chart for GBM method shows that it is in control and the pattern is same as Box-Jenkins method.

Conclusion
Based on the results, furnace temperature data is fit into ARIMA (2,1,1) model with MAPE of 0.0189% and the fitted model is shown below. The furnace temperature data is also fitted into GBM model with MAPE of 0.0197% and the fitted model is also shown below. shows that there is no difference in terms of accuracy and monitoring residuals in the process between these two methods. However, in manufacturing industry when people always working with a tremendous task and data that need to be monitored, dealing with a complicated method is unpractical. Although some people prefer to use time series model method, it is still difficult for the person who is lack of forecasting knowledge to apply. During the computational process, it shows that Box-Jenkins method is difficult to apply since there is a lot of possible models can be selected. Besides, Box-Jenkins method is a time-consuming due too many steps are required to be conducted. When it is compared to GBM method, GBM is easier to implement and time saver. This is because it does not need special statistical skills except logarithmic transformation and parameter estimation for regression. Moreover, it only requires short computation time. In conclusion, GBM method is easier to implement, time-saving and maintain high accuracy when compared with the Box-Jenkins method. Thus, a simple mathematical model as GBM is required to assist the practitioners to make a prompt action.