Forecasting patient admission in orthopedic clinic at a hospital in Kuantan using autoregressive integrated moving average (ARIMA) models

This study is an attempt to examine empirically the best ARIMA model for forecasting. The monthly time series data routinely-collected at Orthopedic clinic from January 2013 until June 2018 have been used for this purpose. At first the stationarity condition of the data series is observed by ACF and PACF plots, then checked using the Ljung-Box-Pierce Q-statistic. It has been found that the monthly time series data of the Orthopedic clinic are stationary. The best ARIMA model has been selected by using the MAPE. To select the best ARIMA model the data split into two periods, viz. estimation period and validation period. The model for which the values of MAPE are smallest is considered as the best model. Hence, ARIMA (1, 0, 0) is found as the best model for forecasting the Orthopedic clinic data series. The out of sample forecast by using ARIMA (1, 0, 0) model indicated a fluctuation of monthly orthopedic patients demand, from lowest was 294 and the highest was 299 patients that could receive treatment from the clinic in a month.


Introduction
As the economy and society continue to develop, effective hospital management has become increasingly important. Hospitals are facing many problems such as the aging of our domestic population, pandemic diseases like H1N1, and rising operational and maintenance costs. The healthcare sector in Malaysia has been expanded over the past few years. Accordingly, The private hospital is expected to continue to grow as a result of private insurance benefits and increased percapital income [1]. Adding to this is the aging of the population and the rising wealthy class in Malaysia, the variety of medical insurance products available and the health-conscious society lead to strong market growth in private hospital sector [2].
Another factor that boosts the expansion of private hospital is medical tourism. Malaysia has been rated as the world's fourth best medical tourism destination [3]. Malaysia attracts medical tourists because the availability of good healthcare facilities at low cost and together with its favorable exchange rate, political and economic stability and high rate of literacy. Furthermore, the Malaysian government supporting medical tourism as part of the country's current 5-year development blueprint by incorporating in the Eleventh Malaysia Plan, which covers from 2016 to 2020 [4].
Hence, to be able to plan effectively, it is vital for a private hospital to anticipate future. A private hospital's ability to predict hospital demand can have a significant impact on both customer service and financial performance [5]. The accurate prediction would facilitate, at a micro level, for example, scheduling nursing and support personnel, and at the macro level for financial and strategic planning for the hospital. Forecasting can also be a great help in controlling resources for the hospital, thereby improving efficiency, reducing costs and increasing profitability.
Consequently, there may be unexpected fluctuations in demand for a hospital service. The gradual increase in the number of patients starts putting pressure on a hospital's available staff and strain facility capacities of a hospital. This capacity crisis has resulted in significant operational gaps, including an increase in patients waiting to be admitted into wards, and long wait time for laboratory, radiology, and other diagnostic services. The implications of this situation would impact the overwhelmed quality of medical staff's diagnosis and prognosis, resulting in increased work dissatisfaction [5]. If this variance can be reduced, higher efficiency can be reached by the hospital. Projection of demand will allow hospital management to plan for these changes and prevent unintended mistake. This research aims to establish parsimony and active models that could forecast the number of patients present on a monthly basis, thereby helping the hospital's management for a better planning.

Literature review
Forecasting has been an important research topic for decades to forecast patients receiving treatment in hospitals. Predictive models are an essential part of the forecasting process. The model building process should take into account the relationships between individual variables and the system as a whole must be checked to ensure the model's forecast capability. This research used ARIMA (autoregressive integrated moving average) model to predict patient admission as it is richer with information-related time series than other similar methods [6]. The finding of relevant studies are briefly described below.
The method of forecasting is used to predict patient admission at Hospital Medicine (HM) in the United States and discover that the approaches predict HM admission accurately [7]. Accordingly, the ARIMA model is appropriate for forecasting the emergency patient admissions at the hospital in Athens [8]. Meanwhile, a study comparing ARIMA and artificial intelligence methods to predict routine hospital admissions due to circulatory and respiratory problems in Madrid, results suggesting that ARIMA model approaches can solve the problem properly [9].
Reis and Mandl developed ARIMA models to provide a foundation for real-time surveillance and bioterrorism detection in Children's Hospital Boston [10]. They forecast overall hospital visits and respiratory-related patient visits at the pediatric emergency department. The results showed that overall visits and respiratory visits are best fitted by ARIMA(2,0,1) and ARIMA(1,0,1) processes respectively. The mean absolute percentage error (MAPE) of the overall visits model was 9.37% and respiratory model was 27.54%. The ARIMA model of overall visits was able to forecast 7 day-long abnormal visit patterns.
A hospital could be able to plan elective admissions properly through accurately predicting emergency patient admissions. Subsequently, Jones et al. developed several ARIMA models of daily hospital bed occupancy from emergency admissions and the number of emergency admissions of Bromley Hospitals NHS Trust [11]. Their models are able to forecast beds occupancy with good accuracy. Champion et al. study supports the finding [12]. They developed ARIMA models to forecast monthly emergency patient arrivals to the emergency department of a hospital in regional Victoria, Australia. They found that ARIMA (0, 1, 1) model is fairly adaptable to the type of data. Thus, they concluded that time series analysis is a useful tool for forecasting emergency department admissions. Abraham et al. further confirmed the findings [13]. They compared several ARIMA models for forecasting daily inpatient admissions and occupancy to Royal Melbourne Hospital, Victoria, Australia. They found that the models are able to forecast emergency occupancy up to seven days ahead with reasonable accuracy.
In epidemic disease forecasting, Promprou et al. applied ARIMA models to forecast dengue hemorrhagic fever (DHF) cases in Southern Thailand [14]. They established ARIMA (1, 0, 1) model and found that the models are very useful in forecasting the DHF in Southern Thailand. Li et al. supports this finding, suggesting that the models of ARIMA fit the fluctuations in the incidence of hemorrhagic fever with renal syndrome in China [15 ]. In the meantime, Permanasari et al. prediction number of malaria incidence in human by using Seasonal Autoregressive Integrated Moving Average (SARIMA) and found that the forecasting model able to support and provide a good prediction of the incidence [16]. As such, they recommended that the models may offer the potential for improving planning, control and prevention of the diseases that can be utilized by the health authority.

Data
For the purpose of time series modelling in this study, data from the Orthopedic clinic of the a private hospital in Kuantan were used. The data were based on the count of patients receiving treatments at the orthopedic clinics and obtained from the admission record books of the clinic. The monthly data from first 60 observations (January of 2013 to December of 2017) were used as training data on which the models were estimated. The subsequent 6 observations (January to June 2018) were kept as hold out sample to verify the accuracy of the forecasting models.

Methodology
The data from the clinics are collected in the sequence of time, therefore modelling with the time series technique is possible. Particularly, the Box-Jenkins approach to autoregressive integrated moving average or ARIMA (p, d, q) models, where p is the auto-regression (AR), d is the differencing or integration (I) and q is moving-average (MA). All three are based on the simple concept of random disturbances or shocks. ARIMA models are usually formulated with the premise of constant variance in the error term [11]. For ARIMA(p,d,q) model, the AR(p) must be stationary and MA(q) must be invertible [17,18]. ARIMA (p, d, q) is popularly applied in empirical studies among researchers in the health care industry [15,10,19].
A brief description of the time series models and definitions used in this study are as following paragraph. In practice, most of the time series are non-stationary. According to Bowerman et al. Box-Jenkins methodology required time series use in forecasting to be stationary, i.e., stationarity implying that the time series is invariant with respect to time and the mean is constant through time [17]. As such, non-stationary time series must be transformed into stationary time series values. The transformation process can be done through differencing or integration. The non-seasonal mixed ARIMA (p,q) model is applied when data show evidence of non-stationarity and can be expressed in equation form of Where t Z is a realization of the time-series,  and  are parameters of the model and a is an Independent and Identically Distributed (IID) error term with mean of zero and constant variance.
The Autoregression (ARIMA) model is non-seasonal autoregressive model of order p. In an autoregressive (AR) process, each value in a series at time t depends only on its previous values and on a random noise. It is a linear function of the previous value or values. That is, in order to make a forecast one needs to know the p previous values. AR assumes that the future values of an examined variable may be approximated and forecasted by its own previous values. That is, its past behaviors may suggest important information regarding its near future dynamics [20]. In a first-order autoregressive process, only the single preceding value is used; in a second-order process, the two preceding values are used, and so on. These processes are commonly indicated by the notation AR (n) or ARIMA (n, 0, 0), where the number in parentheses indicates the order. is non-seasonal moving average model of order q. The moving-average (MA) component of an ARIMA model tries to predict future values of the series based on deviations from the series mean observed for previous values. In a moving-average process, each value is determined by the weighted average of the current disturbance and one or more previous disturbances. The order of the moving-average process specifies how many previous disturbances are averaged into the new value. In the standard notation, an MA (n) or ARIMA (0,0,n) process uses n previous disturbances along with the current one.
The time series model building technique of Box-Jenkins is a method of finding a suitable ARIMA model that adequately represents the data set. The approach is an iterative that can go through the process several times before finding the suitable model. That is, the methodology is designed to arrive at ARIMA models through a three interactive stage procedure based on model identifications, model estimation, and diagnostic checking or model validation, and then utilizes the models for forecasting [see; 17,18,21].
The common method used to measure the accuracy of the forecast is the mean absolute percentage error (MAPE). A MAPE of 0% indicated a perfect fit of the model in the training data [10,21]. The mean absolute percentage error is calculated as: Where, A t is actual value in period t, F t is forecast value in period t, and n is the number of periods used in the calculation.   Figure 1 shows the time plot of the original values of time series seem to fluctuate around a constant mean, and, therefore, could be considered stationary. The graph also shows that is no seasonality and no trend in the data. Furthermore, figure 2 of the ACF of the series shows that the series dies down fairly quickly after lag two through damped sine-wave fashion. As such, the series is considered stationary. Since the time series is stationary without being difference or integration, the optimal degree of integration is determined to be zero (d = 0). By comparing the pair of correlograms in figure 2 and 3, we could tentatively determine the order of ARIMA models. Figure 3 shows that the PACF behaviour cut off fairly quickly after lag one as compare to figure 2 of ACF. Therefore, the autoregressive of order p is a possible ARIMA model. The single spike in the PACF strongly suggests that an AR (1) model is appropriate for monthly orthopedic patient demand. Consequently, AR (1) is then tested to determine adequate ARIMA (p,d,q) model from the available training data.

Results
From the analysis, finally, ARIMA (1, 0, 0) model (first order Autoregressive) is recognized to generate the appropriate results to represent monthly orthopedic demand. The following Table 1 shows the estimated parameters for ARIMA (1,0,0) model of monthly orthopedic demand.   Table 1 and following Bowerman et al., the stationary of the estimated model could be evaluated [17]. That is an AR (1) or ARIMA (1,0,0) model the coefficients  , must be less than 1.
The Table 1 shows  is 0.48505 which is less than 1. As such the model fulfills the stationarity condition and model was considered stable. Furthermore, the coefficients  has t-ratio greater than 2 with p-value less than 5 percent (p<.05).    Figure 6: ACF of error from ARIMA (1, 0, 0) model Diagnostic checking was performed to verify the ARIMA (1, 0, 0) model adequacy. Figure 4 and 5 show correlogram of the ACF and PACF of the residuals from ARIMA(1,0,0) model respectively. Figure 6 shows the autocorrelation plot of residuals of the ARIMA (1,0,0) model. By observing figure  4 and 5, we could conclude that ACF and PACF of the residuals lay within the boundary lines, indicating that overall the residual time series approximates zero mean white noise behaviour. From figure 6, we could observe that the autocorrelation of the residuals is small. Thus Ljung-Box Q statistic values are relatively small. The residuals of ARIMA (1, 0, 0) model also have a smaller standard errors and larger p-values (i.e., 05 .  ) corresponding to Ljung-Box Q statistic [see ; 17]. As such, we could conclude that the ARIMA (1, 0, 0) model is adequate and could be used in forecasting.

Forecasting monthly patients demand at Orthopaedic clinic
The main objective of this study is forecasting. Forecasting is to predict future values of a time series. Time series models developed are used to forecast future demand in the clinic. However, we need to evaluate the accuracy of the forecasting models over certain periods of time in order to identify the best model, i.e., the model that has little error as possible. The common method that usually employs is the mean absolute percentage error (MAPE). MAPE is used to measure of quality of fit that measure   Table 2 shows the actual monthly patients demand of Orthopaedic clinic from January 2018 to June 2018, forecasted patients demand from July 2018 to June 2019, and corresponding errors of forecasting for the period from January 2018 to June 2018. From table 2 it appears that the absolute percentage errors are high for May 2018 and June 2018. The forecast indicates increasing trend and moving toward constant of monthly orthopedic patients demand, being the lowest is 294 and the highest is 299 patients could receive treatment per month. The table is also presented that the ARIMA (1, 0, 0) model yielded a MAPE of 11.103 percent.

Discussion and conclusion
This study considers a number of ARIMA models during estimating and forecasting of patients demand at orthopedic clinics. The models were developed based on monthly time series data. Finally the best model was selected. Still, it is very essential that forecast needs to be updated as and when more data becomes available. Obviously, the forecasting models act as useful components of the healthcare system. However, the model should not be judged totally based on technical criteria such as MAPE, but the application of the model in real situation such as ability to predict patient's present in the clinic practically accurate should be given due consideration.
The ARIMA models are a useful tool for analyzing time series data and then make use of the model in forecasting. Forecasting offers the potential for improving planning in hospital service. The ARIMA