Assessment of Prediction Models of Confirmed, Recovered and Deceased cases due to COVID-19

Pandemic relates to a situation where any disease starts spreading geographically and affects a entire country or the whole world. So when an epidemic becomes pandemic, it really a question of our survival. COVID -19 has become a pandemic as we all know and needs real and underneath research on that. The procession of death is uncountable still now. It can cause significant economic, social, and political disruption. So it’s very necessary to know the impact of it on originating venue so that we can analyze its potential and rate of spreads. So to do this we have applied here some Machine learning algorithm and concepts of regression for prediction. In this present work we have made prediction model of confirmed cases, Recovered and death cases using K-Nearest Neighbour regressor and Gradient Boosting Regressor. The model performance is very good in predicting all the cases. The R squared value is very near to 1.


Introduction
Covid 19 is a cursed gift to the nation, due to which a Pandemic situation arises in the whole world. Our life has been changed now a days because our Earth has been infected by virus, COVID-19 -a novel corona virus. A virus which is merely a thing between living and non living has become a super powerful enemy against humanity posing a challenge to their existence [1][2]. The whole human race is looking for a weapon to combat against this invisible enemy.
The pandemic caused by Covid (the Corona Virus Infectious Disease) is the most devastating pandemic the world has faced up till now. This virus is suspected to be originated from a small animal market in Wuhan, China, named as Wuhan virus at first. A number of patients admitted to the hospitals in Wuhan with pneumonia like symptoms but were not responding to the usual treatment and within a few days the virus is found to be extremely contagious and rapidly spreaded in Wuhan, Hubei province of China, in December 2019 [3][4]. Today (15-8-2020) it has spreaded among 213 countries all over the world and total number of cases are more than 21 million and with 763387 numbers of deaths.
After the genome sequencing of the respiratory tract samples of the infected patients, the virus is isolated as new novel-beta-coronavirus and later it is identified as Severe Acute Respiratory Syndrome type 2 coronavirus that is SARS-Cov-2 virus. Before this SARS-COV-2 pandemic there were two other mentionable coronavirus pandemic or epidemic in the past [5] [7][8][9].
The global problem of the outbreak is being researched by scientists of different areas, a number methodology to analyze and predict the evolutions of this pandemic are being discussed in different forum [10][11][12]. Some mathematical model has been made to forecast some transmission dynamics [13][14][15]. In this paper we are going to make some prediction models which can predict the confirmed, recovered and deceased cases.

Methodology and Implementation
Here, we do prediction of effected or Confirmed cases, Recovered cases and Deceased cases. We have considered the total tenure of this epidemic or more specifically pandemic situation caused by COVID 19. Here we have considered the data of 247 days from 31 December, 2019 upto 31st August, 2020. The original dataset is shown in Table 1. It is arranged in total number of confirmed, recovered and deceased cases and specimen part of which is shown in table 2.   Figure 1. Visualisation of Confirmed, Recovered and Deceased cases Figure 1 is explaining how the Confirmed, Recovered and Deceased cases can be represented as per the pandemic days are going. Figure 2 represents the correlation among different attributes of the dataset. Figure 3 depicts the ground truth or original confirmed cases of COVID 19. Figure 4 depicts the predicted situation of confirmed cases using K-Nearest Neighbour Regressor. Figure 5 depicts the predicted situation of confirmed cases using Gradient Boosting Regressor.   Figure 9, Figure 10 and Figure 11 are showing the ground truth, prediction using K-Nearest Neighbour Regressor and Gradient Boosting Regressor respectively of the deceased cases.    Figure 11. Deceased: Prediction using Gradient Boosting Regressor

Result and Discussion
To measure the performance of our model we have calculated some relevant parameter like MSE and R-Squared metrics.
MSE (Mean Squared Error) is the average of squared differences between prediction and actual observation.
R-squared (Coefficient of determination) represents the coefficient of how well the values fit compared to the original values. The value from 0 to 1 interpreted as percentages. The higher the value is, the better the model is.
The above metrics can be expressed,

Conclusion
In this present work we have made prediction model of Confirmed cases, Recovered and Deceased or death cases using K-Nearest Neighbour Regressor and Gradient Boosting Regressor. The model performance is very good in predicting all the cases. The R squared value is very near to 1. In future some mathematical model Susceptible-Exposed-Infectious-Recovered (SEIR) based forecasting can be done and also using some different machine learning model we will do the prediction.