Classification of Imbalanced leukocytes Dataset using ANN-based Deep Learning

Nowadays, classification of imbalanced data is a major challenge in the machine learning (ML) algorithms, especially in medical data analysis, In this paper, deep learning algorithm which is the advance artificial neural network (ANN) is used for classifying five white blood cells (WBCs). Different preprocessing image techniques and algorithms are applied to isolate WBCs and segment the nucleus for the cytoplasm. Geometric, statistical and color features are extracted, the principal component analysis technique is applied to select the optimal features. The classification process has been repeated several times to tune the algorithm parameters and to find the best pattrens match through the training data in the learning process until achieve best classification accuracy. Multi-class classification results show high accuracy of more than 94% for the five types of WBCs. We evaluate the classification model using the geometric mean, Cohen’s Kappa, Receiver operating characteristic curve, Root mean squared error, relative absolute error and cross-validation techniques. The algorithm model achieves high accuracy and can conduct a multi-class classification of imbalanced datasets in terms of the above-mentioned metrics.


INTRODUCTION
Imbalanced classes are a common problem in machine learning (ML) classification. In this classification problem, the classes are unequally represented. Classification of imbalanced datasets is an inevitable problem in medical intelligent diagnosis because medical data have limited samples and high-dimensional features. This condition affects the classification performance of the model and causes inaccurate guidance for the diagnosis of diseases. Classification of imbalanced datasets is a challenging predictive modelling task due to the severely skewed classes distribution. This condition causes an inferior performance of traditional ML models and evaluation metrics that assumed a balanced classes distribution (Zhang, 2018), (Wang, 2016), (Ali, 2015). Predictive modelling of such classification is complex given that most ML algorithms used for classification are considered under the assumption of an equal number of examples for each class (Korfiatis, 2019) (Sun, 2007).
A classification problem may be a little skewed, namely, it has a slight imbalance. Otherwise, the classification problem may have a severe imbalance, in which enormous examples may be present in one class and tens of examples in another class for a given training dataset (Géron, 2019), (Vanhoeyveld, 2018). The consequence of this condition lead to a poor predictive performance of the models, specifically for the minority class. The classification errors are more sensitive for the minority class than the majority class so that, The minority class is more important than the majority class (Yousefi, 2019) (Lemaître, 2017). Classes imbalance occurs when one class is highly represented in the dataset compared with others. Datasets are divided into two categories: balanced and imbalanced datasets. In imbalanced datasets, instances are divided into two sets: majority instances, which are the most frequent ones; and minority instances, which are the less frequent ones (Akbani, 2004 (Salehinejad, 2018). Imbalanced data should be given attention due to two reasons: accuracy paradox and algorithm bias towards the highly represented class.
Leukocytes perform an important role in the immune system by protecting the body from infectious diseases. The number of WBCs and the count of different WBCs play a crucial role in clinical diagnoses and tests: they reflect the hidden infection within the body and alert the haematologist. The presence of WBCs in the medical world is also used as an indicator of various types of diseases, such as leukaemia, immunological disorders and certain types of cancer (Fan, 2019), (Yildirim, 2019),(AL-Dulaimi, 2018). WBCs can be considered into five types: eosinophils, lymphocytes, neutrophils, monocytes and basophils. Neutrophils are the most abundant with 50%-70% of WBCs, and they are responsible for defending against bacteria or fungal infection. Eosinophils occupy around 2%-4% of WBCs and act in response to allergies and parasite infection. Lymphocytes have 25%-45% of WBCs and undertake the task of the specific recognition of foreign agents and the consequent removal from the host. Monocytes are of 3%-8% of WBCs and effective in direct destruction of pathogens and clean-up of debris from infection sites. Basophils are the least common type of WBCs (Negm, 2018) (Gonzalez Viejo, 2019), (AL-Dulaimi, 2018), (Doan, 1954).
Currently, Artificial neural networks based Deep learning (ANN's) represents an optimal selection in image applications for medical purposes, such as medical image detection and classification. ANNs achieved preferable results subjected to availability to engh data for deep neural networks training process. Also, it required a high-quality computational resource to train and test the required size of the medical data. The limited data set may affect negatively on the deep neural networks training process from scratch in many cases. In such scenario, transfer learning can be utilized to leverage the power of applied Algorithm and reduce the computational costs (Sahlol, 2020) (Nguyen, 2018) (LeCun, 2015). In this Algorithm, the deep neural network is firstly pre-trained on the various and considerable general image dataset and then has been applied for selected task. These predictive modelling problems are challenging because an enough representative number of examples of each class are required for a model to learn the problems. The difficulty is evident when the number of examples in each class is imbalanced or skewed towards one or a few of the classes have very few examples of other classes. Problems of this type are referred to as imbalanced multi-class classification problems, and they require the careful design of an evaluation metric and test harness and choice of ML models (Das, 2020) (Madasamy, 2017).
An inclusive literature review was applied to addressing class imbalance with deep learning algorithms. In the 1990s, back-propagation algorithm deep neural networks applied to discover the effects of class imbalance by Anand et al. The researchers indicated to the difference length of the gradient component between minority class and majority class as they proved that the minority class has the shortest one (Khan, 2017). Essentially, the majority class has dominating role on the net gradient which in charge of updating the model weights as Ben mentioned in his work in the 2019. This condition decreases the error of the majority group very rapidly in the precocious repetition but often growths the error of the minority group that lead the network to get stuck in a slow mode of convergence (Anand R, 1993). Khan et al in 2015, showed that during training phase, effective cost-sensitive of the deep learning that learns simultaneously the network weight parameters and class misclassification feedbacks. Johnson et al. studied the problems of deep learning with a class imbalance to enhance understanding the efficacy of deep learning when applied to the imbalanced class datasets (Johnson, 2019). The methods for usage class imbalance in ML can be classify into three groups: algorithm-level methods, data-level techniques and the hybrid method (Polat, 2018) (L, 2016). The data-level techniques try to minimize the level of imbalance over several data sampling approches. Where the algorithm-level approches are generally performed with a cost or weight sketch, including adjusting the implied learner or its outcomes to decrease the bias across the majority classes. Hybrid methods integrate the sampling and algorithmic technique in strategic way (Ali H. S., 2019) (Li, 2017).
The most common evaluation method that use to evaluate the classification performance is the accuracy rate. However, in the basis of imbalanced datasets, accuracy is not an efficient measurement method because it does not identify the numbers of rightly classified samples of different classes. This inadequacy may drive to mistaken conclusions. (Tyagi, 2020), (FernáNdez, 2013) In imbalanced fields, the confusion matrix has been adopted as a significant evaluation method to measure the classification performance of positive and negative classes independently to evaluate the performance of the classifier. (Thabtah, 2020), (García, 2010).
This work aims to apply ANN-based deep learning with imbalanced class for multi-class classification of normal and abnormal WBC datasets. Loss functions in different forms have been suggested to ensure achieving high 3 classification accuracy by making learning algorithm more sensitive to the minority class. Presently, the mostly adopted loss function in the deep learning algorithms is the mean squared error .

Materials and Methodology
The first step in this study is data collection. Thereafter, we suggest a classification scheme with four main stages, as shown in Figure 1. The first stage is data collection. The images are preprocessed to segment WBCs from the whole image. After WBCs are segmented, feature extraction and selection are performed in the second stage. In the third stage, ANN as the ML algorithm is applied using Waikato Environment for Knowledge Analysis (Weka) software to classify five types of WBCs. The classification performance is evaluated using different evaluation metrics in the fourth stage.

Preprocessing
Our dataset is composed of 180 blood smear images; each image includes at least one WBC, as shown in Figure 2. The images are obtained from the Department of Laboratories at Al-Hilla Teaching Hospital under the approval of the consultant pathologist Dr. Ali Zaki Naji. Several image preprocessing steps include one or more algorithms and techniques to isolate the individual WBCs, as shown in Figure 2. The perfect segmentation process is performed following [40]. The segmentation process aims to isolate the nucleus for the cytoplasm for extracting its features. Each step includes several methods and algorithms; segmentation is the most important step in processing medical images [41]. Failure of the above-mentioned step affects the sensitivity level. The sensitivity lost in this step cannot be recovered in the later steps [42]. The main processes are gap filling, noise cleaning, removal of small black and white cut outs using seed filling, image resizing using bilinear interpolation method and edge smoothing using mean filter technique. A total of 1000 individual WBCs are obtained. Each image has a size of 354 × 362. The images have been hand labelled by a pathologist and are collected from an existing dataset.

Identification of WBCs
The process obtains 1000 WBCs for five classes, and each class includes different numbers of cells, as shown in Table 1. WBCs contain nuclei and cytoplasms and can be categorised into five classes, which are mentioned earlier.
The idea of most existing methods is to identify the nuclei firstly because they are more noticeable than other components. The nucleus of each type has a unique shape, which is the most important feature of it [43] [44]. The nucleus of neutrophils has a 'U' shape or a curved rod before segmentation. Basophils are the smallest circulating granulocytes, and their granules are large and very numerous. Eosinophil are simply distinguished in stained smears by their massive granules. The nucleus of eosinophil often has two lobes connected by a band of nuclear materials. Their granules grab onto a large amount of basic dyes and have deep blue purple colour. Monocytes represent the biggest's kind of WBCs. but they have only single rarely or barely lobed nucleus. Folds can be observed in the nucleus, which can be of different shapes as round, lobular and kidney like. The nucleus of lymphocytes is a bit oval or round and stained dark [45] [46].

Feature Extraction
The process of features extraction determines a group of features or characteristics that sampolize to the significant information reqired for classification task [47]. The features extraction has an essential role of the automatic WBC classification system performance. The main WBC features extracted from WBCs are the number of nuclei, the colour of cytoplasm and the entire cell, as shown in Table 1. Three types of features, namely, geometrical, statistical and colour features, are extracted for classification. Most existing feature methods extract geometrical features, including WBC size, nucleus cell and nucleus shape [48]. Textural features comprise statistical features, such as momentum and contrast. Colour features use variance and histogram. A total of 34 of these features are extracted in this study from three-colour (RGB) and grey images.

Classification Task
The sampling process was applied after the preprocessing stage then the deep neural classification model was implemented. One run of 10-fold crossvalidation is executed for each experiment to ensure the model learns from all observations. An algorithm with hyper-parameter 'balanced' class weight is used as an algorithm-level technique because of its simplicity and common application in credit risk. This research applies ANN algorithms with several 3  [50]. Thus, training ANN for learning is not ideal. SPSS version 20 is used to clean the dataset to achieve a chisquare statistic of 10.73 and a Sig of 0.030. Thus, the assumption is satisfied. Thereafter, feature reduction is performed using principal component analysis (PCA) to reduce poor-quality features. As a result, only four components, namely, the size of WBCs, the shape and number of nuclei and the colour of cytoplasm (bluish and reddish), are input to the ANN algorithm. The classification task is performed using Weka software. The classification task requires adjusting the parameters, including the number of clusters, clustering seed and the ridge parameters for the linear regression. These parameters are experimentally determined. The ridge parameter is tested and has a value of 1×10−8.

RESULTS
The classification results of five types of WBCs under an imbalanced dataset using ANN-based deep learning algorithm show a high classification accuracy of WBCs of more than 94% after several adjustments to the parameters, namely, the number of clusters, clustering seed and the ridge parameter, as shown in Table 3. The best accuracy is achieved after 100 attempts to balance the parameters of ANN algorithms, as shown in Figure 3 and Table 2. Figure 4 shows the correct and incorrect classification of WBC types.

Evaluation of Algorithm model
The performance of the classification model is evaluated using the main types of metrics for evaluating classifier models, namely, geometric mean (GM) of true rates, threshold and probability. Threshold metrics quantify classification prediction errors. In this study, the relative absolute error (RAE) and the root mean squared error (RMES) are used for evaluation.
The receiver operating characteristic (ROC) curve is a ranking metric parameter and summarises the behaviour of a model by calculating the false and true positive rates. Kappa refers to a fundamental scale that measure the performance of classification system, in particular the imbalanced datasets. It shows the powerful classifier performance compared with the hypothesis of target distribution. Table 4 shows the evaluation metrics of ANN classification model. Figure 5 shows the ROC curves of classification of WBC types. GM, ROC and Kappa can be defined as the following equations.

Kappa Statistic
Where; Pr (a) represents the actual observed agreement Pr (e) represents chance agreement.

Conclusion
Classification of WBCs is an important task in medical diagnosis because it can signal the presence of diseases, especially cancerous diseases. Each type of WBCs causes diseases that differ from the others. Therefore, pathologists need to distinguish amongst the types of WBCs to diagnose diseases. This research classifies five types of WBCs using imbalanced data samples by ANN-based deep leaning as an ML algorithm classifier. The results show that the classification task performs effectively by 10-fold cross-validation and tuning of parameters, namely, number of clusters and clustering seed. The classification accuracy is average for the five types more than 94%. The analysis of statistical features indicates a strong negative correlation relationship between the number of nuclei and the size of WBCs. A negative relationship also exists between the number of nuclei and the color of cytoplasm (bluish and reddish). The classification accuracy achieves a GM of 0.806. The obtained Kappa value 0.92 shows that the classification model can conduct multiclass classification of imbalanced class problems effectively. The RMSE 0.275 shows that the model can relatively predict the data accurately with an acceptable RAE 9.50.