Pre-Diabetic Retinopathy identification using hybridGenetic Algorithm-Neural Network classifier

Diabetic retinopathy (DR) is one of the main prevalent diabetes problems, causing blurry vision and degeneration amongst adults of working age. The first symptoms of DR include Microaneurysms (MA). A Genetic Algorithm-Artificial Neural Network (GA-NN) technique is developed for early diagnosis of DR. There are five steps of the proposed framework. Image pre-processing is achieved using r-polynomial transformation. In the extraction, the K-means algorithm is used to segment blood vessels, and candidate patches were generated. Shape attributes, GLCM and LBP features have been derived from excluded blood vessel image and from patches separately. To achieve independent classification, GA-NN classifiers is employed. The ultimate decision system projects the MA or non-MA class labels by plurality voting for eachclassifier. This methodology was tested on two databases: e-Ophtha-MA and DIARETDB1. The e-ophtha-MA and DIARETDB1 datasets had AUCs of 0.89 and 0.87, respectively, on the receiver operating characteristic (ROC) curve.


Introduction
Diabetic Retinopathy (DR) is the disease a diabetic patient experiences that induces disruption of blood vessels in the eye, that might translate into impaired vision. It is the most common formof retinal damage in people with diabetes and can impact almost half a billion people worldwide [1]. In India, there is one ophthalmologist per 1,07,000 people, while urban areas have a ratio of 1: 9,000 and in rural areas there is only one ophthalmologist per 608,000 people [2]. In 2045, the number of diabetics in India will triple, and a third of them will be DR patients [3]. DR screening systems face multiple challenges including deployment, administration, grader availability and financial sustainability. In view of these, the computer-aided diagnostic methods are needed for the screening of such a large population and also this will aid doctors in the detection, understanding, and assessment of retinal anomalies [4] [5].
The Retina of human eye includes the fovea, macula, arteries of the blood and optic disc (OD). The ophthalmologist detects many traits, such as the microaneurysms (MA), retinal haemorrhages (HEM), exudates (EX), cotton-wool stains (CWS), irregular new veins, venous beadings, dilations, and segmentations in the diagnosis of disease. Nonproliferative diabetic retinopathy (NPDR) and proliferative diabetic retinopathy (PDR) are two terms used to describe diabetic retinopathy.
The MA, which is caused by leakage of tiny blood vessels in the eye and smaller in size and circular red spots on the retina, is the first clinical manifestation of DR. The development of MAs in the eye is the disease's first major symptom which is caused by Blood Vessel (BV) focal dilation. Once the MAs rupture, dot and blot hemorrhages emerge. Yellow stains form over a period of time from the trickle of blood's contents, lipids and proteins these are knownas exudates. The clinician classifies the condition based on the finding of lesions in the retina and their degree into three levels: mild, moderate, and extreme [6]. The early diagnosis of microaneurysms reduces the occurrence of late complications of DR.

Related Work
MAs detection can be achieved with a variety of techniques, pre-processing images may aid in the identification, by eliminating uneven lighting or reducing noise in the first phase. All potential ma candidates are found in candidate extraction phase. Finally, each candidate is classified based on their features [8].
Behdad et al. [14] suggested applying the multi-scale orientation gradient weighting approach to isolate MA candidates, which includes iterative thresholds based on pixel gradient weights. The Local Convergence Index Filter, intensity, and shape features are acquired and fed to RUSBoost classifier.
A system for detecting MA in fundus images was proposed by Bo Wu et al. [8]. To start, CLAHE is used as a pre-processing step, and then followed by methods for identifying peakvalued features and region-growing techniques for candidate extraction. A total of 27 attributes were derived based on the profile and the local attributes. In the end, the KNN classifier is utilized for class prediction.
A method for MA detection on Singular Spectrum Analysis was suggested by Su wang et al. [13]. This MA process includes a multi-layer filtering system which helps isolate the MA candidates. In that case, 11 attributes are derived directly from the filtered candidates depending on the cross-sectional profile. Eventually, the KNN classifier is used for labeling. Sandra Morales et al. [15] propose to use a Local Binary Pattern (LBP) texture descriptor to identify fundus images as DR, AMD, or normal. Distinguishing 144 characteristics are used without the need of any vessels segmentation. External CV and Weka software are used to determine which features to use and for classification.
The authors of the paper by Usman et al. [9] propose a three-stage detection method for MAs. Blood vessel segmentation is carried out for MA in the first stages of preprocessing, on the basis of multi-layered thresholds and Gabor filter. At the second stage, numerous lower-level featuresare extracted. Finally, to increase overall precision, a hybrid classifier composed of m-Mediods, GMM, and SVM is used.
Shailesh Kumar et al.
[10] developed a mechanism to suppress false positives by implementinga series of segmentations of the blood vessels and the optic disc. The fovea's positioning is carried out, too. Seven different characteristics based on structure and intensity were obtained for red lesion candidates, and radial base function neural network is used as a classifier. In [16] the bagging ensemble classifier for the MA classification is recommended. The author suggested three distinct procedures. First stage of image preprocessing, a mathematical closing operator will be used for the separation of vein and the region growing will be linked to updating the status of the MA candidates. 70 features were extracted. Finally, classifier is used.
Detection of lesions by ant colony optimization is proposed in [11], frangi filters are used for blood vessel segmentation. [12] proposed mathematical morphology-based method for the detection of MAs. By bottom-hat transformation technique both blood vessel and optic disc are eliminated. Smitha et al. [17] proposed the detection of DR by using cross sectional profiles and ANN is used as classifier.
This paper will sum up the main contributions in two points: (i) Most methods for the MA recognition use segmentation of the optical features including blood vessels, after which the candidate extraction is completed. The suggested technique involves a hybrid approach, in which the segmentation is carried out by k-means and the candidates are extracted without the use of segmentation by patch extraction. This is because as textural information around the candidates cannot be retrieved after segmentation. (ii) A decision-making model that integrates a GA-NN classifiers, which calculates a final prediction by voting consensus.
This article is arranged in the following order: Section 3 addresses the proposed procedure for identifying Microaneurysms. The experimental findings were summarized in the following section, along with the results of specific performance assessments. Section 5 lastly draws conclusions and discusses prospective work. Fig. 2 displays the flowchart of the approach discussed in this article. To achieve better MA contrast, the input RGB image is transformed into a green channel image. To render MAs more visible, image pre-processing methods are first used. The second stage is known as the candidate extraction. The k-means approach is used in segmentation of blood vessels and additionally MA and non-MA patches were obtained from the pre-processed image. Patches are extracted in order to avoid false-negatives arising from missing blood vessel segmentation. Third stage is the extraction of features; shape features are derived from segmented MA candidates and the GLCM, LBP characteristics are derived separately from the MA and non-MA patches. These individual attributes are assigned to three GA-NN classifiers. The final decision is based on the GA-NN classifiers majority result.

Image Pre-Processing
Eye images are often sporadic in intensity, poor contrast and background clutter. A prime objective behind this pre-processing step is the elimination of imperfections. To start, green plane from the original colour image is chosen because it offers the greatest visual contrast between the red lesions.  Since uneven background illumination could mask probable lesions, pixel intensity r-polynomial transformation is applied [18]. A Gaussian filter of width 5 and a standard deviation of 1 is applied to reduce noise. The resulting pre-processed image is as shownin the Fig.4.

Segmentation of blood vessels:
In this process vascular structure is isolated from the background. Using clustering method, the separation can be achieved. Clustering algorithms like K-Means use unsupervised means [19], and they are used to delineate the segment interest region from the rest of the image [20] [21]. By grouping or partitioning the data across K-centroids, the resulting data may be grouped in groups of K clusters or sections. This algorithm consists of five distinct steps, as defined here: Step1: Select the cluster number k Step2: For each data-set, pick C random point to be the centroids Step3: Allocate all points to the cluster centroid that is nearest to them. Calculate the euclidean distance to locate the cluster membership.
Step4: Recalculate the centroids of new clusters that have formed. To find a new centroid, carry out a new centroid estimation with the below equation, Step5: Repeat the procedure in steps 3 and 4 before no centroid moves. The following equation specifies the criterion for the collection of these:  Fig.5 shows the segmentation output with the K-mean algorithm.

A2
Perimeter-Periphery that encircles the area of candidate.

A3
Eccentricity-The distance ratio between the ellipse's focus and the main length of the axis. For a circular field, it is equal to zero.

A4
Extent-Ratio of the MA candidate pixels to the overall bounding box A5 MajoraxisLength-Length of the major axis of the ellipse in pixels.

A6
MinoraxisLength-In pixels, the length of ellipse's minor axis.

A7
Orientation-An angle between x axis and ellipse main axis.

Vascular Elimination:
After segmentation, morphological disc closure is performed as a structuring element that fills the holes within circular dark information that are MA candidates and is transformed into a binary image. The resulting MA candidates is as shown in Fig.6.

3.2.3.
Candidate Patch Extraction: To avoid false negatives and to analyze texture characteristics covering the lesions, patches were produced using pre-processed image only i.e., without blood vessel segmentation. Each MA lesion is represented × by a 128 128 square window, with annotated coordinates in the center. Non-MA patches have been extracted around the image in a random way.

Feature Extraction
Seven shape-based features were derived from the lesion candidates, which were depicted ina segmented image. The Gray Level Co-occurrence Matrix (GLCM) [22] and Local binary pattern (LBP) [23] characteristics of the patches were equally collected independently. A description of these features is provided in table 1.

ClassiFIcation
Global search capacity of GA is exploited to evolve the initial weights and biases of neural network. In this strategy, two distinct phases are employed. The GA-assisted initial weight and bias adaptation is used in the first phase, and back-propagation algorithm driven fine-tuning is used in the second. Inspired from the work of [24] GA-NN is implemented for this classification problem. The flowchart for the GA backed for the configuration of weights is shown in the Fig.7.The GA search algorithm is a population-based global search procedure. It involves initialization through a population of chromosomes. The neural network weights and biases are randomly initialized as genes of chromosomes. Each chromosome's fitness is measured using the fitness feature. One particular fitness function is the root mean square error (RMSE), which defines the error between real and expected outputs. Three stochastic evolutionary operators are implemented during each iteration of the algorithm: selection, crossover, and mutation. At the beginning, 100 random chromosomes were made. The fitness function is then computed based on the Root Mean Square of the untrained neural network performance. Parents were chosen using the Roulette Wheel Selection process. The two-point crossover approach is introduced with Pc=1 for crossover, which means that recombination is performed on each iteration. A Gaussian mutation operator with a mutation likelihood of Pm=0.2 is employed.
Input layer, a hidden layer and output layer are used in the neural networks. There are 7 inputneurons in the first GA-NN classifier, 5 hidden neurons and one output neuron to decide whether or not the grade result is an MA or a Non-MA. Similarly, the number of inputs for the other GA-NN classifiers varies based on the number of attributes. The network training mechanism that updates weight and bias is based on Levenberg-Marquardt optimization. Mutated child fitness is measured as Root Mean Square Error (RMSE) on NN performance. RMSE of the NN which is initialized with random weights and bias is also calculated. If the differences between two RMSE output is less than 0.01 then the iteration stops otherwise control goes again for another around of iteration of GA.

Decision Logic
The 3 GA-NN classifiers are trained and evaluated on each of these individual feature sets. In order to boost the classification accuracy, a final consent system is included. This module uses the plurality decision of these classifiers to reach the final prediction.

Experiment and Results
Two public datasets -e-ophtha MA [25] and DIARETDB1 [26] were employed to validate the classification methods proposed.
The resolution of images in e-ophtha database vary from 1440 960 pixels to 2544 1696 pixels, with a field of view of 45 degrees. There are 233 normal images and 148 images with MA. Both images are classified by an ophthalmologist and reviewed by another ophthalmologist. The MA pixels are coloured white to label them as part of the GT. The dataset contains 148 images, 103 of which are used for training purpose, and 45 of which are for evaluating. 345 patches were extracted from labelled MA locations, while 400 patches were extracted from non-marked MA locations at even random locations. 89 colour fundus images are available in the DIARETDB1 archive, each with a 1500 1152pixel resolution and a FOV of 50 degrees. Four experts separately labelled MA in the DIARETDB1 database. For learning, 59 images are used, and for testing, 30 images are used. For both lesions and non-lesions, 250 patches have been extracted.
The Receiver Operating Characteristic (ROC) curve and its associated Area Under Curve are the parameters used to evaluate the model's efficiency (AUC). The abscissa i.e x-axis of the ROC curve is the proportion of negative samples with positive classification outcomes for all negative samples, which is known as the false positive rate (FPR) to the proportion of positive samples with positive classification results is a sensitivity along y-axis, also known as the TruePositive Rate (TPR). The ROC for final decision classifier i.e the decision logic for the two dataset is shown in the Figures 8 and 9. Both on e-ophtha-MA and DIARETDB1 test cases, the suggested MA detection performed well, with AUCs of 0.89 and 0.87, respectively.

Conclusion
This paper outlines a process for detecting MAs in RGB fundus images. The CAD process consists of multiple phases including processing of image for enhancing the lesions visibility, extraction of candidates, extraction of attributes, classification and the last forecasting unit that makes a decision. The polynomial contrast enhancement technique is used for image preprocessing, rather than traditional CLAHE procedure. In the candidate extraction phase, blood vessels are eliminated to avoid correlated structures. Since MA are small structures, it is likely that segmentation would remove them, so the image patches are collected directly from the pre-processed image. A comprehensive feature vector for each candidate is created, and it consists of three feature descriptors such as shape features, GLCM, and LBP. The multiple feature sets in this instance are all independently trained and validated to three distinct GA-NN classifiers. To ensure the classification system's consistency, a final assessment unit is being used. This module uses the consensus decision of classifiers to generate the final prediction. The proposed detection algorithm performed well on the eophtha MA and DIARETDB1, yielding AUCs of 0.89 and 0.87, respectively.
The future work involves the exploration of new features such as Local Ternary Pattern (LTP) and other texture features for classifying MAs from FPs, as well as the implementation of a hybrid classifier for classification.