Hybrid UNet Architecture based on Residual Learning of Fundus Images for Retinal Vessel Segmentation

This paper deals with the new segmentation techniques for retinal blood vessels on fundus images. This technique aims at extracting thin vessels to reduce the intensity difference between thick and thin vessels. This paper proposes the modified UNet model by incorporating ResNet blocks into it which includes structured prediction. In this work we generate the visualization of blood vessels from retinal fundus image for two loss functions namely cross entropy loss and Dice loss where the network classifies several pixels simultaneously. The results shows higher accuracy by considering a much more expressive UNet algorithm and outperforms the past algorithms for Retinal Vessel Segmentation. The benefits of this approach will be demonstrated empirically.


Introduction
The study and analysis of Retinal Vessel Segmentation i.e. RVS is the basis of medical application which can be related to prior scrutiny of retinal diseases which can bulge to vision loss. Early diagnosis can prevent vision loss. The main causes of vision loss is diabetic retinopathy and maculopathy. The computerized image analysis is applied to various medical applications. Advanced machine learning depended techniques is applied to different domains of medical imaging. These approaches outperforms the current practice dependent on the image processing procedures. The main dominance of practicing leading-edge machine learning for classification and segmentation is suitability of training of huge database which makes such approach effective from previously implemented techniques. A deep Convolutional Neural Network i.e. D-CNN framework was devised for performing diverse imaging task on ordinary images. In retinal imaging the D-CNN is used to detect vessels but the efficiency subsides as the arteries and veins broadness gets cramped. This experimentation targets on retinal fundus image for vessel extraction by using modified UNet architecture. There are openly accessible datasets for fundus images DRIVE, STARE, CHASEDB1, HRF dataset, which can be used for the interpretation of retinal vessel disunion algorithm. This work focuses on Digital Retinal Images for Vessel Segmentation.

Related Work
Small сhаnges in thin retinal vessels which cannot be seen by bare eye саn оnly be deteсted using various соmрuter аlgоrithms and we can translate these findings into something useful which can be beneficial to ophthalmologist. For this we have gone through research papers which tells us about the recent techniques and developments in the field of retinal imaging. Lei Zhou et al. [1] proposed methods to improve the functioning of the dense CRF model and outstrips rest of the technique when assessed with reference to F1-score, Mathews correlation coefficient and G-mean. for fundus retinal vessels segmentation and achieved higher network performance. X. R. Gao et al. [18] addressed the issues regarding the robustness of segmented micro-vessels. The capability of geometric transformation modeling still comes from artificial features and data expansion, and up-sampling layer of the decoding part cannot recover the details of the coding loss. C. Qingcui et al. [19] proposed an approach of Local Adaptive Gamma Correction. Authors used to conduct Gamma matching according to different pel features of blood vessels and background, so as to suppress the uneven light factor and the reflection of the center line and corrects the contrast of different regions in retinal images. J. Staal et al. [20] presented automatic ridge based vessel segmentation of retinal images for screening of diabetic retinopathy. The authors partitioned images into patches and computed its feature vector and used kNN classifer for classification. For this purpose they used Utrecht database and the Hoover database. The authors goal in their research was to classify each and every image pixel as vessel or non-vessel.

Challenges in Retinal Imaging
Retinal vessel segmentation i.e. RVS is the primary step preceding processing and identifying eye diseases. Many research work has been conducted for RVS but automatic classification of the segmented vessels have received less attention. There are some challenges which makes this RVS difficult. First factor is low contrast of fundus images as different blood vessels have different contrast with background i.e. thicker vessels have high contrast than thinner. Second factor is Inhomogeneous lighting of the background which is caused by imaging process itself. Generally veins are thicker and are darker than arteries, in many cases it becomes difficult to distinguish arteries from veins. In the low contrast retinal images the vessels in the peripheral areas are blackish because of overshadowing effect resulted from hetrogeneous illumination of fundus images. In such lighting conditions arteries and veins seems undoubtedly identical that actuate misclassification.
Mohammad et al. [11] discussed constraints and conventional proposals of deep learning for various organ segmentation. To improve segmentation accuracy deep learning models, it should be capable to tackle sophisticated structures. To procure this accuracy the network requires generous amount of labelled instances for carrying out training. To collect massive amount of annotated dataset for medical images and annotating them is very expensive and tedious task. Several solution has been addressed for the limited annotated data problem. The most commonly used approach to address this problem is data agumentation. Transfer learning is alternative approach to tackle this problem and in contrast to the first approach transfer learning imparts explicit solution and requires more parameters. Another approach is patch-wise training in which image is decomposed into several overlapping or random patches. Higher accuracy is witnessed in overlapping patches and lower accuracy in random patching. Class imbalance is seen in Random patching and is not advisable in small organ segmentation. The hybrid approach to address this issues is to use unsupervised learning methods to derive additional precise information from weakly labeled data. These extracted annotated data can be used to train the network. In 3 D cases since fully annotated data is not always available sparsely labeled data must be used. In sparsely annotated dataset the weights of unlabeled data are set to zero to learn from labeled data.
In medical imaging the anatomy of region of interest i.e. ROI (occupying small portion) of images is of great importance in anomaly or lesion detection. Hence most of the patches extracted belongs to the background region. Training with such dataset results the trained network biased towards the background and gets trapped in local minima. Sample reweighting is used as a solution to this problem where higher weight is applied to foreground patches during training. Automatic sample reweighting is developed by Dice loss layer and Dice Coefficient. Still the effectiveness is limited in dealing with extreme class imbalance. A mechanism to have approximately balanced number of patches from background and foreground should be in place in order to deal with extreme class imbalance. It is also noted that the presence of hemorrhages in retina makes the blood vessel segmentation more challenging. The entire process is treated separately into dual segments first is training and second is testing. We applied thresholding on labeled images by replacing entire pel levels equivalent to and in range of 127 to 255 to 1 and levels inferior to 127 is assigned 0 and converted them into one channel gray scale image. Normalize all the images along with their labeled images and rescale to 512 ×512. The Residual UNet model has encoder decoder architecture. Input images are processed by conv2D layer followed by two residual block as shown in Figure. 1. uses conv layer as hierarchical feature extractor to produce more representative features maps. A ResNet section subsists of batch normalization in conjunction along a pair of subsequent convolutional layers. Batch normalization play the role of regularizer to accelerate discriminative feature learning and helps in achieving training speed up from higher learning rates. Activation layers is applied after each Conv2D module in order to add non linearity. Batch normalization layer throughput is channeled to the concluding convolutional layer of the ResNet section throughput. This results ResNet section acting as a feedback nexus utilize residues through preceding layer. The drop-out layer after each max-pooling layer is added.

Results
We used publicly available retinal image database DRIVE for training and evaluating our model on DRIVE, HRF and CHASEDB1 data set. Training of the model was done with binary cross-entropy and dice coefficient loss functions and the training process was carried for 100 epochs. We also experimented our model on HRF and CHASEDB1 dataset as shown in Figure 3 and 4 respectively. Also the images from HRF and CHASEDB1 data set are tested with the trained DRIVE dataset. We evaluated the Sensitivity and Specificity of our method for DRIVE, HRF and CHASEDB1 data set.. It is observed that as training progresses the training accuracy increases and training loss decreases. The training accuracy curve and training loss curve for CHASEDB is shown in Figure. 5(a) and (b) respectively. The training accuracy and loss curve for HRF dataset is shown in Figure. 5(c) and (d) respectively. We conducted the investigation of our scheme with respect to sensitivity(Se) and specificity(Sp) as shown in Table I being highly sensitive and specific. This is intermittently impossible. Generally there exists tradeoff. After doing extensive testings it is observed that certain subjects are apparently healthy certain subjects are apparently unhealthy and certain subjects lie in the midst of them. So the criteria must be made for positive and negative results. Confusion matrix aka error matrix is frequently accustomed to depict the behavior of model based on group of test dataset for which the actual values are familiar.
The HRF and CHASEDB1 dataset are tested on trained DRIVE dataset. Their performance indicators are recorded and hence we can comment that this algorithm works well in unseen data. Secondly we also trained DRIVE, HRF and CHASEDB1 dataset separately and their performance indicators are recorded. The Figure 2. (a), (b) and (c) shows Test image for the Drive dataset, ground truth image and the result of segmentation map for Drive dataset respectively. The Figure 3.

Performance Evaluation
The intended computation is assessed with regard to sensitivity (Se), specificity (Sp), accuracy (Acc), f1 score, G and MCC as shown in Table I 6 potential of the model to accurately speculate the non-vessel. G-mean is the root of the product of class wise sensitivity. This indicator maximizes the each class accuracies while keeping the accuracies balanced. Positive predicted value (PPV) is the probability of correctly discovered vessels which are actually vessels. Mathews correlation coefficient (MCC) is a correlation among the predicted throughput and ground truth.
It is recorded that intended computation is superior than all the other contemporary schemes. This is because performing retinal image segmentation in fundus images is more challenging. This algorithm has a capability of detecting micro vessels in low contrast images. As well as has capability to detect blood vessels with central reflex. The findings of different researchers are compared with our hybrid method and is shown in Table 1. It is seen that intended computation of our hybrid method is more superior than all the other methods.  The algorithm correctly detects vessels with Se 0.7543, 0.7233 and 0.7403 for DRIVE, HRF and CHDB dataset respectively. This algorithm is useful for practical applications. The algorithm predicts non vessels with Sp 0.9927, 0.9871 and 0.9894 for DRIVE, HRF and CHDB dataset respectively. The accuracy recorded for DRIVE, HRF and CHDB dataset is 0.9778, 0.9698 and 0.9839 respectively. The PPV recorded is 0.8956, 0.8266 and 0.8391 for DRIVE, HRF and CHDB dataset respectively. The f1_score is 0.8285, 0.7715 and 0.7867 DRIVE, HRF and CHDB dataset respectively. G-mean score recorded is 0.8622, 0.8450 and 0.8558 for DRIVE, HRF and CHDB dataset respectively. The correlation among the segmentation throughput and ground truth is recorded as 0.8685, 0.8364 and 0.8445 DRIVE, HRF and CHDB dataset respectively.

Conclusion and Future Scope
This research article introduces a modified computation for retinal blood vessel segmentation on fundus images originated from UNet. The findings are shown experimentally on openly accessible dataset as mentioned in section I. which outperforms state of art algorithms. This algorithm also works in challenging situation similar to low contrast imaging of microvasculature and vessels in presence of abnormality. We have explored this experiments on unseen retinal data. We have trained DRIVE dataset and tested HRF and CHASEDB1 on trained DRIVE dataset. We also have trained the above three mentioned dataset separately and recoded their performance indicators in Table I. The accuracy recorded with proposed method for DRIVE, HRF and CHDB dataset is 0.9778, 0.9698 and 0.9839 respectively.
We can further exploit the proposed algorithm in other medical applications for segmentation of vessel like structures. Effective and true candidate lesion detection depends on the accuracy belonging to retinal vessel segmentation which might be used for practical research or practical medical treatment. Ophthalmologist will be benefited as different deep learning algorithms can be used for automatic retinal vessel segmentation and classification so that they can take correct decision.