Recognition of Tuberculosis Through Image Modalities

This project is based upon the different process determined for the recognition of lung tuberculosis using different modalities like filtering, lung boundaries detection, extraction of tuberculosis and performance evaluation measures. We all basically know about lung tuberculosis, which is caused by a bacterial infection that causes the majority of death rate around the world. It is caused by Mycobacterium tuberculosis (M tuberculosis). This project tries to detect the tuberculosis in minimum investment. In this project first we get the X-ray images of infected person. After that the X-ray images are given as the input then the preprocessing is applied in that we use filter which removes the unnecessary noise and also helps to acquire clear images. Then the output which is obtained from the preprocessing is given as the as the input for lung boundary detection after detecting the boundaries the output we detect is fused to get Lung Boundary Detection. In this LBD the features such as area, major and minor axis etc…are detected. Under the various process we detect the TB that is present in X-ray or not.


Introduction
In this modern era all things are automated, all machines can work without the need of humans they think of their own. Health care is most essential in the life of humans. In our project we detect the lung tuberculosis by using X-ray images [1]. Tuberculosis is deadly diseases which causes a greater number of deaths around the world. It is mostly caused by Bacteria. Pulmonary tuberculosis is a bacterial infection which causes more death than other infectious disease [2]. Lung tuberculosis is caused by Mycobacterium tuberculosis or TB (Tubercle bacillus). In recent survey about 25-30% of the global is caused by tuberculosis with around 3 million new cases. The tuberculosis is majorly reported in United States of America. It is deadly disease which causes death when untreated. The symptoms are Chronic cough, prolonged fever, cough with bloody mucus, sudden weight loss [3]. The risk factors are Smoking. There are various types of tuberculosis they are Active tuberculosis, Latent tuberculosis, Miliary tuberculosis. In active tuberculosis, it is easily spread through the air and transmit to another person. Miliary tuberculosis occurs when the bacteria find the way to enter into blood path [4]. This type of bacteria easily spread to the body and attacks the various organs. In that Latent tuberculosis the TB infection is very difficult to detect when the patient having no symptoms. This project is based on developing the application which is used for the detection of Pulmonary tuberculosis using Python (machine learning) as a project tool. The output which is obtained VGG algorithm [5]. The rest can be discussed below.

Literature review
At different level of diagnostic method tuberculosis can be detected with the sputum microscopic image the TB cells are identified and counted but, in the chest, Xray technique the accuracy level is higher more over the microscopic image counting the cell over microscope add complex to it [6]. Obtained image from the database of both affected and non-infected tuberculosis patient to compare the result of the it.
From the CXR image (chest Xray) aided to image processing with the help of tools it can be extracted from the lung. There is various procedure involved in the process of image processing use the lung boundary detection by thresholding processing [7]. By using the thresholding process, it segments the image accurately from the white and black pixel it can be clearly segmented. Edge detection technique can be easily detecting the outer boundary of lung [8] calculation of pixel is done with the measurement of pixel in the boundaries and in the septum region

Methodology
x Image Acquisition x Pre-Processing x Feature Extraction x Classification

Image Acquisition
Chest X-ray dataset used on the work is open-source dataset and it is publicly available with two folder normal chest X Ray and chest X-ray with tuberculosis [9].
This dataset has 3500 normal chest X-ray and 3500 tuberculosis chest x-rays. The tuberculosis dataset is collected from the dataset NLM dataset, Belarus dataset, Japanese dataset TB dataset and RSNA CXR dataset. Figures 1 and Figure 2 shows Normal CXR and Tuberculosis CXR image.

Pre-Processing
Preprocessing has to be done to avoid bad performance in classifier. The dataset is heterogeneous in type it shows that the images are in different size, width, and top therefore the resize of image has to be done [10]. In the first step the image is resized to 224 × 224. Then the image is converted to gray scale and noise is removed. Figure 3 shows Chest Xray dataset with labels.

Feature Extraction
In the feature extraction VGG 16 convolutional neural networks is used. VGG architecture has 16 layers the input to the layer is of fixed size 224 × 224 then the image is passed to the of convolution layers. It has set of convolution layer, max pooling layer and fully connected layer [11]. It has less hyperparameter and it is simple architecture. The filter size of 3 × 3 is used in the VGG model and it also uses 1 stride in the convolution layer and 2 strides in max pooling layer [12].

Classification
In our work the trained VGG model is stored in keras [13]. Then the last step is to predict the image has tuberculosis or Normal. Saved VGG model is used for the prediction, in the prediction output two values has been displayed the first value is for Normal and second value is for Tuberculosis [14]. Based on the greater value the result is displayed. Figure 4 and Figure 5 shows the graph of modal accuracy and modal loss.

Conclusion
In this project we described the detection of Pulmonary tuberculosis from X-ray images using python. In this project the unwanted noise can be removed from the images using filtering techniques in the initial stage of process and the contrast of the images can be adjusted by using the CLAHE (Contrast Limited Adaptive Histogram Equalization). Secondly, the lung regions are detected by segmentation process. In segmentation process we use Otsu's threshold because it allows to adaptively defines a global threshold value. Then the feature extraction process is occurring in that, in this the features like area, major and minor axis can be calculated. Then finally by using SVM (Support Vector Machine) classifier in that it uses a technique called Kernel to transform the data and finds the suitable boundary between the outputs. In that we get the accuracy of above 90%