Pathological grading of Hepatocellular Carcinomas in MRI using a LASSO algorithm

To investigate the predictive ability of Radiomics signature for preoperative pathological grading of Hepatocellular Carcinomas (HCC), the no contrast MRI images were integrated and a comprehensive analysis was conducted to predict clinical outcomes using the radiomics features. Variable selection via LASSO and logistic regression were used to select the most-predictive Radiomics features for the pathological grading. Cross-Validation with receiver operating characteristic (ROC) analysis was performed and the area under the ROC curve (AUC) was employed as the prediction metric. Overall, the prediction performances by Radiomics features showed statistically significant correlations with pathological grading, however, improvement on the prediction performance by combining T1WI and T2WI data, classification performance obtained the AUC 0.829 in training dataset and the AUC 0.742 in validation dataset. This study consisted of 170 consecutive patients (training dataset: n=125; validation dataset, n=45). The results showed Radiomics signature was developed and validated to be a significant predictor for discrimination HCC pathological grading, which may serve as a complementary tool for the preoperative tumour grading in HCC.


Background
Hepatocellular Carcinomas (HCC) is the most frequently diagnosed cancer in the world, its incidence and mortality in China ranked the fourth and second respectively [1], a cancer usually treatable if an early diagnosis is possible. By assessing the characteristics of liver tumor noninvasively, MRI is often used in clinical practice for disease diagnosis and treatment guidance [2], MRI has great potential to guide therapy because it can provide a more comprehensive view of the entire tumor and it can be used on an ongoing basis to monitor the development and progression of the disease or its response to therapy. Further, MRI is noninvasive and is already often repeated during treatment in routine practice. But radiology doctors can't get more valuable information from MRI images directly, they must analysis images by the help of computer to find a further information for clinical outcomes. Pathological grading in oncology is closely related to cancer diagnosis, prognosis, and treatment planning. For HCC, traditional pathological grading depends on aspiration biopsy and surgical pathology, this invasive method of understanding tumor heterogeneity is not suitable for all cancer patients before treatment. Lambin [3] proposed the concept of radiomics in 2012, it is an emerging field that converts image data into a high dimensional mineable feature space using a large number of automatically extracted data-characterization algorithms, these image features can capture the subtle differences in the image across different tumors, it has shown great advantages in phenotypic typing, treatment options and prognosis analysis, which is a research hotspot in clinical medicine and biomedical engineering [4][5]. Figure1 depicts the block diagram of the entire process.
More and more radiomics researchs are used to study the phenotypic and classification of tumors. Kumar [6] expanded further and defined radiomics as "high throughput from CT, MRI, and PET (positron emission computed tomography), a large number of advanced quantitative image features are extracted and analyzed", Doroshow [7] published articles in "Nature Reviews Clinical Oncology", 2 1234567890 ''""  pointing out that radiomics is one of the future directions of transformation medicine. In 2014, the articles of Aerts [2] published in "Nature Communication", they extracted 440 quantitative image features from 1019 cases of lung cancer or head and neck cancer patients with CT data, including the gray distribution, shape and texture, these image features reflect the heterogeneity of tumor, and tumor pathological type, T stage, gene pattern of expression is related.
Image segmentation is a key step in radiomics. Precise segmentation of lesion area is very important for subsequent feature extraction and model construction. Many image segmentation algorithms are limited in practice because of the diversity and complexity of medical images in different tissues. There is no uniform and perfect segmentation algorithm in the field of medical image segmentation today. The main segmentation algorithms are manual segmentation, human-computer interactive segmentation and automatic segmentation. Although automatic segmentation can improve efficiency, but the effect of algorithm is not good at present. Manual segmentation is often used as the golden standard of image segmentation. However, manual segmentation has high inter observer bias, and manual segmentation is inefficient and reproducible.
Radiomics analysis requires accurate feature extraction technology to extract all kinds of features, and then bioinformatics method is associated with various clinical phenotypes of tumor. We can obtain massive and high dimensional features by automatic image feature extraction algorithm, but in data fitting, when sample size is small, it will cause data dimension disaster when fitting. When building model, it will lead to over-fitting, the results can't be applied to other data samples, the generalization ability is very poor. Therefore, it is necessary to discuss the appropriate dimensionality reduction methods to retain the characteristics of maximum correlation and minimum redundancy for subsequent feature selection. Fan [8] proposed variable selection should meet the following requirements: (1) the accuracy of prediction model; (2) interpretability of the model; (3) the stability of the model, that is small changes in data set do not lead to larger changes in the model; (4) to avoid bias in hypothesis testing; (5) we should try to control the computational complexity. At present, there are many ways to reduce dimension, such as principal component analysis (PCA), random forest (RF), support vector machine (SVM), clustering, etc. But these methods can only reach some of targets, although ridge regression can better deal with multiple collinearity among variables, it can't provide a sparse model because it can't reduce the dimension. LASSO is also the initials for Least Absolute Shrinkage and Selection Operator, this method is proposed by Tibshirani [9], it uses the absolute value function of the model coefficients as a penalty strategy to compress the model coefficients, therefore, LASSO can provide a sparse solution. Compared with the traditional method of variable selection, the LASSO method overcame the shortcomings of the traditional methods. With the increase of computing power, LASSO has been used more and more in high-throughput data field, such as gene, protein and image.
LASSO is an innovative variable selection method for regression. Variable selection in regression is extremely important when we have available a large collection of possible covariates from which we hope to select a parsimonious set for the efficient prediction of aresponse variable. LASSO minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. LASSO not only helps to improve the prediction accuracy when dealing with multi-colinearity data, but also carries several nice properties such as interpretability and numerical stability.

A novel method for liver MR image pathological grading based on LASSO
The proposed pathological grading method consists of five steps: image acquisition and preprocessing, feature extraction, feature selection using LASSO, classification based Rad-score, and classification based clinical model combined the radiomics signature. This method is described in the following sections.
Analysis Workflow 1. Pathological grading based on LASSO.
Step1: Image acquisition; Step2: Image preprocessing by Laplacian of Gaussian spatial band-pass filter; Step3: Extracting the lesion area(ROI) from the MR images by manual segmentation; Step4: Extracting the high dimension features of ROI; Step5: All the samples will be divided into two parts: training dataset and test dataset; Step6: Feature selection based on LASSO regression algorithm on training dataset; Step7: Constructing a classification model by using selected features in setp6 to make up signature; Step8: Verifing the performance of classification mode in setp7 on test dataset.

Image preprocessing
A process was applied to selectively extract features of diverse sizes and intensity variations, a series of images ranging from fine to coarse texture that highlighted and enhanced tumor features at different anatomic spatial scales, was derived from a MRI slice by a Laplacian of Gaussian spatial band-pass filter (∇ 2 G) with five filter values (0, no filtration; 1.0, fine textures; 1.5and 2.0, medium textures; 2.5, coarse textures) [10]. The distribution of the ∇ 2 is given by the following the mathematical expression: (1) Wherex, , are the spatial coordinates of a pixel and is the value of the filter parameter.

Feature extraction
In this stage, many significant features from the MR images were extracted to be subjected to classification. As previously discussed, the image was processed slice-by-slice, feature extraction was performed using a voxel overlapping.
In this work, We evaluated a total number of 328 MR image features, which are divided in four groups as follows:(I) tumor intensity, (II) shape, (III) texture and (IV)wavelet features. The first group quantified tumor intensity characteristics using first-order statistics, calculated from the histogram of all tumor voxel intensity values. Group2 consists of features based on the shape of the tumour (for example, sphericity or compactness of the tumor). Group3 consists of textual features that are able to quantify intratumour heterogeneity differences in the texture that is observable within the tumor volume. These features are calculated in all three dimensional directions within the tumor volume, thereby taking the spatial location of each voxel compared with the surrounding voxels into account. Group4 calculatesthe intensity and textural features from wavelet decompositions of the original image, thereby focusing the features on different frequency ranges within the tumor volume (figure 2). All features shown in table 1.  Figure 2. Schematic of the undecimated three dimensional wavelet transform applied to each MR image. The original image X is decomposed into 8 decompositions, by directional low-pass (i.e. a scaling) and highpass (i.e. a wavelet) filtering: X LLL , X LLH , X LHL , X LHH , X HLL , X HLH , X HHL , X HHH .  regression method finds the unbiased linear combination of the 's that minimizes the residual sum of squares. However, if p is large or the regression coefficients are highly correlated (multicolinearity), the OLS may yield estimates with large variance which reduces the accuracy of the prediction. A widely-known method to solve this problem is Ridge Regression and subset selection. As an alternative to these techniques, Robert Tibshirani presented "LASSO" which minimized the residual sum of squares subject to the sum of absolute values of the coefficient being less than a constant. > 0. It will be shown later that the relation between and LASSO parameter t is one-to-one. Due to the nature of the constraint, LASSO tends to produce some coefficients to be exactly zero. Compared to the OLS, whose predicted coefficient ̂ is an unbiased estimator of , both ridge regression and LASSO sacrifice a little bias to reduce the variance of the predicted values and improve the overall prediction accuracy [11].
2.4. CV methods and estimate of the LASSO parameter. The tuning parameter t is called LASSO parameter, which is also recognized as the absolute bound. ∑ |β j L | = t n j=1 . Here define another parameter, s , as the relative bound.
The relative bound can be seen as a normalized version of LASSO parameter. N-fold Crossvalidation algorithms [12] can be used to compute the best s. Cross-validation is a general procedure that can be applied to estimate tuning parameters in a wide variety of problems. The bias in RSS is a result of using the same data for model fitting and model evaluation. CV can reduce the bias of RSS by splitting the whole data into two subsamples: a training (calibration) sample for model fitting and a test (validation) sample for model evaluation. The idea behind the cross-validation is to recycle data by switching the roles of training and test samples.
The optimal s can be denoted by . Prediction error can be estimated for the LASSO procedure by ten-fold cross-validation. The LASSO is indexed in terms of s, and the prediction error is estimated over a grid of values of s from 0 to 1 inclusive. We wish to predict with small variance, thus we wish to choose the constraint s as small as we can. The value ̂ which achieves the minimum predicted error of ( ) is selected. The histology grading data was retrieved from the archived clinical histology report, in which the histological grades of the HCCs were noted. Low-grade tumour corresponds to Edmondson grades I, I-II and II, and high-grade tumour corresponds to Edmondson grades II-III, III, III-IV and IV. Figure 3 shows the MR image of HCC.

Results
All feature algorithms were implemented in Matlab2014. Totally 328 features extracted from T1WI and 328 features extracted from T2WI. LASSO algorithm was used to select the most useful features. The Radiomics score (Rad-score) of each patient was calculated by using the linear combination of selected features multiplying their respective coefficients. 10-fold cross-validation was applied to  Figure 3. Axial slice of a patient with Hepatocellular Carcinoma in T1WI MRI, the left is high-grade tumor, the right is low-grade tumor.
training dataset and selecting the optimal model of the pathological grading of HCC. The results of HCC histology grading in training dataset and validation (Test) dataset shown in Table 2. Features selection using the LASSO logistic model shown in figure 4. The predictive performance for the discrimination of HCC histology grading presented as ROC were described in figure 5. All of the LASSO programs are implemented with R language.

Discussion
The Radiomics model based on T1WI achieved an AUC 0.812 in the training data-set and AUC of 0.712 in the validation data-set ( figure 5a). The Radiomics model based on T2WI achieved an AUC of 0.804 in the training data-set and AUC of 0.722 in the validation data-set ( figure 5b). The Radiomics model based on T1WI combined T2WI achieved the highest AUC 0.829 in the training data-set and AUC of 0.742 in the validation data-set (figure 5c). There's significant difference between the median of Radiomics signature of high-stage patients and that of the low-stage patients, in both the training dataset (p <0.0001) and the validation dataset (p < 0.01).The Radiomics signature both based on T1WI and T2WI presented good performance for the discrimination of high-stage patients and low-stage patients.
Compared with basic MR images traits, Radiomics features capture more information about intratumour heterogeneity objectively and quantitatively at low cost, correlate with underlying geneexpression patterns, and may help predict clinical outcomes. In our study, a few potential features were chosen from 328 candidate Radiomics features via the LASSO method to build a Radiomics signature (Rad-score), which proved to be an independent predictor for HCC histopathological grading.