2.5 Feature extraction
We extracted radiomic features using an open-source software IBEX (Figure 1) [30], including shape features, first-order features, and texture features. Shape features describe the tumor volume, surface area, and etc. First-order features were statistical descriptors of the image intensity, intensity histogram, and gradient orient histogram. Texture features included features calculated based on the neighborhood intensity difference matrix, gray-level cooccurrence matrix, and gray-level run length matrix. Texture features in different 3D directions were averaged as the final feature values. All of the images were rescaled into 100 gray levels before extracting texture features to avoid the generation of sparse matrices. The images were rescaled into 256 gray levels before extracting first-order features. No filter was applied to the images.
2.6 Feature selection and endpoint prediction
All patients were allocated into the training/validation sets (3:1 ratio) using proportional random sampling, in order to avoid unbalanced data distribution in the two sets. All clinicopathological data and radiomic features were normalized using robust data scaling method, which ignores the outliers when calculating the mean and standard deviation and then scales the variables using the calculated values. A four-step method was used to select predictive features and to build prediction models. First, the radiomic features with the inter-rater reliability of intra-class coefficient (ICC) > 0.80 were selected. Second, the Lilliefors test was used to test whether data come from a normal distribution. We calculated the differences between two groups of patients using a two-sample two-sided t-test or Wilcoxon rank sum test depending on the normality of data. The difference in pelvic lymph node status was calculated using chi-square test. The radiomic features with P < 0.05 were selected. If the number of the selected features was < 20 based on this criterion, P < 0.1 was used instead. The selected radiomic features and all clinicopathological parameters were candidate features for next step. Third, we classified two groups of patients and selected the best predictors using sequential backward elimination-support vector machine (SBE-SVM) algorithms. This method initially used all features to train and test an SVM model with a linear kernel in a five-fold cross validation using data in the training set and sequentially removed one feature from the feature set to see whether the prediction accuracy was improved or remained the same. If so, this feature was permanently removed. The soft margin SVM algorithm that is not sensitive to outliers was used for modelling, in order to prevent overfitting. The SBE-SVM model considers each feature’s contribution to the classification task and finally gives the optimal combination of features, and has shown good performance in previous studies [31-33]. Finally, the final SVM model was used to predict classes of patients in the validation set. Please note that the performance of the training set was evaluated in a five-fold cross validation.
A receiver operating characteristic (ROC) curve was plotted using actual labels and the scores predicted by models as well as an area under curve (AUC) was simultaneously calculated as the major metric to evaluate the model performance. Besides, we also calculated accuracy, sensitivity, specificity, and F1-score as auxiliary metrics (definitions are in supplementary materials).
We established three models for prediction: a clinical model built using only clinicopathological parameters, a radiomic model built using only radiomic features, and a clinical and radiomic model built using both of them.
All analyses were performed using MATLAB 2018a. The SBE-SVM algorithm is based on MATLAB functions: ‘sequentialfs ’ and ‘fitcsvm ’. The computational codes are available upon request to the corresponding author.