3.3 Prediction of OS
The radiomic model performed best in predicting OS [balanced accuracies: 78.69% and 85.75%; AUC (95% CI): 0.82 (0.69-0.94) and 0.92 (0.82-1) in training and validation sets]. Additionally, the balanced accuracies of only using the tumor volume to predict OS were 50% and 50% in the training and validation sets.
The scores predicted by the models were used to split patients into the high-risk and low-risk groups using the threshold value defined by the ROC curve of the training set. The radiomic scores can significantly stratify patients into the high-risk and low-risk groups (P = 7.16x10-10), while clinical score cannot (P = 0.29) (Figure 3) [34].
The clinical model only showed limited predictive power for OS, and combining clinicopathological parameters with radiomic features didn’t improve the performance (Table 3 and Figure 2b).
3.3 Prediction ofhypohemoglobin
We identified 9 clinicopathological parameters and 7 pelvis radiomic features as best predictors of hypohemoglobin [balanced accuracies: 62.42% and 70.96%; AUC (95% CI): 0.65 (0.57-0.73) and 0.74 (0.62-0.87) in training and validation sets]. The clinical model outperformed radiomic model in predicting hypohemoglobin (Table 3 and Figure 2c).
3. 4 Prediction of severeleucopenia
The clinical model and the radiomic model alone only showed limited predictive power for severe leucopenia [balanced accuracies: 56.42% and 55.08%; AUC (95% CI): 0.56 (0.41-0.71) and 0.57 (0.43-0.72) in the validation set]. Combining radiomic features with clinicopathological parameters improved the prediction performance [balanced accuracy: 69.93%; AUC (95% CI): 0.64 (0.48-0.79) in validation set] (Table 3 and Figure 2d).
Figure 4 shows boxplots and data distribution of the predicted scores for four clinical endpoints.
We repeated the whole process using data of 159 patients treated with the same therapy. Similar performance was observed in the prediction of four clinical endpoints (Table S3 and Figure S2-S4).
The wide and overlapping 95% CIs cannot represent statistical differences between two models.