Personalizing colorectal treatment strategies through machine learning
The effect of screening with flexible sigmoidoscopy has been demonstrated in the PLCO trial: Significant reductions were observed in the incidence of both distal colorectal cancer (479 cases in the intervention group vs. 669 cases in the usual-care group; relative risk, 0.71; 95% CI, 0.64 to 0.80; P<0.001) and proximal colorectal cancer (512 cases vs. 595 cases; relative risk, 0.86; 95% CI, 0.76 to 0.97; P=0.01). There were also fewer death in the screening arm of the trial: 2.9 deaths from colorectal cancer per 10,000 person-years in the intervention group (252 deaths), as compared with 3.9 per 10,000 person-years in the usual-care group (341 deaths), which represents a 26% reduction (relative risk, 0.74; 95% CI, 0.63 to 0.87; P<0.001). Mortality from distal colorectal cancer was reduced by 50% (87 deaths in the intervention group vs. 175 in the usual-care group; relative risk, 0.50; 95% CI, 0.38 to 0.64; P<0.001); mortality from proximal colorectal cancer was unaffected (143 and 147 deaths, respectively; relative risk, 0.97; 95% CI, 0.77 to 1.22; P=0.81).
In this project, we intend to use a gradient-boosted decision tree algorithm (XGBoost) to create a phenotype profile of the patients who are most likely to benefit from CRC screening. Machine learning techniques can be leveraged to unravel clinical entities and relationship that have not been explored. We will use these techniques and train two predictive models for cancer-specific (CSS) and overall survival (OS) models. Class imbalance correction techniques will be used before training the models.
In a second step, we will also compare the performances of models trained on all patients vs a models trained on the subset of patients included in the screening arm.
Beyond prediction, we intend on providing interpretable models that explain how they make their prediction at the scale of the whole population or at the individual scale.
The main outcome of the project is to be able to create artificial intelligence models that can:
- predict benefit from CRC screening,
- predict CSS and OS after CRC diagnosis.
The models will be interpretable, meaning that we will provide the clinical features that explain each prediction. This is crucial to build a model that is clinically actionable.
Lei Xing, PhD, Jacob Haimson Professor of Medical Physics and Director of Medical Physics Division of Radiation Oncology Department, Stanford University