Data-efficient and interpretable machine learning framework for lung cancer screening
Lung cancer is the leading cancer killer in both men and women in the U.S. with more than half of people dying within one year of being diagnosed. Thus, it is critical to detect the disease at an earlier stage when it is more likely to be curable. Early detection of lung cancer by low-dose computed tomography (LDCT) screening can reduce mortality by 14 to 20 percent among high-risk populations. To limit the clinical, financial and psychological harms of missed diagnoses and unnecessary follow-up procedures, effective screening programs require highly sensitive and specific medical image analysis techniques.
The overall objective of this proposal it to engineer, in partnership with physicians, Machine Learning (ML)-powered software that address two critical technical barriers to clinical deployment and physician adoption of automated decision-support tools for image-based cancer screening applications: (1) lack of tailoring to the unique specificities of medical data, and (2) lack of interpretability for the trust-seeking clinician. To achieve these objectives, we plan to pursue the following two aims:
Aim 1. To develop efficient model architectures that achieve accurate predictions with many fewer parameters and smaller training datasets than state-of-the-art models.
- State-of-the-art ML models require a staggering amount of data and computational power. This poses challenges for reproducibility in external centers and imposes stringent limits on their accuracy for real-world datasets, diverse populations composed of minority subgroups, and images with both high resolution and small regions of interest such as large LDCT volumes with small nodules.
- Contrary to existing ML models that learn from images based on their dense representation as arrays of a large number of regularly spaced pixels, we will build efficient models able to learn from images based on their compressed representation as networks of a fewer number grouped pixels. The desired outcome is computationally efficient models with large accuracy to number of parameters or training data size ratio.
Aim 2. To build trustable models whose predictions can be confidently validated or rejected by physicians.
- Existing “black-box” ML models only output raw predictions. Thus, they require an additional redundant expert assessment as they cannot be trusted for high-stakes medical decisions. Additionally, the lack of transparency inhibits troubleshooting reliability and fairness issues, and the potential to train the physician in unknown yet useful patterns.
- Contrary to existing ML methods based on optimizing a single diagnosis prediction, we will augment the model outputs to include physician-understandable explanations and quantified uncertainty estimates associated with the prediction. The desired outcome is the improved performance of a physician using the interpretable ML model compared to a physician using a conventional “black-box” model.
We plan to use the NLST dataset to develop, test, and validate the computational methods described in the objectives above. We are requesting access to all datasets (1-2, 4, 6, 8-15) except those strictly related to Chest X-rays (3: Chest X-Ray Screening, 5: Chest X-Ray Abnormalities, 7: Chest X-ray Comparison Read Abnormalities). To demonstrate the performance of our method, we ask for the complete dataset.
1) Jean-Emmanuel Bibault, MD, PhD - Laboratory of Artificial Intelligence in Medicine and Biomedical Physics, Stanford University
2) Varun Vasudevan - Laboratory of Artificial Intelligence in Medicine and Biomedical Physics, Stanford University
3) Lei Xing, PhD, DABR - Laboratory of Artificial Intelligence in Medicine and Biomedical Physics, Stanford University