Cell Free DNA Methylome for Lung Cancer Early Detection
Lung cancer is the leading cause of cancer death, which underlines the importance of early detection. Cell-free DNA (cfDNA) represents a great promise for identifying tumors at early stage without invasive procedures. Recognizing the issue of low sensitivity related to mutation detection for circulating tumor DNA, we investigated whether cfDNA methylome profiles can be used for lung cancer early detection based on 70 lung cancer cases and 86 healthy controls. The samples of lung cancer cases were collected at the time of diagnosis, and we identified specific methylation aberrations that are detectable in circulating cfDNA, with predictive accuracy of 97% in the validation set, and the results were comparable for early stages vs late stage lung cancers(Shen et al, Nature 2018). In addition, we used cfMeDIP-seq to profile 189 plasma samples from 7 different tumor types and healthy controls (discovery set) and another 199 plasma samples from 3 tumor types and healthy controls (validation set). Density clustering by cfDNA methylation status based on t-Distributed Stochastic Neighbor Embedding (t-SNE) revealed distinct clustering of the 388 plasma samples according to cancer types.
We propose to validate these novel findings in the PLCO based on approximately 700 pre-diagnostic samples of lung cancer patients collected within 5 years of diagnosis in PLCO, along with the matched controls. This represents a great promise for identifying tumors at early stage without invasive procedures, and validations based on samples collected before cancer diagnosis will be vital to assess its actual clinical utility.
After the extraction of cfDNA from plasma, we propose to run the cfMeDIP-seq assay and QC as previously described. To identify the differentiated methylated regions (DMRs) we will use Benjamin-Hochberg false-discovery rate to control for multiple comparison. To build the risk prediction model incorporating DMRs and clinic-epidemiological information, we will apply regularized elastic net regression for dimensionality reduction and accounting for correlation. To avoid model overfitting, we will use cross-validation technique to determine the best inclusion threshold based on optimal signal-to-noise ratio, in addition to the bootstrap method (internal validation). The model’s ability to discriminate will be based on area under the receiver operating characteristics curve (AUC), and model calibration will be assessed by evaluating how much the slope of the calibration line deviates from the ideal of 1. For the subset of samples where there are sequential samples available at different time points, the detectable timing of the tumor-specific cfDNA methylation markers will be assessed prospectively. Per sample availability, we propose to investigate 100 lung cancer cases for the timing of detectability at 5 years, 2 years and 6 months before diagnosis.
1. To perform the cfDNA methylome analysis based on cfMeDIP-seq on the pre-diagnostic samples of lung cancer cancer cases (estimated N~700) and matched controls within the cohort. The matching will be done based on age, sex, age of cohort entry and follow-up time with 1:1 ratio.
We will perform the analysis based on 100 nested case-control pairs within 1 year of diagnosis first to assess the suitability of this assay for PLCO samples before continue with the rest of the samples.
2. To assess the predictive performance of cfDNA methylome for lung cancer early detection in the PLCO study. The predictive model will integrate epidemiological information from PLCO including smoking, family history and medical history.
3. We will assess the timing of when cfDNA methylation markers become detectable, with a subset of samples where sequential samples are available.
Rayjean Hung (Sinai Health System)
Neal Freedman (National Cancer Institute)
Geoffrey Liu (Princess Margaret Cancer Centre)