Mapping the progression of COPD to Lung Cancer
We propose analysis of the PLCO study cohort. We will stratify the cohort by diagnosis of individual cancers with important and relevant co-variates, such as age, sex, BMI, and smoking status amongst others. Regression analyses will be utilised to analyse the relationship between air pollution measures and cancer outcomes as well as differences between males and females, whilst controlling for relevant covariates. We will explore the possibility of using deep learning to understand the progression to cancer from illnesses (e.g creating a neural network to map the progression of COPD). For those cancer types which are associated with air pollution, we will investigate the mechanistic basis for this relationship. This will be achieved through interrogating the data on participant medical histories, blood biomarkers, environmental exposures, and genomic data to determine whether specific factors mitigate or enhance the air pollution-induced risk of developing a certain cancer.
Aims:
• Elucidate the relationships between environmental exposures and cancer outcomes.
• Assess whether there are any differences between males and females, and by smoking status.
• Investigate interactions between environmental and host factors (for example genotypes) and how they affect cancer outcomes.
• Link insights from survey, health outcomes, and genetic data with blood biomarker data to yield mechanistic insights into how air pollution and sex dimorphisms might predispose to cancer risk.
Methods
Study Design
The study employs a cross-sectional and longitudinal design to elucidate the relationships between environmental exposures and cancer outcomes, assess differences by sex and smoking status, and investigate interactions between environmental exposures, host factors (genotypes), and cancer outcomes. The study integrates survey data, health outcomes, genetic data, and blood biomarker data to provide a comprehensive analysis. Where possible, we will try and harmonise our findings with other biobanks (e.g UK Biobank).
Participants
The study includes participants from the [Name of Biobank], a comprehensive biobank with diverse demographic representation. Participants will be stratified by gender and smoking status to facilitate subgroup analyses.
Data Sources (where applicable)
1. Survey Data:
- A structured survey capturing environmental exposures, lifestyle factors, and demographic information.
2. Health Outcomes Data:
- Comprehensive health records, including cancer diagnoses and relevant clinical information.
3. Genetic Data:
- Genotyping data providing information on host factors (genotypes).
4. Blood Biomarker Data:
- Comprehensive blood biomarker data, linking molecular information with health outcomes.
Charles Swanton - Francis Crick Institute
Kevin Litchfield - UCL Cancer Institute
Marcellus Augustine - Francis Crick Institute
Cian Murphy - Francis Crick Institute