Skip to Main Content

An official website of the United States government

Principal Investigator
Name
Burcu Darst
Degrees
Ph.D.
Institution
University of Southern California
Position Title
Postdoctoral Scholar - Research Associate
Email
About this CDAS Project
Study
PLCO (Learn more about this study)
Project ID
PLCO-555
Initial CDAS Request Approval
Nov 26, 2019
Title
Integrating Genomics and Metabolomics to Develop Predictive Models of Prostate Cancer in Multiethnic Men
Summary
PCa is the second leading cause of cancer death among American men, with men of African ancestry have the highest PCa incidence and mortality rates. While the causes of this disparity are unknown, there is evidence that genetics is a contributing factor (Haiman et al., Nat Gen, 2007). Given the high heritability of PCa and that metabolomics is suggested to be biologically relevant to PCa progression (Kelly et al., CEBP, 2016), a systems biology approach could further elucidate genomic complexities and provide biological insights into the mechanisms of PCa, which are not well understood.

Recent multi-omics methods, such as TWAS and PrediXcan that integrate genome-wide association studies (GWAS) and transcriptomics, have identified genetic risk not captured in standard methods and have elucidated novel biological mechanisms of complex diseases. Metabolomics is suggested to be biologically relevant to PCa progression, and we have identified metabolites with strong genetic influences (Darst et al., Aging, 2019) that mediate genetic effects on health outcomes (Darst et al., Gen Epi, 2019), suggesting that integrating GWAS and metabolomics could capture novel genetic variation in PCa, provide novel insights into the biological mechanisms contributing to this disease, improve predictive models of PCa.

The objective of this research is to elucidate biologically relevant genomic and metabolomic mechanisms and improve predictive models of PCa across ethnic and racial populations. Specifically, we are proposing to develop models that are predictive of metabolite levels across multiethnic populations. These predicted metabolite levels, or genetically-regulated metabolite levels, will facilitate future investigations of causal mechanisms and further our understanding of the biological underpinnings of many complex diseases. Genomic and metabolomic data will be integrated to investigate 1) metabolic mediation and 2) differentiate PCa risk groups according to combined genomic and metabolomic profiles. Results are expected to identify potentially causal PCa mechanisms and biologically distinct subgroups of high risk PCa individuals.

This proposed investigation will utilize GWAS data and metabolomic data from N=448 African American Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial participants, along with GWAS and metabolomic data from N=738 African American Multiethnic Cohort (MEC) participants. Metabolomics data from both studies were generated together on the same platform at Metabolon for a study conducted by Demetrius Albanes. These data will be used to investigate African-specific associations between genetic variants and metabolites in order to develop predictive models of metabolite levels, which will be used to impute metabolite levels into other large GWAS datasets (i.e., the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome; PRACTICAL, N=237,381 multiethnic men) and investigate potentially causal associations between genetically-imputed metabolites and PCa. These data will also be used to identify subgroups of individuals at high risk of PCa, combining GWAS and metabolomic data using a structured latent variable approach (described briefly in Specific Aims).
Aims

Aim 1. Identify genetically-imputed metabolites that could be causally associated with prostate cancer across multiethnic population.

The goal of this aim is to develop a multiethnic metabolomics imputation panel to facilitate the identification of metabolites causally associated with PCa. To accomplish this, we will 1) develop a large multiethnic metabolomics imputation panel using studies with both metabolites and GWAS data, including PLCO, MEC, and the Consortium of METabolomics Studies (COMETS; we are collaborating with COMETS investigators who are performing a GWAS meta-analysis on metabolites and have agreed to share summary statistics for the purpose of this investigation), 2) use this panel to impute metabolomics data into PRACTICAL, and 3) test for associations between imputed metabolites and PCa status.

MEC/PLCO GWAS summary statistics will be generated by performing a GWAS on each metabolite. GWAS will then be performed by study, using a separate linear regression model for each metabolite (outcome) and each genetic variant (predictor), controlling for age, principal components, batch, and sample storage time. After meta-analyzing results across studies and ethnic populations, summary statistics will be used to develop a polygenic score for each metabolite. Each metabolite will correspond to a unique variant list and weights used to develop a polygenic score (a weighted sum of the number of alleles). Polygenic scores will be applied to GWAS data from the PRACTICAL consortium to predict, or impute, metabolite levels across all participants. Next, we will perform a metabolome-wide association study (MWAS) to assess whether imputed metabolites are predictive of PCa.

Aim 2. Integrate genomics and metabolomics to investigate mediation and identify subgroups of individuals at high risk PCa.

The goals of this aim are to use a sample of African American men to 1) identify metabolites that mediate the effects of genetic factors on PCa and 2) identify subgroups of individuals with high risk of PCa using integrative analyses of genomics and metabolomics. We will initially focus on a polygenic risk score we developed in a recent multiethnic GWAS meta-analysis (manuscript submitted) as the primary genetic factor of interest.

To determine whether a metabolite mediates the relationship between the PRS and PCa, logistic regression models will be used to assess the causal mediation effect of the PRS on PCa through a metabolite. This will be repeated for each metabolite. Mediation P-values will be evaluated using the quasi-Bayesian Monte Carlo method with 1,000 simulations and adjusted for multiple testing. Based on these findings, follow-up mediation analyses will be performed on select genes and variants to identify potential drivers of mediation effects.

Subgroups of high risk PCa individuals will be identified by integrating genomics and metabolomics using the structured latent variable approach Latent Unknown Clustering Integrating Multi-Omics Data with Phenotypic Traits (LUCID) (Peng, et al., Bioinformatics 2019). Latent subgroups represent causal mechanisms underlying the effects of measured genomics, metabolomics, and PCa status. Subgroups are defined in a multivariate normal model, M~MVN(Sθ,Σ), where θ represents the mean difference of metabolites by subgroups and Σ represents the correlation of metabolite profiles by
subgroup.

Collaborators

Christopher Haiman, Sc.D., University of Southern California
David Conti, Ph.D., University of Southern California
Demetrius Albanes, M.D., National Cancer Institute, NIH