Skip to Main Content

An official website of the United States government

Principal Investigator
Name
Melody Eide
Degrees
MD MPH
Institution
Henry Ford Health System
Position Title
physician
Email
About this CDAS Project
Study
PLCO (Learn more about this study)
Project ID
PLCO-105
Initial CDAS Request Approval
Sep 9, 2014
Title
HFHS Electronic Medical Records compared to HFHS PLCO data
Summary
The increasing adoption of electronic medical records (EMR) in the US offers new opportunities and
challenges for the investigation of cancer. The Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer
Screening Trial, was a National Cancer Institute (NCI)-funded trial of more than 150,000 participants
enrolled from multiple sites, including the Henry Ford Health System (HFHS) that was designed to test
the effectiveness of screening. Approximately 25,000 HFHS patients, whose care has been
documented in the HFHS EMR, participated in the PLCO trial locally. This provides an opportunity to
investigate the potential capacity of the EMR to quantify variables necessary to study cancer
prevention. The HFHS PLCO data could serve as a bridge between previous large prospective cancer
prevention trials and future EMR-based investigations, by clarifying what info is present, extractable
and imputable from the EMR. The aims of this project are to (1) examine machine-learning based
techniques for identifying cancer risk and prevention factors from the EMR (2) compare Henry Ford
Health System patient PLCO data against this electronically extracted information from their EMRs
Aims

The Henry Ford Health System was an active PLCO trial site and has maintained an EMR for more
than 20 years. It is unclear how PLCO questionnaire responses compare to EMR data for HFHS
patients who participated in the trial. Recognizing the possibility for the PLCO study to serve as a
bridge between previous prevention trials and future EMR-based studies, this project will focus on the
following research objectives:
To assess natural language processing (NLP) techniques for identifying cancer risk and related factors
from the EMR. We hypothesize that existing general-purpose information extraction techniques can be
adapted to process EMR stored within Henry Ford Health System (HFHS) using lung cancer and
melanoma as examples.
To compare risk factor information extracted from the HFHS EMR with previous PLCO patient survey
responses. We hypothesize that EMR information will have a high correlation with survey data.

Collaborators

Alexander Kotov, PhD Wayne State University