Skip to Main Content

The following PLCO Lung dataset(s) are available for delivery on CDAS. For each dataset, a Data Dictionary that describes the data is publicly available. In order to obtain the actual data in SAS or CSV format, you must begin a data-only request. Data will be delivered once the project is approved and data transfer agreements are completed.

To learn more about the Lung data collected as part of the study, visit the PLCO Data Collected documentation page.


Datasets and Data Dictionaries

Data Dictionary
(PDF - 592.2 KB)
1. The Lung dataset is a comprehensive dataset that contains nearly all the PLCO study data available for lung cancer screening, incidence, and mortality analyses. The dataset contains one record for each of the approximately 155,000 participants in the PLCO trial.
Data Dictionary
(PDF - 270.8 KB)
2. The Lung Screening dataset (~236,000, one record per year of screening) contains additional information from chest x-ray cancer screens. This includes results, detailed findings, reasons for inadequate exams, and additional findings that were not suspicious for cancer.
Data Dictionary
(PDF - 192.9 KB)
3. The Lung Screening Abnormalities dataset (~138,000, one record per abnormality) contains information for each abnormality found during the x-ray screen. This includes the location and type.
Data Dictionary
(PDF - 195.9 KB)
4. The Lung Diagnostic Procedures dataset (~63,000, one record per procedure) contains information about the diagnostic procedures prompted by positive lung cancer screen, as well as diagnostic/staging procedures associated with any lung cancers diagnosed during the 13 years of follow-up.
Data Dictionary
(PDF - 179.5 KB)
5. The Lung Medical Complications dataset (~1,400, one record per medical complication) contains information about the medical complications caused by diagnostic workup for lung cancer.
Data Dictionary
(PDF - 176.5 KB)
6. The Lung Treatments dataset (~6,000, one record per treatment procedure) contains specifics of the initial treatment following the diagnosis of lung cancer.
Data Dictionary
(PDF - 167.3 KB)
7. The Lung Pathology Image Linkage dataset (~1500, one record per image) contains identifiers necessary to link slide images with participants. This data is only provided for projects receiving H&E stained pathology images.
Data Dictionary
(PDF - 166.8 KB)
8. The Lung X-Ray Image Standard 25K dataset (25,000, one record per person in standard selection) contains variables reporting each participant's x-ray image availability. This data is only provided for projects receiving x-ray images.
Data Dictionary
(PDF - 166.7 KB)
9. The Lung X-Ray Image Linkage dataset (~89,000, one record per image) contains identifiers necessary to link x-ray images with participants' screens. This data is only provided for projects receiving x-ray images.

User Guides and Other Files

User Guides are intended to serve as a guide to using the data contained in these datasets.

For PLCO:
PLCO User Guide (PDF - 360.0 KB)

Data-Collection Forms

The following forms were used to collect data that is now available in the datasets listed above. They are provided in PDF format.

Baseline Questionnaire - Female
BQF1
BQF2
BQF3 - Scanned (1.7 MB)
Baseline Questionnaire - Male
BQM1
BQM2
BQM3 - Scanned (1.6 MB)
Chest X-RAY Screening Exam Form
XRY1
XRY2 - Scanned (474.1 KB)
Lung Diagnostic Evaluation Form
DEL
DEL2
DEL3 - Scanned (2.4 MB)
Lung Treatment Information Form
TIL1
TIL2 - Scanned (1.1 MB)
Annual Study Update Form
ASU - Scanned (82.2 KB)