Skip to Main Content

An official website of the United States government

What is the Recurrent Adenoma Dataset?

The Recurrent Adenoma Dataset is a dataset for recurrent colorectal adenoma analyses. It contains approximately 2000 PLCO participants who had at least one adenoma and at least one subsequent colonoscopy. The first adenoma is referred to as the baseline adenoma. Detection of adenoma after baseline indicates recurrence. About half of the population were assessed as having recurrent adenoma.

Much of the data was not part of the PLCO data collection protocol. Please read the information on this page prior to accessing this dataset.

What Data is in This Dataset?

The Recurrent Adenoma Dataset is narrowly focused on detection and timing of recurrent adenoma.

This dataset contains variables which describe the location (left or right colon) and characteristics of the baseline adenoma and the count and findings of colonoscopies after baseline. For participants with recurrent adenoma, the dataset includes the location (left or right colon) and characteristics of recurrent adenoma and the time from baseline adenoma to recurrent adenoma. For participants without recurrent adenoma, the dataset includes the time to last colonoscopy.

Nearly all data for first adenoma comes from diagnostic colonoscopy in the baseline year of the PLCO screening trial. This was part of the PLCO data collection protocol.

Nearly all data for recurrent adenoma comes from the Study of Colonoscopy Utilization (SCU), which retrospectively surveyed the frequency and yield of surveillance colonoscopies among selected participants (N~5400) from the intervention arm of the PLCO trial. It is recommended you become familiar with this ancillary study prior to using the Recurrent Adenoma Dataset.

For colorectal data not found in the Recurrent Adenoma Dataset, you may merge this dataset with the Colorectal dataset also found on this site.

Who is in the Recurrent Adenoma Dataset?

The Recurrent Adenoma Dataset contains ~2000 PLCO participants. All participants were from the intervention arm of the trial with a baseline screening flexible sigmoidoscopy (FSG) that was positive for lesions. In diagnostic follow-up for this screen, they all had at least one adenoma detected and surgically removed. In addition, all must have had at least one colonoscopy at a later date to assess whether adenomas were again detected as of that date (i.e., recurrent adenoma status can be determined).

Approximately 94% of participants in this dataset were also in the SCU ancillary study.

The remaining 6% of participants in this dataset were not in SCU, but did have recurrent adenoma observed following the second PLCO screening round.

By design, participants with colorectal cancer at baseline were not included in the Recurrent Adenoma Dataset.

How Was Recurrent Adenoma Status Determined?

Recurrent adenoma status was almost exclusively determined from data collected by the SCU ancillary study.

SCU queried ~5400 PLCO participants, intentionally oversampling participants with adenoma at baseline. The Recurrent Adenoma Dataset contains the subset of these SCU participants who had surveillance colonoscopy. From surveillance colonoscopy records, a participant’s recurrent status was determined – Either the participant has recurrent adenoma, or no recurrent adenoma was observed on any of their surveillance colonoscopies.

SCU is the source of >80% of the recurrent adenomas in the dataset.

The remaining <20% of recurrent adenomas were observed following the second PLCO screening round. Note that some of these cases are among participants who were also in SCU.

SCU is the source of 100% of non-cases (i.e., participants without recurrence), because it is not possible to determine non-case status from the PLCO screening data alone.