Observer variability for Lung-RADS categorisation of lung cancer screening CTs: impact on patient management.
van Riel SJ, Jacobs C, Scholten ET, Wittenberg R, Winkler Wille MM, de Hoop B, Sprengers R, Mets OM, Geurts B, Prokop M, Schaefer-Prokop C, van Ginneken B
OBJECTIVES: Lung-RADS represents a categorical system published by the American College of Radiology to standardise management in lung cancer screening. The purpose of the study was to quantify how well readers agree in assigning Lung-RADS categories to screening CTs; secondary goals were to assess causes of disagreement and evaluate its impact on patient management.
METHODS: For the observer study, 80 baseline and 80 follow-up scans were randomly selected from the NLST trial covering all Lung-RADS categories in an equal distribution. Agreement of seven observers was analysed using Cohen's kappa statistics. Discrepancies were correlated with patient management, test performance and diagnosis of malignancy within the scan year.
RESULTS: Pairwise interobserver agreement was substantial (mean kappa 0.67, 95% CI 0.58-0.77). Lung-RADS category disagreement was seen in approximately one-third (29%, 971) of 3360 reading pairs, resulting in different patient management in 8% (278/3360). Out of the 91 reading pairs that referred to scans with a tumour diagnosis within 1 year, discrepancies in only two would have resulted in a substantial management change.
CONCLUSIONS: Assignment of lung cancer screening CT scans to Lung-RADS categories achieves substantial interobserver agreement. Impact of disagreement on categorisation of malignant nodules was low.
KEY POINTS: • Lung-RADS categorisation of low-dose lung screening CTs achieved substantial interobserver agreement. • Major cause for disagreement was assigning a different nodule as risk-dominant. • Disagreement led to a different follow-up time in 8% of reading pairs.