Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 20;41(12):2191-2200.
doi: 10.1200/JCO.22.01345. Epub 2023 Jan 12.

Sybil: A Validated Deep Learning Model to Predict Future Lung Cancer Risk From a Single Low-Dose Chest Computed Tomography

Affiliations

Sybil: A Validated Deep Learning Model to Predict Future Lung Cancer Risk From a Single Low-Dose Chest Computed Tomography

Peter G Mikhael et al. J Clin Oncol. .

Abstract

Purpose: Low-dose computed tomography (LDCT) for lung cancer screening is effective, although most eligible people are not being screened. Tools that provide personalized future cancer risk assessment could focus approaches toward those most likely to benefit. We hypothesized that a deep learning model assessing the entire volumetric LDCT data could be built to predict individual risk without requiring additional demographic or clinical data.

Methods: We developed a model called Sybil using LDCTs from the National Lung Screening Trial (NLST). Sybil requires only one LDCT and does not require clinical data or radiologist annotations; it can run in real time in the background on a radiology reading station. Sybil was validated on three independent data sets: a heldout set of 6,282 LDCTs from NLST participants, 8,821 LDCTs from Massachusetts General Hospital (MGH), and 12,280 LDCTs from Chang Gung Memorial Hospital (CGMH, which included people with a range of smoking history including nonsmokers).

Results: Sybil achieved area under the receiver-operator curves for lung cancer prediction at 1 year of 0.92 (95% CI, 0.88 to 0.95) on NLST, 0.86 (95% CI, 0.82 to 0.90) on MGH, and 0.94 (95% CI, 0.91 to 1.00) on CGMH external validation sets. Concordance indices over 6 years were 0.75 (95% CI, 0.72 to 0.78), 0.81 (95% CI, 0.77 to 0.85), and 0.80 (95% CI, 0.75 to 0.86) for NLST, MGH, and CGMH, respectively.

Conclusion: Sybil can accurately predict an individual's future lung cancer risk from a single LDCT scan to further enable personalized screening. Future study is required to understand Sybil's clinical applications. Our model and annotations are publicly available.

[Media: see text].

PubMed Disclaimer

Conflict of interest statement

The following represents disclosure information provided by authors of this manuscript. All relationships are considered compensated unless otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the subject matter of this manuscript. For more information about ASCO's conflict of interest policy, please refer to www.asco.org/rwc or ascopubs.org/jco/authors/author-center.

Open Payments is a public database containing information reported by companies about payments made to US-licensed physicians (Open Payments).

Figures

FIG 1.
FIG 1.
(A) Annotation of lung cancers in Sybil training. For NLST participants who were diagnosed with lung cancer within 1 year of an LDCT examination, thoracic radiologists drew two-dimensional bounding boxes (purple) on every image showing the lesion, generating a 3D volume of each cancer to assist with model training. Each image below shows a different cancer from the NLST data set. (B) Data set construction flowcharts. Disposition of patients, LDCT examinations, and individual series within LDCTs from the data sets received from the NLST (left), MGH (center), and CGMH (right). Red font indicates a data filtration step. CGMH, Chang Gung Memorial Hospital; LDCT, low-dose chest computed tomography; MGH, Massachusetts General Hospital; NLST, National Lung Screening Trial.
FIG 2.
FIG 2.
Receiver operating characteristic curves displaying Sybil's ability to predict future lung cancer over 6 years following a single low-dose computed tomography from the (A) NLST, (B) MGH, and (C) CGMH test sets. CIs for each curve can be found in Table 1. AUC, area under the curve; C-index, concordance index; CGMH, Chang Gung Memorial Hospital; MGH, Massachusetts General Hospital; NLST, National Lung Screening Trial.
FIG 3.
FIG 3.
Examples of screening scans with negative clinical interpretations (Lung-RADS 1 or 2) and high Sybil risk scores, who subsequently developed lung cancer. Paired sets of images from four separate subjects from the National Lung Screening Trial and Massachusetts General Hospital cohorts illustrating Sybil's potential in predicting future lung cancer. Clinical (preoperative) or pathologic (postoperative) stages are provided using American Joint Committee on Cancer version 8. (A) A 69-year-old man with a 99 pack-year smoking history and LDCT without visible nodules in the right upper lobe (circle; Lung-RADS score 2, Sybil risk 75th percentile). (B) Two years later (after unchanged interval scan at 1 year), a new spiculated solid nodule appeared (arrow), and resection confirmed a 2.2-cm poorly differentiated squamous cancer (pT1cN0M0, stage IA3). (C) A 67-year-old man with a 30 pack-year smoking history and LDCT with a 7-mm solid nodule in the lingula next to the heart (arrow), which was missed because of human error (Lung-RADS score 2, Sybil risk 62nd percentile). (D) One year later, a 1.5-cm solid spiculated nodule was appreciated (arrow), and mediastinal sampling confirmed adenocarcinoma (cT1bN2M0, stage IIIA). (E) A 73-year-old man with an 80 pack-year smoking history and LDCT with a new solid nodule < 6 mm in the left upper lobe, that is, below the size threshold, which would have triggered a 6-month interval scan (Lung-RADS score 2, Sybil risk 65th percentile). (F) Two years later, after missing the recommended annual screen, a solid spiculated nodule was noted (arrow), and resection confirmed a 1.8-cm moderately differentiated squamous cell cancer (pT1bN0M0, stage IA2). (G) A 74-year-old man with 30 pack-year smoking history and LDCT showing an ill-defined cystic airspace in the left apex (arrow; Lung-RADS score 2, Sybil risk 69th percentile). Cyst-associated lung cancers are among the most difficult to recognize early., (H) Two years later, the lesion (arrow) had increased in size and resection confirmed a 2.1-cm moderately differentiated adenocarcinoma (invasive size 1.3 cm; pT1bN0M0, stage IA2). LDCT, low-dose computed tomography; Lung-RADS, Lung Imaging Reporting and Data Systems.
FIG A1.
FIG A1.
Architecture of Sybil. We first extract features from the input LDCT volume via a pretrained 3D Resnet-18 encoder. These features were used to compute a global feature vector for the volume through a Max Pooling layer and an attention-guided pooling layer. The resulting vectors were concatenated and passed through a hazard layer to produce a cumulative probability of developing lung cancer within 6 years. We trained the same algorithm architecture five times, and Sybil is the ensemble of these five algorithms whose risk predictions are averaged. Bounding box annotations of visible cancer nodules were used to guide the model's attention during training but are not used during testing. LDCT, low-dose computed tomography.
FIG A2.
FIG A2.
Sybil's accuracy in predicting clinical risk factors. Predictions on the basis of low-dose chest computed tomography images compared with the majority baseline. Error bars represent bootstrapped 95% CIs. COPD, chronic obstructive pulmonary disease; LDCT, low-dose computed tomography.

Comment in

References

    1. The National Lung Screening Trial Research Team : Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med 365:395-409, 2011 - PMC - PubMed
    1. US Preventive Services Task Force, Krist AH Davidson KW et al. : Screening for lung cancer: US Preventive Services Task Force recommendation statement. JAMA 325:962-970, 2021 - PubMed
    1. Fedewa SA, Kazerooni EA, Studts JL, et al. : State variation in low-dose computed tomography scanning for lung cancer screening in the United States. J Natl Cancer Inst 113:1044-1052, 2021 - PMC - PubMed
    1. Haddad DN, Sandler KL, Henderson LM, et al. : Disparities in lung cancer screening: A review. Ann Am Thorac Soc 17:399-405, 2020 - PMC - PubMed
    1. Wang GX, Baggett TP, Pandharipande PV, et al. : Barriers to lung cancer screening engagement from the patient and provider perspective. Radiology 290:278-287, 2019 - PubMed

Publication types