Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 11;13(1):5853.
doi: 10.1038/s41598-023-32979-6.

A study on the differential of solid lung adenocarcinoma and tuberculous granuloma nodules in CT images by Radiomics machine learning

Affiliations

A study on the differential of solid lung adenocarcinoma and tuberculous granuloma nodules in CT images by Radiomics machine learning

Huibin Tan et al. Sci Rep. .

Abstract

To study the classification efficiency of using texture feature machine learning method in distinguishing solid lung adenocarcinoma (SADC) and tuberculous granulomatous nodules (TGN) that appear as solid nodules (SN) in non-enhanced CT images. 200 patients with SADC and TGN who underwent thoracic non-enhanced CT examination from January 2012 to October 2019 were included in the study, 490 texture eigenvalues of 6 categories were extracted from the lesions in the non-enhanced CT images of these patients for machine learning, the classification prediction model is established by using relatively the best classifier selected according to the fitting degree of learning curve in the process of machine learning, and the effectiveness of the model was tested and verified. The logistic regression model of clinical data (including demographic data and CT parameters and CT signs of solitary nodules) was used for comparison. The prediction model of clinical data was established by logistic regression, and the classifier was established by machine learning of radiologic texture features. The area under the curve was 0.82 and 0.65 for the prediction model based on clinical CT and only CT parameters and CT signs, and 0.870 based on Radiomics characteristics. The machine learning prediction model developed by us can improve the differentiation efficiency of SADC and TGN with SN, and provide appropriate support for treatment decisions.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Research path map. Blue path, routine clinical CT diagnosis process. The features were found from CT images, and the differential diagnosis models were established to evaluate CT features and clinical CT diagnosis. Green path, machine learning process. The texture features are extracted from CT images by IBEX software. The extracted texture features are reduced by LASSO regression. The best classifier is selected by learning curves of machine learning, and the machine learning classification model is established and tested.
Figure 2
Figure 2
Image segmentation diagram. Nodule segmentation: in the image of the set window level and window width bar, the anchor points of the segmentation line is set continuously along the junction of nodule and lung. The anchor size is the default of the software, and there is no interval between the anchor points.
Figure 3
Figure 3
Machine learning flow chart. The samples after LASSO regression were randomly divided into a training group and test group at a ratio of 8:2. In the process of machine learning, 12 types of classifiers are used for training, and five times of cross-validation are performed during the learning process. Finally, the learning curve is drawn and the training score and verification score of each classifier is obtained; according to the learning curve of each classifier to score, select a classifier with a high degree of fit and no overfitting, and build and test a machine learning classification model.
Figure 4
Figure 4
Flow chart of selecting case samples. Case elimination flowchart. 200 patients were selected according to the inclusion criteria.
Figure 5
Figure 5
(ah) Axial, Sagittal and Coronal CT images and pathological sections (HE, 400X, light microscope). A 65-year-old man underwent chest CT during "physical examination" and found right lung space-occupying lesions. The pathological diagnosis was tuberculous granulomatous nodule after operation. CT images showed superficial loculated nodules with rough edges, short and hard "burr", thin line like adhesion with pleura, and small blood vessels passing through the nodules (ac). Pathological sections showed that the alveolar structure had been completely lost, and the multinucleated giant cells and epithelioid cells were disorderly accumulated (d). A 43-year-old female patient with coronary heart disease was diagnosed with left lung space-occupying lesion during coronary CTA. The pathological diagnosis was invasive adenocarcinoma after operation. CT images showed loculated, irregular nodules with smooth edge, long and soft "burr", pleural adhesion, vascular and bronchial convergence (eg). Pathological sections showed that tumor cells with large nuclei and few cytoplasm accumulated along the alveolar wall (h).
Figure 6
Figure 6
LASSO regression dimensionality reduction. Texture feature selection using the least absolute shrinkage and selection operator (LASSO) binary logistic regression model. (a) Tuning parameter (l) selection in the LASSO model used ten-fold cross-validation via minimum criteria. The area under the receiver operating characteristic (AUC) curve was plotted versus log(l). Dotted vertical lines were drawn at the optimal values by using the minimum criteria and the 1 standard error of the minimum criteria (the 1-SE criteria). An l value of 0.009, with log (l), 24.709 was chosen (1-SE criteria) according to ten-fold cross-validation. (b) LASSO coefficient profiles of the 490 texture features. A coefficient profile plot was produced against the log (l) sequence. A vertical line was drawn at the value selected using ten-fold cross-validation, where optimal l resulted in 23 nonzero coefficients.
Figure 7
Figure 7
(ac) Confusion matrix and ROC curve of CT features classification results and biological characteristics combined with CT features classification results.
Figure 8
Figure 8
(ac) Confusion matrix and ROC curve of classification results of the training group and test group of naive Bayesian classifier.

Similar articles

Cited by

References

    1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2017. CA Cancer J. Clin. 2017;67(1):7–30. doi: 10.3322/caac.21387. - DOI - PubMed
    1. Chen W, Zheng R, Baade PD, et al. Cancer statistics in China, 2015. CA Cancer J. Clin. 2016;66(2):115–132. doi: 10.3322/caac.21338. - DOI - PubMed
    1. Feng RM, Zong YN, Cao SM, Xu RH. Current cancer situation in China: Good or bad news from the 2018 Global Cancer Statistics? Cancer Commun. (Lond.) 2019;39(1):22. doi: 10.1186/s40880-019-0368-6. - DOI - PMC - PubMed
    1. Lung cancer group RS. Chinese Medical Association Expert consensus on diagnosis and treatment of pulmonary nodules (2018 edition) Chin. J. Tubre Respir. Dis. 2018;40(10):9. doi: 10.3760/cma.j.issn.1001-0939.2018.10.004. - DOI
    1. Gao JW, Rizzo S, Ma LH, et al. Pulmonary ground-glass opacity: Computed tomography features, histopathology and molecular pathology. Transl. Lung Cancer Res. 2017;6(1):68–75. doi: 10.21037/tlcr.2017.01.02. - DOI - PMC - PubMed

Publication types