Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Sep 15;8(50):87494-87511.
doi: 10.18632/oncotarget.20903. eCollection 2017 Oct 20.

Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets

Affiliations

Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets

Yu-Hang Zhang et al. Oncotarget. .

Abstract

Detection and diagnosis of cancer are especially important for early prevention and effective treatments. Traditional methods of cancer detection are usually time-consuming and expensive. Liquid biopsy, a newly proposed noninvasive detection approach, can promote the accuracy and decrease the cost of detection according to a personalized expression profile. However, few studies have been performed to analyze this type of data, which can promote more effective methods for detection of different cancer subtypes. In this study, we applied some reliable machine learning algorithms to analyze data retrieved from patients who had one of six cancer subtypes (breast cancer, colorectal cancer, glioblastoma, hepatobiliary cancer, lung cancer and pancreatic cancer) as well as healthy persons. Quantitative gene expression profiles were used to encode each sample. Then, they were analyzed by the maximum relevance minimum redundancy method. Two feature lists were obtained in which genes were ranked rigorously. The incremental feature selection method was applied to the mRMR feature list to extract the optimal feature subset, which can be used in the support vector machine algorithm to determine the best performance for the detection of cancer subtypes and healthy controls. The ten-fold cross-validation for the constructed optimal classification model yielded an overall accuracy of 0.751. On the other hand, we extracted the top eighteen features (genes), including TTN, RHOH, RPS20, TRBC2, in another feature list, the MaxRel feature list, and performed a detailed analysis of them. The results indicated that these genes could be important biomarkers for discriminating different cancer subtypes and healthy controls.

Keywords: RNA-seq data; cancer detection; liquid biopsy; maximum relevance minimum redundancy; support vector machine.

PubMed Disclaimer

Conflict of interest statement

CONFLICTS OF INTEREST No potential conflicts of interest were disclosed.

Figures

Figure 1
Figure 1. IFS-curves for the results yielded in the first stage of the IFS method
The Y-axis represents the overall accuracy, and the X-axis represents the number of features used for classification. The high overall accuracies (no less than 0.740) all cluster between 2000 and 2200.
Figure 2
Figure 2. IFS-curves for the results yielded in the second stage of the IFS method
The Y-axis represents the overall accuracy, and the X-axis represents the number of features used for classification. The highest overall accuracy was 0.751 when 2047 features were used.
Figure 3
Figure 3. The performance of the optimal classification model evaluated by ten-fold cross-validation
Figure 4
Figure 4. The heat map of all samples using the important eighteen genes
Figure 5
Figure 5. The eighteen important genes found in the MaxRel feature list were clustered into three groups
Figure 6
Figure 6. The flow chart of constructing the mRMR feature list in the mRMR method

Similar articles

Cited by

References

    1. Krishnan A, Nair SA, Pillai MR. Biology of PPAR gamma in cancer: A critical review on existing lacunae. Current molecular medicine. 2007;7:532–540. - PubMed
    1. Carney DN. The Biology Of Lung-Cancer - a Review. Acta Oncol. 1989;28:1–5. - PubMed
    1. Shaw P, Costa J. Molecular-Biology Of Colon Cancer - (Review) Anticancer Research. 1989;9:21–27. - PubMed
    1. Parsons HM, Harlan LC, Schmidt S, Keegan TH, Lynch CF, Kent EE, Wu XC, Schwartz SM, Chu RL, Keel G, Smith AW, AYA HOPE Collaborative Group Who Treats Adolescents and Young Adults with Cancer? A Report from the AYA HOPE Study. J Adolesc Young Adul. 2015;4:141–150. - PMC - PubMed
    1. McGuire S. Adv Nutr 2016. Vol. 7. Geneva, Switzerland: World Health Organization, International Agency for Research on Cancer, WHO Press; 2015. World Cancer Report 2014; pp. 418–419. - PMC - PubMed