PreMSIm: An R package for predicting microsatellite instability from the expression profiling of a gene panel in cancer
- PMID: 32257050
- PMCID: PMC7113609
- DOI: 10.1016/j.csbj.2020.03.007
PreMSIm: An R package for predicting microsatellite instability from the expression profiling of a gene panel in cancer
Abstract
Microsatellite instability (MSI) is a genomic property of the cancers with defective DNA mismatch repair and is a useful marker for cancer diagnosis and treatment in diverse cancer types. In particular, MSI has been associated with the active immune checkpoint blockade therapy response in cancer. Most of computational methods for predicting MSI are based on DNA sequencing data and a few are based on mRNA expression data. Using the RNA-Seq pan-cancer datasets for three cancer cohorts (colon, gastric, and endometrial cancers) from The Cancer Genome Atlas (TCGA) program, we developed an algorithm (PreMSIm) for predicting MSI from the expression profiling of a 15-gene panel in cancer. We demonstrated that PreMSIm had high prediction performance in predicting MSI in most cases using both RNA-Seq and microarray gene expression datasets. Moreover, PreMSIm displayed superior or comparable performance versus other DNA or mRNA-based methods. We conclude that PreMSIm has the potential to provide an alternative approach for identifying MSI in cancer.
Keywords: ACC, adrenocortical carcinoma; AUC, area under the curve; Algorithm; BLCA, bladder urothelial carcinoma; BRCA, breast invasive carcinoma; CESC, cervical squamous cell carcinoma and endocervical adenocarcinoma; CHOL, cholangiocarcinoma; COAD, colon adenocarcinoma; CV, cross validation; Cancer; Classification; DLBC, lymphoid neoplasm diffuse large B-cell lymphoma; ESCA, esophageal carcinoma; GBM, glioblastoma multiforme; GEO, Gene Expression Omnibus; GO, gene ontology; Gene expression profiling; HNSC, head and neck squamous cell carcinoma; KICH, kidney chromophobe; KIRC, kidney renal clear cell carcinoma; KIRP, kidney renal papillary cell carcinoma; LAML, acute myeloid leukemia; LGG, brain lower grade glioma; LIHC, liver hepatocellular carcinoma; LUAD, lung adenocarcinoma; LUSC, lung squamous cell carcinoma; MESO, mesothelioma; MSI, microsatellite instability; MSS, microsatellite stability; Machine learning; Microsatellite instability; OV, ovarian serous cystadenocarcinoma; PAAD, pancreatic adenocarcinoma; PCPG, pheochromocytoma and paraganglioma; PPI, protein-protein interaction; PRAD, prostate adenocarcinoma; READ, rectum adenocarcinoma; RF, random forest; ROC, receiver operating characteristic; SARC, sarcoma; SKCM, skin cutaneous melanoma; STAD, stomach adenocarcinoma; SVM, support vector machine; TCGA, The Cancer Genome Atlas; TGCT, testicular germ cell tumors; THCA, thyroid carcinoma; THYM, thymoma; UCEC, uterine corpus endometrial carcinoma; UCS, uterine carcinosarcoma; UVM, uveal melanoma; XGBoost, extreme gradient boosting; k-NN, k-nearest neighbor.
© 2020 The Authors.
Conflict of interest statement
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Figures
References
-
- Hegde M. ACMG technical standards and guidelines for genetic testing for inherited colorectal cancer (Lynch syndrome, familial adenomatous polyposis, and MYH-associated polyposis) Genet Med. 2014;16(1):101–116. - PubMed
LinkOut - more resources
Full Text Sources
Other Literature Sources