Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jul 7;14(2):20170013.
doi: 10.1515/jib-2017-0013.

Prediction of Primary Tumors in Cancers of Unknown Primary

Affiliations

Prediction of Primary Tumors in Cancers of Unknown Primary

Dan Søndergaard et al. J Integr Bioinform. .

Abstract

A cancer of unknown primary (CUP) is a metastatic cancer for which standard diagnostic tests fail to identify the location of the primary tumor. CUPs account for 3-5% of cancer cases. Using molecular data to determine the location of the primary tumor in such cases can help doctors make the right treatment choice and thus improve the clinical outcome. In this paper, we present a new method for predicting the location of the primary tumor using gene expression data: locating cancers of unknown primary (LoCUP). The method models the data as a mixture of normal and tumor cells and thus allows correct classification even in impure samples, where the tumor biopsy is contaminated by a large fraction of normal cells. We find that our method provides a significant increase in classification accuracy (95.8% over 90.8%) on simulated low-purity metastatic samples and shows potential on a small dataset of real metastasis samples with known origin.

Keywords: cancer of unknown origin; classification; precision medicine; transcriptomics.

PubMed Disclaimer

Conflict of interest statement

Authors state no conflict of interest. All authors have read the journal’s Publication ethics and publication malpractice statement available at the journal’s website and hereby confirm that they comply with all its parts applicable to the present scientific work.

Figures

Figure 1:
Figure 1:
Example of the model in a 2-dimensional space using simulated data. The normal tissue samples belong to an ordinary normal distribution with “Normal Centroid” as mean. The tumor samples produce an elongated shape because impurity drags them towards the normal tissue centroid.
Figure 2:
Figure 2:
Distribution of estimated purities and the fitted beta distributions. Estimated shape parameters are shown as Beta (β1k, β2k). We observe that a beta distribution is a good fit for the tumor purity estimates.
Figure 3:
Figure 3:
Relationship between the datasets used for test (grid search and cross-validation) and validation. Simulated datasets are shown in blue. Datasets derived during cross-validation (only a single fold is shown) are shown in red.
Figure 4:
Figure 4:
Accuracy for the LoCUP and MLRR methods binned by the true α of the simulated samples in D4. The number of samples in each bin is shown in bold. Our method outperforms MLRR on samples where α∈(0, 2, 0.7), that is low-purity samples.

References

    1. Vikeså J, Møller AK, Kaczkowski B, Borup R, Winther O, Henao R. et al. Cancers of unknown primary origin (CUP) are characterized by chromosomal instability (CIN) compared to metastasis of known origin. BMC Cancer. 2015;15:151. - PMC - PubMed
    1. Moran S, Martínez-Cardús A, Sayols S, Musulén E, Balañá C, Estival-Gonzalez A. et al. Epigenetic profiling to classify cancer of unknown primary: a multicentre, retrospective analysis. Lancet Oncol. 2016;17:1386–95. - PubMed
    1. Ferracin M, Pedriali M, Veronese A, Zagatti B, Gafà R, Magri E. et al. MicroRNA profiling for the identification of cancers with unknown primary tissue-of-origin. J Pathol. 2011;225:43–53. - PMC - PubMed
    1. Marquard AM, Birkbak NJ, Thomas CE, Favero F, Krzystanek M, Lefebvre C. et al. TumorTracer: a method to identify the tissue of origin from the somatic mutations of a tumor specimen. BMC Med Genom. 2015;8:58. - PMC - PubMed
    1. Wang N, Gong T, Clarke R, Chen L, Shih IM, Zhang Z. et al. UNDO: a Bioconductor R package for unsupervised deconvolution of mixed gene expressions in tumor samples. Bioinformatics. 2015;31:137–9. - PMC - PubMed

MeSH terms

LinkOut - more resources