Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct 2;22(1):473.
doi: 10.1186/s12859-021-04381-4.

DECONbench: a benchmarking platform dedicated to deconvolution methods for tumor heterogeneity quantification

Collaborators, Affiliations

DECONbench: a benchmarking platform dedicated to deconvolution methods for tumor heterogeneity quantification

Clémentine Decamps et al. BMC Bioinformatics. .

Abstract

Background: Quantification of tumor heterogeneity is essential to better understand cancer progression and to adapt therapeutic treatments to patient specificities. Bioinformatic tools to assess the different cell populations from single-omic datasets as bulk transcriptome or methylome samples have been recently developed, including reference-based and reference-free methods. Improved methods using multi-omic datasets are yet to be developed in the future and the community would need systematic tools to perform a comparative evaluation of these algorithms on controlled data.

Results: We present DECONbench, a standardized unbiased benchmarking resource, applied to the evaluation of computational methods quantifying cell-type heterogeneity in cancer. DECONbench includes gold standard simulated benchmark datasets, consisting of transcriptome and methylome profiles mimicking pancreatic adenocarcinoma molecular heterogeneity, and a set of baseline deconvolution methods (reference-free algorithms inferring cell-type proportions). DECONbench performs a systematic performance evaluation of each new methodological contribution and provides the possibility to publicly share source code and scoring.

Conclusion: DECONbench allows continuous submission of new methods in a user-friendly fashion, each novel contribution being automatically compared to the reference baseline methods, which enables crowdsourced benchmarking. DECONbench is designed to serve as a reference platform for the benchmarking of deconvolution methods in the evaluation of cancer heterogeneity. We believe it will contribute to leverage the benchmarking practices in the biomedical and life science communities. DECONbench is hosted on the open source Codalab competition platform. It is freely available at: https://competitions.codalab.org/competitions/27453 .

Keywords: Benchmarking platform; Cancer; Cellular heterogeneity; DNA methylation; Deconvolution; Omics integration; Transcriptome.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Overview of the DECONbench platform. The platform proposes a set of 8 baseline deconvolution methods and benchmark datasets consisting of paired methylome and transcriptome of in silico mixtures from pancreatic tumors. The platform releases the performance of each method on a leaderboard and provides plots for deeper evaluation. New methods are automatically compared to the existing ones
Fig. 2
Fig. 2
Benchmark dataset construction: a 5 different cell populations present in pancreatic tumors were considered. b Raw transcriptome and methylome profiles of these different cell populations were extracted from various sources (PDX model, tissues or isolated cells). c Raw cell type profile matrices were preprocessed together (Feature filtering, normalization, signal transformation, sample aggregation) to avoid any batch effect. After pre-processing, transcriptomic data are constituted of log2-transformed expression counts on 21,566 genes and methylome data of beta-values on 772,316 EPIC array CpG sites. d In silico Dirichlet distributions have been used based on realistic proportions defined by the anatomopathologist expertise (J. Cros). e Paired methylome and transcriptome of in silico mixtures from pancreatic tumors were obtained by considering D = T × A, with T the cell-type profiles (matrix of size M × K, with M the number of features and K = 5 the number of cell types) and A the cell-type proportion per patient (matrix of size K × N, with N = 30 the number of samples) common between both omics. One training set (DMET and DRNA) is accessible to the users (obtained by one realization of A). The algorithms are compared on 10 test sets (obtained from 10 other realizations of A) that are hidden on the platform
Fig. 3
Fig. 3
Baseline algorithm performances: Boxplots of ‘MAE A’ obtained for reference algorithms (‘MAE A’: Mean Absolute Error on estimated A, the matrix of cell proportion per patient). Each reference algorithm was run 5 times on each simulated data, for 10 independent simulations (50 measures were used to construct the boxplots). Each color represents the type of data used to infer A (‘DNAm’ for DMET training set, ‘RNA’ for DMET training set and ‘Both’ for methods integrating DMET and DRNA). Details of the algorithms can be found in the "Methods" section
Fig. 4
Fig. 4
DECONbench graphical outputs. a Boxplots of the Mean Absolute Errors (MAE) of the estimations of the A matrices (i.e. proportion matrices) obtained for each method that uses the transcriptome only (yellow), the methylome only (blue) or both omics (green). Boxplots of the baseline methods and other existing methods are shown in white, whereas the user's method is shown in red. b Heatmap of each A-matrix estimate are generated. The cell populations are in rows and the samples in columns. c Heatmaps of Absolute Error of each proportion estimate are generated. The cell populations are in rows, and the samples in columns

References

    1. Avila Cobos F, Vandesompele J, Mestdagh P, De Preter K. Computational deconvolution of transcriptomics data from mixed cell populations. Bioinformatics. 2018;34:1969–1979. doi: 10.1093/bioinformatics/bty019. - DOI - PubMed
    1. Becht E, et al. Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression. Genome Biol. 2016;17:1–20. doi: 10.1186/s13059-015-0866-z. - DOI - PMC - PubMed
    1. Nazarov PV, et al. Deconvolution of transcriptomes and miRNomes by independent component analysis provides insights into biological processes and clinical outcomes of melanoma patients. BMC Med Genomics. 2019;12:1–17. doi: 10.1186/s12920-019-0578-4. - DOI - PMC - PubMed
    1. Blum Y, et al. Dissecting heterogeneity in malignant pleural mesothelioma through histo-molecular gradients for clinical applications. Nat Commun. 2019;10:1333. doi: 10.1038/s41467-019-09307-6. - DOI - PMC - PubMed
    1. Steen CB, Liu CL, Alizadeh AA, Newman AM. Profiling cell type abundance and expression in bulk tissues with CIBERSORTx. Methods Mol Biol Clifton NJ. 2020;2117:135–157. doi: 10.1007/978-1-0716-0301-7_7. - DOI - PMC - PubMed