Optimal deconvolution of transcriptional profiling data using quadratic programming with application to complex clinical blood samples
- PMID: 22110609
- PMCID: PMC3217948
- DOI: 10.1371/journal.pone.0027156
Optimal deconvolution of transcriptional profiling data using quadratic programming with application to complex clinical blood samples
Abstract
Large-scale molecular profiling technologies have assisted the identification of disease biomarkers and facilitated the basic understanding of cellular processes. However, samples collected from human subjects in clinical trials possess a level of complexity, arising from multiple cell types, that can obfuscate the analysis of data derived from them. Failure to identify, quantify, and incorporate sources of heterogeneity into an analysis can have widespread and detrimental effects on subsequent statistical studies.We describe an approach that builds upon a linear latent variable model, in which expression levels from mixed cell populations are modeled as the weighted average of expression from different cell types. We solve these equations using quadratic programming, which efficiently identifies the globally optimal solution while preserving non-negativity of the fraction of the cells. We applied our method to various existing platforms to estimate proportions of different pure cell or tissue types and gene expression profilings of distinct phenotypes, with a focus on complex samples collected in clinical trials. We tested our methods on several well controlled benchmark data sets with known mixing fractions of pure cell or tissue types and mRNA expression profiling data from samples collected in a clinical trial. Accurate agreement between predicted and actual mixing fractions was observed. In addition, our method was able to predict mixing fractions for more than ten species of circulating cells and to provide accurate estimates for relatively rare cell types (<10% total population). Furthermore, accurate changes in leukocyte trafficking associated with Fingolomid (FTY720) treatment were identified that were consistent with previous results generated by both cell counts and flow cytometry. These data suggest that our method can solve one of the open questions regarding the analysis of complex transcriptional data: namely, how to identify the optimal mixing fractions in a given experiment.
Conflict of interest statement
Figures





Similar articles
-
DeconRNASeq: a statistical framework for deconvolution of heterogeneous tissue samples based on mRNA-Seq data.Bioinformatics. 2013 Apr 15;29(8):1083-5. doi: 10.1093/bioinformatics/btt090. Epub 2013 Feb 21. Bioinformatics. 2013. PMID: 23428642
-
Complete deconvolution of cellular mixtures based on linearity of transcriptional signatures.Nat Commun. 2019 May 17;10(1):2209. doi: 10.1038/s41467-019-09990-5. Nat Commun. 2019. PMID: 31101809 Free PMC article.
-
Computational expression deconvolution in a complex mammalian organ.BMC Bioinformatics. 2006 Jul 3;7:328. doi: 10.1186/1471-2105-7-328. BMC Bioinformatics. 2006. PMID: 16817968 Free PMC article.
-
Computational deconvolution of transcriptomics data from mixed cell populations.Bioinformatics. 2018 Jun 1;34(11):1969-1979. doi: 10.1093/bioinformatics/bty019. Bioinformatics. 2018. PMID: 29351586 Review.
-
[Transcriptomes for serial analysis of gene expression].J Soc Biol. 2002;196(4):303-7. J Soc Biol. 2002. PMID: 12645300 Review. French.
Cited by
-
An assessment of computational methods for estimating purity and clonality using genomic data derived from heterogeneous tumor tissue samples.Brief Bioinform. 2015 Mar;16(2):232-41. doi: 10.1093/bib/bbu002. Epub 2014 Feb 20. Brief Bioinform. 2015. PMID: 24562872 Free PMC article. Review.
-
Fast and robust deconvolution of tumor infiltrating lymphocyte from expression profiles using least trimmed squares.PLoS Comput Biol. 2019 May 6;15(5):e1006976. doi: 10.1371/journal.pcbi.1006976. eCollection 2019 May. PLoS Comput Biol. 2019. PMID: 31059559 Free PMC article.
-
MMAD: microarray microdissection with analysis of differences is a computational tool for deconvoluting cell type-specific contributions from tissue samples.Bioinformatics. 2014 Mar 1;30(5):682-9. doi: 10.1093/bioinformatics/btt566. Epub 2013 Oct 1. Bioinformatics. 2014. PMID: 24085566 Free PMC article.
-
MosaicSolver: a tool for determining recombinants of viral genomes from pileup data.Nucleic Acids Res. 2014;42(16):e123. doi: 10.1093/nar/gku524. Epub 2014 Aug 12. Nucleic Acids Res. 2014. PMID: 25120266 Free PMC article.
-
Reference-free deconvolution of complex samples based on cross-cell-type differential analysis: Systematic evaluations with various feature selection options.Front Genet. 2025 May 30;16:1570781. doi: 10.3389/fgene.2025.1570781. eCollection 2025. Front Genet. 2025. PMID: 40520231 Free PMC article.
References
-
- Liotta L, Petricoin E. Molecular profiling of human cancer. Nat Rev Genet. 2000;1:48–56. - PubMed
-
- Coleman WB, Tsongalis GJ. 2009. Molecular Pathology: The Molecular Basis of Human Disease: Academic Press; 1 edition (March 16, 2009)
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
Molecular Biology Databases