Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jun 6;17 Suppl 5(Suppl 5):205.
doi: 10.1186/s12859-016-1046-1.

Inferring differentially expressed pathways using kernel maximum mean discrepancy-based test

Affiliations

Inferring differentially expressed pathways using kernel maximum mean discrepancy-based test

Esteban Vegas et al. BMC Bioinformatics. .

Abstract

Background: Pathway expression is multivariate in nature. Thus, from a statistical perspective, to detect differentially expressed pathways between two conditions, methods for inferring differences between mean vectors need to be applied. Maximum mean discrepancy (MMD) is a statistical test to determine whether two samples are from the same distribution, its implementation being greatly simplified using the kernel method.

Results: An MMD-based test successfully detected the differential expression between two conditions, specifically the expression of a set of genes involved in certain fatty acid metabolic pathways. Furthermore, we exploited the ability of the kernel method to integrate data and successfully added hepatic fatty acid levels to the test procedure.

Conclusion: MMD is a non-parametric test that acquires several advantages when combined with the kernelization of data: 1) the number of variables can be greater than the sample size; 2) omics data can be integrated; 3) it can be applied not only to vectors, but to strings, sequences and other common structured data types arising in molecular biology.

Keywords: Kernel maximum mean test; Kernel-based methods; Omics data integration.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Empirical distribution of kernel MMD under the null hypothesis. The observed value of the test statistic is indicated by an arrow. The number of repetitions is 2499
Fig. 2
Fig. 2
Kernel PCA of gene expression. The wt samples are represented in black and the ppar samples in red. Diets represented as follows: (ref) diet by letter x; (coc) diet by circles; (sun) diet by diamonds; (lin) diet by plus signs; and (fish) diet by triangles
Fig. 3
Fig. 3
Kernel PCA of gene expression. Kernel PCA of gene expression which shows 16 genes correspond to the fatty acid catabolism pathway. All genes have approximately the same direction (black vector) to left except ACOTH gene
Fig. 4
Fig. 4
Heatmaps. Figure shows the expression of the fatty acid catabolism pathway between wt and ppar genotype from gene expression (left) and gene expression and fatty acids (right). Mice from 1 to 20 are wt and from 20 to 40 are ppar
Fig. 5
Fig. 5
Heatmaps. Figure shows the expression of fatty acid catabolism pathway between sun and fish diet from gene expression (left) and gene expression and fatty acids (right). Mice 2, 3, 13, 15, 23, 25, 34, 40 were fed the sun diet and the others the fish diet

References

    1. Hamid JS, Hu P, Roslin NM, Ling V, Greenwood CMT, Beyene J. Data integration in genetics and genomics: methods and challenges. Hum Genomics Proteomics : HGP. 2009. doi:10.4061/2009/869093. - PMC - PubMed
    1. Gomez-Cabrero D, Abugessaisa I, Maier D, Teschendorff A, Merkenschlager M, Gisel A, Ballestar E, Bongcam-Rudloff E, Conesa A, Tegnér J. Data integration in the era of omics: current and future challenges. BMC Syst Biol. 2014;8(Suppl 2):1. doi: 10.1186/1752-0509-8-S2-I1. - DOI - PMC - PubMed
    1. Ritchie MD, Holzinger ER, Li R, Pendergrass SA, Kim D. Methods of integrating data to uncover genotype – phenotype interactions. Nat Rev Genet. 2015;16(2):85–97. doi: 10.1038/nrg3868. - DOI - PubMed
    1. Lanckriet GRG, De Bie T, Cristianini N, Jordan MI, Noble S. A statistical framework for genomic data fusion. Bioinformatics. 2004;20(16):2626–635. doi: 10.1093/bioinformatics/bth294. - DOI - PubMed
    1. Daemen A, Gevaert O, De Moor B. Integration of clinical and microarray data with kernel methods. In: Engineering in Medicine and Biology Society, 2007. EMBS 2007. 29th Annual International Conference of the IEEE: 2007. p. 5411–415. doi:10.1109/IEMBS.2007.4353566. - PubMed