Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Dec;15(12):1049-1052.
doi: 10.1038/s41592-018-0218-5. Epub 2018 Nov 26.

Interpretation of an individual functional genomics experiment guided by massive public data

Affiliations

Interpretation of an individual functional genomics experiment guided by massive public data

Young-Suk Lee et al. Nat Methods. 2018 Dec.

Abstract

A key unmet challenge in interpreting omics experiments is inferring biological meaning in the context of public functional genomics data. We developed a computational framework, Your Evidence Tailored Integration (YETI; http://yeti.princeton.edu/ ), which creates specialized functional interaction maps from large public datasets relevant to an individual omics experiment. Using this tailored integration, we predicted and experimentally confirmed an unexpected divergence in viral replication after seasonal or pandemic human influenza virus infection.

PubMed Disclaimer

Figures

Fig. 1 |
Fig. 1 |. Overview of YETI.
YETI leverages the available public data compendium and learns global data-compendium-validated functional interactions that provide insight and predictions relevant to the user dataset.
Fig. 2 |
Fig. 2 |. Evaluation of network accuracy and relevance.
a, Schematic of trade-offs in accuracy in recovering true functional relationships (functional accuracy) and the relevance of these relationships to a specific dataset (dataset specificity). b, Comparison of dataset specificity score (DSS) and functional accuracy score (FAS) of the three network approaches for tumor datasets from the Pan-Cancer Analysis project. The different network performance assessments for the same tumor type are connected. For clarity, a representative subset of the tumor types is shown. The inset compares the performance for all Pan-Cancer tumor types (mean ± s.e.m., n = 13 Pan-Cancer tumor datasets; several s.e.m. indicators are smaller than the inset markers). c, Box plots of distributions of the DSS for generic functional networks and for YETI networks for 362 -omics datasets from GEO. In each box plot, the center line represents the median, the lower and upper hinges indicate the first and third quartiles, the upper whisker extends to the largest value less than 1.5× the interquartile range (IQR), and the lower whisker extends to the smallest value at most 1.5× the IQR. d, Box plots of distributions of the FAS for coexpression networks and for YETI networks for 362 omics datasets from GEO. Box plot elements are defined as in c. Significance was assessed by one-tailed paired t test.
Fig. 3 |
Fig. 3 |. YETI maps the specific functional landscapes of human dendritic cells after seasonal or pandemic influenza virus infection.
a,b, Source networks selected from the seasonal virus dataset (a) or the pandemic virus dataset (b) that were grouped with the “response to virus” source network. Apoptosis-related source networks are highlighted in orange, and the “viral genome replication” source network is in red. c, Network neighbors of inhibitors of apoptosis (IAPs: BIRC2, BIRC3, and XIAP) and RIPK1 and RIPK3 in the YETI network from the seasonal virus dataset (edge weight threshold: 0.3). The edge colors indicate the interaction weights of the YETI network. Gold edges represent known functional interactions. d, Testing the YETI-based hypothesis by determining infectious virus titer produced in dendritic cells after infection with the seasonal (NC/99) and pandemic (Cal/09) influenza virus strains. Black diamonds represent the virus titer level for each independent experiment. Statistical comparison between seasonal and pandemic virus titer levels was performed by one-tailed t test based on n = 3 biologically independent samples 8 h and 24 h post-infection (P = 0.019 and P = 2.8 × 10−4, respectively). Data are summarized by mean and s.e.m.

Comment in

  • Olga Troyanskaya.
    Marx V. Marx V. Nat Methods. 2018 Dec;15(12):987. doi: 10.1038/s41592-018-0226-5. Nat Methods. 2018. PMID: 30504864 No abstract available.

References

    1. Rung J. & Brazma A. Reuse of public genome-wide gene expression data. Nat. Rev. Genet 14, 89–99 (2013). - PubMed
    1. Dolinski K. & Troyanskaya OG Implications of Big Data for cell biology. Mol. Biol. Cell 26, 2575–2578 (2015). - PMC - PubMed
    1. Eisen MB, Spellman PT, Brown PO & Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868 (1998). - PMC - PubMed
    1. Stuart JM, Segal E, Koller D. & Kim SK A gene-coexpression network for global discovery of conserved genetic modules. Science 302, 249–255 (2003). - PubMed
    1. De Smet R. & Marchal K. Advantages and limitations of current network inference methods. Nat. Rev. Microbiol 8, 717–729 (2010). - PubMed

Publication types