Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 May 9;35(5):406-409.
doi: 10.1038/nbt.3790.

Discovering and linking public omics data sets using the Omics Discovery Index

Affiliations

Discovering and linking public omics data sets using the Omics Discovery Index

Yasset Perez-Riverol et al. Nat Biotechnol. .
No abstract available

PubMed Disclaimer

Figures

Figure 1
Figure 1
Omics Discovery Index: data standardization, annotation, index and presentation. (a) The datasets stored in public repositories are converted to a common data representation including all metadata and biological entities. The OmicsDI XML files are validated using the OmicsDI XML validator. (b) The OmicsDI XML files are then annotated using public services and databases like UniProt, ChEBI, and PubMed, and the metadata is enriched using the Annotator service. The EBI search engine generates the indexes including other related resources such as PubMed, UniProt, Ensembl and ChEBI. (c) Different clients can use the OmicsDI API to retrieve data from the resource including the web interface and the ddiR package.
Figure 2
Figure 2
Distributions of OmicsDI datasets. (a) Distribution of datasets per omics type and organism category including model organisms, non-model organisms (excluding human) and human. (b) The dataset view showing the other related omics datasets, including the ontology highlighting option to extract the most relevant terms in the metadata. (c) Pearson-correlation plot between the metadata similarity score and the biological similarity score, across transcriptomics (T), proteomics (P) and metabolomics (M) datasets. (d) The shared molecules box shows all datasets with a biological similarity score of more than 0.5, with a slider allowing a user to increase the cutoff value (here set to 0.81).

References

    1. Bourne PE, Lorsch JR, Green ED. Perspective: Sustaining the big-data ecosystem. Nature. 2015;527:S16–17. - PubMed
    1. Perez-Riverol Y, Alpi E, Wang R, Hermjakob H, Vizcaino JA. Making proteomics data accessible and reusable: current state of proteomics databases and repositories. Proteomics. 2015;15:930–949. - PMC - PubMed
    1. Wilkinson MD, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016;3:160018. - PMC - PubMed
    1. Prins P, et al. Toward effective software solutions for big biology. Nature biotechnology. 2015;33:686–687. - PubMed
    1. Bourne PE, et al. The NIH Big Data to Knowledge (BD2K) initiative. J Am Med Inform Assoc. 2015;22:1114. - PMC - PubMed

Publication types