Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jan 27;44(2):127-30.
doi: 10.1038/ng.1089.

Developing predictive molecular maps of human disease through community-based modeling

Affiliations

Developing predictive molecular maps of human disease through community-based modeling

Jonathan M J Derry et al. Nat Genet. .

Abstract

The inability to identify the molecular causes of disease has led to a disappointing rate of development of new medicines. By combining the power of community-based modeling with broad access to large datasets on a platform that promotes reproducible analyses, we can work toward more predictive molecular maps that can deliver better therapeutics.

PubMed Disclaimer

Conflict of interest statement

COMPETING FINANCIAL INTERESTS

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1. Synapse platform architecture
Synapse uses a set of web services to provide access to the data repository, which comprises a federated collection of curated, adjusted and analyzed datasets, models and code. Synapse may also reference restricted data stored in external databases, such as dbGAP or The Cancer Genome Atlas (TCGA). All resources managed by Synapse can be referenced as objects using a URL according to linked data principles. This approach allows for the storage of data and metadata using persistence mechanisms that are appropriate for each data modality while abstracting clients away from the details of how data and services are obtained. Integration with ontology services and support for a rich query language occurs on the Synapse backend, allowing multiple clients (for example, R and the web client) to run similar queries across hosted data. Versioning of data, workflows and tools allows for the documentation of details on how individual models were generated, and enables these models to be reproduced. Storage of the data repository and services in the cloud allows for scalability, access and the potential to use high performance computing facilities directly from Synapse.
Figure 2
Figure 2. The process of data acquisition, curation, adjustment, reformatting and modeling
Data flows into the repository from a number of different sources (examples are shown). Individual datasets typically contain different types of data and are submitted in various formats. Curation involves reformatting the data into a common tab-delimited text matrix format. This curated standard format is available for download and allows for the development of workflows for common manipulations (for example, adjustments for technical covariates, such as gene expression array batch). The ‘curated and adjusted’ dataset is also available for download. Data analysts or modelers may use the curated data or the curated and adjusted data for downstream analyses; the key feature is that the version of the dataset that is used for an analysis, as well as the underlying code and workflow, is stored. Allowing different types of users to interact with the data at different points in the process has advantages. For example, providing tools to enable the curation of a dataset into a standard format provides the user with the benefit of easy curation and opens up tools for downsteam quality control and analysis.

References

    1. Kola I, Landis J. Can the pharmaceutical industry reduce attrition rates. Nat Rev Drug Discov. 2004;3:711–715. - PubMed
    1. Barabási AL, Oltvai ZN. Network biology: understanding the cell’s functional organization. Nat Rev Genet. 2004;5:101–113. - PubMed
    1. Friend SH. The need for precompetitive integrative bionetwork disease model building. Clin Pharmacol Ther. 2010;87:536–539. - PubMed
    1. Schadt EE, Friend SH, Shaywitz DA. A network view of disease and compound screening. Nat Rev Drug Discov. 2009;8:286–295. - PubMed
    1. Tegnér JN, et al. Computational disease modeling - fact or fiction? BMC Syst Biol. 2009;3:56. - PMC - PubMed

Publication types