Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul:2020:4.
doi: 10.1145/3400903.3400908. Epub 2020 Jul 30.

Towards Co-Evolution of Data-Centric Ecosystems

Affiliations

Towards Co-Evolution of Data-Centric Ecosystems

Robert Schuler et al. Sci Stat Database Manag. 2020 Jul.

Abstract

Database evolution is a notoriously difficult task, and it is exacerbated by the necessity to evolve database-dependent applications. As science becomes increasingly dependent on sophisticated data management, the need to evolve an array of database-driven systems will only intensify. In this paper, we present an architecture for data-centric ecosystems that allows the components to seamlessly co-evolve by centralizing the models and mappings at the data service and pushing model-adaptive interactions to the database clients. Boundary objects fill the gap where applications are unable to adapt and need a stable interface to interact with the components of the ecosystem. Finally, evolution of the ecosystem is enabled via integrated schema modification and model management operations. We present use cases from actual experiences that demonstrate the utility of our approach.

Keywords: application-database co-evolution; model management; schema evolution; software ecosystems.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
A typical data-driven ecosystem based on Deriva including (clockwise from top-right) web clients, command-line clients, export bundles, and visualization.
Figure 2:
Figure 2:
Example ER model (left) with schema annotations (middle) specifying model mappings for a model-adaptive interactive application (top-right) and a model-neutral command-line client (bottom-right).
Figure 3:
Figure 3:
Conceptual structure of expressions used for mapping entities from one context to another.
Figure 4:
Figure 4:
Anatomy of a boundary object.
Figure 5:
Figure 5:
Overview of the three primary patterns of database-client interaction.
Figure 6:
Figure 6:
Conceptual steps enacted by model-adaptive database clients.
Figure 7:
Figure 7:
Conceptual steps enacted by model-neutral database clients.
Figure 8:
Figure 8:
Decoupling and mediation of model-bound clients from the data service via a model-neutral service layer for producing and consuming boundary objects.
Figure 9:
Figure 9:
Architecture diagram highlighting extensions for integrating MMO capabilities.
Figure 10:
Figure 10:
Major schema evolution events in the FaceBase Data Hub: initial schema (S0), revised for greater experimental details (S1), and evolved for better reproducibility (S2).
Figure 11:
Figure 11:
Illustration of a bioinformatics pipeline supported by “bag” (boundary object) collections in the decoupled interaction model.
Listing 1:
Listing 1:
Rule definitions to translate assignment (logical) expressions into Create, Alter (and Rename), and Drop (physical) operators

References

    1. Bechhofer Sean, David De Roure Matthew Gamble, Goble Carole, and Buchan Iain. 2010. Research Objects: Towards Exchange and Reuse of Digital Knowledge. Nature Precedings (2010).
    1. Philip A Bernstein Jayant Madhavan, and Rahm Erhard. 2011. Generic Schema Matching, Ten Years Later. Proceedings of the VLDB Endowment 4, 11 (2011).
    1. James F Brinkley Shannon Fisher, Matthew P Harris Greg Holmes, et al. 2016. The FaceBase Consortium: a comprehensive resource for craniofacial researchers. Development (Cambridge, England) 143, 14 (2016), 2677–88. - PMC - PubMed
    1. Bugacov Alejandro, Czajkowski Karl, Kesselman Carl, Kumar Anoop, Schuler Robert E., and Tangmunarunkit Hongsuda. 2017. Experiences with DERIVA: An asset management platform for accelerating eScience. In Proceedings - 13th IEEE International Conference on eScience, eScience 2017. 79–88. - PMC - PubMed
    1. Chard K, D’Arcy M, Heavner B, Foster I, Kesselman C, Madduri R, Rodriguez A, Soiland-Reyes S, Goble C, Clark K, Deutsch EW, Dinov I, Price N, and Toga A. 2016. I’ll take that to go: Big data bags and minimal identifiers for exchange of large, complex datasets. In 2016 IEEE International Conference on Big Data (Big Data). 319–328.

LinkOut - more resources