Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May 7;227(1):iyae049.
doi: 10.1093/genetics/iyae049.

Updates to the Alliance of Genome Resources central infrastructure

Collaborators

Updates to the Alliance of Genome Resources central infrastructure

Alliance of Genome Resources Consortium. Genetics. .

Abstract

The Alliance of Genome Resources (Alliance) is an extensible coalition of knowledgebases focused on the genetics and genomics of intensively studied model organisms. The Alliance is organized as individual knowledge centers with strong connections to their research communities and a centralized software infrastructure, discussed here. Model organisms currently represented in the Alliance are budding yeast, Caenorhabditis elegans, Drosophila, zebrafish, frog, laboratory mouse, laboratory rat, and the Gene Ontology Consortium. The project is in a rapid development phase to harmonize knowledge, store it, analyze it, and present it to the community through a web portal, direct downloads, and application programming interfaces (APIs). Here, we focus on developments over the last 2 years. Specifically, we added and enhanced tools for browsing the genome (JBrowse), downloading sequences, mining complex data (AllianceMine), visualizing pathways, full-text searching of the literature (Textpresso), and sequence similarity searching (SequenceServer). We enhanced existing interactive data tables and added an interactive table of paralogs to complement our representation of orthology. To support individual model organism communities, we implemented species-specific "landing pages" and will add disease-specific portals soon; in addition, we support a common community forum implemented in Discourse software. We describe our progress toward a central persistent database to support curation, the data modeling that underpins harmonization, and progress toward a state-of-the-art literature curation system with integrated artificial intelligence and machine learning (AI/ML).

Keywords: Caenorhabditis elegans; Drosophila; data integration; database; knowledgebase; mouse; software; text mining; yeast; zebrafish.

PubMed Disclaimer

Conflict of interest statement

Conflicts of interest: The author(s) declare no conflicts of interest.

Figures

Fig. 1.
Fig. 1.
MOD landing pages at the Alliance portal. A common look and feel that allows community-specific content.
Fig. 2.
Fig. 2.
Paralog table for C. elegans hlh-25. The table presents a ranking of paralogs for the hlh-25 gene, based on a weighted scoring algorithm that incorporates sequence conservation metrics. It lists the gene symbols, provides the alignment length in amino acids, and quantifies the similarity and identity percentages of genes paralogous to hlh-25. The methodology count, indicating the number of algorithms supporting the paralogous relationship, is also included. In this ranking, hlh-27 is identified as the primary paralog due to its high similarity and identity scores, despite being recognized by fewer methods than hlh-28.
Fig. 3.
Fig. 3.
Sequence detail widget. Chosen views of a specific gene are readily available for copying as plain text or with highlights. 5′ region of the human PLAA gene.
Fig. 4.
Fig. 4.
Screenshot of results from the Alliance SequenceServer BLAST tool. The results have been enhanced relative to the default SequenceServer results page by the addition of links to Alliance JBrowse and to the corresponding gene page (in this case C. elegans abi-1) at the Alliance website for each BLAST hit.
Fig. 5.
Fig. 5.
Output of a BLAST search. After a user clicks on the JBrowse link for a BLAST hit, they are directed to the web service where they will see a track for the BLAST hit and how the hit aligns with other tracks.
Fig. 6.
Fig. 6.
AllianceMine example. Using a simple template, a disease ontology (DO) term, in this case “autism,” is chosen, and all genes associated with this DO term are returned in a downloadable table.
Fig. 7.
Fig. 7.
Alliance pathway viewer. The pathway widget displays gene products (rectangles with gene names) and chemicals (rectangles with chemical abbreviations) and the flow of information and material between them (relations). These relations, shown in legend, indicate direct or indirect regulation that can be positive, negative, or of unknown effect direction. For metabolites that mediate the information flow between gene products, distinct shading distinguishes metabolites that are the inputs or outputs of a reaction.
Fig. 8.
Fig. 8.
Evolution of data flow. Graphical summary showing the design of short-term infrastructure initially deployed to support rapid delivery of unified data to the community and the planned production system. Red, data quartermasters at MODs; yellow, data; brown, database; green, transformations; blue, user interface.
Fig. 9.
Fig. 9.
Alliance curation tool. Screenshot of the Alliance curation tool interface showing an example of curated annotations of AGMs managed in the persistent store.
Fig. 10.
Fig. 10.
Textpresso for SGD literature at the Alliance (http://sgd-textpresso.alliancegenome.org/tpc/search).
Fig. 11.
Fig. 11.
Swagger interface for the Alliance APIs.
Fig. 12.
Fig. 12.
Alliance community forum home page.
Fig. 13.
Fig. 13.
Mockup of an expression detail page. This example shows one of the current features of WormBase—single-cell data from 2 studies—displayed on what will be part of an Alliance gene expression detail page.
Fig. 14.
Fig. 14.
Mockup of the AD portal showing the home page and the data access page. These views illustrate the type of information that will be available with a disease focus.

Update of

References

    1. Alliance of Genome Resources C. 2022. Harmonizing model organism data in the Alliance of Genome Resources. Genetics 220(4):iyac022. doi:10.1093/genetics/iyac022 - DOI - PMC - PubMed
    1. Altenhoff AM, Train CM, Gilbert KJ, Mediratta I, Mendes de Farias T, Moi D, Nevers Y, Radoykova HS, Rossier V, Warwick Vesztrocy A, et al. 2021. OMA orthology in 2021: website overhaul, conserved isoforms, ancestral gene order and more. Nucleic Acids Res. 49(D1):D373–D379. doi:10.1093/nar/gkaa1007. - DOI - PMC - PubMed
    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol. 215(3):403–410. doi:10.1016/S0022-2836(05)80360-2. - DOI - PubMed
    1. Anderson WP; Global Life Science Data Resources Working Group . 2017. Global life science data resources working, data management: a global coalition to sustain core data. Nature 543(7644):179. doi:10.1038/543179a. - DOI - PubMed
    1. Arnaboldi V, Raciti D, Van Auken K, Chan JN, Müller HM, Sternberg PW. 2020. Text mining meets community curation: a newly designed curation platform to improve author experience and participation at WormBase. Database (Oxford). 2020:baaa006. doi:10.1093/database/baaa006. - DOI - PMC - PubMed

Publication types