Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Nov 22:2023.11.20.567935.
doi: 10.1101/2023.11.20.567935.

Updates to the Alliance of Genome Resources Central Infrastructure Alliance of Genome Resources Consortium

Collaborators, Affiliations

Updates to the Alliance of Genome Resources Central Infrastructure Alliance of Genome Resources Consortium

Alliance of Genome Resources Consortium. bioRxiv. .

Update in

Abstract

The Alliance of Genome Resources (Alliance) is an extensible coalition of knowledgebases focused on the genetics and genomics of intensively-studied model organisms. The Alliance is organized as individual knowledge centers with strong connections to their research communities and a centralized software infrastructure, discussed here. Model organisms currently represented in the Alliance are budding yeast, C. elegans, Drosophila, zebrafish, frog, laboratory mouse, laboratory rat, and the Gene Ontology Consortium. The project is in a rapid development phase to harmonize knowledge, store it, analyze it, and present it to the community through a web portal, direct downloads, and APIs. Here we focus on developments over the last two years. Specifically, we added and enhanced tools for browsing the genome (JBrowse), downloading sequences, mining complex data (AllianceMine), visualizing pathways, full-text searching of the literature (Textpresso), and sequence similarity searching (SequenceServer). We enhanced existing interactive data tables and added an interactive table of paralogs to complement our representation of orthology. To support individual model organism communities, we implemented species-specific "landing pages" and will add disease-specific portals soon; in addition, we support a common community forum implemented in Discourse. We describe our progress towards a central persistent database to support curation, the data modeling that underpins harmonization, and progress towards a state-of-the art literature curation system with integrated Artificial Intelligence and Machine Learning (AI/ML).

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. MOD landing pages at the Alliance Portal.
A common look and feel that allows community-specific content.
Figure 2.
Figure 2.. Paralog table for C. elegans hlh-25.
The table presents a ranking of paralogs for the hlh-25 gene, based on a weighted scoring algorithm that incorporates sequence conservation metrics. It lists the gene symbols, provides the alignment length in amino acids, and quantifies the similarity and identity percentages of genes paralogous to hlh-25. The methodology count, indicating the number of algorithms supporting the paralogous relationship, is also included. In this ranking, hlh-27 is identified as the primary paralog due to its high similarity and identity scores, despite being recognized by fewer methods than hlh-28.
Figure 3.
Figure 3.. Sequence detail widget.
Chosen views of a specific gene are readily available for copying as plain text or with highlights. 5’ region of the human PLAA gene.
Figure 4.
Figure 4.. Screenshot of results from the Alliance SequenceServer BLAST tool.
The results have been enhanced relative to the default Sequence Server results page by the addition of links to Alliance JBrowse and to the corresponding gene page (in this case, C. elegans abi-1) at the Alliance website for each BLAST hit.
Figure 5.
Figure 5.. Output of a BLAST search
After a user clicks on the JBrowse link for a BLAST hit they are directed to the web service where they will see a track for the BLAST hit and how the hit aligns with other tracks.
Figure 6.
Figure 6.. AllianceMine example.
Using a simple template, a disease ontology (DO) term is chosen, and all genes associated with this DO term are returned in a downloadable table.
Figure 7.
Figure 7.. Alliance Pathway Viewer.
The pathway widget displays gene products (light purple rectangles), protein complexes (light grey rectangles) and chemicals (light blue rectangles) and the flow of information and material between them (relations). These relations, shown in legend indicate direct or indirect regulation that can be positive, negative or of unknown effect direction.
Figure 8.
Figure 8.. Evolution of Data Flow.
Graphical summary showing the design of short term infrastructure initially deployed to support rapid delivery of unified data to the community and the planned production system. Red, data quartermasters at MODs; Yellow, data; Brown, database; Green, transformations; Blue, user interface.
Figure 9.
Figure 9.
Screenshot of the Alliance curation tool interface showing an example of curated annotations of Affected Genomic Models managed in the persistent store.
Figure 10.
Figure 10.
Textpresso for SGD literature at the Alliance. (http://sgdtextpresso.alliancegenome.org/tpc/search)
Figure 11.
Figure 11.
Swagger interface for the Alliance APIs.
FIgure 12.
FIgure 12.
Example of API output.
Figure 13.
Figure 13.. Mockup of the Alzheimer’s Disease Portal showing the Home page and the Data access page.
These views illustrate the type of information that will be available with a disease-focus.
Figure 14.
Figure 14.
Alliance community forum home page.
Figure 15.
Figure 15.. Mockup of an Expression Detail page.
This example shows one of the current features of WormBase – single cell data from two studies – displayed on what will be part of an Alliance Gene Expression detail page.

References

    1. Alliance of Genome Resources, C., Harmonizing model organism data in the Alliance of Genome Resources. Genetics, 2022. 220(4). - PMC - PubMed
    1. Altenhoff AM, Train CM, Gilbert KJ, Mediratta I, Mendes de Farias T, Moi D, Nevers Y, Radoykova HS, Rossier V, Warwick Vesztrocy A, Glover NM, Dessimoz C. OMA orthology in 2021: website overhaul, conserved isoforms, ancestral gene order and more. Nucleic Acids Res. 2021. Jan 8;49(D1):D373–D379. doi: 10.1093/nar/gkaa1007. - DOI - PMC - PubMed
    1. Altschul SF, Gish W, Miller W, Myers Eugene W., Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215: 403–410. - PubMed
    1. Anderson W.P. and Global G. Life Science Data Resources Working, Data management: A global coalition to sustain core data. Nature, 2017. 543(7644): p. 179. - PubMed
    1. Bornstein K., et al., The NIH Comparative Genomics Resource: addressing the promises and challenges of comparative genomics on human health. BMC Genomics, 2023. 24(1): p. 575. - PMC - PubMed

Publication types