Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 Oct 14:2025.10.13.682095.
doi: 10.1101/2025.10.13.682095.

From data to publication in a browser with BRC-Analytics: Evolutionary dynamics of coding overlaps in measles virus

Affiliations

From data to publication in a browser with BRC-Analytics: Evolutionary dynamics of coding overlaps in measles virus

Anton Nekrutenko et al. bioRxiv. .

Abstract

The analytical landscape of pathogen research is often fragmented, hindering transparency and reproducibility due to diverse genomic data sources, numerous software tools, and suboptimal integration methods. Here we introduce BRC-analytics, a novel browser-based environment that unifies authoritative sources of genomic data with community-curated best analysis practices on a freely accessible public computational infrastructure. We demonstrate its capabilities by analyzing the evolutionary dynamics within the P/V/C locus of the measles virus, a complex system involving overlapping coding regions and RNA editing. Our analysis, conducted entirely within BRC-analytics, reveals asymmetric evolution of the locus's reading frames under distinct selective pressures. BRC-analytics streamlines the entire research process-from data collection and primary analysis (e.g., variant calling) to interpretation (e.g., using integrated JupyterLite notebooks and LLMs) and publication-into a single web browser session. This eliminates the need for local installations and manual data transfers, implicitly tracking provenance and ensuring reproducibility. The platform's goal is to provide true data-to-publication functionality, making advanced pathogen genomics accessible to a broader research community regardless of their computational expertise or infrastructure access.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Schematic representation of overlapping reading frames within the P/V/C locus of measles virus, MeV, (upper pane). C (middle pane) and V (lower pane) are shifted +1 and +2 relative to the P frame, respectively. The P protein is translated from the primary mRNA transcript, while the V and C proteins are produced from the same locus through different mechanisms. The C protein, a small basic protein, is expressed by an alternative translational initiation mechanism. It is translated from an overlapping open reading frame (in +1 phase relative to P) that begins 19 nucleotides downstream of the P/V start codon. This strategy allows for the independent expression of the C protein from the same mRNA transcript that codes for P and V. The V protein is produced through cotranscriptional RNA editing. This process involves the viral polymerase inserting a non-templated G residue at a specific site within the P/V mRNA transcript. This pseudo-templated insertion triggers a +2 ribosomal frameshift, leading to the translation of a new V ORF. The mechanism, often referred to as “polymerase stuttering,” occurs when the viral RNA-dependent RNA-polymerase repeatedly reads a single template cytosine within a short G run that is part of a larger polypurine tract.
Figure 2.
Figure 2.
Correspondence of codon positions in +1 and +2 overlaps
Figure 3.
Figure 3.
Main components of BRC-analytics.
Figure 4.
Figure 4.
Schematics of the variant calling workflow used in this study (the workflow is available from IWC).
Figure 5.
Figure 5.
Analysis setup for post-processing of variant calls.
Figure 5.
Figure 5.
Nucleotide changes within P/C (top) and P/V (bottom) overlaps. Each substitution is represented by two ticks corresponding to each reading frame. Green = synonymous for that frame; Red = non-synonymous for that frame; length of each tick represents the alternative allele frequency spread = the difference between min and max values. The opacity is the number of samples (from min = 2 to max = 225) the change is found in. A live version of this notebook can be viewed here.
Figure 6.
Figure 6.
Substitutions within P/V/C locus in each sample. Samples (X-axis) are sorted by collection time from 01/22/2018 (left) to 05/07/2024 (right). The rightmost blue sample had no collection date associated with it. Genomic coordinates of variants are on the Y-axis. Blue = Canada; Red = USA; Orange = Romania. Size of this circle is proportional for alternative allele frequency at that site. A live version of this notebook can be viewed here.
Figure 7.
Figure 7.
Tracing substitutions within codons 105 (left) and 111 (right) of the P/V reading frames through the phylogenetic tree (unrooted) of the analyzed samples. Imputed states at internal nodes are also shown. The number of samples shown here is smaller than the total number of samples in our analysis because identical sequences were excluded from the analysis. Sequences with an excessive number of unresolved (N) characters were also excluded.

References

    1. RePORT > RePORTER [Internet]. [cited 2025 Sep 12]. Available from: https://reporter.nih.gov/
    1. Chung W-Y, Wadhawan S, Szklarczyk R, Pond SK, Nekrutenko A. A first look at ARFome: dual-coding genes in mammalian genomes. PLoS Comput Biol. Public Library of Science (PLoS); 2007. May;3(5):e91.
    1. Szklarczyk R, Heringa J, Pond SK, Nekrutenko A. Rapid asymmetric evolution of a dual-coding tumor suppressor INK4a/ARF locus contradicts its function. Proc Natl Acad Sci U S A. Proceedings of the National Academy of Sciences; 2007. Jul 31;104(31):12807–12812.
    1. Muñoz-Baena L, Poon AFY. Using networks to analyze and visualize the distribution of overlapping genes in virus genomes. PLOS Pathogens. Public Library of Science; 2022. Feb 24;18(2):e1010331.
    1. Hohan R, Surleac M, Miron VD, Tudor A, Tudor A-M, Săndulescu O, Vlaicu O, Aramă V, Pițigoi D, Hristea A, Drăgănescu AC, Paraskevis D, Bănică L, Oțelea D, Paraschiv S. Ongoing measles outbreak in Romania: Clinical investigation and molecular epidemiology performed on whole genome sequences. PLoS One. 2025. Jan 15;20(1):e0317045. - PMC - PubMed

Publication types

LinkOut - more resources