Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Oct 2;21(1):431.
doi: 10.1186/s12859-020-03760-7.

Real-time audio and visual display of the Coronavirus genome

Affiliations

Real-time audio and visual display of the Coronavirus genome

Mark D Temple. BMC Bioinformatics. .

Abstract

Background: This paper describes a web based tool that uses a combination of sonification and an animated display to inquire into the SARS-CoV-2 genome. The audio data is generated in real time from a variety of RNA motifs that are known to be important in the functioning of RNA. Additionally, metadata relating to RNA translation and transcription has been used to shape the auditory and visual displays. Together these tools provide a unique approach to further understand the metabolism of the viral RNA genome. This audio provides a further means to represent the function of the RNA in addition to traditional written and visual approaches.

Results: Sonification of the SARS-CoV-2 genomic RNA sequence results in a complex auditory stream composed of up to 12 individual audio tracks. Each auditory motive is derived from the actual RNA sequence or from metadata. This approach has been used to represent transcription or translation of the viral RNA genome. The display highlights the real-time interaction of functional RNA elements. The sonification of codons derived from all three reading frames of the viral RNA sequence in combination with sonified metadata provide the framework for this display. Functional RNA motifs such as transcription regulatory sequences and stem loop regions have also been sonified. Using the tool, audio can be generated in real-time from either genomic or sub-genomic representations of the RNA. Given the large size of the viral genome, a collection of interactive buttons has been provided to navigate to regions of interest, such as cleavage regions in the polyprotein, untranslated regions or each gene. These tools are available through an internet browser and the user can interact with the data display in real time.

Conclusion: The auditory display in combination with real-time animation of the process of translation and transcription provide a unique insight into the large body of evidence describing the metabolism of the RNA genome. Furthermore, the tool has been used as an algorithmic based audio generator. These audio tracks can be listened to by the general community without reference to the visual display to encourage further inquiry into the science.

Keywords: Auditory display; COVID-19; Molecular animation; RNA sequence; SARS-CoV-2; Sonification.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
The animated display. Panel A shows the sliding window of the animated display in translation mode. Key features of the animated display are labelled such as the translated peptide sequences and the frame in which they occur, the presence of start and stop codons are highlighted in green and red, respectively. The location of the audio play-head is represented to coincide with the peptidyl-transferase centre of the ribosome. The sonified audio is generated as the SARS-CoV-2 genome sequence passes through the play-head. The direction in which the ribosome moves relative to the RNA sequence is indicated. Panel B shows the animated display in transcription mode. The newly synthesised minus RNA strand is shown below the genome sequence with the 3′ extended nucleotide shown in the play-head. The direction in which the replicase protein complex moves in relation to genome sequence is indicated
Fig. 2
Fig. 2
Multitrack wave files representing a portion of an auditory display. These tracks play in unison to generate the auditory display and each represent approximately 80 nucleotides beginning at nucleotide position 65. This sequence is located in the 5′ untranslated region and includes a TRS region and a uORF. Each audio stream was generated from a different algorithm, only nucleotides that gave rise to audio are shown (the entire nucleotide sequence is shown in track 2). In track 1, each nucleotide generates a note for every beat unless it is a repeat of the previous in which case the length of the note is extended. In track 2, each di-nucleotide generates a note every second beat. In tracks 3 and 4, audio from the GC track is only triggered when the GC ratio changes by an increment of 0.1. Each change in the GC ratio is indicated by a plus (+) or minus (−) symbol on the wave files. In track 5, only codon sequences beginning with a start codon (AUG) are shown through to the next stop codon (e.g. UAA). Isolated stop codons also give rise to a note. This track is a compilation of audio form three sub-tracks each representing a different reading frame and notes in this track are panned left, centre or right, respectively. Track 6 represents the audio generated from metadata that indicates the location of a TRS region. Additionally, the consensus sequence within this region is coloured purple in the visual display. Track 7 represents audio generated by the occurrence of three nucleotides of the same type. Other data tracks are not represented since no audio was generated in these during processing of this sequence of the genome. Additionally, the amino acid sequence of the ORF is shown in the codon track 5
Fig. 3
Fig. 3
Alignment of the raw stereo waveforms. Two stereo waveforms are shown that depict the audio from examples 1 and 2. The vertical cursor indicates the transition across the TRS1 consensus sequence. Panel A depict the audio from the ‘UTR to Surface Glycoprotein’ example and panel B depicts that from the ‘Untranslated ends’ example. To the left of the cursor the stereo waveforms are identical leading up to the TRS1 region. To the right of the cursor the waveforms diverge. Panel A represents translation of a template produced through discontinuous transcription whereas panel B represents translation of contiguous genome sequence

References

    1. GenBank: Accession No. MN908947.3. https://www.ncbi.nlm.nih.gov/nuccore/MN908947.3. 2 April 2020
    1. Wu F, Zhao S, Yu B, Chen Y-M, Wang W, Song Z-G, Hu Y, Tao Z-W, Tian J-H, Pei Y-Y, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020;579(7798):265–269. doi: 10.1038/s41586-020-2008-3. - DOI - PMC - PubMed
    1. COVID-19 Dashboard by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU). https://coronavirus.jhu.edu/map.html 18 May 2020
    1. Grifoni A, Sidney J, Zhang Y, Scheuermann RH, Peters B, Sette A. A sequence homology and bioinformatic approach can predict candidate targets for immune responses to SARS-CoV-2. Cell Host Microbe. 2020;27(4):671–680.e672. doi: 10.1016/j.chom.2020.03.002. - DOI - PMC - PubMed
    1. Masters PS. The molecular biology of coronaviruses. Adv Virus Res. 2006;66:193–292. doi: 10.1016/S0065-3527(06)66005-3. - DOI - PMC - PubMed

LinkOut - more resources