Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan 23:12:e82762.
doi: 10.7554/eLife.82762.

Ribosomal RNA (rRNA) sequences from 33 globally distributed mosquito species for improved metagenomics and species identification

Affiliations

Ribosomal RNA (rRNA) sequences from 33 globally distributed mosquito species for improved metagenomics and species identification

Cassandra Koh et al. Elife. .

Abstract

Total RNA sequencing (RNA-seq) is an important tool in the study of mosquitoes and the RNA viruses they vector as it allows assessment of both host and viral RNA in specimens. However, there are two main constraints. First, as with many other species, abundant mosquito ribosomal RNA (rRNA) serves as the predominant template from which sequences are generated, meaning that the desired host and viral templates are sequenced far less. Second, mosquito specimens captured in the field must be correctly identified, in some cases to the sub-species level. Here, we generate mosquito rRNA datasets which will substantially mitigate both of these problems. We describe a strategy to assemble novel rRNA sequences from mosquito specimens and produce an unprecedented dataset of 234 full-length 28S and 18S rRNA sequences of 33 medically important species from countries with known histories of mosquito-borne virus circulation (Cambodia, the Central African Republic, Madagascar, and French Guiana). These sequences will allow both physical and computational removal of rRNA from specimens during RNA-seq protocols. We also assess the utility of rRNA sequences for molecular taxonomy and compare phylogenies constructed using rRNA sequences versus those created using the gold standard for molecular species identification of specimens-the mitochondrial cytochrome c oxidase I (COI) gene. We find that rRNA- and COI-derived phylogenetic trees are incongruent and that 28S and concatenated 28S+18S rRNA phylogenies reflect evolutionary relationships that are more aligned with contemporary mosquito systematics. This significant expansion to the current rRNA reference library for mosquitoes will improve mosquito RNA-seq metagenomics by permitting the optimization of species-specific rRNA depletion protocols for a broader range of species and streamlining species identification by rRNA sequence and phylogenetics.

Keywords: RNA-seq; infectious disease; metagenomics; microbiology; molecular taxonomy; mosquito; ribosomal RNA; surveillance.

PubMed Disclaimer

Conflict of interest statement

CK, LF, HB, CN, SB, PD, NG, RG, JD, MS No competing interests declared

Figures

Figure 1.
Figure 1.. Percentage of rRNA reads in mosquito total RNA sequencing (RNA-seq) data after depletion using probes antisense to Aedes aegypti sequences.
Pools of five individual mosquitoes from genera Aedes (Ae), Culex (Cx), Mansonia (Ma), and Anopheles (An) were ribodepleted by probe hybridisation followed by RNase H digestion according to the protocol by Morlan et al., 2012. Y-axis depicts percentages of remaining rRNA reads calculated as the number of rRNA reads over total reads per sample pool. Depletion efficiency decreases with taxonomic distance from Ae. aegypti underlining the need for reference sequences for species of interest.
Figure 2.
Figure 2.. Novel mosquito rRNA sequences were obtained using a unique reads filtering method.
(A) Schematic of sequencing and bioinformatics analyses performed in this study to obtain full-length 18S and 28S rRNA sequences as well as cytochrome c oxidase I (COI) DNA sequences. Nucleic acids were isolated from mosquito specimens for next-generation (for rRNA) or Sanger (for COI) sequencing. Two in-house libraries were created from the SILVA rRNA gene database: Insecta and Non-Insecta, which comprises 8,585 sequences and 558,185 sequences, respectively. Following BLASTn analyses against these two libraries, each RNA-sequencing (RNA-seq) read is assigned a ratio of BLASTn scores to describe their relative nucleotide similarity to insect rRNA sequences. Based on these ratios of scores, RNA-seq reads can then be filtered to remove non-mosquito reads prior to assembly with SPAdes to give full-length 18S and 28S rRNA sequences. Image created with https://biorender.com/. (B) Based on their ratio of scores, reads can be segregated into four categories, as shown on this ratio of scores versus number of reads plot for the representative specimen ‘CF S27’: (i) reads with hits only in the Insecta library (shaded in green), (ii) reads with a higher score against the Insecta library (shaded in blue), (iii) reads with a higher score against the Non-Insecta library (shaded in yellow), and (iv) reads with no hits in the Insecta library (shaded in red). We applied a conservative threshold at 0.8, indicated by the black horizontal line, where only reads above this threshold are used in the assembly with SPAdes. For this given specimen, 175,671 reads (96.3% of total reads) passed the ≥0.8 cut-off, 325 reads (0.18% of total reads) had ratios of scores <0.8, while 6,423 reads (3.52%) did not have hits against the Insecta library.
Figure 3.
Figure 3.. 28S sequences generated from this study clustered with conspecifics or congenerics from existing GenBank records.
A rooted phylogenetic tree based on full-length 28S sequences (3,900 bp) from this study and from GenBank was inferred using the maximum-likelihood method and constructed to scale in MEGA X (Kumar et al., 2018) using an unknown Horreolanus species found among our samples as an outgroup. Values at each node indicate bootstrap support (%) from 500 replications. Sequences from GenBank are annotated with filled circles and their accession numbers are shown. For sequences from this study, each specimen label contains information on taxonomy, origin (in two-letter country codes), and specimen ID number. Some specimens produced up to two consensus 28S sequences; this is indicated by the numbers 1 or 2 at the beginning of the specimen label. Specimen genera are indicated by colour: Culex in coral, Anopheles in purple, Aedes in dark blue, Mansonia in dark green, Culiseta in maroon, Limatus in light green, Coquillettidia in light blue, Psorophora in yellow, Mimomyia in teal, Uranotaenia in pink, and Eretmapodites in brown. Scale bar at 0.05 is shown.
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. Interspecific and intersubgeneric distances within the genus Anopheles indicate a greater degree of divergence than those within any other genera of family Culicidae.
The phylogenetic tree presented in Figure 3 based on 28S sequences from this study and from GenBank (annotated with filled circles) is depicted here in radial format to illustrate how the branch lengths separating Anopheline taxa are longer relative to other members of family Culicidae. An unknown Horreolanus species found among our samples serves as an outgroup. For sequences from this study, each specimen label contains information on taxonomy, origin (in two-letter country codes), and specimen ID number. Some specimens produced up to two consensus 28S sequences; this is indicated by the numbers 1 or 2 at the beginning of the specimen label. Specimen genera are indicated by colour: Culex in coral, Anopheles in purple, Aedes in dark blue, Mansonia in dark green, Limatus in light green, Coquillettidia in light blue, Psorophora in yellow, Mimomyia in teal, Uranotaenia in pink, and Eretmapodites in brown. Scale bar at 0.05 is shown.
Figure 3—figure supplement 2.
Figure 3—figure supplement 2.. Sequence conservation among 169 28S rRNA sequences obtained from this study and from GenBank combined.
Multiple sequence alignment was performed on 28S rRNA sequences, 3,900 bp in length. Each bar represents a 25 bp sliding window of the 28S rRNA sequence alignment where the y-axis values are the lowest percentage nucleotide identity found.
Figure 4.
Figure 4.. Concatenating 28S and 18S rRNA sequences produces phylogenetic relationships that are concordant with classical Culicidae systematics with higher bootstrap support than 28S sequences alone.
This phylogenetic tree based on concatenated 28S+18S rRNA sequences (3,900+1,900 bp) generated from this study was inferred using the maximum-likelihood method and constructed to scale using MEGA X (Kumar et al., 2018) using an unknown Horreolanus species found among our samples as an outgroup. Values at each node indicate bootstrap support (%) from 500 replications. Each specimen label contains information on taxonomy, origin (as indicated in two-letter country codes), and specimen ID number. Some specimens produced up to two consensus 28S+18S rRNA sequences; this is indicated by the numbers 1 or 2 at the beginning of the specimen label. Specimen genera are indicated by colour: Culex in coral, Anopheles in purple, Aedes in dark blue, Mansonia in dark green, Limatus in light green, Coquillettidia in light blue, Psorophora in yellow, Mimomyia in teal, Uranotaenia in pink, and Eretmapodites in brown. Scale bar at 0.05 is shown.
Figure 4—figure supplement 1.
Figure 4—figure supplement 1.. Phylogenetic tree based on 28S rRNA sequences generated from this study (3,900 bp).
This tree was inferred using maximum-likelihood method and constructed to scale in MEGA X (Kumar et al., 2018) using an unknown Horreolanus species found among our samples as an outgroup. Values at each node indicate bootstrap support (%) from 500 replications. For sequences from this study, each specimen label contains information on taxonomy, origin (in two-letter country codes), and specimen ID number. Some specimens produced up to two consensus 28S rRNA sequences; this is indicated by the numbers 1 or 2 at the beginning of the specimen label. Specimen genera are indicated by colour: Culex in coral, Anopheles in purple, Aedes in dark blue, Mansonia in dark green, Limatus in light green, Coquillettidia in light blue, Psorophora in yellow, Mimomyia in teal, Uranotaenia in pink, and Eretmapodites in brown. Scale bar at 0.05 is shown.
Figure 4—figure supplement 2.
Figure 4—figure supplement 2.. Phylogenetic tree based on 18S rRNA sequences (1,900 bp).
This tree was inferred using maximum-likelihood method and constructed to scale in MEGA X (Kumar et al., 2018) using an unknown Horreolanus species found among our samples as an outgroup. Values at each node indicate bootstrap support (%) from 500 replications. For sequences from this study, each specimen label contains information on taxonomy, origin (in two-letter country codes), and specimen ID number. One 18S rRNA sequence was obtain for each specimen. Specimen genera are indicated by colour: Culex in coral, Anopheles in purple, Aedes in dark blue, Mansonia in dark green, Limatus in light green, Coquillettidia in light blue, Psorophora in yellow, Mimomyia in teal, Uranotaenia in pink, and Eretmapodites in brown. Scale bar at 0.05 is shown.
Figure 5.
Figure 5.. Cytochrome c oxidase I (COI) sequences cluster by species but show phylogenetic relationships that contrast those derived from rRNA trees.
A phylogenetic tree based on COI sequences (621–699 bp) was inferred using the maximum-likelihood method and constructed to scale using MEGA X (Kumar et al., 2018) with three water mite species to serve as outgroups. Outgroup sequences obtained from GenBank are annotated with filled circles and their accession numbers are shown. Values at each node indicate bootstrap support (%) from 500 replications. Each specimen label contains information on taxonomy, origin (as indicated in two-letter country codes), and specimen ID. Specimen genera are indicted by colour: Culex in coral, Anopheles in purple, Aedes in dark blue, Mansonia in dark green, Limatus in light green, Coquillettidia in light blue, Psorophora in yellow, Mimomyia in teal, Uranotaenia in pink, and Eretmapodites in brown. Scale bar at 0.05 is shown.

Update of

  • doi: 10.1101/2022.02.01.478639

Similar articles

Cited by

References

    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. Journal of Molecular Biology. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. - DOI - PubMed
    1. Arctander P. Comparison of a mitochondrial gene and a corresponding nuclear pseudogene. Proceedings. Biological Sciences. 1995;262:13–19. doi: 10.1098/rspb.1995.0170. - DOI - PubMed
    1. Arunachalam N, Samuel PP, Hiriyan J, Thenmozhi V, Gajanana A. Japanese encephalitis in Kerala, south India: can Mansonia (Diptera: Culicidae) play a supplemental role in transmission? Journal of Medical Entomology. 2004;41:456–461. doi: 10.1603/0022-2585-41.3.456. - DOI - PubMed
    1. Aspen S, Savage HM. Polymerase chain reaction assay identifies North American members of the Culex pipiens complex based on nucleotide sequence differences in the acetylcholinesterase gene Ace.2. Journal of the American Mosquito Control Association. 2003;19:323–328. - PubMed
    1. Auerswald H, Maquart PO, Chevalier V, Boyer S. Mosquito vector competence for Japanese encephalitis virus. Viruses. 2021;13:1154. doi: 10.3390/v13061154. - DOI - PMC - PubMed

Publication types