Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Oct 16:7:e36495.
doi: 10.7554/eLife.36495.

Firefly genomes illuminate parallel origins of bioluminescence in beetles

Affiliations

Firefly genomes illuminate parallel origins of bioluminescence in beetles

Timothy R Fallon et al. Elife. .

Abstract

Fireflies and their luminous courtships have inspired centuries of scientific study. Today firefly luciferase is widely used in biotechnology, but the evolutionary origin of bioluminescence within beetles remains unclear. To shed light on this long-standing question, we sequenced the genomes of two firefly species that diverged over 100 million-years-ago: the North American Photinus pyralis and Japanese Aquatica lateralis. To compare bioluminescent origins, we also sequenced the genome of a related click beetle, the Caribbean Ignelater luminosus, with bioluminescent biochemistry near-identical to fireflies, but anatomically unique light organs, suggesting the intriguing hypothesis of parallel gains of bioluminescence. Our analyses support independent gains of bioluminescence in fireflies and click beetles, and provide new insights into the genes, chemical defenses, and symbionts that evolved alongside their luminous lifestyle.

Keywords: Aquatica lateralis; Ignelater luminosus; Photinus pyralis; bioluminescence; firefly; genetics; genomics; luciferase.

PubMed Disclaimer

Conflict of interest statement

TF, SL, CC, MB, GM, AB, MB, HD, IW, JD, AS, CS, KS, DH, RS, DN, SL, SS, SB, AL, YO, JW No competing interests declared

Figures

Figure 1.
Figure 1.. Geographic and phylogenetic context of the Big Dipper firefly, Photinus pyralis.
(A) P. pyralis males emitting their characteristic swooping ‘J’ patrol flashes over a field in Homer Lake, Illinois. Females cue in on these species-specific flash patterns and respond with their own species-specific flash (Lloyd, 1966). Photo credit: Alex Wild. Inset: male and female P. pyralis in early stages of mating. Photo credit: Terry Priest. (B) Cladogram depicting the hypothetical phylogenetic relationship between P. pyralis and related bioluminescent and non-bioluminescent taxa with Tribolium castaneum and Drosophila melanogaster as outgroups. Numbers at nodes give approximate dates of divergence in millions of years ago (mya) (Misof et al., 2014; Mckenna et al., 2015). Right: Dorsal and ventral photos of adult male specimens. Note the well-developed ventral light organs on the true abdominal segments 6 and 7 of P. pyralis and A. lateralis. In contrast, the luminescent click beetle, I. luminosus, has paired dorsal light organs at the base of its prothorax (arrowhead) and a lantern on the anterior surface of the ventral abdomen (not visible). (C) Empirical range of P. pyralis in North America, extrapolated from 541 reported sightings (Appendix 1.2). Collection sites of individuals used for genome assembly are denoted with circles and location codes. Cross hatches represent areas which likely have P. pyralis, but were not sampled. Diagonal hashes represent Ontario, Canada.
Figure 2.
Figure 2.. Photinus pyralis genome assembly and analysis.
(A) Assembled Ppyr1.3 linkage groups with annotation of the location of known luminescence-related genes, combined with Hi-C linkage density maps. Linkage group 3a (box with black arrow) corresponds to the X chromosome (Appendix 1.6.4.1). (B) Fluorescence in situ hybridization (FISH) on mitotic chromosomes of a P. pyralis larvae. The telomeric repeats TTAGG (green) localize to the ends of chromosomes stained with DAPI (blue). 20 paired chromosomes indicates that this individual was an XX female (Appendix 1.13). (C) Genome schematic of P. pyralis mitochondrial genome (mtDNA). Like other firefly mtDNAs, it has a tandem repetitive unit (TRU) (Appendix 1.8). (D) mCG is enriched across gene bodies of P. pyralis and shows methylation levels that are at least two times higher than other holometabolous insects (Appendix 1.12). (E) Orthogroup (OGs) clustering analysis of genes with Orthofinder (Emms and Kelly, 2015) shows a high degree of overlap of the P. pyralis, A. lateralis, and I. luminosus genesets with the geneset of Tribolium castaneum. Numbers within curved brackets (colored by species) represent gene count from specific species within the shared orthogroups. Numbers with square brackets (black color) represent total gene count amongst shared orthogroups. OGs = orthogroups, *=Not fully filtered to single isoform per gene. See Appendix 4.2.1 for more detail. Intermediate scripts and species-specific overlaps are available as Figure 2—source data 1. (F) Assembly statistics for presented genomes. *=Tribolium castaneum model beetle genome assembly (Tribolium Genome Sequencing Consortium et al., 2008) **=Genome size estimated by FC: flow cytometry. P. pyralis n = 5 females (SEM) I. luminosus n = 5 males (SEM), A. lateralis n = 3 technical-replicates of one female (SD). ***=Complete (C), and Duplicated (D), percentages for the Endopterygota BUSCO (Simão et al., 2015) profile (Appendix 1.4, 2.4, 3.4, 4.1).
Figure 3.
Figure 3.. A genomic view of luciferase evolution.
(A) The reaction scheme of firefly luciferase is related to that of fatty acyl-CoA synthetases. (B) Model for genomic evolution of firefly luciferases. Ranging from genome structures of luciferase loci in extant fireflies (top), to inferred genomic structures in ancestral species (bottom). Arrow (left) represents ascending time. Not all adjacent genes within the same clade are shown. (C) Maximum likelihood tree of luciferase homologs. Grey circles above gene names indicate the presence of peroxisomal targeting signal 1 (PTS1). Color gradients indicate the transcript per million (TPM) values of whole body in each sex/stage (grey to blue) and in the prothorax or abdominal lantern (grey to orange to green). Tree and annotation visualized using iTOL (Letunic and Bork, 2016). Prothorax and abdominal lantern expression values for I. luminosus are from whole prothorax plus head, and metathorax plus the two most anterior abdominal segments. Fluc = firefly luciferases, Eluc = elaterid luciferases, R/PLuc = rhagophthalmid/phengodid luciferases. (Appendix 4.3.2) Gene tree, gene accession numbers, annotation, and expression values are available as Figure 3—source data 1. (D) Synteny analysis of beetle luciferase homologs. Nine of the 14 A. lateralis PACS/ACS genes closely flank AlatLuc1 on scaffold 228, while 4 of the 13 P. pyralis PACS/ACS genes are close neighbors of PpyrLuc1 on LG1, with a further seven genes 2.4 Mbp and 39.1 Mbp away on the same linkage-group. Although the Luc1 loci in P. pyralis and A. lateralis are evidently derived from a common ancestor, the relative positions of the most closely related flanking PACS/ACS genes have diverged between the two species. IlumLuc was captured on a separate scaffold (Ilumi1.2_Scaffold13255) from its most most closely related PACSs (IlumPACS8, IlumPACS9) on Ilumi1.2_Scaffold9864, although three more distantly related PACS genes (IlumiPACS1, IlumiPACS2, IlumiPACS4) are co-localized with IlumLuc. In contrast, a different scaffold (Ilumi1.2_Scaffold9654) shows orthology to the firefly Luc1 locus. The full Ilumi1.2_Scaffold13255 was produced by a manual evidence-supported merge of two scaffolds (Appendix 3.5.4). Genes with a PTS1 are indicated by a dark outline, except for the genes with white interiors, which instead represent non-PACS/ACS genes without an identified homolog in the other scaffolds. Co-orthologous genes are labeled in the same color in the phylogenetic tree and are connected with corresponding color bands in synteny diagram. Genes and genomic regions are to scale (Scale bar = 25 Kbp). Gaps excluded from the figure are shown with dotted lines and are annotated with their length in square brackets. Scaffold ends are shown with rough black bars. MGST = Microsomal glutathione S-transferase, IMP = Inositol monophosphatase, PRNT = Polyribonucleotide nucleotidyltransferase. Figure produced with GenomeTools ‘sketch’ (v1.5.9) (Gremme et al., 2013). Figure production scripts available as Figure 3—source data 2.
Figure 4.
Figure 4.. Parallel evolution of elaterid and firefly luciferase.
(A) Ancestral state reconstruction recovers at least two gains of luciferase activity in bioluminescent beetles. Luciferase activity (top right figure key; black: luciferase activity, white: no luciferase activity, shaded: undetermined) was annotated on extant firefly luciferase homologs via literature review or inference via direct orthology. The ancestral states of luciferase activity within the putative ancestral nodes were then reconstructed with an unordered parsimony framework and a maximum likelihood (ML) framework (bottom left figure key; Appendix 4.3.3). Two gains (‘G’) of luciferase activity, annotated with black arrows and yellow stars, are hypothesized. These hypothesized gains occurred once in a gene within the common ancestor of fireflies, rhagophthalmid, and phengodid beetles, and once in a gene within the common ancestor of bioluminescent elaterid beetles. Scale bar is substitutions per site. Numbers adjacent to nodes represents node support. NEXUS and newick files available as Figure 4—source data 1 (B) Molecular adaptation analysis supports independent neofunctionalization of click beetle luciferase. We tested the molecular adaptation of elaterid luciferase using the adaptive branch-site REL test for episodic diversification (aBSREL) method (Smith et al., 2015) (Appendix 4.3.4). The branch leading to the common ancestor of elaterid luciferases (red star) was one of three branches (red and blue stars) recovered with significant (p<0.01) evidence of positive selection, with 35% of sites showing strong directional selection (ω or max dN/dS = 3.98), which we interpret as signal of the initial neofunctionalization of elaterid ancestral luciferase (EAncLuc) from an ancestor without luciferase activity. As the selected branches with blue stars are red-shifted elaterid luciferases (Oba et al., 2010a; Stolz et al., 2003), they may represent the post-neofunctionalization selection of a few key sites via sexual selection of emission colors. Specific sites identified as under selection using Mixed Effect Model of Evolution (MEME) and Phylogenetic Analysis by Maximum Likelihood (PAML) methods are described in Appendix 4.3.4. The tree and results from the full adaptive model are shown. Branch length, with the exception of the PpyrLuc1 branch which was shortened, reflects the number of substitutions per site. Numbers adjacent to nodes represents node support. Figure was produced with iTOL (Letunic and Bork, 2016). Gene tree, metadata, and coding nucleotide multiple sequence alignment available as Figure 4—source data 2.
Figure 5.
Figure 5.. Comparative analyses of firefly lantern expression highlight likely metabolic adaptations to bioluminescence.
Enzymes which are highly expressed (HE), differentially expressed (DE), and annotated as enzymes via InterProScan are shown in the Venn diagrams for their respective species. Those genes in the intersection of the two sets which are within the same orthogroup (OGs) as determined by OrthoFinder are shown in the table. Many-to-one orthology relationships are represented by bold orthogroups and blank cells. See Appendix 4.2.2 for more detail. *=genes of previously described function. Underlying expression quantification and Venn analysis available on FigShare: (DOI: 10.6084/m9.figshare.5715151)
Figure 6.
Figure 6.. An expansion in the CYP303-P450 family correlates with lucibufagin content.
(A) Hypothesized lucibufagin biosynthetic pathway, starting from cholesterol. (B) LC-HRAM-MS multi-ion-chromatograms (MIC) showing the summation of exact mass traces for the [M + H]+ of 11 lucibufagin chemical formulas ± 5 ppm, calibrated for run-specific systematic m/z error (Appendix 4—table 9). Y-axis upper limit for P. pyralis adult hemolymph and larval body extract is 1000x larger than other traces. Arrows (blue/teal) indicate features with high MS2 spectral similarity to known lucibufagins. Sporadic peaks in A. lateralis body, and I. luminosus thorax traces are not abundant, preventing MS2 spectral acquisition and comparison, but do not match the m/z and RT of P. pyralis lucibufagins (Appendix 4.6). (C) Maximum likelihood tree of CYP303 family cytochrome P450 enzymes from P. pyralis, A. lateralis, T. castaneum, and D. melanogaster. P. pyralis shows a unique CYP303 family expansion, whereas the other species only have a single CYP303. Circles represent node bootstrap support >60%. Branch length measures substitutions per site. Pseudogenes are annotated with the greek letter Ψ (Appendix 1.10.1; 4.2.4). (D) Genomic loci for P. pyralis CYP303 family genes. These genes are found in multiple gene clusters on LG9, supporting origin via tandem duplication. Introns >4 kbp are shown.
Appendix 1—figure 1.
Appendix 1—figure 1.. Detailed geographic distribution map for P. pyralis.
P. pyralis sightings (red circles show county centroided reports) in the United States and Ontario, Canada (diagonal hashes). The World Wildlife Fund Terrestrial Ecoregions (Olson et al., 2001; World Wildlife Fund, 2017) are also shown (colored shapes). The P. pyralis sighting dataset shown is identical to that used to prepare Figure 1B.
Appendix 1—figure 2.
Appendix 1—figure 2.. P. pyralis aedeagus (male genitalia).
(A) Ventral and (B) side view of a P. pyralis aedeagus dissected from specimens collected on the same date and locality as those used for PacBio sequencing. Note the strongly sclerotized paired ventro-basal processes (‘mickey mouse ears’) emerging from the median process, characteristic of P. pyralis (Green, 1956).
Appendix 1—figure 3.
Appendix 1—figure 3.. Luminescence of P. pyralis eggs.
(A) Photograph under ambient light of ~1 day post-deposition P. pyralis eggs. (B) Photograph of self-luminescence of ~1 day post-deposition P. pyralis eggs. Both photographs taken with a NightOwl LB98 cooled CCD luminescence imager (Berthold Technologies, USA). Luminescence was not visible to the dark-adapted eye.
Appendix 1—figure 4.
Appendix 1—figure 4.. Gregarious predation of young P. pyralis larvae on a live Lumbricus terrestris.
Both P. pyralis larvae (red arrows), and Enchytraeus albidus (yellow arrows), were observed to feed on the paralyzed earthworms.
Appendix 1—figure 5.
Appendix 1—figure 5.. Gregarious predation of 3rd-4th instar P. pyralis larvae on a live Lumbricus terrestris.
Appendix 1—figure 6.
Appendix 1—figure 6.. Genome scope kmer analysis of the P. pyralis short read library.
(A) Linear and (B) log plot of a kmer spectral genome composition analysis of the ‘8369’ P. pyralis Illumina short-read library from a single P. pyralis XO adult male (Appendix 1.5.1; Appendix 4—table 1) with jellyfish (v2.2.9; parameters: -C -k 35) (Marçais and Kingsford, 2011) and GenomeScope (v1.0; parameters: Kmer length = 35, Read length = 100, Max kmer coverage = 1000) (Vurture et al., 2017). len = inferred haploid genome length, uniq = percentage non-repetitive sequence, het = overall rate of genome heterozygosity, kcov = mean kmer coverage for heterozygous bases, err = error rate of the reads, dup: average rate of read duplications. These results are consistent with the genome size of a XO male, when possible systematic error of kmer spectral analysis and flow cytometry genome size estimates is considered. The heterozygosity is somewhat low when compared to some other arthropods.
Appendix 1—figure 7.
Appendix 1—figure 7.. PFGE of P. pyralis HMW DNA used for PacBio sequencing.
Lane 1 was used for further library prep and sequencing, Lanes 2–5 represent separate batches of P. pyralis HMW DNA that was not used for PacBio sequencing. Lane 1 was used as it had the highest DNA yield, and an equivalent DNA size distribution to the other samples.
Appendix 1—figure 8.
Appendix 1—figure 8.. Subread length distribution for P. pyralis PacBio RSII sequencing.
Figure produced with SMRTPortal (v2.3.0.140936, Pacific Biosciences, 2017) by aligning all PacBio reads from data from the 61 SMRT cells against Ppyr1.3 using the RS_Resequencing.1 protocol with default parameters. Subread length unit is basepair (bp).
Appendix 1—figure 9.
Appendix 1—figure 9.. Blobplot of Illumina short-insert reads aligned against the Ppyr1.2 reference.
Coverage shown represents mean coverage of reads from the Illumina short-insert library (Sample name 8369; Appendix 4—table 1), aligned against Ppyr1.2 using Bowtie2 with parameters (--local). Scaffolds were taxonomically annotated as described in Appendix 1.6.4.2.
Appendix 1—figure 10.
Appendix 1—figure 10.. Blobplot of P. pyralis PacBio reads aligned against Ppyr1.2.
Coverage shows represents mean coverage of reads from the PacBio library (Sample name 1611; Appendix 4—table 1). The reads were aligned using SMRTPortal v2.3.0.140893 with the ‘RS_Resequencing.1’ protocol with default parameters. Scaffolds were taxonomically annotated as described in Appendix 1.6.4.2.
Appendix 1—figure 11.
Appendix 1—figure 11.. Venn diagram representation of blobtools taxonomic annotation filtering approach for Ppyr1.2 scaffolds.
(A) The blue set represents scaffolds which have >10.0 coverage in both Illumina and PacBio libraries. (B) The red set represents scaffolds which had either genes on repeats (non simple or low-complexity) annotated. (C) The green set represents scaffolds with suspicious taxonomic assignment (Non ‘Arthropod’ or ‘no-hit’). Outside A, B, and C, represents low-coverage, unannotated scaffolds. Ppyr1.3 consists of the intersection of A and B, minus the intersection of C. All linkage groups (LG1-LG10) were annotated as ‘Arthropod’ by blobtools, and captured in the intersection between A and B but not set C.
Appendix 1—figure 12.
Appendix 1—figure 12.. Mitochondrial genome of P. pyralis.
The mitochondrial genome of P. pyralis was assembled and annotated as described. Note the firefly specific tandem-repeat-unit (TRU) region. Figure produced with Circos (Krzywinski et al., 2009).
Appendix 1—figure 13.
Appendix 1—figure 13.. P. pyralis P450 gene phylogenetic tree.
Neighbor-joining phylogenetic tree of 165 cytochrome P450s from P. pyralis. Four pseudogenes and one short sequence were removed. The P450 clans have colored spokes (CYP2 clan brown, CYP3 clan green, CYP4 clan red, Mito clan blue). Shading highlights different families and family clusters within the CYP3 clan. The tree was made using Clustal Omega at EBI (European Bioinformatics Institute, 2017) with default settings. The resulting multiple sequence alignment is available on FigShare (DOI: 10.6084/m9.figshare.5697643). The tree was drawn with FigTree v1.3.1 using midpoint rooting.
Appendix 2—figure 1.
Appendix 2—figure 1.. Genome scope kmer analysis of the A. lateralis short-insert genomic library.
(A) Linear and (B) log plot of a kmer spectral genome composition analysis of the ‘FFGPE_PE200’ A. lateralis Illumina short-insert library (Appendix 2.5; Appendix 4—table 1) with jellyfish (v2.2.9; parameters: -C -k 35) (Marçais and Kingsford, 2011) and GenomeScope (v1.0; parameters: Kmer length = 35, Read length = 100, Max kmer coverage = 1000) (Vurture et al., 2017). len = inferred haploid genome length, uniq = percentage non-repetitive sequence, het = overall rate of genome heterozygosity, kcov = mean kmer coverage for heterozygous bases, err = error rate of the reads, dup: average rate of read duplications. These results are consistent when considering the possible systematic error of kmer spectral analysis and flow cytometry genome size estimates. The heterozygosity is lower than that measured for P. pyralis, possibly reflecting the long-term laboratory rearing in reduced population sizes of A. lateralis strain Ikeya-Y90.
Appendix 2—figure 2.
Appendix 2—figure 2.. Blobplot of A. lateralis Illumina reads aligned against Alat1.2.
Coverage shown represents mean coverage of reads from the Illumina short-insert library (Sample name FFGPE_PE200; Appendix 4—table 1), aligned against Alat1.2 using Bowtie2. Scaffolds were taxonomically annotated as described in Appendix 2.5.2.
Appendix 3—figure 1.
Appendix 3—figure 1.. I. luminosus aedeagus (male genitalia).
(A) Dorsal and (B) ventral view of an Ignelater luminosus aedeagus, dissected from the same batch of specimens used for linked-read sequencing and genome assembly. The species identity of this specimen was confirmed as I. luminosus by comparison of the aedeagus to the keys of Costa and Rosa (Costa, 1975; Rosa, 2007; Rosa, 2010).
Appendix 3—figure 2.
Appendix 3—figure 2.. Genome scope kmer analysis of the I. luminosus linked-read genomic library.
(A) Linear and (B) log plot of a kmer spectral genome composition analysis of the ‘1610_IlumiHiSeqX’ I. luminosus Illumina linked-read library (Appendix 2.5; Appendix 4—table 1) with jellyfish (v2.2.9; parameters: -C -k 35) (Marçais and Kingsford, 2011) and GenomeScope (v1.0; parameters: Kmer length = 35, Read length = 138, Max kmer coverage = 1000) (Vurture et al., 2017). Before analysis, 10x Chromium barcodes were trimmed off Read1 using cutadapt (v1.8; parameters: -u 23) (Martin, 2011). vlen = inferred haploid genome length, uniq = percentage non-repetitive sequence, het = overall rate of genome heterozygosity, kcov = mean kmer coverage for heterozygous bases, err = error rate of the reads, dup: average rate of read duplications. These results are consistent when considering the possible systematic error of kmer spectral analysis and flow cytometry genome size estimates. The heterozygosity is higher than that measured for P. pyralis and A. lateralis. The read error rate for this library is also significantly higher than the P. pyralis and A. lateralis results, possibly highlighting the difference in raw read error rate between HiSeq2500 and HiSeqX sequencing, or is possibly an artifact of the Chromium library.
Appendix 3—figure 3.
Appendix 3—figure 3.. Blobtools plot of Ilumi1.0.
Coverage shown represents mean coverage of reads from the HiSeqX Chromium library sequencing (Sample name 1610_IlumiHiSeqX; Appendix 4—table 1), aligned against Ilumi1.0 using Bowtie2 with parameters (--local). Scaffolds were taxonomically annotated as described in Appendix 3.5.2.
Appendix 3—figure 4.
Appendix 3—figure 4.. Self alignment of the Ilumi1.1_Scaffold13255 right-edge extending long MinION read.
Alignment performed in in Gepard (Krumsiek et al., 2007). Note the large (10 kbp+) tandem repetitive region.
Appendix 3—figure 5.
Appendix 3—figure 5.. Diagram of manual scaffold merges between Ilumi1.1 and Ilumi1.2.
Diagram of the manual merge of Ilumi1.1_Scaffold13255 with Ilumi1.1_Scaffold11560 between I. luminosus genome assembly versions Ilumi1.1 and Ilumi1.2. This merge was supported by: (1) The putative missing first exon of IlumPACS4 being present on the right edge of Ilumi1.2_Scaffold11560. (2) The right edge of Ilumi1.1_Scaffold13255, and the right edge of Ilumi1.1_Scaffold11560, having anti-parallel versions of a homologous complex tandem repeat. See Figure 3 in the maintext for explanation of presented genes.
Appendix 3—figure 6.
Appendix 3—figure 6.. Mitochondrial genome of I. luminosus.
The mitochondrial genome of I. luminosus was assembled and annotated as described. in the Appendix 3.10. Figure produced with Circos (Krzywinski et al., 2009).
Appendix 4—figure 1.
Appendix 4—figure 1.. Venn diagram of P. pyralis, A. lateralis, I. luminosus, T. castaneum, and D. melanogaster orthogroup relationships.
Orthogroups were calculated between the PPYR_OGS1.1, AQULA_OGS1.0, ILUMI_OGS1.2, genesets, and the T. casteneum and D. melanogaster filtered Uniprot reference proteomes using OrthoFinder(Emms and Kelly, 2015). See Appendix 4.2.1 for description of clustering method. OGs = Orthogroups, OGS = Official gene set, *=Not completely filtered to single peptide per gene. Figure produced with InteractiVenn (Heberle et al., 2015). Intermediate scripts and species specific overlaps are available as Figure 2—source data 1.
Appendix 4—figure 2.
Appendix 4—figure 2.. DNA and tRNA methyltransferase gene phylogeny.
Levels and patterns of mCG in P. pyralis are corroborated by the presence of de novo and maintenance DNMTs (DNMT3 and DNMT1, respectively). Notably, P. pyralis possesses two copies of DNMT1, and 3 copies of DNMT3, in contrast to a single copy of DNMT1 and DNMT3 in the firefly Aquatica lateralis. The evolutionary history was inferred by using the Maximum Likelihood method with the LG + G (five gamma categories) (Le and Gascuel, 2008). Evolutionary analyses were conducted in MEGA7 (Kumar et al., 2016). Size of circles at nodes corresponds to bootstrap support (100 bootstrap replicates). Branch lengths are in amino acid substitutions per site. T. castaneum = Tribolium castaneum, D. melanogaster = Drosophila melanogaster, N. vespilloides = Nicrophorus vespilloides. The multiple sequence alignment and phylogenetic topology are available on FigShare (10.6084/m9.figshare.6531311).
Appendix 4—figure 3.
Appendix 4—figure 3.. Detection of DNA methylation using CpG[O/E].
Appendix 4—figure 4.
Appendix 4—figure 4.. Intron-exon structure of beetle luciferases.
(A) Intron-exon structure of P. pyralis and A. lateralis Luc1 and Luc2 from Ppyr1.3 and Alat1.3, and IlumLuc from Ilumi1.2. Between fireflies and click-beetles, the structure of the luciferase genes are globally similar, with seven exons, similar intron lengths, and identical splice junction locations (Appendix 4—figure 5). The intron-exon structure of IlumLuc is consistent with the reported intron-exon structure of Pyrophorus plagiophthalamus luciferase (Velez and Feder, 2006).
Appendix 4—figure 5.
Appendix 4—figure 5.. Multiple sequence alignment of firefly luciferase genes.
MAFFT (Katoh and Standley, 2013) L-INS-i multiple sequence alignment of luciferase gene nucleotide sequences from PpyrOGS1.1 and AlatOGS1.0 demonstrates the location of intron-exon junctions (bolded blue text) is completely conserved amongst the four luciferases. Exonic sequence is capitalized, whereas intronic sequence is lowercase.
Appendix 4—figure 6.
Appendix 4—figure 6.. Preliminary maximum likelihood phylogeny of luciferase homologs.
A preliminary maximum likelihood tree was reconstructed from a 385 amino acid multiple sequence alignment, generated via a BLASTP and orthoDB search using P. pyralis luciferase as query (e-value: 1.0 × 10−60). Members of the clade that includes both known firefly luciferase and CG6178 of D. melanogaster (bold) are defined as luciferase co-orthologous genes (highlighted in gray), and were selected and used for the independent maximum likelihood analysis in Figure 3C (Appendix 4.3.2). Branch length represents substitutions per site. Genes found from this study are indicated in blue. Lampyridae Luc1-type and Luc2-type luciferases are highlighted in yellow-green and green. Rhagophthalmidae and Phengodidae luciferases are highlighted in lime-green. Elateridae luciferases are highlighted in yellow. Genbank accession numbers of luciferase orthologs genes are indicated after the species name. OrthoDB taxon and protein IDs of luciferase co-orthologs are indicated after species name. Bootstrap values are indicated on the nodes. The genes from Coleoptera are indicated as purple strip. Grey closed circles indicate genes that have PTS1.
Appendix 4—figure 7.
Appendix 4—figure 7.. Amino acid variation at sites recovered in selection analysis.
Amino acid variation of extant Elaterid luciferases (Clade D ‘Eluc’ subset; Figure 3) at all sites recovered via both the MEME and PAML-BEB selection analysis (Appendix 4—table 5). Site numbering relative to IlumLuc. Figure produced with seqkit (Shen et al., 2016) and WebLogo(v3.6.0) (Crooks et al., 2004).
Appendix 4—figure 8.
Appendix 4—figure 8.. Maximum likelihood gene tree of the combined adenylyl-sulfate kinase and sulfate adenylyltransferase (ASKSA) orthogroup.
Peptide sequences from P. pyralis, A. lateralis, I. luminosus, T. castaneum, and D. melanogaster were clustered (orthogroup # 698), multiple sequence aligned, and refactored into a species rooted maximum likelihood tree, via the OrthoFinder pipeline (Appendix 4.2.1). As this is a genome-wide analysis where bootstrap replicates would be computationally prohibitive, no bootstrap replicates were performed to evaluate the support of the tree topology. PTS1 sequences were predicted from the peptide sequence using the PTS1 predictor server (Neuberger et al., 2017). Figure produced with iTOL (Letunic and Bork, 2016).
Appendix 4—figure 9.
Appendix 4—figure 9.. ML tree and gene expression levels of opsin genes.
Appendix 4—figure 10.
Appendix 4—figure 10.. Positive mode MS1 total-ion-chromatogram (TIC) of P.pyralis adult hemolymph LC-HRAM-MS data.
Figure produced using MZmine2 (Pluskal et al., 2010).
Appendix 4—figure 11.
Appendix 4—figure 11.. Negative mode MS1 total-ion-chromatogram (TIC) of P. pyralis adult hemolymph LC-HRAM-MS data.
Figure produced using MZmine2 (Pluskal et al., 2010).
Appendix 4—figure 12.
Appendix 4—figure 12.. Positive mode MS1 total-ion-chromatogram (TIC) of P. pyralis larval whole body minus two posterior segments LC-HRAM-MS data.
Figure produced using MZmine2 (Pluskal et al., 2010).
Appendix 4—figure 13.
Appendix 4—figure 13.. Negative mode MS1 total-ion-chromatogram (TIC) of P. pyralis larval whole body minus two posterior segments LC-HRAM-MS data.
Figure produced using MZmine2 (Pluskal et al., 2010).
Appendix 4—figure 14.
Appendix 4—figure 14.. Positive mode MS1 total-ion-chromatogram (TIC) of A. lateralis adult hemolymph LC-HRAM-MS data.
Figure produced using MZmine2 (Pluskal et al., 2010).
Appendix 4—figure 15.
Appendix 4—figure 15.. Negative mode MS1 total-ion-chromatogram (TIC) of A. lateralis adult hemolymph LC-HRAM-MS data.
Figure produced using MZmine2 (Pluskal et al., 2010).
Appendix 4—figure 16.
Appendix 4—figure 16.. Positive mode MS1 total-ion-chromatogram (TIC) of A. lateralis larval whole body LC-HRAM-MS data.
Figure produced using MZmine2 (Pluskal et al., 2010).
Appendix 4—figure 17.
Appendix 4—figure 17.. Negative mode MS1 total-ion-chromatogram (TIC) of A. lateralis larval whole body extract LC-HRAM-MS data.
Figure produced using MZmine2 (Pluskal et al., 2010).
Appendix 4—figure 18.
Appendix 4—figure 18.. Positive mode MS1 total-ion-chromatogram (TIC) of I. luminosus mesothorax +abdomen extract LC-HRAM-MS data.
Figure produced using MZmine2 (Pluskal et al., 2010).
Appendix 4—figure 19.
Appendix 4—figure 19.. Negative mode MS1 total-ion-chromatogram (TIC) of I.luminosus mesothorax + abdomen extract LC-HRAM-MS data.
Figure produced using MZmine2 (Pluskal et al., 2010).
Appendix 4—figure 20.
Appendix 4—figure 20.. Positive mode MS2 spectra of (A) diacetylated lucibufagin [M + H]+ and (B) dipropylated lucibufagin [M + H]+.
Appendix 4—figure 21.
Appendix 4—figure 21.. MS2 spectral similarity network for P.pyralis adult hemolymph lucibufagins.
(A) MS2 similarity network produced with the MZmine2 MS2 similarity search module. Nodes represent MS2 spectra from the initial dataset, whereas edges represent an MS2 similarity match between two MS2 spectra. Thickness/label of the edge represents the number of ions matched between the two MS2 spectra. (B) Table of matched ions between diacetylated lucibufagin (m/z: 533.2385 RT:15.1), and core (unacetylated) lucibufagin (m/z: 449.2171 RT:10.8 min). MS1 adducts and complexes of the presented ions were manually removed.
Appendix 5—figure 1.
Appendix 5—figure 1.. Mitochondrial genome of Apocephalus antennatus.
The mitochondrial genome of A. antennatus was assembled and annotated as described in the Appendix 5.2, and taxonomically identified as described in Appendix 5.3. Figure produced with Circos (Krzywinski et al., 2009).
Appendix 5—figure 2.
Appendix 5—figure 2.. Photinus pyralis viruses and endogenous viral-like elements.
(A) Phylogenetic tree based in MAFFT alignments of predicted replicases of Orthomyxoviridae (OMV) ICTV accepted viruses (green stars), new Photinus pyralis viruses (underlined) and tentative OMV-like virus species (black stars). ICTV recognized OMV genera: Quaranjavirus (orange), Thogotovirus (purple), Issavirus (turquoise), Influenzavirus A-D (green). Silhouettes correspond to host species. Asterisk denote FastTree consensus support >0.5. Question marks depict viruses with unidentified or unconfirmed host. (B) Phylogenetic tree of OMV proposed and recognized species in the context of all ssRNA (-) virus species, based on MAFFT alignments of refseq replicases. Photinus pyralis viruses are portrayed by black stars. (C) Phylogenetic tree of ICTV recognized OMV species and PpyrOMLV1 and 2. Numbers indicate FastTree consensus support. (D) Genetic distances of concatenated gene products of OMV depicted as circoletto diagrams. Proteins are oriented clockwise in N-HA-PB1-PB2-PA order when available. Sequence similarity is expressed as ribbons ranging from blue (low) to red (high). (E) Genomic architecture, predicted gene products and structural and functional domains of PpyrOLMV1 and 2. (F) Virus genomic noncoding termini analyses of PpyrOLMV1 and 2 in the context of ICTV OMV. The 3’ and 5’ end, A and U rich respectively, partially complementary sequences are associated to tentative panhandle polymerase binding and replication activity, typical of OMV. (G) 3D renders of the heterotrimeric polymerase of PpyrOMLV1 based on Swiss-Expasy generated models using as template the Influenza A virus polymerase structure. Structure comparisons were made with the MatchAlign tool of the Chimera suite, and solved in PyMOL. (H) Conserved functional motifs of PpyrOLMV1 and 2 PB1 and related viruses. Motif I-III are essential for replicase activity of viral polymerase. (I) Dynamic and prevalent virus derived RNA levels of the corresponding PpyrOMLV1 and 2 genome segments, determined in 24 RNA libraries of diverse individuals/developmental stages/tissues and geographic origins. RNA levels are expressed as normalized TPM, heatmaps were generated by Shinyheatmap. Values range from low (green) to high (red). (J) Firefly EVEs (FEVEs) identified in the P. pyralis genome assembly mapped to the corresponding pseudo-molecules. A 15 Kbp region flanking nucleoprotein like FEVES are depicted, enriched in transposable elements. Representative products of a putative PB2 FEVE are aligned to the corresponding protein of PpyrOMLV 2.
Appendix 5—figure 3.
Appendix 5—figure 3.. Pairwise identity of OMLV viral proteins amongst identified OMLV viruses.

References

    1. Aaskov J, Buzacott K, Thu HM, Lowry K, Holmes EC. Long-term transmission of defective RNA viruses in humans and Aedes mosquitoes. Science. 2006;311:236–238. doi: 10.1126/science.1115030. - DOI - PubMed
    1. Aiewsakun P, Katzourakis A. Endogenous viruses: connecting recent and ancient viral evolution. Virology. 2015;479-480:26–37. doi: 10.1016/j.virol.2015.02.011. - DOI - PubMed
    1. Al-Wathiqui N, Fallon TR, South A, Weng JK, Lewis SM. Molecular characterization of firefly nuptial gifts: a multi-omics approach sheds light on postcopulatory sexual selection. Scientific Reports. 2016;6:38556. doi: 10.1038/srep38556. - DOI - PMC - PubMed
    1. Amaral DT, Silva JR, Viviani VR. Transcriptional comparison of the photogenic and non-photogenic tissues of Phrixothrix hirtus (Coleoptera: Phengodidae) and non-luminescent Chauliognathus flavipes (Coleoptera: Cantharidae) give insights on the origin of lanterns in railroad worms. Gene Reports. 2017;7:78–86. doi: 10.1016/j.genrep.2017.02.004. - DOI
    1. Anderson CR, Casals J. Dhori virus, a new agent isolated from Hyalomma dromedarii in India. The Indian Journal of Medical Research. 1973;61:1416–1420. - PubMed

Publication types