Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar 5;116(10):4416-4425.
doi: 10.1073/pnas.1810031116. Epub 2019 Feb 20.

Lateral transfers of large DNA fragments spread functional genes among grasses

Affiliations

Lateral transfers of large DNA fragments spread functional genes among grasses

Luke T Dunning et al. Proc Natl Acad Sci U S A. .

Abstract

A fundamental tenet of multicellular eukaryotic evolution is that vertical inheritance is paramount, with natural selection acting on genetic variants transferred from parents to offspring. This lineal process means that an organism's adaptive potential can be restricted by its evolutionary history, the amount of standing genetic variation, and its mutation rate. Lateral gene transfer (LGT) theoretically provides a mechanism to bypass many of these limitations, but the evolutionary importance and frequency of this process in multicellular eukaryotes, such as plants, remains debated. We address this issue by assembling a chromosome-level genome for the grass Alloteropsis semialata, a species surmised to exhibit two LGTs, and screen it for other grass-to-grass LGTs using genomic data from 146 other grass species. Through stringent phylogenomic analyses, we discovered 57 additional LGTs in the A. semialata nuclear genome, involving at least nine different donor species. The LGTs are clustered in 23 laterally acquired genomic fragments that are up to 170 kb long and have accumulated during the diversification of Alloteropsis. The majority of the 59 LGTs in A. semialata are expressed, and we show that they have added functions to the recipient genome. Functional LGTs were further detected in the genomes of five other grass species, demonstrating that this process is likely widespread in this globally important group of plants. LGT therefore appears to represent a potent evolutionary force capable of spreading functional genes among distantly related grass species.

Keywords: Poaceae; adaptation; genome; horizontal gene transfer; phylogenetics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Overview of the analytical pipeline. (A) For each of the steps used to identify lateral gene transfers in the reference genome of A. semialata, the number of candidates retained/discarded is indicated. (B) The purpose of each set of analyses conducted on the unambiguous LGTs is indicated.
Fig. 2.
Fig. 2.
Multigene coalescence species tree. The relationships are based on 200 single-copy genes extracted from complete genomes (bold species names) and transcriptomes of 37 grass species. The pie charts show the proportion of quartets supporting the species tree topology (blue) and the two alternative topologies (red and orange, respectively). Posterior probabilities supporting values ≥0.50 are shown near nodes, and branch lengths are in coalescent units, with null terminal branches and dashed lines connecting to species names. The main clades of grasses are delimited on the Right. The position of the A. semialata reference genome is highlighted in yellow, the groups of donors in red, and the clade that contains Themeda is indicated. Red arrows represent the transfers of fragments from each identified donor into A. semialata, with those reported before (32) indicated with blue dashes. And., Andropogoneae; Cench., Cenchrinae; Chlor., Chloridoideae; Melin., Melinidinae; Panici., Panicinae; and Pasp., Paspaleae.
Fig. 3.
Fig. 3.
Phylogenetic evidence for LGT in the reference A. semialata genome. The gene ASEM_AUS1_12633 from A. semialata was laterally acquired from an Australian T. triandra. The maximum likelihood phylogeny inferred from third positions of codons (Dataset S3) is shown, with the LGT in A. semialata in red and the native orthologs in blue. The region of the phylogeny containing the LGT [red dashed rectangle (Upper) is expanded (Lower)]. Bootstrap support values ≥75% are shown (Lower) or denoted as asterisks (Upper), and the main taxonomic groups are delimited on the Right as in Fig. 2. Chlor., Chloridoideae.
Fig. 4.
Fig. 4.
Phylogenetic and genomic distribution of LGTs in Alloteropsis. (A) The distribution of the 59 laterally acquired genes (primary and secondary) across the coalescence phylogenetic tree of Alloteropsis (extracted from SI Appendix, Fig. S4) is shown. Each tip represents a different population, with the reference genome denoted with a star, and the Philippines population indicated with a boldface P on the Right. Boxes on the phylogeny delimit geographic origins within A. semialata. The LGTs are organized into the 23 acquired genomic fragments, which are labeled at the Bottom and the primary candidates within each fragment are indicated by an asterisk (†, gene duplicate from fragment C). The fragments are sorted by the approximate order of their acquisition, with those spread most widely across A. semialata phylogeny on the Left. Within each fragment, genes are ordered based on their position in the A. semialata reference genome. For populations with no corresponding expression data, only presence is shown (light gray). For populations with matched expression data, LGTs present but not expressed are shown in dark gray; those expressed in yellow show ≥1 rpkm; and those expressed at a higher level than the native ortholog are shown in orange (see Dataset S1 for details). Gene presence was inferred based on the number of reads mapping to the coding region in the reference genome (see Dataset S1 for details). The positions of (B) native orthologs and the (C) LGTs detected in the genome of A. semialata (in white) are compared with that of orthologs from S. italica (gray). The nine chromosomes are numbered (1–9) and oriented based on their synteny. Genes in syntenic positions are connected by black lines, while those in distinct genomic locations are connected by red lines (see SI Appendix, Fig. S2 for details of synteny analyses). (D) The donors of the 23 LGT fragments, shown with colors and letters as in A and C, are listed with the main group and subgroup, as in Fig. 2 and in SI Appendix, Fig. S4.
Fig. 5.
Fig. 5.
Phylogenetic tree of Panicoideae genes encoding phosphoenolpyruvate carboxykinase (PCK). The maximum likelihood phylogeny was inferred from gene regions extending from exon 2 to exon 10, and including introns. The three LGTs are highlighted in red, and their corresponding native copies are highlighted in blue. Sequences were either extracted from complete genomes (or transcriptome for Cymbopogon flexuosus) or retrieved from GenBank (accession nos. shown). Bootstrap supports ≥50% are indicated near nodes and the A. semialata reference genome sequences are shown in bold. Branch lengths are given in expected substitutions per site. The main groups are delimited on the Right, as in Fig. 2.
Fig. 6.
Fig. 6.
Genomic context of LGTs in the A. semialata reference genome. For four LGT fragments, a 0.5-Mb genomic region is shown with high-confidence protein-coding genes indicated at the Top of each panel, in red for primary LGT candidates, orange for secondary LGT candidates, and green for native genes (see Fig. 1 for definitions of primary and secondary LGT candidates). For each fragment, the mapping coverage is shown for the closest relative to the donor in the dataset (in blue), the reference genome AUS1, and three conspecifics or congeners with the three-letter prefix for the A. semialata identifiers based on their country of origin. The coverage is shown on a logarithmic scale, with the dotted red lines indicating the coverage expected for single-copy DNA and the gray lines for the coverage expected for a five-copy DNA segment. Black/blue bars represent mapping quality ≥20, while gray bars have mapping quality <20 and include reads that map in multiple locations, indicative of repeats. Valid read alignments have a nucleotide identity of ≥90%. All read lengths are 250 bp except for Iseilema membranaceum (151 bp), Miscanthus sinensis (100 bp), and S. italica (95 bp). The size of the laterally acquired region is indicated by a black bar at the Top for fragments A and C, but its delimitation is ambiguous for fragments E and N. See Dataset S4 for details.

References

    1. Barrett RD, Schluter D. Adaptation from standing genetic variation. Trends Ecol Evol. 2008;23:38–44. - PubMed
    1. Blount ZD, Barrick JE, Davidson CJ, Lenski RE. Genomic analysis of a key innovation in an experimental Escherichia coli population. Nature. 2012;489:513–518. - PMC - PubMed
    1. Keeling PJ, Palmer JD. Horizontal gene transfer in eukaryotic evolution. Nat Rev Genet. 2008;9:605–618. - PubMed
    1. Boothby TC, et al. Evidence for extensive horizontal gene transfer from the draft genome of a tardigrade. Proc Natl Acad Sci USA. 2015;112:15976–15981. - PMC - PubMed
    1. Crisp A, Boschetti C, Perry M, Tunnacliffe A, Micklem G. Expression of multiple horizontally acquired genes is a hallmark of both vertebrate and invertebrate genomes. Genome Biol. 2015;16:50. - PMC - PubMed

Publication types

Associated data