Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar 18;17(3):e1009389.
doi: 10.1371/journal.pgen.1009389. eCollection 2021 Mar.

Gene disruption by structural mutations drives selection in US rice breeding over the last century

Affiliations

Gene disruption by structural mutations drives selection in US rice breeding over the last century

Justin N Vaughn et al. PLoS Genet. .

Abstract

The genetic basis of general plant vigor is of major interest to food producers, yet the trait is recalcitrant to genetic mapping because of the number of loci involved, their small effects, and linkage. Observations of heterosis in many crops suggests that recessive, malfunctioning versions of genes are a major cause of poor performance, yet we have little information on the mutational spectrum underlying these disruptions. To address this question, we generated a long-read assembly of a tropical japonica rice (Oryza sativa) variety, Carolina Gold, which allowed us to identify structural mutations (>50 bp) and orient them with respect to their ancestral state using the outgroup, Oryza glaberrima. Supporting prior work, we find substantial genome expansion in the sativa branch. While transposable elements (TEs) account for the largest share of size variation, the majority of events are not directly TE-mediated. Tandem duplications are the most common source of insertions and are highly enriched among 50-200bp mutations. To explore the relative impact of various mutational classes on crop fitness, we then track these structural events over the last century of US rice improvement using 101 resequenced varieties. Within this material, a pattern of temporary hybridization between medium and long-grain varieties was followed by recent divergence. During this long-term selection, structural mutations that impact gene exons have been removed at a greater rate than intronic indels and single-nucleotide mutations. These results support the use of ab initio estimates of mutational burden, based on structural data, as an orthogonal predictor in genomic selection.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Insertions and deletions between temperate and tropical japonica references.
A) Schematic illustrating the insertion/deletion orientation method for indel characterization. Any mutation found only in O. glaberrima is ambiguous and ignored. B) Distribution of gap coverage values across all events analyzed. As indicated, insertions and deletions are defined as mutations with a gap coverage of >95% or <5%, respectively. C) Boxplot depicting the log-transformed size distribution of events broken out by type and the variety–CarGold (cg) or Nipponbare (nip)–in which it was derived. D) Scatterplot showing each insertion, sorted by length, and its impact on cumulative length of all insertions. TEs contributing to rapid changes in total inserted sequence are indicated by orange window.
Fig 2
Fig 2. Tandem bias in insertions versus deletions between Nipponbare and CarGold.
Indel size is plotted against the d metric for each insertion and deletion satisfying our alignment and inference criteria. Upper corner insets show only indels <125bp. Lower corner insets give two examples of how d is calculated; alignments between ancestral and derived (indicated by red or green) sequence are depicted as simplified dotplots.
Fig 3
Fig 3. The genetic structure of US rice breeding varieties.
A) The population structure of sequenced US germplasm samples in this study. The dendrogram represents hierarchical clustering based on a kinship matrix derived from the low LD marker set (See Materials and Methods). Bar plots, based on the same markers, indicate the proportion of genetic content of each individual that can be assigned to a set of known rice subpopulations. Colored dots represent the general location/program from which material was derived. B). Heatmap of the centered-identity-by-state (IBS) between each variety in the analysis and all other varieties, both along the column and row axis. Columns are clustered based on similarity. Rows are ordered explicitly by date of release. The red squares connect a variety’s clustering position with its release-date position. Seed type is indicated along the top, clustering axis. “Short grain” shown as maroon.
Fig 4
Fig 4. The rate of change in functional variant classes across a century of plant breeding.
Rate estimates are reported only for mutations that have occurred in CarGold, not Nipponbare (see Text). A) Cumulative frequency of rates for each variant class. The farther a class is shifted to the left of 0, the more rapidly it has declined on average. B) Plot and linear fit of release date versus the total count of alleles that disrupt an exon in each variety for each grain type. C) Relationship between change in haplotypes scored by number of exonic indels and the variance in this score (see Fig 5). The y-axis represents, for each LD block, the regression coefficient of a linear model between haplotype score and the release date of the variety containing that haplotype. The x-axis represents the variance for those haplotype scores. In B and C, the shading around the regression line represents the 95% confidence intervals in combined intercept and slope estimates.
Fig 5
Fig 5. The select-ome of US rice breeding.
Chromosome 1 is shown as representative. The chromosome is divided into LD blocks and haplotypes within those blocks are colored based on red being the most frequent across the three time periods defined on far left. Semi-dwarf locus, a known target of selection, is labeled. Sparse regions represent very small LD blocks. All chromosomes are plotted in S8 Fig. See https://gbru-ars.shinyapps.io/HaploStrata/ for fully interactive plots.

References

    1. Gaut BS, Seymour DK, Liu Q, Zhou Y. Demography and its effects on genomic variation in crop domestication. Nat Plants. 2018;4:512. 10.1038/s41477-018-0210-1 - DOI - PubMed
    1. Moyers BT, Morrell PL, McKay JK. Genetic Costs of Domestication and Improvement. J Hered. 2018;109:103–116. 10.1093/jhered/esx069 - DOI - PubMed
    1. Lu J, Tang T, Tang H, Huang J, Shi S, Wu C-I. The accumulation of deleterious mutations in rice genomes: a hypothesis on the cost of domestication. Trends Genet TIG. 2006;22:126–131. 10.1016/j.tig.2006.01.004 - DOI - PubMed
    1. Wallace JG, Rodgers-Melnick E, Buckler ES. On the Road to Breeding 4.0: Unraveling the Good, the Bad, and the Boring of Crop Quantitative Genomics. Annu Rev Genet. 2018;52:421–444. 10.1146/annurev-genet-120116-024846 - DOI - PubMed
    1. Yang J, Mezmouk S, Baumgarten A, Buckler ES, Guill KE, McMullen MD, et al. Incomplete dominance of deleterious alleles contributes substantially to trait variation and heterosis in maize. PLOS Genet. 2017;13:e1007019. 10.1371/journal.pgen.1007019 - DOI - PMC - PubMed

Publication types

Substances

LinkOut - more resources