Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 May;29(5):870-880.
doi: 10.1101/gr.241240.118. Epub 2019 Apr 16.

Structural variants in 3000 rice genomes

Affiliations

Structural variants in 3000 rice genomes

Roven Rommel Fuentes et al. Genome Res. 2019 May.

Abstract

Investigation of large structural variants (SVs) is a challenging yet important task in understanding trait differences in highly repetitive genomes. Combining different bioinformatic approaches for SV detection, we analyzed whole-genome sequencing data from 3000 rice genomes and identified 63 million individual SV calls that grouped into 1.5 million allelic variants. We found enrichment of long SVs in promoters and an excess of shorter variants in 5' UTRs. Across the rice genomes, we identified regions of high SV frequency enriched in stress response genes. We demonstrated how SVs may help in finding causative variants in genome-wide association analysis. These new insights into rice genome biology are valuable for understanding the effects SVs have on gene function, with the prospect of identifying novel agronomically important alleles that can be utilized to improve cultivated rice.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Distribution and classification of SVs. (A) Frequency of observations per SV cluster. Only 562 high-coverage samples were used for insertion detection. (B) Distribution of variant sizes by SV type. (C) Classification of variants in each peak (cluster frequency > 10 samples). (D) Frequencies of events with 98% sequence identity to known or potentially active TEs in rice.
Figure 2.
Figure 2.
Structure analysis based on selected CNVs and assuming K = [2, …, 9] subpopulations.
Figure 3.
Figure 3.
SVs in genome features. (A) Enrichment/depletion of deletions (green) and insertions (orange) in various genomic regions. As expected, genic regions have fewer SVs than intergenic ones, with CDSs and exons being the most conserved regions. (B) Distribution of deletion and insertion clusters near the transcription start site (TSS). Although the total number of SNPs is much larger than SV clusters, SVs affect more positions. The bump at about −366 bp just before the core promoter is explained by longer SVs associated with transposons. (C) Distribution of the number of deletions in the vicinities of start and end of transcription and translation (Supplemental Fig. S16). (D) P-values of the independence tests between predicted TFBS and deletions. Strong anti-correlation is observed at the TSS and ∼100 bp upstream. Distribution of P-values shows that in the core promoter area ([TSS-200, TSS]), deletions and TFBS are not independent.
Figure 4.
Figure 4.
Deleted genes in variety groups. (A) Percentage of deleted genes in each variety group. (B) Number of deleted genes (frequency ≥ 5) that are unique or shared between variety groups. Note that the number of the deleted genes in Japonica is lower can be explained by the bias introduced by using Nipponbare genome as a reference.

References

    1. The 1001 Genomes Consortium. 2016. 1,135 genomes reveal the global pattern of polymorphism in Arabidopsis thaliana. Cell 166: 481–491. 10.1016/j.cell.2016.05.063 - DOI - PMC - PubMed
    1. The 3000 rice genomes project. 2014. The 3,000 rice genomes project. GigaScience 3: 7 10.1186/2047-217X-3-7 - DOI - PMC - PubMed
    1. Abyzov A, Urban AE, Snyder M, Gerstein M. 2011. CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res 21: 974–984. 10.1101/gr.114876.110 - DOI - PMC - PubMed
    1. Alexandrov N, Tai S, Wang W, Mansueto L, Palis K, Fuentes RR, Ulat VJ, Chebotarov D, Zhang G, Li Z, et al. 2014. SNP-Seek database of SNPs derived from 3000 rice genomes. Nucleic Acids Res 63: 2–6. 10.1093/nar/gku1039 - DOI - PMC - PubMed
    1. Alkan C, Coe BP, Eichler EE. 2011. Genome structural variation discovery and genotyping. Nat Rev Genet 12: 363–376. 10.1038/nrg2958 - DOI - PMC - PubMed

Publication types

Substances

LinkOut - more resources