Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul 7;11(1):3403.
doi: 10.1038/s41467-020-17195-4.

Discovery and population genomics of structural variation in a songbird genus

Affiliations

Discovery and population genomics of structural variation in a songbird genus

Matthias H Weissensteiner et al. Nat Commun. .

Erratum in

Abstract

Structural variation (SV) constitutes an important type of genetic mutations providing the raw material for evolution. Here, we uncover the genome-wide spectrum of intra- and interspecific SV segregating in natural populations of seven songbird species in the genus Corvus. Combining short-read (N = 127) and long-read re-sequencing (N = 31), as well as optical mapping (N = 16), we apply both assembly- and read mapping approaches to detect SV and characterize a total of 220,452 insertions, deletions and inversions. We exploit sampling across wide phylogenetic timescales to validate SV genotypes and assess the contribution of SV to evolutionary processes in an avian model of incipient speciation. We reveal an evolutionary young (~530,000 years) cis-acting 2.25-kb LTR retrotransposon insertion reducing expression of the NDP gene with consequences for premating isolation. Our results attest to the wealth and evolutionary significance of SV segregating in natural populations and highlight the need for reliable SV genotyping.

PubMed Disclaimer

Conflict of interest statement

K.-J.F. is an employee of BioNano Genomics (San Diego, CA). All other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Sampling setup and assembly-based structural and single-nucleotide variation.
a Phylogeny of sampled species in the genus Corvus (after, see ref. ). Numbers in columns represent individual numbers for short-read sequencing (SR), long-read sequencing (LR), and optical mapping (OM). b Density histogram showing the abundance of genetic variation within single individuals. Counts of variants per 1 Mb windows are based on comparing the two haplotypes of each assembly. The upper panel reflects structural variation (SV) densities, the lower panel reflects densities for SNPs. Crow drawings in 1A by Alice Meröndun, published under a CC BY-SA 4.0 license (https://creativecommons.org/licenses/by-sa/4.0/deed.en). No changes were made to the original drawings. Source data are provided as Source Data file.
Fig. 2
Fig. 2. Phylogenetic filtering of read mapping-based structural variants.
a Example genotype plots of LR-based variants according to phylogenetically informed filtering. Given the large divergence time of 13 million years between the crow and jackdaw lineage, the proportion of polymorphisms shared by descent is negligible and therefore likely constitutes false positives or hypermutable sites (left). Variants segregating exclusively in the jacdaw or crow clade (middle and right), however, comply with the infinite sites model and were retained accordingly. Plotted are genotypes of one representative chromosome (chromosome 18), with genotypes of variants in different colors, where each row corresponds to one individual (N = 8 individuals jackdaw clade and N = 24 individuals crow clade). Note that, due to the tolerance of a certain number of misgenotyped variants per clade, some variants are present in both clades. b Excluded versus retained variants in relation to SV class and chromosomal distribution. Excluded variants are enriched for deletions (LMM, p < 10−16) and c are most abundant at chromosome ends, coinciding with d, an increased repeat density. Source data are provided as Source Data file.
Fig. 3
Fig. 3. Characterization and allele frequencies of SV.
a Length distributions of deletions and insertions shorter than 10 kb identified with LR (upper panel) and OM (lower panel) data. Pronounced peaks at 0.9, 2.4 kb in the LR and at 2.4 and 6.5 kb in the OM variants likely stem from an overrepresentation of specific repeats. Indeed, among the five most common repeats found in insertions and deletions are LTR retrotransposons with a consensus sequence length of 670, 1315, 2072 bp, respectively. b Content of insertion and deletion sequences. About half of all variants were assigned to a known repeat family, of which transposable elements from the LTR retrotransposon subclass were most common, followed by simple repeats (including microsatellites) and low complexity repeats. c Folded allele frequency spectra of structural variants. Upper and lower panels correspond to the jackdaw and crow clade, respectively. The five left panels depict the minor allele frequencies of insertions and deletions, and the rightmost panel that of inversions. Source data are provided as Source Data file.
Fig. 4
Fig. 4. SV-based population structure and LTR retrotransposon insertion upstream of the NDP gene.
a Principal component analysis based on SV genotypes. In the left, all individuals were analyzed together. Individuals of the crow clade are tightly clustered and separate from the jackdaw clade along PC1; individuals of the two jackdaw species separate along PC2. The middle displays the results of PCA conducted exclusively for individuals from the crow clade, clearly separating American crows from the Eurasian species. In the right, only the European populations of the crow clade are included, showing marked separation of the Spanish carrion crow population. b A 2.25-kb LTR retrotransposon insertion into the crow lineage (black bar: ancestral state, gray bar: derived, reference allele) belongs to the endogenous retrovirus K (ERVK) subfamily corCorLTRK1b and is located 20 kb upstream of the NDP gene. A highly conserved non-coding region (pink arrow) is present in close proximity (2.8 kb) to the insertion in the 3′ flanking sequence. This region, which is conserved between chicken, human, and crow (Supplementary Fig. 8), is likely a regulatory element which may be affected by the nearby LTR retrotransposon insertion. Located in the 5′ region of the insertion is a region copy number variable in pigeons, associated with plumage pattern variation. c Genotypes of the LTR insertion in short-read (SR) and long-read (LR) data. In both datasets, the LTR element insertion (blue) is fixed in all hooded crow populations. Species and populations with a black plumage are either polymorphic (light green) or fixed for the ancestral state, non-insertion (green). d Gene expression of NDP in body skin. Normalized gene counts of 18 individuals are significantly associated with the insertion genotypes (LMM, p = 0.002), boxplot center lines show median show medians and whiskers 1.5 times the interquartile range (n = 19). Source data are provided as Source Data file.

References

    1. Feuk L, Carson AR, Scherer SW. Structural variation in the human genome. Nat. Rev. Genet. 2006;7:85–97. - PubMed
    1. Küpper C, et al. A supergene determines highly divergent male reproductive morphs in the ruff. Nat. Genet. 2016;48:79–83. - PMC - PubMed
    1. van’t Hof AE, et al. The industrial melanism mutation in British peppered moths is a transposable element. Nature. 2016;534:102–105. - PubMed
    1. Alkan C, Coe BP, Eichler EE. Genome structural variation discovery and genotyping. Nat. Rev. Genet. 2011;12:363–376. - PMC - PubMed
    1. Huddleston J, Eichler EE. An incomplete understanding of human genetic variation. Genetics. 2016;202:1251–1254. - PMC - PubMed

Publication types