Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 12;11(1):2990.
doi: 10.1038/s41467-020-16736-1.

Functional annotation of rare structural variation in the human brain

Collaborators, Affiliations

Functional annotation of rare structural variation in the human brain

Lide Han et al. Nat Commun. .

Abstract

Structural variants (SVs) contribute to many disorders, yet, functionally annotating them remains a major challenge. Here, we integrate SVs with RNA-sequencing from human post-mortem brains to quantify their dosage and regulatory effects. We show that genic and regulatory SVs exist at significantly lower frequencies than intergenic SVs. Functional impact of copy number variants (CNVs) stems from both the proportion of genic and regulatory content altered and loss-of-function intolerance of the gene. We train a linear model to predict expression effects of rare CNVs and use it to annotate regulatory disruption of CNVs from 14,891 independent genome-sequenced individuals. Pathogenic deletions implicated in neurodevelopmental disorders show significantly more extreme regulatory disruption scores and if rank ordered would be prioritized higher than using frequency or length alone. This work shows the deleteriousness of regulatory SVs, particularly those altering CTCF sites and provides a simple approach for functionally annotating the regulatory consequences of CNVs.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Details of CMC SV dataset.
Characterization of high confidence rare (<0.5%) SV dataset stratified by a type of SV, b allele frequency, and c length (log10-scaled) colored by type of SV. SV types, include Alu (Alu), complex (CPX), translocation (CTX), deletion (DEL), duplication (DUP), insertion (INS), inversion (INV), long interspersed nuclear element-1 (LINE1), SINE-VNTR-Alu (SVA), including short interspersed nuclear elements, variable number tandem repeat, and Alu.
Fig. 2
Fig. 2. Genic and regulatory SVs occur at significantly lower frequencies.
Proportion of variants that are seen only a single time with bootstrapped 95% confidence interval in the sample stratified by overlap with any annotation, allowing for multiple (CMC), only a single annotation (CMC unique) and any annotation in gnomAD SV.
Fig. 3
Fig. 3. Genic SVs induce observable changes in expression.
Expression presented as a z-score for a all CNV that overlap any proportion of the exonic sequence of a gene, b CNV that delete or duplicate 100% of the exonic sequence of a gene, and c all inversions with any gene overlap (green) compared to all other SVs (gray). Deletions are red, duplications are blue. The dashed lines are located at z-score of 2 and −2.
Fig. 4
Fig. 4. Genes intolerant to variation are less likely to be affected by genic or regulatory SVs.
Each plot stratifies genes using either the LoF intolerance metric or the CNV intolerance metric that have been split into quintiles (20% bins) ordered left to right from least to most intolerant genes and by deletion (red) and duplication (blue). The plots show the effect of this stratification on a the proportion of the exonic sequence that is affected showing mean and standard deviation, b the deviation from the expected 20% of CNV that alter exonic sequence, c the deviation from expected for noncoding CNV that alter promoters, and d the deviation from expected for noncoding CNV that alter enhancers.
Fig. 5
Fig. 5. Transcriptional consequences of rare CNVs can be significantly predicted.
SV expression prediction performance and associated R2 from building the same linear model using different training and test datasets. a CMC into CMC_HBCC, b CMC_HBCC into CMC, c CMC into CMC, and d CMC_HBCC into CMC_HBCC. The best fit line with confidence interval was produced using generalized additive model smoothing.
Fig. 6
Fig. 6. Regulatory disruption scores prioritize pathogenic CNVs better than standard annotations.
Number of pathogenic variants defined as 50% overlap with known pathogenic variant in ClinGen (84 deletions and 84 duplications) identified based on rank ordering deletions a and duplications b by length (yellow), number of genes deleted (green), number of intolerant genes deleted (purple) allele frequency (red), and regulatory disruption (blue). Where multiple variants had the same value, the order was random.

References

    1. Stankiewicz P, Lupski JR. Structural variation in the human genome and its role in disease. Annu. Rev. Med. 2010;61:437–455. - PubMed
    1. Sudmant PH, et al. An integrated map of structural variation in 2,504 human genomes. Nature. 2015;526:75–81. - PMC - PubMed
    1. Weischenfeldt J, Symmons O, Spitz F, Korbel JO. Phenotypic impact of genomic structural variation: insights from and for human disease. Nat. Rev. Genet. 2013;14:125–138. - PubMed
    1. CNV and Schizophrenia Working Groups of the Psychiatric Genomics Consortium. Contribution of copy number variants to schizophrenia from a genome-wide study of 41,321 subjects. Nat. Genet. 2017;49:27–35. - PMC - PubMed
    1. Glessner JT, et al. Copy number variation meta-analysis reveals a novel duplication at 9p24 associated with multiple neurodevelopmental disorders. Genome Med. 2017;9:106. - PMC - PubMed

Publication types