Utilizing ExAC to assess the hidden contribution of variants of unknown significance to Sanfilippo Type B incidence
- PMID: 29979746
- PMCID: PMC6034809
- DOI: 10.1371/journal.pone.0200008
Utilizing ExAC to assess the hidden contribution of variants of unknown significance to Sanfilippo Type B incidence
Abstract
Given the large and expanding quantity of publicly available sequencing data, it should be possible to extract incidence information for monogenic diseases from allele frequencies, provided one knows which mutations are causal. We tested this idea on a rare, monogenic, lysosomal storage disorder, Sanfilippo Type B (Mucopolysaccharidosis type IIIB). Sanfilippo Type B is caused by mutations in the gene encoding α-N-acetylglucosaminidase (NAGLU). There were 189 NAGLU missense variants found in the ExAC dataset that comprises roughly 60,000 individual exomes. Only 24 of the 189 missense variants were known to be pathogenic; the remaining 165 variants were of unknown significance (VUS), and their potential contribution to disease is unknown. To address this problem, we measured enzymatic activities of 164 NAGLU missense VUS in the ExAC dataset and developed a statistical framework for estimating disease incidence with associated confidence intervals. We found that 25% of VUS decreased the activity of NAGLU to levels consistent with Sanfilippo Type B pathogenic alleles. We found that a substantial fraction of Sanfilippo Type B incidence (67%) could be accounted for by novel mutations not previously identified in patients, illustrating the utility of combining functional activity data for VUS with population-wide allele frequency data in estimating disease incidence.
Conflict of interest statement
All authors are full time employees of BioMarin Pharmaceutical Inc. This does not alter our adherence to PLOS ONE policies on sharing data and materials.
Figures






References
-
- Moyer VA, Calonge N, Teutsch SM, Botkin JR. Expanding newborn screening: process, policy, and priorities. Hastings Center Report. 2008;38(3):32–39. doi: 10.1353/hcr.0.0011 - DOI - PubMed
-
- Sleat DE, Gedvilaite E, Zhang Y, Lobel P, Xing J. Analysis of large-scale whole exome sequencing data to determine the prevalence of genetically-distinct forms of neuronal ceroid lipofuscinosis. Gene. 2016;593(2):284–291. doi: 10.1016/j.gene.2016.08.031 - DOI - PMC - PubMed
-
- Schrodi SJ, DeBarber A, He M, Ye Z, Peissig P, Van Wormer JJ, et al. Prevalence estimation for monogenic autosomal recessive diseases using population-based genetic data. Human genetics. 2015;134(6):659–669. doi: 10.1007/s00439-015-1551-8 - DOI - PubMed
-
- Crow JF, Kimura M. An introduction to population genetics theory. An introduction to population genetics theory. 1970;.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources