Genome reference and sequence variation in the large repetitive central exon of human MUC5AC
- PMID: 24010879
- PMCID: PMC3930937
- DOI: 10.1165/rcmb.2013-0235OC
Genome reference and sequence variation in the large repetitive central exon of human MUC5AC
Abstract
Despite modern sequencing efforts, the difficulty in assembly of highly repetitive sequences has prevented resolution of human genome gaps, including some in the coding regions of genes with important biological functions. One such gene, MUC5AC, encodes a large, secreted mucin, which is one of the two major secreted mucins in human airways. The MUC5AC region contains a gap in the human genome reference (hg19) across the large, highly repetitive, and complex central exon. This exon is predicted to contain imperfect tandem repeat sequences and multiple conserved cysteine-rich (CysD) domains. To resolve the MUC5AC genomic gap, we used high-fidelity long PCR followed by single molecule real-time (SMRT) sequencing. This technology yielded long sequence reads and robust coverage that allowed for de novo sequence assembly spanning the entire repetitive region. Furthermore, we used SMRT sequencing of PCR amplicons covering the central exon to identify genetic variation in four individuals. The results demonstrated the presence of segmental duplications of CysD domains, insertions/deletions (indels) of tandem repeats, and single nucleotide variants. Additional studies demonstrated that one of the identified tandem repeat insertions is tagged by nonexonic single nucleotide polymorphisms. Taken together, these data illustrate the successful utility of SMRT sequencing long reads for de novo assembly of large repetitive sequences to fill the gaps in the human genome. Characterization of the MUC5AC gene and the sequence variation in the central exon will facilitate genetic and functional studies for this critical airway mucin.
Figures
References
-
- Rose MC, Voynow JA. Respiratory tract mucin genes and mucin glycoproteins in health and disease. Physiol Rev. 2006;86:245–278. - PubMed
-
- Rodríguez-Piñeiro AM, Bergström JH, Ermund A, Gustafsson JK, Schuette A, Johansson ME, Hansson GC. Gastrointestinal mucus proteome reveals Muc2 and Muc5ac accompanied by a set of core proteins: 2. Studies of mucus in mouse stomach, small intestine, and colon. Am J Physiol Gastrointest Liver Physiol. 2013;305:G348–G356. - PMC - PubMed
-
- Stonebraker JR, Wagner D, Lefensty RW, Burns K, Gendler SJ, Bergelson JM, Boucher RC, O’Neal WK, Pickles RJ. Glycocalyx restricts adenoviral vector access to apical receptors expressed on respiratory epithelium in vitro and in vivo: role for tethered mucins as barriers to lumenal infection. J Virol. 2004;78:13755–13768. - PMC - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
Miscellaneous
