Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul:35:97-106.
doi: 10.1016/j.fsigen.2018.03.012. Epub 2018 Apr 12.

A phylogenetic framework facilitates Y-STR variant discovery and classification via massively parallel sequencing

Affiliations

A phylogenetic framework facilitates Y-STR variant discovery and classification via massively parallel sequencing

Tunde I Huszar et al. Forensic Sci Int Genet. 2018 Jul.

Abstract

Short tandem repeats on the male-specific region of the Y chromosome (Y-STRs) are permanently linked as haplotypes, and therefore Y-STR sequence diversity can be considered within the robust framework of a phylogeny of haplogroups defined by single nucleotide polymorphisms (SNPs). Here we use massively parallel sequencing (MPS) to analyse the 23 Y-STRs in Promega's prototype PowerSeq™ Auto/Mito/Y System kit (containing the markers of the PowerPlex® Y23 [PPY23] System) in a set of 100 diverse Y chromosomes whose phylogenetic relationships are known from previous megabase-scale resequencing. Including allele duplications and alleles resulting from likely somatic mutation, we characterised 2311 alleles, demonstrating 99.83% concordance with capillary electrophoresis (CE) data on the same sample set. The set contains 267 distinct sequence-based alleles (an increase of 58% compared to the 169 detectable by CE), including 60 novel Y-STR variants phased with their flanking sequences which have not been reported previously to our knowledge. Variation includes 46 distinct alleles containing non-reference variants of SNPs/indels in both repeat and flanking regions, and 145 distinct alleles containing repeat pattern variants (RPV). For DYS385a,b, DYS481 and DYS390 we observed repeat count variation in short flanking segments previously considered invariable, and suggest new MPS-based structural designations based on these. We considered the observed variation in the context of the Y phylogeny: several specific haplogroup associations were observed for SNPs and indels, reflecting the low mutation rates of such variant types; however, RPVs showed less phylogenetic coherence and more recurrence, reflecting their relatively high mutation rates. In conclusion, our study reveals considerable additional diversity at the Y-STRs of the PPY23 set via MPS analysis, demonstrates high concordance with CE data, facilitates nomenclature standardisation, and places Y-STR sequence variants in their phylogenetic context.

Keywords: Massively parallel sequencing; PPY23; PowerSeq system; Repeat pattern variation (RPV); Single nucleotide polymorphism (SNP); Y-STRs.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Observed SNPs and indels in their phylogenetic context. The phylogenetic tree to the left represents the relationships among 100 diverse Y chromosomes, based on 13,261 high-confidence Y-SNPs previously described [11]. Y-chromosome haplogroups are given in their shorthand formats (Table S1) to the right of the tree. Y-STR names are listed above. Variants are shaded in grey and represented by filled circles if internal to the repeat array, or unfilled diamonds if in the flanking region. Variants are described below, by rs# where available, or otherwise as ‘SNP’ or ‘indel’ (Table S3). Note that ‘multiple SNPs’ internal to DYS635 (which we regard as an RPV − see text) are found in 85/100 samples because the GRCh38 reference assembly carries the same derived state as superhaplogroup P, and hence all deeper-rooting clades bearing the ancestral state are considered as ‘alternative’ rather than ‘reference’ variants. Note that rs370750300 and rs375658920 are listed elsewhere as DYS481-associated SNPs, and thus included in the figure; however, we regard these as an RPV (see text).
Fig. 2
Fig. 2
Examples of observed RPVs in their phylogenetic contexts. A phylogenetic tree is shown to the left, as in Fig. 1. a) Allele structures for DYS635 in all 100 samples. Repeat unit sequences are shown above, and boxes below contain the number of repeat units in each block, coloured by heat-map from blue (shortest) to red (longest). Invariant blocks are not coloured. SNPs and indels are highlighted by green and orange boxes respectively. Bars on the right mark features specifically mentioned in the text, and are coloured black for monophyletic, or grey for polyphyletic examples. Below is represented the reference sequence allele structure (‘ref.’) in GRCh38 chrY. To fully appreciate the colours of the heat-map, please, consult the online version of the figure. b) Allele structures for DYS389II; c) Allele structures for DYS481.

Similar articles

Cited by

References

    1. Butler J.M. Academic Press; Cambridge MA: 2009. Fundamentals of Forensic DNA Typing.
    1. Gettings K.B., Kiesler K.M., Faith S.A., Montano E., Baker C.H., Young B.A., Guerrieri R.A., Vallone P.M. Sequence variation of 22 autosomal STR loci detected by next generation sequencing. Forensic Sci. Int. Genet. 2016;21:15–21. - PMC - PubMed
    1. Nachman M.W., Crowell S.L. Estimate of the mutation rate per nucleotide in humans. Genetics. 2000;156:297–304. - PMC - PubMed
    1. Weber J.L., Wong C. Mutation of human short tandem repeats. Hum. Mol. Genet. 1993;3:1123–1128. - PubMed
    1. Jobling M.A., Pandya A., Tyler-Smith C. The Y chromosome in forensic analysis and paternity testing. Int. J. Legal Med. 1997;110:118–124. - PubMed

Publication types

LinkOut - more resources