Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Dec 20;20(Suppl 24):671.
doi: 10.1186/s12859-019-3246-y.

A unified STR profiling system across multiple species with whole genome sequencing data

Affiliations

A unified STR profiling system across multiple species with whole genome sequencing data

Yilin Liu et al. BMC Bioinformatics. .

Abstract

Background: Short tandem repeats (STRs) serve as genetic markers in forensic scenes due to their high polymorphism in eukaryotic genomes. A variety of STRs profiling systems have been developed for species including human, dog, cat, cattle, etc. Maintaining these systems simultaneously can be costly. These mammals share many high similar regions along their genomes. With the availability of the massive amount of the whole genomics data of these species, it is possible to develop a unified STR profiling system. In this study, our objective is to propose and develop a unified set of STR loci that could be simultaneously applied to multiple species.

Result: To find a unified STR set, we collected the whole genome sequence data of the concerned species and mapped them to the human genome reference. Then we extracted the STR loci across the species. From these loci, we proposed an algorithm which selected a subset of loci by incorporating the optimized combined power of discrimination. Our results show that the unified set of loci have high combined power of discrimination, >1-10-9, for both individual species and the mixed population, as well as the random-match probability, <10-7 for all the involved species, indicating that the identified set of STR loci could be applied to multiple species.

Conclusions: We identified a set of STR loci which shared by multiple species. It implies that a unified STR profiling system is possible for these species under the forensic scenes. The system can be applied to the individual identification or paternal test of each of the ten common species which are Sus scrofa (pig), Bos taurus (cattle), Capra hircus (goat), Equus caballus (horse), Canis lupus familiaris (dog), Felis catus (cat), Ovis aries (sheep), Oryctolagus cuniculus (rabbit), and Bos grunniens (yak), and Homo sapiens (human). Our loci selection algorithm employed a greedy approach. The algorithm can generate the loci under different forensic parameters and for a specific combination of species.

Keywords: Individual identification; Short tandem repeats; Whole genome sequencing.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Impact of sample size on allele detection and major forensic parameters (PD, HE, MPF)
Fig. 2
Fig. 2
The distribution of call rate
Fig. 3
Fig. 3
Distribution of PD of remaining loci (ηl≥0.5)
Fig. 4
Fig. 4
(𝕃) achieved by different number of loci
Fig. 5
Fig. 5
Number of loci generated for different number of species
Fig. 6
Fig. 6
Box plot for common logarithms of RMPs on 10,000 simulated individuals with CODIS and loci selected with proposed method
Fig. 7
Fig. 7
Normalized probability distributions of common logarithm of CPI in trio paternity testing in human

Similar articles

Cited by

References

    1. Richard G-F, Kerrest A, Dujon B. Comparative genomics and molecular dynamics of dna repeats in eukaryotes. Microbiol Mol Biol Rev. 2008;72(4):686–727. doi: 10.1128/MMBR.00011-08. - DOI - PMC - PubMed
    1. Gulcher J. Microsatellite markers for linkage and association studies. Cold Spring Harb Protoc. 2012;2012(4):068510. doi: 10.1101/pdb.top068510. - DOI - PubMed
    1. Ruitberg CM, Reeder DJ, Butler JM. Strbase: a short tandem repeat dna database for the human identity testing community. Nucleic Acids Res. 2001;29(1):320–2. doi: 10.1093/nar/29.1.320. - DOI - PMC - PubMed
    1. Butler John M. Short tandem repeat typing technologies used in human identity testing. BioTechniques. 2007;43(4):Sii–Sv. doi: 10.2144/000112582. - DOI - PubMed
    1. Müller S, Flekna G, Müller M, Brem G. Use of canine microsatellite polymorphism in forensic examinations. J Hered. 1999;90(1):55–6. doi: 10.1093/jhered/90.1.55. - DOI - PubMed

LinkOut - more resources