. 2009 Dec 16;4(12):e8327.

doi: 10.1371/journal.pone.0008327.

An information gap in DNA evidence interpretation

Mark W Perlin¹, Alexander Sinelnikov

Affiliations

PMID: 20020039
PMCID: PMC2791197
DOI: 10.1371/journal.pone.0008327

An information gap in DNA evidence interpretation

Mark W Perlin et al. PLoS One. 2009.

. 2009 Dec 16;4(12):e8327.

doi: 10.1371/journal.pone.0008327.

Authors

Mark W Perlin¹, Alexander Sinelnikov

Affiliation

¹ Cybergenetics, Pittsburgh, Pennsylvania, United States of America. perlin@cybgen.com

PMID: 20020039
PMCID: PMC2791197
DOI: 10.1371/journal.pone.0008327

Abstract

Forensic DNA evidence often contains mixtures of multiple contributors, or is present in low template amounts. The resulting data signals may appear to be relatively uninformative when interpreted using qualitative inclusion-based methods. However, these same data can yield greater identification information when interpreted by computer using quantitative data-modeling methods. This study applies both qualitative and quantitative interpretation methods to a well-characterized DNA mixture and dilution data set, and compares the inferred match information. The results show that qualitative interpretation loses identification power at low culprit DNA quantities (below 100 pg), but that quantitative methods produce useful information down into the 10 pg range. Thus there is a ten-fold information gap that separates the qualitative and quantitative DNA mixture interpretation approaches. With low quantities of culprit DNA (10 pg to 100 pg), computer-based quantitative interpretation provides greater match sensitivity.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: Dr. Mark Perlin is a shareholder, officer and employee of Cybergenetics in Pittsburgh, PA, a company that develops genetic technology for computer interpretation of DNA evidence. Cybergenetics manufactures the TrueAllele® Casework system, which is one of the methods described in the paper. Dr. Alex Sinelnikov is an employee of Genetica in Cincinnati, OH, a company that conducts genetic testing. Dr. Sinelnikov was an employee of Cybergenetics at time he worked on this study.

Figures

**Figure 1. STR data can be modeled by linear superposition of genotype patterns.**
(a) The DNA sequencer data signal shows the Penta D STR locus for a 0.25 ng 30% culprit DNA sample C3. (b) Linearly combining genotype values and in respective 70% and 30% proportions forms a model of the observed allele peak height pattern.

formula image — **Figure 1. STR data can be modeled by linear superposition of genotype patterns.**
(a) The DNA sequencer data signal shows the Penta D STR locus for a 0.25 ng 30% culprit DNA sample C3. (b) Linearly combining genotype values and in respective 70% and 30% proportions forms a model of the observed allele peak height pattern.

**Figure 2. Different genotype combinations produce patterns that are compared with data to obtain a genotype probability.**
Linear combinations of known victim contributor with three different unknown contributor allele pair candidates are shown at STR locus Penta D. The victim contribution is known to be 70%, and the culprit's is 30%. The allelic peak height pattern that best fits the observed data (Figure 1a) corresponds to the candidate (rightmost column). The other two candidates produce patterns that have a very poor fit to the quantitative data peaks. Therefore, based on a multivariate normal likelihood function, allele pair candidate would have the greatest probability of arising from the culprit genotype.

**Figure 3. Some genotype probability distributions are more informative than others.**
The allele pair probabilities for six possible genotype candidates are shown for three different DNA mixture interpretation methods: (a) quantitative data modeling based on peak heights; (b) qualitative listing of genotype possibilities, filtered by consideration of an obligate culprit allele; and (c) unfiltered qualitative genotype listing that includes all allele pairs, based on “allele peaks” over threshold. A higher probability for the correct genotype value leads to greater match information.

**Figure 4. Qualitative genotype inference uses thresholds to discard data and produce a uniform genotype probability distribution.**
In qualitative genotype interpretation, a predetermined threshold is applied to the peak height data, retaining all peaks whose heights exceed the threshold, and discarding all other peaks. (a) This threshold operation transforms the quantitative peak height pattern into a qualitative all-or-none set of threshold-inferred alleles. (b) This data allele set can then be compared with victim (black) and candidate culprit (blue) genotype values in a match operation based on qualitative set inclusion. When accounting for the victim's genotype, all possible culprit allele pairs that combine with the victim's alleles to reproduce the data alleles are assigned equal positive probability.

**Figure 5. Histograms show how match information increases when using quantitative interpretation methods.**
The within-case log(LR) match information differences between quantitative and qualitative interpretation methods. (a) The information improvement log(LR2/CPI) between the LR2 (quantitative interpretation) and CPI (qualitative interpretation) match statistics on the same cases, when there are two unknown contributor genotypes. (b) The information improvement log(LR1/CLR) between the LR1 (quantitative method) and CLR (qualitative method) match statistics on the same cases, when the victim genotype is known, and there is one unknown contributor genotype.

**Figure 6. Match information as a function of mixture weight and interpretation method.**
A scatter plot of log(LR) match information (y-axis) versus culprit mixture weight (x-axis). Each of the five culprit weights (10%, 30%, 50%, 70% and 90%) has four columns, one for each of the mixture interpretation method match statistics (CPI in red, CLR in purple, LR2 in green, and LR1 in blue). In each column, there are up to eight mixture cases; a case is not shown when its log(LR) is negative.

**Figure 7. Determining DNA mass detection sensitivity by linear regression of match information versus DNA quantity.**
Scatter plots showing log(LR) match information (y-axis) versus log(culprit DNA) (x-axis) for four different mixture interpretation methods: (a) LR1 (blue), (b) LR2 (green), (c) CLR (purple), and (d) CPI (red). For each method, the scatter plots show an increasing ramp function that levels off when the maximum match information has been attained. The left ramp component is fitted to a regression line. The point at which this line intersects the horizontal million-to-one match information level gives the sensitivity of the interpretation method, measured in picograms of culprit DNA.

**Figure 8. The information gap in detection sensitivity between quantitative and qualitative DNA interpretation methods.**
Two log-log scatter plots are shown of LR match information (y-axis) versus culprit DNA quantity. There is an order of magnitude information gap between the more sensitive quantitative LR1 interpretation method (blue) and the less informative qualitative CPI method (red).

**Figure 9. Using a linear relationship to predict match information from DNA quantity in a criminal case.**
The linear regression of log(LR) match information (y-axis) versus log(culprit DNA) (x-axis) is used here as a calibration curve for the LR1 method. For a mass of 67 pg of culprit DNA, the regression line shows an expected LR of 10¹⁵. With the 12 STR loci actually examined in the homicide case (instead of the validation study's 15 STR loci), we expect an 80% information reduction to a LR of approximately 10¹².

See this image and copyright information in PMC

Cited by

Validating TrueAllele^® Interpretation of DNA Mixtures Containing up to Ten Unknown Contributors.
Bauer DW, Butt N, Hornyak JM, Perlin MW. Bauer DW, et al. J Forensic Sci. 2020 Mar;65(2):380-398. doi: 10.1111/1556-4029.14204. Epub 2019 Oct 3. J Forensic Sci. 2020. PMID: 31580496 Free PMC article.
Open practices in our science and our courtrooms.
Edge MD, Matthews JN. Edge MD, et al. Trends Genet. 2022 Feb;38(2):113-115. doi: 10.1016/j.tig.2021.09.010. Epub 2021 Nov 2. Trends Genet. 2022. PMID: 34740452 Free PMC article.
New York State TrueAllele ® casework validation study.
Perlin MW, Belrose JL, Duceman BW. Perlin MW, et al. J Forensic Sci. 2013 Nov;58(6):1458-66. doi: 10.1111/1556-4029.12223. Epub 2013 Jul 18. J Forensic Sci. 2013. PMID: 23865896 Free PMC article.
Four model variants within a continuous forensic DNA mixture interpretation framework: Effects on evidential inference and reporting.
Swaminathan H, Qureshi MO, Grgicak CM, Duffy K, Lun DS. Swaminathan H, et al. PLoS One. 2018 Nov 20;13(11):e0207599. doi: 10.1371/journal.pone.0207599. eCollection 2018. PLoS One. 2018. PMID: 30458020 Free PMC article.
Forensic trace DNA: a review.
van Oorschot RA, Ballantyne KN, Mitchell RJ. van Oorschot RA, et al. Investig Genet. 2010 Dec 1;1(1):14. doi: 10.1186/2041-2223-1-14. Investig Genet. 2010. PMID: 21122102 Free PMC article.

See all "Cited by" articles

References

1. National Research Council. Evaluation of Forensic DNA Evidence: Update on Evaluating DNA Evidence. Washington, DC: National Academies Press; 1996.
1. Butler JM. Forensic DNA Typing: Biology, Technology, and Genetics of STR Markers. New York: Academic Press; 2005.
1. Zedlewski E, Murphy MB. DNA analysis for “minor” crimes: a major benefit for law enforcement. NIJ Journal. 2006;253:2–5.
1. Gill P, Whitaker J, Flaxman C, Brown N, Buckleton J. An investigation of the rigor of interpretation rules for STRs derived from less than 100 pg of DNA. Forensic Sci Intl. 2000;112:17–40. - PubMed
1. Michaelis RC, Flanders RG, Wulff P. A Litigator's Guide to DNA: From the Laboratory to the Courtroom. New York: Academic Press; 2008.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

An information gap in DNA evidence interpretation

Affiliation

An information gap in DNA evidence interpretation

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Miscellaneous