Abstract
Deconvoluting mixture samples is one of the most challenging problems confronting DNA forensic laboratories. Efforts have been made to provide solutions regarding mixture interpretation. The probabilistic interpretation of Short Tandem Repeat (STR) profiles has increased the number of complex mixtures that can be analyzed. A portion of complex mixture profiles, particularly for mixtures with a high number of contributors, are still being deemed uninterpretable. Novel forensic markers, such as Single Nucleotide Variants (SNV) and microhaplotypes, also have been proposed to allow for better mixture interpretation. However, these markers have both a lower discrimination power compared with STRs and are not compatible with CODIS or other national DNA databanks worldwide. The short-read sequencing (SRS) technologies can facilitate mixture interpretation by identifying intra-allelic variations within STRs. Unfortunately, the short size of the amplicons containing STR markers and sequence reads limit the alleles that can be attained per STR. The latest long-read sequencing (LRS) technologies can overcome this limitation in some samples in which larger DNA fragments (including both STRs and SNVs) with definitive phasing are available. Based on the LRS technologies, this study developed a novel CODIS compatible forensic marker, called a macrohaplotype, which combines a CODIS STR and flanking variants to offer extremely high number of haplotypes and hence very high discrimination power per marker. The macrohaplotype will substantially improve mixture interpretation capabilities. Based on publicly accessible data, a panel of 20 macrohaplotypes with sizes of ~ 8 k bp and the maximum high discrimination powers were designed. The statistical evaluation demonstrates that these macrohaplotypes substantially outperform CODIS STRs for mixture interpretation, particularly for mixtures with a high number of contributors, as well as other forensic applications. Based on these results, efforts should be undertaken to build a complete workflow, both wet-lab and bioinformatics, to precisely call the variants and generate the macrohaplotypes based on the LRS technologies.
Similar content being viewed by others
References
Gill P, Jeffreys AJ, Werrett DJ (1985) Forensic application of DNA ‘fingerprints.’ Nature 318:577–579
Voorhees JC, Ferrance JP, Landers JP (2006) Enhanced elution of sperm from cotton swabs via enzymatic digestion for rape kit analysis. J Forensic Sci 51:574–579. https://doi.org/10.1111/j.1556-4029.2006.00112.x
Giusti A, Baird M, Pasquale S, Balazs I, Glassberg J (1986) Application of deoxyribonucleic acid (DNA) polymorphisms to the analysis of DNA recovered from sperm. J Forensic Sci 31:409–417
Vandewoestyne M, Van Nieuwerburgh F, Van Hoofstat D, Deforce D (2012) Evaluation of three DNA extraction protocols for forensic STR typing after laser capture microdissection. Forensic Sci Int Genet 6:258–262. https://doi.org/10.1016/j.fsigen.2011.06.002
Šafařı́k I, Šafařı́ková M (1999) Use of magnetic techniques for the isolation of cells. J Chromatogr B Biomed Sci Appl 722:33–53
Buoncristiani MR, Timken MD (2009) Development of a procedure for dielectrophoretic (DEP) separation of sperm and epithelial cells for application to sexual assault case evidence. Bureau of Justice Statistics. https://www.ojp.gov/pdffiles1/nij/grants/228278.pdf. Accessed 7 Aug 2021
Gill P, Brenner CH, Buckleton JS et al (2006) DNA commission of the International Society of Forensic Genetics: recommendations on the interpretation of mixtures. Forensic Sci Int 160:90–101. https://doi.org/10.1016/j.forsciint.2006.04.009
SWGDAM (2015) Guidelines for the Validation of Probabilistic Genotyping Systems. https://1ecb9588-ea6f-4feb-971a-73265dbf079c.filesusr.com/ugd/4344b0_22776006b67c4a32a5ffc04fe3b56515.pdf. Accessed 7 Aug 2021
Bright JA, Taylor D, McGovern C et al (2016) Developmental validation of STRmix, expert software for the interpretation of forensic DNA profiles. Forensic Sci Int Genet 23:226–239. https://doi.org/10.1016/j.fsigen.2016.05.007
Gill P, Haned H, Eduardoff M, Santos C, Phillips C, Parson W (2015) The open-source software LRmix can be used to analyse SNP mixtures. Forensic Sci Int Genet Suppl Ser 5:e50–e51
Perlin MW, Legler MM, Spencer CE et al (2011) Validating TrueAllele® DNA mixture interpretation. J Forensic Sci 56:1430–1447
Bleka Ø, Storvik G, Gill P (2016) EuroForMix: an open source software based on a continuous model to evaluate STR DNA profiles from a mixture of contributors with artefacts. Forensic Sci Int Genet 21:35–44. https://doi.org/10.1016/j.fsigen.2015.11.008
Ge J, Budowle B, Planz JV, Chakraborty R (2010) Haplotype block: a new type of forensic DNA markers. Int J Legal Med 124:353–361. https://doi.org/10.1007/s00414-009-0400-5
Kidd KK, Speed WC, Pakstis AJ et al (2017) Evaluating 130 microhaplotypes across a global set of 83 populations. Forensic Sci Int Genet 29:29–37. https://doi.org/10.1016/j.fsigen.2017.03.014
Castella V, Gervaix J, Hall D (2013) DIP–STR: highly sensitive markers for the analysis of unbalanced genomic mixtures. Hum Mutat 34:644–654. https://doi.org/10.1002/humu.22280
Wang L, He W, Mao J et al (2015) Development of a SNP-STRs multiplex for forensic identification. Forensic Sci Int Genet Suppl Ser 5:e598–e600. https://doi.org/10.1016/j.fsigss.2015.09.236
Liu Z, Liu J, Wang J et al (2018) A set of 14 DIP-SNP markers to detect unbalanced DNA mixtures. Biochem Biophys Res Commun 497:591–596. https://doi.org/10.1016/j.bbrc.2018.02.109
Voskoboinik L, Darvasi A (2011) Forensic identification of an individual in complex DNA mixtures. Forensic Sci Int Genet 5:428–435. https://doi.org/10.1016/j.fsigen.2010.09.002
Voskoboinik L, Ayers SB, LeFebvre AK, Darvasi A (2015) SNP-microarrays can accurately identify the presence of an individual in complex forensic DNA mixtures. Forensic Sci Int Genet 16:208–215. https://doi.org/10.1016/j.fsigen.2015.01.009
Gill P, Phillips C, McGovern C, Bright JA, Buckleton J (2012) An evaluation of potential allelic association between the STRs vWA and D12S391: implications in criminal casework and applications to short pedigrees. Forensic Sci Int Genet 6:477–486. https://doi.org/10.1016/j.fsigen.2011.11.001
Epstein MP, Duren WL, Boehnke M (2000) Improved inference of relationship for pairs of individuals. Am J Hum Genet 67:1219–1231. https://doi.org/10.1016/S0002-9297(07)62952-8
Homer N, Szelinger S, Redman M et al (2008) Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet 4:e1000167
Egeland T, Fonnelop AE, Berg PR, Kent M, Lien S et al (2012) Complex mixtures: a critical examination of a paper by Homer. Forensic Sci Int Genet 6:64–69. https://doi.org/10.1016/j.fsigen.2011.02.003
Børsting C, Morling N (2015) Next generation sequencing and its applications in forensic genetics. Forensic Sci Int Genet 18:78–89
Novroski NMM, King JL, Churchill JD, Seah LH, Budowle B (2016) Characterization of genetic sequence variation of 58 STR loci in four major population groups. Forensic Sci Int Genet 25:214–226. https://doi.org/10.1016/j.fsigen.2016.09.007
Van Neste C, Van Nieuwerburgh F, Van Hoofstat D, Deforce D (2012) Forensic STR analysis using massive parallel sequencing. Forensic Sci Int Genet 6:810–8
Bornman DM, Hester ME, Schuetter JM et al (2012) Short-read, high-throughput sequencing technology for STR genotyping. Biotech Rapid Dispatches 2012:1–6
Jeffreys AJ, Wilson V, Thein SL (1985) Hypervariable ‘minisatellite’regions in human DNA. Nature 314:67–73
Cornelis S, Willems S, Van Neste C et al (2018) Forensic STR profiling using Oxford Nanopore Technologies’ MinION sequencer. BioRxiv: 433151. https://doi.org/10.1101/433151
Lindberg MR, Schmedes SE, Hewitt FC et al (2016) A Comparison and Integration of MiSeq and MinION Platforms for Sequencing Single Source and Mixed Mitochondrial Genomes. PLoS ONE 11:e0167600. https://doi.org/10.1371/journal.pone.0167600
Zaaijer S, Gordon A, Speyer D, Piccone R, Groen SC, Erlich Y (2017) Rapid re-identification of human samples using portable DNA sequencing. Elife 6:e27798. https://doi.org/10.7554/eLife.27798
Mitsuhashi S, Kryukov K, Nakagawa S et al (2017) A portable system for rapid bacterial composition analysis using a nanopore-based sequencer and laptop computer. Sci Rep 7:1–9
Plesivkova D, Richards R, Harbison S (2019) A review of the potential of the MinION™ single-molecule sequencing system for forensic applications. Wiley Interdisciplinary Rev Forensic Sci 1. https://doi.org/10.1002/wfs2.1323
Scheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78:629–644. https://doi.org/10.1086/502802
Browning BL, Zhou Y, Browning SR (2018) A One-Penny Imputed Genome from Next-Generation Reference Panels. Am J Hum Genet 103:338–348. https://doi.org/10.1016/j.ajhg.2018.07.015
Midha MK, Wu M, Chiu KP (2019) Long-read sequencing in deciphering human genetics to a greater depth. Hum Genet 138:1201–1215. https://doi.org/10.1007/s00439-019-02064-y
Saini S, Mitra I, Mousavi N, Fotsing SF, Gymrek M (2018) A reference haplotype panel for genome-wide imputation of short tandem repeats. Nat Commun 9:4397. https://doi.org/10.1038/s41467-018-06694-0
Consortium GP (2015) A global reference for human genetic variation. Nature 526:68–74
Phillips C, Gettings KB, King JL et al (2018) “The devil’s in the detail”: release of an expanded, enhanced and dynamically revised forensic STR Sequence Guide. Forensic Sci Int Genet 34:162–169. https://doi.org/10.1016/j.fsigen.2018.02.017
Kidd KK, Speed WC (2015) Criteria for selecting microhaplotypes: mixture detection and deconvolution. Investig Genet 6:1. https://doi.org/10.1186/s13323-014-0018-3
Excoffier L, Lischer HE (2010) Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour 10:564–567
Wickham H (2016) ggplot2: elegant graphics for data analysis. Springer
Team RC (2017) R: A language and environment for statistical computing. R Found Stat Comput Vienna, Austria
Tang H, Kirkness EF, Lippert C et al (2017) Profiling of Short-Tandem-Repeat Disease Alleles in 12,632 Human Whole Genomes. Am J Hum Genet 101:700–715. https://doi.org/10.1016/j.ajhg.2017.09.013
Willems T, Zielinski D, Yuan J, Gordon A, Gymrek M, Erlich Y (2017) Genome-wide profiling of heritable and de novo STR variations. Nat Methods 14:590–592. https://doi.org/10.1038/nmeth.4267
Aalbers SE, Weir BS (2020) Analyzing population structure for forensic STR markers in next generations sequencing data. Forensic Sci Int Genet. https://doi.org/10.1016/j.fsigen.2020.102364
Karst SM, Ziels RM, Kirkegaard RH et al (2021) High-accuracy long-read amplicon sequences using unique molecular identifiers with nanopore or PacBio sequencing. Nat Methods 18:165–9
Budowle B, van Daal A (2008) Forensically relevant SNP classes. Biotechniques 44(603–8):10. https://doi.org/10.2144/000112806
Taliun D, Harris DN, Kessler MD et al (2021) Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590:290–299
Tytgat O, Gansemans Y, Weymaere J, Rubben K, Deforce D, Van Nieuwerburgh F (2020) Nanopore Sequencing of a Forensic STR Multiplex Reveals Loci Suitable for Single-Contributor STR Profiling. Genes (Basel) 11:381. https://doi.org/10.3390/genes11040381
Asogawa M, Ohno A, Nakagawa S et al (2020) Human short tandem repeat identification using a nanopore-based DNA sequencer: a pilot study. J Hum Genet 65:21–24. https://doi.org/10.1038/s10038-019-0688-z
Funding
This study was funded with internal funds of the Center for Human Identification at the University of North Texas Health Science Center.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Research involving human participants and/or animals
All data used in this study are publicly accessible and anonymized. No human participants or animals were involved in this study.
Informed consent
All data used in this study are publicly accessible and anonymized. No consent form was needed in this study.
Conflict of interest
A patent application by the University of North Texas Health Science Center is pending.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Highlights
• A novel CODIS compatible forensic marker, macrohaplotype, was developed
• A panel of 20 macrohaplotypes with sizes of ~ 8 k bp and extremely high discrimination powers were designed
• These macrohaplotypes substantially outperform CODIS STRs alone for mixture interpretation
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Ge, J., King, J., Mandape, S. et al. Enhanced mixture interpretation with macrohaplotypes based on long-read DNA sequencing. Int J Legal Med 135, 2189–2198 (2021). https://doi.org/10.1007/s00414-021-02679-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00414-021-02679-9