Advertisement

Statistics and Computing

, Volume 25, Issue 3, pp 527–541 | Cite as

Computational aspects of DNA mixture analysis

Exact inference using auxiliary variables in a Bayesian network
  • Therese Graversen
  • Steffen Lauritzen
Article

Abstract

Statistical analysis of DNA mixtures for forensic identification is known to pose computational challenges due to the enormous state space of possible DNA profiles. We describe a general method for computing the expectation of a product of discrete random variables using auxiliary variables and probability propagation in a Bayesian network. We propose a Bayesian network representation for genotypes, allowing computations to be performed locally involving only a few alleles at each step. Exploiting appropriate auxiliary variables in combination with this representation allows efficient computation of the likelihood function and prediction of genotypes of unknown contributors. Importantly, we exploit the computational structure to introduce a novel set of diagnostic tools for assessing the adequacy of the model for describing a particular dataset.

Keywords

Bayesian network Deconvolution Genotype representation Junction tree Model diagnostics Prequential monitor Triangulation 

References

  1. Bahl, L., Cocke, J., Jelinek, F., Raviv, J.: Optimal decoding of linear codes for minimizing symbol error rate. IEEE Trans. Inf. Theory 20, 284–287 (1974)CrossRefzbMATHMathSciNetGoogle Scholar
  2. Balding, D.: Evaluation of mixed-source, low-template DNA profiles in forensic science. Proc. Natl. Acad. Sci. USA. 110(30), 12,241–12,246 (2013)CrossRefGoogle Scholar
  3. Balding, D.J.: Weight-of-Evidence for Forensic DNA Profiles. Statistics in Practice. Wiley, Chichester (2005)Google Scholar
  4. Bill, M., Gill, P., Curran, J., Clayton, T., Pinchin, R., Healy, M., Buckleton, J.: PENDULUM: a guideline-based approach to the interpretation of STR mixtures. Forensic Sci. Int. 148, 181–189 (2005)CrossRefGoogle Scholar
  5. Cowell, R.G., Dawid, A.P., Lauritzen, S.L., Spiegelhalter, D.J.: Probabilistic Networks and Expert Systems. Springer, New York (1999)zbMATHGoogle Scholar
  6. Cowell, R.G., Lauritzen, S.L., Mortera, J.: Probabilistic expert systems for handling artifacts in complex DNA mixtures. Forensic Sci. Int. 5, 202–209 (2011)CrossRefGoogle Scholar
  7. Cowell, R.G., Graversen, T., Lauritzen, S., Mortera, J.: Analysis of forensic DNA mixtures with artefacts. arXiv 1302, 4404 (2013)Google Scholar
  8. Dawid, A.P.: Statistical theory. The prequential approach. J. R. Stat. Soc. Ser. A 147, 277–305 (1984)MathSciNetGoogle Scholar
  9. Dawid, A.P.: Applications of a general propagation algorithm for probabilistic expert systems. Stat. Comput. 2, 25–36 (1992)CrossRefGoogle Scholar
  10. Dawid, A.P., Mortera, J., Pascali, V.L., van Boxel, D.W.: Probabilistic expert systems for forensic inference from genetic markers. Scand. J. Stat. 29, 577–595 (2002)CrossRefzbMATHGoogle Scholar
  11. Ghalanos, A., Theussl, S.: Rsolnp: General Non-linear Optimization Using Augmented Lagrange Multiplier Method. R package version 1.12 (2012)Google Scholar
  12. Gilbert P, Varadhan R (2012) numDeriv: Accurate Numerical Derivatives. R-package version 2012.9-1.Google Scholar
  13. Gill, P., Curran, J., Neumann, C., Kirkham, A., Clayton, T., Whitaker, J., Lambert, J.: Interpretation of complex DNA profiles using empirical models and a method to measure their robustness. Forensic Sci. Int. 2, 91–103 (2008)CrossRefGoogle Scholar
  14. Graversen, T.: DNAmixtures: Statistical Inference for Mixed Traces of DNA. R-package version 0.1-0, dnamixtures.r-forge.r-project.org/ (2013)Google Scholar
  15. Graversen, T., Lauritzen, S.: Estimation of parameters in DNA mixture analysis. J. Appl. Stat. 40(11), 2423–2436 (2013)CrossRefGoogle Scholar
  16. Green, P.J., Mortera, J.: Sensitivity of inferences in forensic genetics to assumptions about founder genes. Ann. Appl. Stat. 3, 731–763 (2009)CrossRefzbMATHMathSciNetGoogle Scholar
  17. Hugin Expert A/S (2013) Hugin API Reference Manual, Version 7.7. Hugin Expert A/S, Aalborg, Denmark.Google Scholar
  18. Konis, K.: RHugin. R-package version 7.7-5 (2013)Google Scholar
  19. Lindley, D.: A problem in forensic science. Biometrika 64(2), 207–213 (1977)CrossRefMathSciNetGoogle Scholar
  20. Mortera, J., Dawid, A.P., Lauritzen, S.L.: Probabilistic expert systems for DNA mixture profiling. Theor. Popul. Biol. 63, 191–205 (2003)CrossRefzbMATHGoogle Scholar
  21. Puch-Solis, R., Rodgers, L., Mazumder, A., Pope, S., Evett, I., Curran, J., Balding, D.: Evaluating forensic DNA profiles using peak heights, allowing for multiple donors, allelic dropout and stutters. Forensic Sci. Int. 7(5), 555–563 (2013)CrossRefGoogle Scholar
  22. Seillier-Moiseiwitsch, F., Dawid, A.P.: On testing the validity of sequential probability forecasts. J. Am. Stat. Assoc. 88, 355–359 (1993)zbMATHMathSciNetGoogle Scholar
  23. Tvedebrink, T., Eriksen, P.S., Mogensen, H.S., Morling, N.: Evaluating the weight of evidence by using quantitative short tandem repeat data in DNA mixtures. Appl. Stat. 59, 855–874 (2010)Google Scholar
  24. Yannakakis, M.: Computing the minimum fill-in is NP-complete. SIAM J. Algebr. Discrete Methods 2, 77–79 (1981)CrossRefzbMATHMathSciNetGoogle Scholar
  25. Ye, Y.: Interior algorithms for linear, quadratic, and linearly constrained non-linear programming. PhD thesis, Department of Electrical Engineering, Stanford University, Stanford (1987).Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of StatisticsUniversity of OxfordOxfordUK

Personalised recommendations