Abstract
Digital forensics research includes several stages. Once we have collected the data the last goal is to obtain a model in order to predict the output with unseen data. We focus on supervised machine learning techniques. This chapter performs an experimental study on a forensics data task for multi-class classification including several types of methods such as decision trees, bayes classifiers, based on rules, artificial neural networks and based on nearest neighbors. The classifiers have been evaluated with two performance measures: accuracy and Cohen’s kappa. The followed experimental design has been a 4-fold cross validation with thirty repetitions for non-deterministic algorithms in order to obtain reliable results, averaging the results from 120 runs. A statistical analysis has been conducted in order to compare each pair of algorithms by means of t-tests using both the accuracy and Cohen’s kappa metrics.
Keywords
- Digital forensics
- Glass evidence
- Data mining
- Supervised machine learning
- Classification model
This is a preview of subscription content, access via your institution.
Buying options
Preview
Unable to display preview. Download preview PDF.
References
Caddy, B.: Forensic Examination of Glass and Paint: Analysis and Interpretation. Taylor & Francis, London (2011)
Mumford, C.L., Jain, L.C. (eds.): Computational Intelligence. ISRL, vol. 1. Springer, Heidelberg (2009)
Popescu, A.C., Farid, H.: Statistical Tools for Digital Forensics. In: Fridrich, J. (ed.) IH 2004. LNCS, vol. 3200, pp. 128–147. Springer, Heidelberg (2004)
Kessler, G.C.: Advancing the Science of Digital Forensics. Computer 45(12), 25–27 (2012)
Stuart, B.H.: Forensic Analytical Techniques. John Wiley & Sons, West Sussex (2013)
Curran, J.M., Hicks, T.N., Buckleton, J.S.: Forensic Interpretation of Glass Evidence. CRC Press, Boca Raton (2000)
Newton, A.W.N., Kitto, L., Buckleton, J.S.: A study of the performance and utility of annealing in forensic glass analysis. Forensic Science International 155, 119–125 (2005)
Winstanley, R., Rydeard, C.: Concepts of annealing applied to small glass fragments. Forensic Science International 29, 1–10 (1985)
Terry, K.W., van Riessen, A., Lynch, B.F., Vowles, D.J.: Quantitative analysis of glasses used within Australia. Forensic Science International 25, 19–34 (1984)
Zadora, G.: Classification of Glass Fragments Based on Elemental Composition and Refractive Index. Journal of Forensic Science 54(1), 49–59 (2009)
Ahmad, U.K., Asmuje, N.F., Ibrahim, R., Kamaruzamanc, N.U.: Forensic Classification of Glass Employing Refractive Index Measurement. Malaysian Journal of Forensic Sciences 3(1), 1–4 (2012)
Zadora, G., Brozek-Mucha, Z., Parczewski, A.: A classification of glass microtraces. Problems of Forensic Sciences XLVII, 137–143 (2001)
Grainger, M.N.C., Manley-Harris, M., Coulson, S.: Classification and discrimination of automotive glass using LA-ICP-MS. Journal of Analytical Atomic Spectrometry 27, 1413–1422 (2012)
Uzkent, B., Barkana, B.D., Cevikalp, H.: Non-speech environmental sound classification using SVMs with a new set of features. International Journal of Innovative Computing, Information and Control 8(5B), 3511–3524 (2012)
Bottrell, M.C.: Forensic Glass Comparison: Background Information Used in Data Interpretation. Forensic Science Communications 11(2) (2009)
Koons, R.D., Buscaglia, J., Bottrell, M., Miller, E.T.: Forensic glass comparisons. In: Saferstein, R. (ed.) Forensic Science Handbook, 2nd edn., vol. I, pp. 161–213. Prentice Hall, Upper Saddle River (2002)
Evett, I.W., Spiehler, E.J.: Rule induction in forensic science. In: Knowledge Based Systems in Government, pp. 152–160. Halsted Press, London (1988)
Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA (2010), http://archive.ics.uci.edu/ml
Buscema, M.: Artificial Adaptive Systems in Data Visualization: Proactive Data. In: Buscema, M., Tastle, W. (eds.) Intelligent Data Mining in Law Enforcement Analytics: New Neural Networks Applied to Real Problems, pp. 51–88 (2013)
Parvin, H., Minaei-Bidgoli, B., Shahpar, H.: Classifier Selection by Clustering. In: Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A., Ben-Youssef Brants, C., Hancock, E.R. (eds.) MCPR 2011. LNCS, vol. 6718, pp. 60–66. Springer, Heidelberg (2011)
Murty, M.N., Devi, V.S.: Pattern Recognition. An Algorithmic Approach. Universities Press (India), Pvt. Ltd., London (2011)
Dougherty, G.: Pattern Recognition and Classification: An Introduction. Springer, New York (2013)
Murthy, S.K.: Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey. Data Mining and Knowledge Discovery 2, 345–389 (1998)
Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth Int. Group, Belmont (1984)
Pearl, J.: Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann Publishers, San Francisco (1998)
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, Waltham (2011)
Cohen, W.: Fast effective rule induction. In: Proc. of the 12th Int. ICML Conf., pp. 115–123 (1995)
Michie, D., Spiegelhalter, D.J.: Machine Learning, Neural and Statistical Classification. Ellis Horwood, New York (1994)
Haykin, S.O.: Neural Networks and Learning Machines. Prentice Hall, Upper Saddle River (2009)
Bishop, M.: Neural Networks for Pattern Recognition. Oxford University Press, New York (1995)
Howlett, R.J., Jain, L.C.: Radial Basis Function Networks 1: Recent Developments in Theory and Applications. Springer, Heidelberg (2001)
Fix, E., Hodges, J.: Discriminatory analysis, nonparametric discrimination: consistency properties. Tech. Rep. 4, USAF School of Aviation Medicine, Randolph Field, Texas (1951)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13(1), 21–27 (1967)
Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley Longman Publishing Co., Boston (2005)
Boularias, A., Chaib-draa, B.: Apprenticeship learning with few examples. Neurocomputing 104, 83–96 (2013)
Bargiela, A., Pedrycz, W.: A model of granular data: a design problem with the Tchebyschev FCM. Soft Computing 9(3), 155–163 (2005)
Hjorth, J.S.U.: Computer intensive statistical methods: Validation model selection and bootstrap. Chapman and Hall, London (1994)
Kohavi, R.: A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI 1995), Montreal, Quebec, Canada, vol. 2, pp. 1137–1145 (1995)
Flach, P.: Machine Learning: The Art and Science of Algorithms that Make Sense of Data. Cambridge University Press, United Kingdom (2012)
Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, USA (2011)
Cohen, J.: A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20(1), 37–46 (1960)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)
Tallón-Ballesteros, A.J., Hervás-Martínez, C., Riquelme, J.C., Ruiz, R.: Feature selection to enhance a two-stage evolutionary algorithm in product unit neural networks for complex classification problems. Neurocomputing 114, 107–117 (2013)
Nisbet, R., Elder, J.F., Miner, G.: Handbook of Statistical Analysis and Data Mining Applications. Academic Press, Canada (2009)
Silva, J.A., Hruschka, E.R.: An experimental study on the use of nearest neighbor-based imputation algorithms for classification tasks. Data & Knowledge Engineering 84, 47–58 (2013)
Wang, Y., Cao, F., Yuan, Y.: A study on effectiveness of extreme learning machine. Neurocomputing 74, 2483–2490 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Tallón-Ballesteros, A.J., Riquelme, J.C. (2014). Data Mining Methods Applied to a Digital Forensics Task for Supervised Machine Learning. In: Muda, A., Choo, YH., Abraham, A., N. Srihari, S. (eds) Computational Intelligence in Digital Forensics: Forensic Investigation and Applications. Studies in Computational Intelligence, vol 555. Springer, Cham. https://doi.org/10.1007/978-3-319-05885-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-05885-6_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05884-9
Online ISBN: 978-3-319-05885-6
eBook Packages: EngineeringEngineering (R0)