Skip to main content

Crime Data Analysis Using Machine Learning Models

  • Conference paper
  • First Online:
Applied Technologies (ICAT 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1755))

Included in the following conference series:

  • 387 Accesses

Abstract

Crime statistics in Ecuador show us that in recent years the number of cases for different types of crimes has increased. Although the different state entities have criminal data, analyzes are not always carried out to predict new cases. This work proposes an analysis of the information based on automatic learning algorithms that allows extracting knowledge about the relationships between the different variables that affect criminal acts. These results can be used as tools for the country's authorities and organizations to better control and prevent crime. Using machine learning algorithms, crime counts by province can be predicted using techniques that are based on multiple regression or other techniques. Using monthly counts of different types of crimes over several years, three machine learning algorithms are implemented: Multiple Linear Regression (MLR), Decision Tree Regression (DTR), and Random Forest Regression (RFR). These models are trained and tested for use in predicting new crimes, especially rapes, burglaries, and personal thefts. The R-squared, adjusted R-squared, and root mean square error (RMSE) metrics are used to evaluate and compare the proposed regression models. The results show that the RFR model achieves a better fit to the data with an adjusted R-squared value of 0.965746 for the case of home burglaries and a value of 0.974088 for thefts. In addition, this model presents the lowest RMSE value for the three types of crimes. The best adjusted R-squared value for the rape case was obtained using the MLR model with a value of 0.929960. The most affected provinces in absolute counts are Guayas and Pichincha, whose crime levels remain at alarming levels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Datos Abiertos – Servicio Nacional de Gestión de Riesgos y Emergencias. https://www.gestionderiesgos.gob.ec/datos-abiertos/. Accessed 21 June 2020

  2. Ministerio de Defensa Nacional – Instancia político-administrativa del Gobierno de Ecuador encargada de dirigir la política de defensa y administrar las Fuerzas Armadas; armonizando las acciones entre las funciones del Estado y la institución militar. https://www.defensa.gob.ec/. Accessed 11 April 2022

  3. Chen, P., Yuan, H., Shu, X.: Forecasting crime using the ARIMA model. In: Proceedings - 5th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2008, vol. 5, pp. 627–630 (2008). https://doi.org/10.1109/FSKD.2008.222

  4. Kshatri, S.S., Singh, D., Narain, B., Bhatia, S., Quasim, M.T., Sinha, G.R.: An empirical analysis of machine learning algorithms for crime prediction using stacked generalization: an ensemble approach. IEEE Access 9, 67488–67500 (2021). https://doi.org/10.1109/access.2021.3075140

    Article  Google Scholar 

  5. Sundhara Kumar, K.B., Bhalaji, N.: A study on classification algorithms for crime records. In: SmartCom 2016. CCIS, vol. 628, pp. 873–880. Springer, Singapore (2016), doi: https://doi.org/10.1007/978-981-10-3433-6_104

  6. Mcclendon, L., Meghanathan, N.: Using machine learning algorithms to analyze crime data. Mach. Learn. Appl. Int. J. (MLAIJ) 2(1), 1–12 (2015). https://doi.org/10.5121/mlaij.2015.2101

    Article  Google Scholar 

  7. Rani, A.: Crime trend analysis and prediction using mahanolobis distance and dynamic time warping technique. Int. J. Comput. Sci. Inf. Technol. 5(3), 4134–4135 (2014). www.ijcsit.com

  8. Awodele, O., Ernest, O.E., Olufunmike, O.A., Oluwawunmi Ugo-Ezeaba Anita A, S.O.: A real-time crime records management system for national security agencies. Europ. J. Comput. Sci. Inf. Technol. 3(2), 1–12 (2015). www.eajournals.org

  9. Khan, M., Ali, A., Alharbi, Y.: Predicting and preventing crime: a crime prediction model using san francisco crime data by classification techniques. Complexity 2022, 1–13 (2022). https://doi.org/10.1155/2022/4830411

    Article  Google Scholar 

  10. Hossain, S., Abtahee, A., Kashem, I., Hoque, M.M., Sarker, I.H.: Crime prediction using spatio-temporal data. In: Chaubey, N., Parikh, S., Amin, K. (eds.) COMS2 2020. CCIS, vol. 1235, pp. 277–289. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-6648-6_22

    Chapter  Google Scholar 

  11. Singh, R., Umrao, R.K., Ahmad, M., Ansari, M.K., Sharma, L.K., Singh, T.N.: Prediction of geomechanical parameters using soft computing and multiple regression approach. Measurement 99, 108–119 (2017). https://doi.org/10.1016/j.measurement.2016.12.023

    Article  Google Scholar 

  12. Farhadian, H., Katibeh, H.: New empirical model to evaluate groundwater flow into circular tunnel using multiple regression analysis. Int. J. Min. Sci. Technol. 27(3), 415–421 (2017). https://doi.org/10.1016/J.IJMST.2017.03.005

    Article  Google Scholar 

  13. Bandekar, S.R., Vijayalakshmi, C.: Design and analysis of machine learning algorithms for the reduction of crime rates in India. Procedia Computer Science 172, 122–127 (2020). https://doi.org/10.1016/J.PROCS.2020.05.018

    Article  Google Scholar 

  14. Ahmad, M.W., Reynolds, J., Rezgui, Y.: Predictive modelling for solar thermal energy systems: a comparison of support vector regression, random forest, extra trees and regression trees. J. Clean. Prod. 203, 810–821 (2018). https://doi.org/10.1016/J.JCLEPRO.2018.08.207

    Article  Google Scholar 

  15. Yang, L., Liu, S., Tsoka, S., Papageorgiou, L.G.: A regression tree approach using mathematical programming. Expert Syst. Appl. 78, 347–357 (2017). https://doi.org/10.1016/J.ESWA.2017.02.013

    Article  Google Scholar 

  16. Speiser, J.L., Miller, M.E., Tooze, J., Ip, E.: A comparison of random forest variable selection methods for classification prediction modeling. Expert Syst. Appl. 134, 93–101 (2019). https://doi.org/10.1016/J.ESWA.2019.05.028

    Article  Google Scholar 

  17. Chen, Y., Zheng, W., Li, W., Huang, Y.: Large group activity security risk assessment and risk early warning based on random forest algorithm. Pattern Recogn. Lett. 144, 1–5 (2021). https://doi.org/10.1016/J.PATREC.2021.01.008

    Article  Google Scholar 

  18. Alves, L.G.A., Ribeiro, V.H., Rodrigues, F.A.: Crime prediction through urban metrics and statistical learning. Phys. A Stat. Mech. Appl. 505, 435–443 (2018). https://doi.org/10.1016/J.PHYSA.2018.03.084

  19. Li, Y., Yan, C., Liu, W., Li, M.: A principle component analysis-based random forest with the potential nearest neighbor method for automobile insurance fraud identification. Appl. Soft Comput. 70, 1000–1009 (2018). https://doi.org/10.1016/J.ASOC.2017.07.027

    Article  Google Scholar 

  20. YeÅŸilkanat, C.M.: Spatio-temporal estimation of the daily cases of COVID-19 in worldwide using random forest machine learning algorithm. Chaos Solitons Fractals 140, 110210 (2020). https://doi.org/10.1016/J.CHAOS.2020.110210

    Article  Google Scholar 

  21. Gunturi, S.K., Sarkar, D.: Ensemble machine learning models for the detection of energy theft. Electric Power Syst. Res. 192, 106904 (2021). https://doi.org/10.1016/J.EPSR.2020.106904

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vivas Kumar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kumar, V. (2023). Crime Data Analysis Using Machine Learning Models. In: Botto-Tobar, M., Zambrano Vizuete, M., Montes León, S., Torres-Carrión, P., Durakovic, B. (eds) Applied Technologies. ICAT 2022. Communications in Computer and Information Science, vol 1755. Springer, Cham. https://doi.org/10.1007/978-3-031-24985-3_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-24985-3_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-24984-6

  • Online ISBN: 978-3-031-24985-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics