Advertisement

Importance of Missing Value Estimation in Feature Selection for Crime Analysis

  • Soubhik RakshitEmail author
  • Priyanka Das
  • Asit Kumar Das
Conference paper
Part of the Lecture Notes in Networks and Systems book series (LNNS, volume 19)

Abstract

Missing values are most likely to be present in voluminous datasets that often lead to poor performance of the decision-making system. The present work carries out an experiment with a crime dataset that deals with the existence of missing values in it. The proposed methodology depicts a graph-based approach for selecting important features relevant to crime after estimating the missing values with the help of a multiple regression model. The method selects some features with missing values as important features. The selected features subsequently undergo some classification techniques that help in determining the importance of missing value estimation without discarding the feature for crime analysis. The proposed method is compared with existing feature selection algorithms and it promises a better classification accuracy, which shows the importance of the method.

Keywords

Crime records Missing value estimation Correlation coefficient Feature selection Multiple regression Rough set theory 

References

  1. 1.
    Acua, E., Rodriguez, C.: The treatment of missing values and its effect in the classifier accuracy. In: Classification, Clustering and Data Mining Applications. (July 2004) 639–647Google Scholar
  2. 2.
    Kalkan, H.: Online feature selection and classification with incomplete data. Volume 22. (2014) 1625–1636Google Scholar
  3. 3.
    Lou, Q., Obradovic, Z.: Margin-based feature selection in incomplete data. In: Proceedings of the Twenty-Sixth Association for the Advancement of Artificial Intelligence. (July 2012) 1040–1046Google Scholar
  4. 4.
    Meesad, P., Hengpraprohm, K.: Combination of knn-based feature selection and knn-based missing-value imputation of microarray data. In: The 3rd Intetnational Conference on Innovative Computing Information and Control (ICICIC’08). (August 2008)Google Scholar
  5. 5.
    Sun, Y., Braga-Neto, U., Dougherty, E.R.: Impact of missing value imputation on classification for dna microarray gene expression data model-based study. EURASIP Journal on Bioinformatics and Systems Biology (2009) 1–17Google Scholar
  6. 6.
    Doquire, G., Verleysen, M.: Feature selection with missing data using mutual information estimators. Neurocomputing 90 (August 2012) 3–11Google Scholar
  7. 7.
    Doquire, G., Verleysen, M.: Mutual information for feature selection with missing data. In: European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. (April 2011) 269–274Google Scholar
  8. 8.
    Saar-Tsechansky, M., Provost, F.: Handling missing values when applying classification models. Journal of Machine Learning Research 8 (July 2011) 1625–1657Google Scholar
  9. 9.
    Kraus, E.J., Dougherty, E.R.: Segmentation-free morphological character recognition. Proc. SPIE 2181 (1994) 14–23Google Scholar
  10. 10.
    Yao, C.S.C., Shen, L., Yu, X.: Improving classification accuracy using missing data filling algorithms for the criminal dataset. International Journal of Hybrid Information Technology 9(4) (2016) 367–374Google Scholar
  11. 11.
    Sun, C., Yao, C., Li, X., Yu, X.: Detecting crime types using classification algorithms. Biotechnology-An Indian Journal 10(24) (2014) 15452–15457Google Scholar
  12. 12.
    N.Poolsawad, C.Kambhampati, Cleland, J.G.F.: Feature selection approaches with missing values handling for data mining - a case study of heart failure dataset. Volume 5. (2011) 671–680Google Scholar
  13. 13.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2005)Google Scholar
  14. 14.
    Dash, M., Liu, H.: Consistency-based search in feature selection. Artificial Intelligence 151(12) (2003) 155–176Google Scholar
  15. 15.
    Hall, M.A.: Correlation-based feature selection for machine learning. Technical report, Waikato University, Department of Computer Science (1999)Google Scholar

Copyright information

© Springer Nature Singapore Pte. Ltd. 2018

Authors and Affiliations

  • Soubhik Rakshit
    • 1
    Email author
  • Priyanka Das
    • 1
  • Asit Kumar Das
    • 1
  1. 1.Department of Computer Science and TechnologyIndian Institute of Engineering Science and Technology, ShibpurHowrahIndia

Personalised recommendations