Skip to main content
Log in

Data mining application on aviation accident data for predicting topmost causes for accidents

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Aviation safety management system is a vital component of the aviation industry. Aviation safety inspectors apply a broad knowledge about aviation industry, aviation safety, and the central laws and regulations, and strategies affecting aviation. In addition, they put on severe technical knowledge and skill in the operation and maintenance of aircraft. Data mining methods also have been successfully applied in aviation safety management system. Aviation industry accumulates large amount of knowledge and data. This paper proposes a method that applied data mining technique on the accident reports of the Federal Aviation Administration (FAA) accident/incident data system database which contains accident data records for all categories of civil aviation between the years of 1919 and 2014. In this study, we have investigated the application of several data mining methods on the accidents reports, to arrive at new inferences that could help aviation management system. Moreover correlation based feature selection (CFS) with Oscillating Search Technique is used to select the number of prominent attributes that are potential factors causing maximum number of accidents in aircraft. The principle of this work is to find out the effective attributes in order to reduce the number of the accidents in the aviation industry. This proposed novel idea named "improved oscillated correlation feature selection (IOCFS)" is evaluated against the conventional classifiers like Naïve bayes, support vector machine (SVM), artificial neural network (ANN), k-nearest neighbor (k-NN), Multiclass classifier and decision tree (J48). The selected features are tested in terms of their accuracy, running time and reliability as in terms of true positive rate, false positive rate, precision, recall-measure and ROC. The results are seen to be the best for k-NN classifier on comparing with other conventional classifiers, with the value of k = 5.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Nazeri, Z., Jianping, Z.: Mining aviation data to understand impacts of severe weather on airspace system performance. In Proceedings of the International Conference on Information Technology, IEEE, (2002)

  2. Jiawei, H., Kamber, M.: Data Mining: concepts and techniques. University of Simon Fraser, Burnaby (2001)

    MATH  Google Scholar 

  3. Somol, P., Novovǐcov’a, J., Pudil, P.: Improving sequential feature selection methods performance by means of hybridization. In: Proceedings of 6th IASTED International conference on Advance in Computer Science and Engineering, ACTA Press (2010)

  4. Weka 3: Machine Learning Software in Java. The University of Waikato software documentation. http://www.cs.waikato.ac.nz/_ml/weka

  5. Qinghua, W., Liu, H., Yan, X.: Multi-label classification algorithm research based on swarm intelligence. Cluster Comput. 19(4), 2075–2085 (2016)

    Article  Google Scholar 

  6. Aitkenhead, M.J.: A co-evolving decision tree classification method. Expert Syst. Appl. 34(1), 18–25 (2006)

    Article  Google Scholar 

  7. Craven, M.W., Shavlik, J.W.: Using neural networks for data mining. Future Gen. Comput. Syst. 13(1), 211–229 (1997)

    Article  Google Scholar 

  8. Apte, C., Weiss, S.: Data mining with decision trees and decision rules. Future Gen. Comput. Syst. 13, 197–210 (1997)

    Article  Google Scholar 

  9. Kirkos, E., Spathis, C., Manolopoulos, Y.: Data mining techniques for the detection of fraudulent financial statements. Expert Syst. Appl. 32(1), 995–1003 (2007)

    Article  Google Scholar 

  10. Mai, C.K., Krishna, M., Reddy, A.V.: PolyAnalyst application for forest data mining, IEEE (2005)

  11. Tso, G.K.F., Yau, K.K.W.: Predicting electricity energy consumption: a comparison of regression analysis, decision tree and neural networks, (2005). www.sciencedirect.com

  12. Bineid, M., Fielding, J.P.: Development of a civil aircraft dispatch reliability prediction methodology. Aircr. Eng. Aerosp. Technol. 75(6), 588–594 (2003)

    Article  Google Scholar 

  13. Shyur, H.J.A.: Quantitative model for aviation safety risk assessment, computers and industrial engineering, (2007)

  14. Tseng, W.S., Nguyen, H., Liebowitz, J., Agresti, W.: Distractions and motor vehicle accidents: data mining application on fatality analysis reporting system (FARS) data files. Ind. Manage. Data Syst. 105(9), 1188–1205 (2005)

    Article  Google Scholar 

  15. Solomon, S., Nguyen, H., Liebowitz, J., Agresti, W.: Using data mining to improve traffic safety programs. Ind. Manage. Data Syst. 106(5), 621–643 (2006)

    Article  Google Scholar 

  16. Muflikhah, L.: Adnyana, Classifying Categorical Data Using Modified K-Nearest Neighbor Weighted by Association Rules. In: Proceedings of the International Conference on Future Information Technology 13(1), 347–351 (2011)

  17. Gaowei, X., Carl, S., Min, L., Feng, Z., Weiming, S.: A user behavior prediction model based on parallel neural network and k-nearest neighbor algorithms. Cluster Computing 20(2), 1703–1715 (2017)

    Article  Google Scholar 

  18. Devijver, P.A., Kittler, J.: Pattern recognition: a statistical approach. Prentice Hall, Upper Saddle River (1982)

    MATH  Google Scholar 

  19. Somol, P., Pudil, P.: Oscillating search algorithms for feature selection. In.Proceedings of ICPR, IEEE Comp. Soc. 406–409 (2000)

  20. Somol, P., Novovicova, P., Grim, J., Pudil, P.: Dynamic oscillating search algorithms for feature selection. In: Proceedings of ICPR 2008. IEEE Comp. Soc. (2008)

  21. Chen, Y., Yu, S.: Selection of effective features for ECG beat recognition based on nonlinear Correlations. Artif. Intell. Med. 54(1), 43–52 (2012)

    Article  Google Scholar 

  22. Hall, M.A.: Correlation based feature selection for machine learning. PhD Dissertation, Dept. of Comp. Science, Univ. of Waikato, Hamilton, New Zealand (1998)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Koteeswaran.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Koteeswaran, S., Malarvizhi, N., Kannan, E. et al. Data mining application on aviation accident data for predicting topmost causes for accidents. Cluster Comput 22 (Suppl 5), 11379–11399 (2019). https://doi.org/10.1007/s10586-017-1394-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-1394-2

Keywords

Navigation