The Journal of Supercomputing

, Volume 75, Issue 4, pp 2007–2026 | Cite as

Driving behaviors analysis based on feature selection and statistical approach: a preliminary study

  • Mu-Song ChenEmail author
  • Chi-Pan Hwang
  • Tze-Yee Ho
  • Hsuan-Fu Wang
  • Chih-Min Shih
  • Hsing-Yu Chen
  • Wen Kai Liu


Due to the prevalence of IoV technology, big data has increasingly been promoted as a revolutionary development in a variety of applications. Indeed, the received big data from IoV is valuable particularly for those involved in analyzing driver’s behaviors. For instance, in the fleet management domain, fleet administrators are interested in fine-grained information about fleet usage, which is influenced by different driver usage patterns. In the vehicle insurance market, usage-based insurance or pay-as-you-drive schemes aim to adapt the insurance premium to individual driver behavior or even to provide various value-added services to policy holders. These applications can be expected to improve and to make safer the driving style of various individuals. Nowadays, big data analysis is becoming indispensable for automatic discovering of intelligence that is involved in the frequently occurring patterns and hidden rules. It is essential and necessary to study how to utilize these large-scale data. Regarding driving behaviors analysis, this paper presents a preliminary study based on feature selection and statistical approach. Feature selection is one of the important and frequently used techniques in data preprocessing for big data mining. Feature selection, as a dimensionality reduction technique, focuses on choosing a small subset of the significant features from the original data by removing irrelevant or redundant features. According to selection process, the most significant feature is vehicle speed for the collected vehicular data. Afterward, the statistical approach calculates skewness and dispersion in speed distribution as the statistical features for driving behaviors analysis. Finally, the established classification rules not only provide data-driven services and big data analytics but also offer training data samples for supervised machine learning algorithms. To validate the feasibility of the proposed method, over 150 drivers and more than 200,000 trips are verified in the simulation. As expected, experimental results are well matched with our observations.


Driving behaviors IoV Big data Feature selection Statistical approach 



This study is conducted under the “Project for 4G Advanced Business and video services platform” of the Institute for Information Industry (III) which is subsidized by the Ministry of Economic Affairs of the Republic of China.


  1. 1.
    Ahmed E, Yaqoob I, Hashem IAT, Khan I, Ahmed AIA, Imran M, Vasilakos AV (2017) The role of big data analytics in Internet of Things. Comput Netw 129:459–471Google Scholar
  2. 2.
    Kerang C, Lee H, Jung H (2017) Task management system according to changes in the situation based on IoT. J Inf Process Syst 13(6):1459–1466Google Scholar
  3. 3.
    Suryani V, Sulistyo S, Widyawan W (2017) Internet of Things (IoT) framework for granting trust among objects. J Inf Process Syst 13(6):1613–1627. Google Scholar
  4. 4.
    Lee E-J, Kim C-H, Jung IY (2014) An intelligent green service in internet of things. J Converg 5(3):4–8Google Scholar
  5. 5.
    Kim B (2017) A distributed coexistence mitigation scheme for IoT-based smart medical systems. J Inf Process Syst 13(6):1602–1612Google Scholar
  6. 6.
    Kumar N, Kaur K, Jindal A, Rodrigues JJPC (2015) Providing healthcare services on-the-fly using multi-player cooperation game theory in Internet of Vehicles (IoV) environment. Digit Commun Netw 1(3):191–203Google Scholar
  7. 7.
    Li C-S, Franke H, Parris C, Abali B, Kesavan M, Chang V (2017) Composable architecture for rack scale big data computing. Future Gener Comput Syst 67:180–193Google Scholar
  8. 8.
    Zhou L, Pan S, Wang J, Vasilakos AV (2017) Machine learning on big data: opportunities and challenges. Neurocomputing 37(10):350–361Google Scholar
  9. 9.
    Sabah Mohammed and Tai Hoon Kim (2016) Big data applications for healthcare: preface to special issue. J Supercomput 72(10):3675–3676Google Scholar
  10. 10.
    Hyun-Jeong Y, Shin-Kyung L, Oh-Cheon K (2011) Vehicle-generated data exchange protocol for remote OBD inspection and maintenance. In: 6th International Computer Sciences and Convergence Information Technology, pp 81–84Google Scholar
  11. 11.
    Händel P, Ohlsson J, Ohlsson M, Skog I, Nygren E (2014) Smartphone-based measurement systems for road vehicle traffic monitoring and usage-based insurance. IEEE Syst J 8(4):1238–1248Google Scholar
  12. 12.
    Tselentis DI, Yannis G, Vlahogianni EI (2016) Innovative insurance schemes: pay as/how you drive. Transp Res Procedia 14:362–371Google Scholar
  13. 13.
    Willke TL, Tientrakool P, Maxemchuk NF (2009) A survey of inter-vehicle communication protocols and their applications. IEEE Commun Surv Tutor 11(2):3–20Google Scholar
  14. 14.
    Jaco Prinsloo RM (2016) Accurate vehicle location system using RFID, an internet of things approach. Ad Hoc Netw 16:1–24Google Scholar
  15. 15.
    Zhu X, Zhang H, Cao D, Fang Z (2015) Robust control of integrated motor-transmission powertrain system over controller area network for automotive applications. Mech Syst Signal Process 58–59:15–28Google Scholar
  16. 16.
    Kumar A, Mallik RK, Schober R (2014) A probabilistic approach to modeling users’ network selection in the presence of heterogeneous wireless networks. IEEE Trans Veh Technol 63(7):3331–3341Google Scholar
  17. 17.
    Cayci A, Menasalvas E, Saygin Y, Eibe S (2013) Self-configuring data mining for ubiquitous computing. Inf Sci 246:83–99Google Scholar
  18. 18.
    Kim Y, Kim W, Kim U (2010) Mining frequent itemsets with normalized weight in continuous data streams. J Inf Process Syst 6(1):79–90Google Scholar
  19. 19.
    Lee M, Park Y-S, Kim M-H, Lee J-W (2016) A convergence data model for medical information related to acute myocardial infarction. Hum Centric Comput Inf Sci 6:15Google Scholar
  20. 20.
    Choi JH, Shin HS, Nasridinov A (2016) A comparative study on data mining classification techniques for military applications. J Converg 7(1):15–22Google Scholar
  21. 21.
    Donoho DL et al (2000) High-dimensional data analysis: the curses and blessings of dimensionality. AMS Math Challenges Lecture, pp 1–32Google Scholar
  22. 22.
    Li J, Chen Z, Wei L, Xu W, Kou G (2007) Feature selection via least squares support feature machine. Int J Inf Technol Dec Mak 6(04):671–686zbMATHGoogle Scholar
  23. 23.
    Li H, Jiang T, Zhang K (2006) Efficient and robust feature extraction by maximum margin criterion. IEEE Trans Neural Netw 17(1):157–165Google Scholar
  24. 24.
    He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. In: Advances in Neural Information Processing Systems, vol 18, pp 507–514 Google Scholar
  25. 25.
    Zhang D, Chen S, Zhou Z-H (2008) Constraint score: a new filter method for feature selection with pairwise constraints. Pattern Recognit 41(5):1440–1451zbMATHGoogle Scholar
  26. 26.
    He X, Niyogi P (2003) Locality preserving projections. In: Advances in Neural Information Processing Systems, vol 16, pp 585–591Google Scholar
  27. 27.
    Bonato M (2011) Robust estimation of skewness and kurtosis in distributions with infinite higher moments. Finance Res Lett 8(2):77–87Google Scholar
  28. 28.
    Washington SP, Karlaftis MG, Mannering F (2010) Statistical and econometric methods for transportation data analysis, 2nd edn. CRC Press, Boca RatonzbMATHGoogle Scholar
  29. 29.
    Horswill MS, McKenna FP (2004) Drivers’ hazard perception ability: situation awareness on the road. In: Banbury S, Tremblay S (eds) A cognitive approach to situation awareness. Ashgate, Aldershot, pp 155–175Google Scholar
  30. 30.
    Lu J, Filev D, Prakah-Asante K, Tseng F, Kolmanovsky I (2009) From vehicle stability control to intelligent personal minder: Realtime vehicle handling limit warning and driver style characterization. In: IEEE Workshop on Computational Intelligence in Vehicles and Vehicular Systems. CIVVS’09, pp 43–50Google Scholar
  31. 31.
    Chen M-S (2015) Neuro-fuzzy approach for online message scheduling. Eng Appl Artif Intell 38:59–69Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Mu-Song Chen
    • 1
    Email author
  • Chi-Pan Hwang
    • 2
  • Tze-Yee Ho
    • 3
  • Hsuan-Fu Wang
    • 4
  • Chih-Min Shih
    • 5
  • Hsing-Yu Chen
    • 5
  • Wen Kai Liu
    • 5
  1. 1.Department of Electrical EngineeringDa Yeh UniversityChanghuaTaiwan
  2. 2.Department of Electronic EngineeringNational Changhua University of EducationChanghuaTaiwan
  3. 3.Department of Electrical EngineeringFeng Chia UniversityTaichungTaiwan
  4. 4.Department of Electrical and Energy TechnologyChung Chou University of Science and TechnologyChanghuaTaiwan
  5. 5.Institute for Information IndustryTaipeiTaiwan

Personalised recommendations