Abstract
To accomplish the rapid screening of drug addicts and to meet the requirements of modern police work, this research employs a data-mining technology that utilizes real samples of drug addicts as well as non-drug addicts. The aim is to construct a classification model based on pulse wave data. After the pre-processing of pulse wave data, the original random forest classification model is initially established with high accuracy, but with a relatively low recall rate and F1 score. To resolve this issue, an improved classification model is henceforth proposed. The improved model mainly involves three improvement strategies: firstly, perform cross-validation by dividing multiple training sets and test sets to obtain generalization errors; secondly, balance the sample distribution using down-sampling techniques; and finally, select model parameters based on multi-criteria analysis. According to the evaluation results of accuracy, precision, recall rates, and F1 scores, the performance of the improved random forest classification model has demonstrated its superiority and robustness via experiments using different datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Li, S., Zhang, F.: Point-contact type FBG dynamic pressure sensor and its application in the measurement of pulse information. Optoelectron. Laser 27(10), 1017–1022 (2016)
Xiaorui, S., Aike, Q.: Evaluation of cardiovascular health based on pulse wave detection technology. J. Med. Biomech. 30(5), 468–473 (2015)
Chen, M., Cai, K.: A modern spectrum analysis method for pulse abnormalities of drug abusers. J. Chongqing Univ. (Nat. Sci. Ed.) 24(4), 98–102 (2001)
Breiman, L.: Random forests. Mach. Learn. 54(1), 5–32 (2001)
Ming, L.: The solution of multicollinearity: a new standard to rule out variables. Stat. Decis. Mak. 5, 82–83 (2013)
Zhang, Y., Gong, H.: Approximate linear Bayesian estimation of skewness coefficient. Stat. Decis. Mak. 78–81 (2017)
Sun, H., McIntosh, S.: Analyzing cross-domain transportation big data of New York City with semi-supervised and active learning. Comput. Mater. Continua 57(1), 1–9 (2018)
Chen, Y., Zhou, J., Du, J.: Credit evaluation method based on transaction data. Comput. Appl. Softw. 35(5), 168–171 (2018)
Feifei, S., Zhuo, C., Xiaolei, X.: Application of an improved random forest based classifier in crime prediction domain. J. Intell. 33(10), 148–152 (2014)
Hao, H., Xianhui, W.: Maximum F1-score criterion based discriminative feature compensation training algorithm for automatic mispronunciation detection. Acta Electronica Sin. 43(7), 1294–1299 (2015)
Verikas, A., Gelzinis, A., Bacauskiene, M.: Mining data with random forests: a survey and results of new tests. Pattern Recognit. 22(2), 330–349 (2011)
Zhihua, Z.: Machine Learning. Tsinghua University Press, Beijing (2016)
Yin, H., Hu, Y.: Unbalanced feature selection algorithm based on random forest. J. Sun Yat-Sen Univ. (Nat. Sci. Ed.) 53(5), 59–65 (2014)
Shishi, D.: Analysis of random forest theory. Integr. Technol. 2(1), 1–7 (2013)
Wu, X., Zhang, C., Zhang, R., et al.: A distributed intrusion detection model via nondestructive partitioning and balanced allocation for big data. CMC Comput. Mater. Continua 56(1), 61–72 (2018)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, T., Gu, H. (2019). A Classification Model for Drug Addicts Based on Improved Random Forests Algorithm. In: Sun, X., Pan, Z., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2019. Lecture Notes in Computer Science(), vol 11632. Springer, Cham. https://doi.org/10.1007/978-3-030-24274-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-24274-9_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-24273-2
Online ISBN: 978-3-030-24274-9
eBook Packages: Computer ScienceComputer Science (R0)