Skip to main content
Log in

An Association Rule Mining Method Based on Named Entity Recognition and Text Classification

  • Research Article-Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

Using massive text data, building a knowledge graph to implement in-depth association analysis and mining can help identify entities and make decisions. The accuracy of traditional Chinese Named Entity Recognition methods is low, and traditional frequent itemset mining methods are also difficult to obtain different types of categories, and their novelty is not high. In this paper, we propose an association rule mining method based on named entity recognition and text classification (ARMTNER). First, the TextCNN model is used to extract the word vector information of the text data; secondly, bidirectional LSTM is used the model extracts the contextual features of the text; then the neural network model is used to automatically extract the word features and the global features of the text for text classification; finally, the text sequence labeling and entity recognition are performed. Then the association rules be used to mine frequent itemsets through two-level classification of text classification and entity classification. The experimental results showed that, our method can achieve an F1-score of 97.3% on the public data set in Chinese Named Entity Recognition, and the novelty of frequent itemsets increased by 0.279%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Xu, C.W.; Wang, F.Y.; Han, J.L; et al.: Exploiting multiple embeddings for chinese named entity recognition. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM '19), pp. 2269–2272 (2019). https://doi.org/10.1145/3357384.3358117

  2. Liu, W.M.; Yu, B.; Zhang, C.; et al.: Chinese named entity recognition based on rules and conditional random field. In Proceedings of the 2018 2nd International Conference on Computer Science and Artificial Intelligence (CSAI '18), pp. 268–272(2018). https://doi.org/10.1145/3297156.3297196

  3. Asif, E.; Sriparna, S.; Dhirendra, S.: Active machine learning technique for named entity recognition. In Proceedings of the International Conference on Advances in Computing, Communications and Informatics (ICACCI '12), pp. 180–186 (2012). https://doi.org/10.1145/2345396.2345427

  4. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989). https://doi.org/10.1109/5.18626

    Article  Google Scholar 

  5. Yu, H.K.; Zhang, H.P.; Liu, Q.; et al.: Chinese named entity identification using cascaded hidden Markov model. J.-China Instit. Commun. 27(2), 87–94 (2006). https://doi.org/10.3321/j.issn:1000-436X.2006.02.013

    Article  Google Scholar 

  6. Koeling, R.: Chunking with maximum entropy models. Proceedings of CoNLL-2000 and LLL-2000, pp. 139–141 (2000). https://doi.org/10.3115/1117601.1117634

  7. Lafferty, J.; McCallum, A.; Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. Proceedings of the 18th International Conference on Machine Learning 2001 (ICML 2001), pp. 282–289 (2001). https://doi.org/10.5555/645530.655813

  8. Collobert, R.; Weston, J.; Bottou, L.; et al.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(1), 2493–2537 (2011)

    MATH  Google Scholar 

  9. Hu, J.M.; Zheng, X.: Opinion extraction of government microblog comments via BiLSTM-CRF Model. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL '20), pp. 473–475 (2020). https://doi.org/10.1145/3383583.3398570

  10. Li, Z.N.; Li, Q.; Zou, X.T.; et al.: Causality extraction based on self-attentive BiLSTM-CRF with transferred embeddings. Neurocomputing 423(29), 207–219 (2020). https://doi.org/10.1016/j.neucom.2020.08.078

    Article  Google Scholar 

  11. Zhang, J.S.; Wang, Y.T.; Yang, X.H.; et al.: Entity recognition of chinese medical literature based on BiLSTM-CRF and fusion features. In Proceedings of the 2020 3rd International Conference on Big Data Technologies (ICBDT 2020), pp. 107–111 (2020). https://doi.org/10.1145/3422713.3422724

  12. Zhang, M.; Ai, X.; Hu, Y.: Chinese text classification system on regulatory information based on SVM. IOP Conference Series: Earth and Environmental Science, 252(2), 022133 (2019). https://doi.org/10.1088/1755-1315/252/2/022133

  13. Zheng, J.: NLP Chinese natural language processing principle and practice. Publishing House of Electronics industry, Beijing, China, pp. 89–108 (2017) (in Chinese)

  14. Yilmaz, S.; Toklu, S.: A deep learning analysis on question classification task using Word2vec representations. Neural Comput. Appl. 65, 1–20 (2020). https://doi.org/10.1007/s00521-020-04725-w

    Article  Google Scholar 

  15. Wu, C.; Chen, Y.X.; Xiong, Y.; et al.: Key Techniques of automatic question-answering customer service system in college informatization domain. In 2020 12th International Conference on Education Technology and Computers (ICETC'20), pp. 133–140 (2020). https://doi.org/10.1145/3436756.3437034

  16. Stephanie, D.; Husby Denilson B.: Topic classification of blog posts using distant supervision. In Proceedings of the Workshop on Semantic Analysis in Social Media (EACL '12), pp. 28–36 (2012). https://doi.org/10.5555/2389969.2389973

  17. Zhu, Z.S.; Zhou, Y.Q.; Xu, S.H.; et al.: Transformer based Chinese sentiment classification. in proceedings of the 2019 2nd International Conference on Computational Intelligence and Intelligent Systems (CIIS 2019), pp. 51–56 (2019). https://doi.org/10.1145/3372422.3372438

  18. Chen, X.Q.; Lu, Y.C.: Stroke-speech: A Multi-channel input method for chinese characters. In The Eighth International Workshop of Chinese CHI (Chinese CHI 2020), pp. 69–72 (2020). https://doi.org/10.1145/3403676.3403687

  19. Kim, Y.: Convolutional neural networks for sentence classification (2014). arxiv:1408.5882

  20. Dong, C.H.; Zhang, J.J.; Zong, C.Q.; et al.: Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In: Lin, C.Y.; Xue, N.; Zhao, D. et al. (Eds.) Natural language understanding and intelligent applications, pp. 239–250. Springer, Cham (2016)

    Chapter  Google Scholar 

  21. Huang, Z.; Xu, W.; Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arxiv:1508.01991(2015)

  22. Lample, G.; Ballesteros, M.; Subramanian, S.; et al.: Neural architectures for named entity recognition. Proceedings NAACLHLT, pp. 260–270 (2016). https://doi.org/10.18653/v1/N16-1030

Download references

Acknowledgements

This research is supported by the Humanities and Social Sciences of Ministry Education of China (No.19XJA910001) and the postgraduate innovation fund project of Chongqing University of Technology (No.clgycx 20203114).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiru Zhang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, B., Zhang, J. An Association Rule Mining Method Based on Named Entity Recognition and Text Classification. Arab J Sci Eng 48, 1503–1511 (2023). https://doi.org/10.1007/s13369-022-06870-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-022-06870-x

Keywords

Navigation