Advertisement

Design and Implementation of an Efficient Web Crawling Using Neural Network

  • Ahmed Md. Tanvir
  • Yonghoon Kim
  • Mokdong ChungEmail author
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 536)

Abstract

The number of users the usage of internet is mounting day by day. Currently, researches on the information using the retrieval model neural networks have been actively progressed for the retrieval of information and the classification of documents. Various types of algorithms have been applied for identification and quantification of the words weights in documents. As information technologies accelerate, it is necessary to understand the exact meaning of documents through analyzing the words, using the advanced methods of technologies. In this paper, specific keywords were used by word2vec to identify naturally fused word frequencies, semantic relationships, and directional text-ranks. Therefore, the neural network is the advanced mechanism to verify the semantic relationship between words and texts in a particular document. Our approach uses the Word2vec to capture the semantic features between words in the selected text, and meanwhile naturally integrate the word frequency, semantic relation.

Keywords

Web crawler Word2vec Hyperlink Reinforcement learning Q-learning 

Notes

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF2017R1D1A1B03030033).

References

  1. 1.
    Amudha, S., Phil, M.: Web crawler for mining web data. Int. Res. J. Eng. Technol. (IRJET) 4 (2017)Google Scholar
  2. 2.
    Kim, Y., Hong, H., Chung, M.: Application of cohesion devices for improvement of distributional representation. In: Proceeding of the 14th International Conference on Multimedia Information Technology and Applications (MITA 2018), Shanghai University of Engineering Science, China, 28–30 June 2018, pp. 84–87 (2018)Google Scholar
  3. 3.
    Zhao, D., Du, N., Zhi, C., Li, Y.: Keyword extraction for social media short text. In: 14th Web Information System and Applications Conference (WISA), pp. 251–256 (2017)Google Scholar
  4. 4.
    Jiang, L., Wu, Z., Feng, Q., Liu, J., Zheng, Q.: Efficient deep web crawling using reinforcement learning. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 428–439. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  5. 5.
    Patil, Y., Patil, S.: Implementation of enhanced web crawler for deep-web interfaces. Int. Res. J. Eng. Technol. (IRJET) 3 (2016)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Ahmed Md. Tanvir
    • 1
  • Yonghoon Kim
    • 1
  • Mokdong Chung
    • 1
    Email author
  1. 1.Department of Computer EngineeringPukyong National UniversityBusanKorea

Personalised recommendations