Design and Implementation of an Efficient Web Crawling Using Neural Network
The number of users the usage of internet is mounting day by day. Currently, researches on the information using the retrieval model neural networks have been actively progressed for the retrieval of information and the classification of documents. Various types of algorithms have been applied for identification and quantification of the words weights in documents. As information technologies accelerate, it is necessary to understand the exact meaning of documents through analyzing the words, using the advanced methods of technologies. In this paper, specific keywords were used by word2vec to identify naturally fused word frequencies, semantic relationships, and directional text-ranks. Therefore, the neural network is the advanced mechanism to verify the semantic relationship between words and texts in a particular document. Our approach uses the Word2vec to capture the semantic features between words in the selected text, and meanwhile naturally integrate the word frequency, semantic relation.
KeywordsWeb crawler Word2vec Hyperlink Reinforcement learning Q-learning
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF2017R1D1A1B03030033).
- 1.Amudha, S., Phil, M.: Web crawler for mining web data. Int. Res. J. Eng. Technol. (IRJET) 4 (2017)Google Scholar
- 2.Kim, Y., Hong, H., Chung, M.: Application of cohesion devices for improvement of distributional representation. In: Proceeding of the 14th International Conference on Multimedia Information Technology and Applications (MITA 2018), Shanghai University of Engineering Science, China, 28–30 June 2018, pp. 84–87 (2018)Google Scholar
- 3.Zhao, D., Du, N., Zhi, C., Li, Y.: Keyword extraction for social media short text. In: 14th Web Information System and Applications Conference (WISA), pp. 251–256 (2017)Google Scholar
- 5.Patil, Y., Patil, S.: Implementation of enhanced web crawler for deep-web interfaces. Int. Res. J. Eng. Technol. (IRJET) 3 (2016)Google Scholar