Abstract
World Wide Web provides a powerful platform that stores and retrieves mass information. It becomes a time-consuming and uncomfortable task to search the information due to its unstructured and heterogeneous nature of data on the World Wide Web. Web mining is one of the popular techniques of data mining that is used to discover and extract useful information from web documents and its services. Web usage mining, web structure, and web content are three different categories of web data mining. Each of these categories has various methods, tools, and approaches to excerpt data from volume of information over the web. This review paper states various issues, while encountering information from the web and also states various problems occurred while finding appropriate information from the web. This paper also introduces different techniques and approaches of web content mining for different types of data. This paper also states various applications of web content mining.
Keywords
- Web mining
- Web content mining
- Web structure mining
- Web usage mining
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Gaikwad M, Naganath S, Pralhad S (2015) Web mining-types, applications, challenges and tools. Int J Adv Res Comput Eng Technol 4(5):2013–2015
NS, Shukla MKRK, Sharma P (2020) Web usage mining-a study of Web data pattern detecting methodologies and its applications in data mining. In: 2nd international conference data, engineering application, pp 1–6. https://doi.org/10.1109/idea49133.2020.9170690
Xia Xie WC, Fu Y, Jin H, Zhao Y (2020) A novel text mining approach for scholar information extraction from web content in Chinese, future generation computer systems. In: Future generation computer systems, vol 111, pp 859–872. https://doi.org/10.1016/j.future.2019.08.033
Hamid Mughal MJ (2018) Data mining: web data mining techniques, tools and algorithms: an overview. Int J Adv Comput Sci Appl 9(6):208–215. https://doi.org/10.14569/ijacsa.2018.090630
Bharanipriya V, Prasad VK (2011) Web content mining tools : a comparative study 4(1):211–215
Johnson F, Kumar Gupta S (2012) Web content mining techniques: a survey. Int J Comput Appl 47(11):44–50. https://doi.org/10.5120/7236-0266
Shoaib M, Maurya AK (2018) Comparative study of different web mining algorithms to discover knowledge on the web comparative study of different web mining algorithms to discover knowledge on the web
Ananthi J (2014) A survey web content mining methods and applications for information extraction from online shopping sites 5(3):4091–4094
Singh RK, Abdul APJ, Uit K (2017) A study on web content mining. 6(1):2015–2018. https://doi.org/10.18535/ijecs/v6i1.29
Vijiyarani S, Suganya ME (2015) Research issues in web mining. Int J Comput Technol 2(3):55–64. https://doi.org/10.5121/ijcax.2015.2305
Satish NR (2017) A study on applications, approaches and issues of web content mining. Int J Trend Res Develop 4(6):41–43
Tiwari KMD (2020) Social media data mining techniques: a survey. In: information and communication technology for sustainable development. Advances in intelligent systems and computing, Springer, vol 933, pp 978–981. https://doi.org/10.1007/978-981-13-7166-0_18
Mary XL, Silambarasan G (2017) Web content mining : tool, technique & concepts. 7(5):11656–11660
AD, Mahmood SSS, Ghani A (2019) Reputation-based approach toward web content credibility analysis. IEEE Access 7. https://doi.org/10.1109/access.2019.2943747
Kamde PM (2011) A survey on web multimedia mining. Int J Multimed Appl 3(3)
kumar TS (2012) A study: web data mining challenges and application for information extraction. IOSR J Comput Eng 7(3):24–29. https://doi.org/10.9790/0661-0732429
Ibukun N, Afolabi T (Covenant University, Ota, Nigeria), Makinde OS (Covenant University, Ota, Nigeria), Oladipupo OO (Covenant University, Ota (2019) Semantic web mining for content-based online shopping recommender systems. Int J Intell Inf Technol 15(4). https://doi.org/10.4018/ijiit.2019100103
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Shah, P., Pandit, H.B. (2022). A Review: Web Content Mining Techniques. In: Nanda, P., Verma, V.K., Srivastava, S., Gupta, R.K., Mazumdar, A.P. (eds) Data Engineering for Smart Systems. Lecture Notes in Networks and Systems, vol 238. Springer, Singapore. https://doi.org/10.1007/978-981-16-2641-8_15
Download citation
DOI: https://doi.org/10.1007/978-981-16-2641-8_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-2640-1
Online ISBN: 978-981-16-2641-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)