A Study on Different Types of Web Crawlers

Chaitra, P. G.; Deepthi, V.; Vidyashree, K. P.; Rajini, S.

doi:10.1007/978-981-13-8618-3_80

P. G. Chaitra¹⁸,
V. Deepthi¹⁸,
K. P. Vidyashree¹⁸ &
…
S. Rajini¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 989))

1555 Accesses
5 Citations

Abstract

The world wide web is a global information medium in which as many people as possible explore the information around the world. Search engine is a place where internet users search for the required content and the results are returned to users through websites, images or videos. Here web crawlers emerged that browses the web to gather and download pages relevant to user topics and store them in a large repository that makes the search engine more efficient. These web crawlers are becoming more important and growing daily. This paper presents the various web crawler types and their architectures. Comparisons are analyzed between these crawlers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Gupta, S.B.: The issues and challenges with the web crawlers. Int. J. Inf. Technol. Syst. 1, 1–10 (2012)
Google Scholar
Castillo, C.: Effective web crawling. Ph.D. thesis. University of Chile (2004). Accessed 03 Oct 2018
Google Scholar
Suebchua, T., Rungsawang, A., Yamana, H.: Adaptive focused website segment crawler. In: 19th International Conference on Network-Based Information Systems, pp. 181–187 (2016)
Google Scholar
Gupta, A., Anand, P.: Focused web crawlers and its approaches. In: 2015 1st International Conference on Futuristic Trends on Computational Analysis and Knowledge Management ABLAZE 2015, pp. 619–622 (2015)
Google Scholar
Shchekotykhin, K., Jannach, D., Friedrich, G.: xCrawl: a high-recall crawling method for web mining. Knowl. Inf. Syst. 25, 303–326 (2010)
Article Google Scholar
Yu, H., Han, J.: PEBL: positive example based learning for web page classification using SVM. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2002)
Google Scholar
Sharma, S., Gupta, P.: The anatomy of web crawlers. In: International Conference on Computing, Communication and Automation ICCCA 2015, pp. 849–853 (2015)
Google Scholar
Hall, W., De Roure, D., Shadbolt, N.: The evolution of the web and implications for eResearch. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 367, 991–1001 (2009)
Article Google Scholar
Yuhao, F.: Design and implementation of distributed crawler system based on Scrapy. In: IOP Conference Series: Earth and Environmental Science, pp. 1–5 (2018)
Google Scholar
Kumar, D., Mishra, R.: Deep web performance enhance on search engine. In: International Conference on Soft Computing Techniques and Implementations, ICSCTI 2015, pp. 137–140 (2015)
Google Scholar
Raghavan, S., Garcia-Molina, H.: Crawling the hidden web. In: 27th VLDB Conference, Roma, Italy, pp. 1–10 (2010)
Google Scholar

Download references

Acknowledgements

The authors express gratitude towards the assistance provided by Accendere Knowledge Management Services Pvt. Ltd. In preparing the manuscripts. We also thank our mentors and faculty members who guided us throughout the research and helped us in achieving desired results.

Author information

Authors and Affiliations

Department of Information Science and Engineering, Vidyavardhaka College of Engineering, Mysuru, Karnataka, India
P. G. Chaitra, V. Deepthi, K. P. Vidyashree & S. Rajini

Authors

P. G. Chaitra
View author publications
You can also search for this author in PubMed Google Scholar
V. Deepthi
View author publications
You can also search for this author in PubMed Google Scholar
K. P. Vidyashree
View author publications
You can also search for this author in PubMed Google Scholar
S. Rajini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to P. G. Chaitra .

Editor information

Editors and Affiliations

Department of Electronics, University of Petroleum and Energy Studies, Dehradun, Uttarakhand, India
Sushabhan Choudhury
Department of Electronics, University of Petroleum and Energy Studies, Dehradun, Uttarakhand, India
Ranjan Mishra
Department of Electronics, University of Petroleum and Energy Studies, Dehradun, Uttarakhand, India
Raj Gaurav Mishra
Department of Electronics, University of Petroleum and Energy Studies, Dehradun, Uttarakhand, India
Adesh Kumar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chaitra, P.G., Deepthi, V., Vidyashree, K.P., Rajini, S. (2020). A Study on Different Types of Web Crawlers. In: Choudhury, S., Mishra, R., Mishra, R., Kumar, A. (eds) Intelligent Communication, Control and Devices. Advances in Intelligent Systems and Computing, vol 989. Springer, Singapore. https://doi.org/10.1007/978-981-13-8618-3_80

Download citation

DOI: https://doi.org/10.1007/978-981-13-8618-3_80
Published: 28 August 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8617-6
Online ISBN: 978-981-13-8618-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics