Skip to main content

A Study on Different Types of Web Crawlers

  • Conference paper
  • First Online:
Intelligent Communication, Control and Devices

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 989))

Abstract

The world wide web is a global information medium in which as many people as possible explore the information around the world. Search engine is a place where internet users search for the required content and the results are returned to users through websites, images or videos. Here web crawlers emerged that browses the web to gather and download pages relevant to user topics and store them in a large repository that makes the search engine more efficient. These web crawlers are becoming more important and growing daily. This paper presents the various web crawler types and their architectures. Comparisons are analyzed between these crawlers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gupta, S.B.: The issues and challenges with the web crawlers. Int. J. Inf. Technol. Syst. 1, 1–10 (2012)

    Google Scholar 

  2. Castillo, C.: Effective web crawling. Ph.D. thesis. University of Chile (2004). Accessed 03 Oct 2018

    Google Scholar 

  3. Suebchua, T., Rungsawang, A., Yamana, H.: Adaptive focused website segment crawler. In: 19th International Conference on Network-Based Information Systems, pp. 181–187 (2016)

    Google Scholar 

  4. Gupta, A., Anand, P.: Focused web crawlers and its approaches. In: 2015 1st International Conference on Futuristic Trends on Computational Analysis and Knowledge Management ABLAZE 2015, pp. 619–622 (2015)

    Google Scholar 

  5. Shchekotykhin, K., Jannach, D., Friedrich, G.: xCrawl: a high-recall crawling method for web mining. Knowl. Inf. Syst. 25, 303–326 (2010)

    Article  Google Scholar 

  6. Yu, H., Han, J.: PEBL: positive example based learning for web page classification using SVM. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2002)

    Google Scholar 

  7. Sharma, S., Gupta, P.: The anatomy of web crawlers. In: International Conference on Computing, Communication and Automation ICCCA 2015, pp. 849–853 (2015)

    Google Scholar 

  8. Hall, W., De Roure, D., Shadbolt, N.: The evolution of the web and implications for eResearch. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 367, 991–1001 (2009)

    Article  Google Scholar 

  9. Yuhao, F.: Design and implementation of distributed crawler system based on Scrapy. In: IOP Conference Series: Earth and Environmental Science, pp. 1–5 (2018)

    Google Scholar 

  10. Kumar, D., Mishra, R.: Deep web performance enhance on search engine. In: International Conference on Soft Computing Techniques and Implementations, ICSCTI 2015, pp. 137–140 (2015)

    Google Scholar 

  11. Raghavan, S., Garcia-Molina, H.: Crawling the hidden web. In: 27th VLDB Conference, Roma, Italy, pp. 1–10 (2010)

    Google Scholar 

Download references

Acknowledgements

The authors express gratitude towards the assistance provided by Accendere Knowledge Management Services Pvt. Ltd. In preparing the manuscripts. We also thank our mentors and faculty members who guided us throughout the research and helped us in achieving desired results.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. G. Chaitra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chaitra, P.G., Deepthi, V., Vidyashree, K.P., Rajini, S. (2020). A Study on Different Types of Web Crawlers. In: Choudhury, S., Mishra, R., Mishra, R., Kumar, A. (eds) Intelligent Communication, Control and Devices. Advances in Intelligent Systems and Computing, vol 989. Springer, Singapore. https://doi.org/10.1007/978-981-13-8618-3_80

Download citation

Publish with us

Policies and ethics