In this chapter, we’ll go through the basic building blocks of web pages such as HTML and CSS and demonstrate scraping structured information from them using popular Python libraries such as Beautiful Soup and lxml. Later, we’ll expand our knowledge and tackle issues that will make our scraper into a full-featured web crawler capable of fetching information from multiple web pages.
This is a preview of subscription content, access via your institution.
Tax calculation will be finalised at checkout
Purchases are for personal use onlyLearn about institutional subscriptions
© 2020 Jay M. Patel
About this chapter
Cite this chapter
Patel, J.M. (2020). Web Scraping in Python Using Beautiful Soup Library. In: Getting Structured Data from the Internet. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-6576-5_2
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-6575-8
Online ISBN: 978-1-4842-6576-5