Abstract
In this chapter, we’ll go through the basic building blocks of web pages such as HTML and CSS and demonstrate scraping structured information from them using popular Python libraries such as Beautiful Soup and lxml. Later, we’ll expand our knowledge and tackle issues that will make our scraper into a full-featured web crawler capable of fetching information from multiple web pages.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsAuthor information
Authors and Affiliations
Rights and permissions
Copyright information
© 2020 Jay M. Patel
About this chapter
Cite this chapter
Patel, J.M. (2020). Web Scraping in Python Using Beautiful Soup Library. In: Getting Structured Data from the Internet. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-6576-5_2
Download citation
DOI: https://doi.org/10.1007/978-1-4842-6576-5_2
Published:
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-6575-8
Online ISBN: 978-1-4842-6576-5
eBook Packages: Business and ManagementBusiness and Management (R0)Apress Access Books