Overview
- Explains PySpark SQL and Dataframe in detail
- Include IO operation using PySpark SQL from most frequently used SQL and NoSQL databases
- Detail discussion on Data Preprocessing using PySpark SQL
- Problem Solution approach to graph bases algorithm using Graphframes
Access this book
Tax calculation will be finalised at checkout
Other ways to access
Table of contents (9 chapters)
Keywords
About this book
PySpark SQL Recipes starts with recipes on creating dataframes from different types of data source, data aggregation and summarization, and exploratory data analysis using PySpark SQL. You’ll also discover how to solve problems in graph analysis using graphframes.
On completing this book, you’ll have ready-made code for all your PySpark SQL tasks, including creating dataframes using data from different file formats as well as from SQL or NoSQL databases.
What You Will Learn
- Understand PySpark SQL and its advanced features
- Use SQL and HiveQL with PySpark SQL
- Work with structured streaming
- Optimize PySpark SQL
- Master graphframes and graph processing
Who This Book Is For
Data scientists, Python programmers, and SQL programmers.
Authors and Affiliations
About the authors
Sundar Rajan Raman is an artificial intelligence practitioner currently working at Bank of America. He holds a Bachelor of Technology degree from the National Institute of Technology, India. Being a seasoned Java and J2EE programmer he has worked on critical applications for companies such as AT&T, Singtel, and Deutsche Bank. He is also a seasoned big data architect. His current focus is on artificial intelligence space including machine learning and deep learning.
Bibliographic Information
Book Title: PySpark SQL Recipes
Book Subtitle: With HiveQL, Dataframe and Graphframes
Authors: Raju Kumar Mishra, Sundar Rajan Raman
DOI: https://doi.org/10.1007/978-1-4842-4335-0
Publisher: Apress Berkeley, CA
eBook Packages: Professional and Applied Computing, Apress Access Books, Professional and Applied Computing (R0)
Copyright Information: Raju Kumar Mishra and Sundar Rajan Raman 2019
Softcover ISBN: 978-1-4842-4334-3Published: 19 March 2019
eBook ISBN: 978-1-4842-4335-0Published: 18 March 2019
Edition Number: 1
Number of Pages: XXIV, 323
Number of Illustrations: 57 b/w illustrations
Topics: Big Data, Open Source, Python, Programming Languages, Compilers, Interpreters