Beginning Apache Pig

Big Data Processing Made Easy

  • Balaswamy Vaddeman

Table of contents

  1. Front Matter
    Pages i-xxiii
  2. Balaswamy Vaddeman
    Pages 1-20
  3. Balaswamy Vaddeman
    Pages 21-31
  4. Balaswamy Vaddeman
    Pages 33-40
  5. Balaswamy Vaddeman
    Pages 41-67
  6. Balaswamy Vaddeman
    Pages 69-87
  7. Balaswamy Vaddeman
    Pages 103-113
  8. Balaswamy Vaddeman
    Pages 115-122
  9. Balaswamy Vaddeman
    Pages 123-136
  10. Balaswamy Vaddeman
    Pages 137-145
  11. Balaswamy Vaddeman
    Pages 147-155
  12. Balaswamy Vaddeman
    Pages 157-169
  13. Balaswamy Vaddeman
    Pages 171-186
  14. Balaswamy Vaddeman
    Pages 187-199
  15. Balaswamy Vaddeman
    Pages 201-208
  16. Balaswamy Vaddeman
    Pages 209-223
  17. Balaswamy Vaddeman
    Pages 225-248
  18. Back Matter
    Pages 249-274

About this book

Introduction

Learn to use Apache Pig to develop lightweight big data applications easily and quickly. This book shows you many optimization techniques and covers every context where Pig is used in big data analytics. Beginning Apache Pig shows you how Pig is easy to learn and requires relatively little time to develop big data applications. 

The book is divided into four parts: the complete features of Apache Pig; integration with other tools; how to solve complex business problems; and optimization of tools. 

You'll discover topics such as MapReduce and why it cannot meet every business need; the features of Pig Latin such as data types for each load, store, joins, groups, and ordering; how Pig workflows can be created; submitting Pig jobs using Hue; and working with Oozie. You'll also see how to extend the framework by writing UDFs and custom load, store, and filter functions. Finally you'll cover different optimization techniques such as gathering statistics about a Pig script, joining strategies, parallelism, and the role of data formats in good performance.

What You Will Learn

• Use all the features of Apache Pig
• Integrate Apache Pig with other tools
• Extend Apache Pig
• Optimize Pig Latin code
• Solve different use cases for Pig Latin

Who This Book Is For

All levels of IT professionals: architects, big data enthusiasts, engineers, developers, and big data administrators

Keywords

Apache Pig Grunt Pig Latin HCatalogue HCatalog Pig Jobs Hue Apache Falcon Macros Hadoop

Authors and affiliations

  • Balaswamy Vaddeman
    • 1
  1. 1.HyderabadIndia

Bibliographic information

  • DOI https://doi.org/10.1007/978-1-4842-2337-6
  • Copyright Information Balaswamy Vaddeman 2016
  • Publisher Name Apress, Berkeley, CA
  • eBook Packages Professional and Applied Computing
  • Print ISBN 978-1-4842-2336-9
  • Online ISBN 978-1-4842-2337-6
  • About this book