Abstract
This chapter focuses on building Random Forests (RF) with PySpark for classification. We will learn about various aspects of them and how the predictions take place; but before knowing more about random forests, we have to learn the building block of RF that is a decision tree (DT). A decision tree is also used for Classification/Regression. but in terms of accuracy, random forests beat DT classifiers due to various reasons that we will cover later in the chapter. Let’s learn more about decision trees.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsAuthor information
Authors and Affiliations
Rights and permissions
Copyright information
© 2019 Pramod Singh
About this chapter
Cite this chapter
Singh, P. (2019). Random Forests. In: Machine Learning with PySpark . Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-4131-8_6
Download citation
DOI: https://doi.org/10.1007/978-1-4842-4131-8_6
Published:
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-4130-1
Online ISBN: 978-1-4842-4131-8
eBook Packages: Professional and Applied ComputingApress Access BooksProfessional and Applied Computing (R0)