Applied Predictive Modeling

  • Max Kuhn
  • Kjell Johnson

Table of contents

  1. Front Matter
    Pages i-xiii
  2. Max Kuhn, Kjell Johnson
    Pages 1-16
  3. General Strategies

    1. Front Matter
      Pages 17-17
    2. Max Kuhn, Kjell Johnson
      Pages 19-26
    3. Max Kuhn, Kjell Johnson
      Pages 27-59
    4. Max Kuhn, Kjell Johnson
      Pages 61-92
  4. Regression Models

    1. Front Matter
      Pages 93-93
    2. Max Kuhn, Kjell Johnson
      Pages 95-100
    3. Max Kuhn, Kjell Johnson
      Pages 101-139
    4. Max Kuhn, Kjell Johnson
      Pages 141-171
    5. Max Kuhn, Kjell Johnson
      Pages 173-220
    6. Max Kuhn, Kjell Johnson
      Pages 221-223
    7. Max Kuhn, Kjell Johnson
      Pages 225-243
  5. Classification Models

    1. Front Matter
      Pages 245-245
    2. Max Kuhn, Kjell Johnson
      Pages 247-273
    3. Max Kuhn, Kjell Johnson
      Pages 275-328
    4. Max Kuhn, Kjell Johnson
      Pages 329-367
    5. Max Kuhn, Kjell Johnson
      Pages 369-413
    6. Max Kuhn, Kjell Johnson
      Pages 415-418
    7. Max Kuhn, Kjell Johnson
      Pages 419-443

About this book

Introduction

This text is intended for a broad audience as both an introduction to predictive models as well as a guide to applying them. Non-mathematical readers will appreciate the intuitive explanations of the techniques while an emphasis on problem-solving with real data across a wide variety of applications will aid practitioners who wish to extend their expertise. Readers should have knowledge of basic statistical ideas, such as correlation and linear regression analysis. While the text is biased against complex equations, a mathematical background is needed for advanced topics.

Dr. Kuhn is a Director of Non-Clinical Statistics at Pfizer Global R&D in Groton Connecticut. He has been applying predictive models in the pharmaceutical and diagnostic industries for over 15 years and is the author of a number of R packages. 

Dr. Johnson has more than a decade of statistical consulting and predictive modeling experience in pharmaceutical research and development.  He is a co-founder of Arbor Analytics, a firm specializing in predictive modeling and is a former Director of Statistics at Pfizer Global R&D.  His scholarly work centers on the application and development of statistical methodology and learning algorithms.

Applied Predictive Modeling covers the overall predictive modeling process, beginning with the crucial steps of data preprocessing, data splitting and foundations of model tuning.  The text then provides intuitive explanations

of numerous common and modern regression and classification techniques, always with an emphasis on illustrating and solving real data problems.  Addressing practical concerns extends beyond model fitting to topics such as handling class imbalance, selecting predictors, and pinpointing causes of poor model performance—all of which are problems that occur frequently in practice.
 
The text illustrates all parts of the modeling process through many hands-on, real-life examples.  And every chapter contains extensive R code for each step of the process.  The data sets and corresponding code are available in the book’s companion AppliedPredictiveModeling R package, which is freely available on the CRAN archive.
 
This multi-purpose text can be used as an introduction to predictive models and the overall modeling process, a practitioner’s reference handbook, or as a text for advanced undergraduate or graduate level predictive modeling courses.  To that end, each chapter contains problem sets to help solidify the covered concepts and uses data available in the book’s R package.
 
Readers and students interested in implementing the methods should have some basic knowledge of R.  And a handful of the more advanced topics require some mathematical knowledge.

Keywords

Model Non-Linear Predictive Models R Regression Models Regression Trees

Authors and affiliations

  • Max Kuhn
    • 1
  • Kjell Johnson
    • 2
  1. 1.Division of Nonclinical StatisticsPfizer Global Research and DevelopmentGrotonUSA
  2. 2.Arbor AnalyticsSalineUSA

Bibliographic information

  • DOI https://doi.org/10.1007/978-1-4614-6849-3
  • Copyright Information Springer Science+Business Media New York 2013
  • Publisher Name Springer, New York, NY
  • eBook Packages Mathematics and Statistics
  • Print ISBN 978-1-4614-6848-6
  • Online ISBN 978-1-4614-6849-3
  • About this book