# Probability and Statistics for Computer Science

• David Forsyth

1. Front Matter
Pages i-xxiv
2. ### Describing Datasets

1. Front Matter
Pages 1-1
2. David Forsyth
Pages 3-27
3. David Forsyth
Pages 29-50
3. ### Probability

1. Front Matter
Pages 51-51
2. David Forsyth
Pages 53-85
3. David Forsyth
Pages 87-114
4. David Forsyth
Pages 115-137
4. ### Inference

1. Front Matter
Pages 139-139
2. David Forsyth
Pages 141-157
3. David Forsyth
Pages 159-177
4. David Forsyth
Pages 179-196
5. David Forsyth
Pages 197-222
5. ### Tools

1. Front Matter
Pages 223-223
2. David Forsyth
Pages 225-252
3. David Forsyth
Pages 253-279
4. David Forsyth
Pages 281-304
5. David Forsyth
Pages 305-330
6. David Forsyth
Pages 331-351
6. ### Mathematical Bits and Pieces

1. Front Matter
Pages 353-353

### Introduction

This textbook is aimed at computer science undergraduates late in sophomore or early in junior year, supplying a comprehensive background in qualitative and quantitative data analysis, probability, random variables, and statistical methods, including machine learning.

With careful treatment of topics that fill the curricular needs for the course, Probability and Statistics for Computer Science features:

•   A treatment of random variables and expectations dealing primarily with the discrete case.

•   A practical treatment of simulation, showing how many interesting probabilities and expectations can be extracted, with particular emphasis on Markov chains.

•   A clear but crisp account of simple point inference strategies (maximum likelihood; Bayesian inference) in simple contexts. This is extended to cover some confidence intervals, samples and populations for random sampling with replacement, and the simplest hypothesis testing.

•   A chapter dealing with classification, explaining why it’s useful; how to train SVM classifiers with stochastic gradient descent; and how to use implementations of more advanced methods such as random forests and nearest neighbors.

•   A chapter dealing with regression, explaining how to set up, use and understand linear regression and nearest neighbors regression in practical problems.

•   A chapter dealing with principal components analysis, developing intuition carefully, and including numerous practical examples. There is a brief description of multivariate scaling via principal coordinate analysis.

•   A chapter dealing with clustering via agglomerative methods and k-means, showing how to build vector quantized features for complex signals.

Illustrated throughout, each main chapter includes many worked examples and other pedagogical elements such as

boxed Procedures, Definitions, Useful Facts, and Remember This (short tips). Problems and Programming Exercises are at the end of each chapter, with a summary of what the reader should know.

Instructor resources include a full set of model solutions for all problems, and an Instructor's Manual with accompanying presentation slides.

### Keywords

Summarizing 1D data Boxplots Datasets Spatial data Random variables Conditional probability Expected values Discrete distributions Continuous distributions Confidence intervals Clustering Markov chains regression Significance of evidence

#### Authors and affiliations

• David Forsyth
• 1
1. 1.Computer Science DepartmentUniversity of Illinois at Urbana ChampainUrbanaUSA

### Bibliographic information

• DOI https://doi.org/10.1007/978-3-319-64410-3
• Copyright Information Springer International Publishing AG 2018
• Publisher Name Springer, Cham
• eBook Packages Computer Science
• Print ISBN 978-3-319-64409-7
• Online ISBN 978-3-319-64410-3