When you browse your email, you can usually tell right away whether a message is spam. Still, you probably do not enjoy spending your time identifying spam and have come to rely on a filter to do that task for you, either deleting the spam automatically or filing it in a different mailbox. An email filter is based on a set of rules applied to each incoming message, tagging it as spam or “ham” (not spam). Such a filter is an example of a supervised classification algorithm. It is formulated by studying a training sample of email messages that have been manually classified as spam or ham. Information in the header and text of each message is converted into a set of numerical variables such as the size of the email, the domain of the sender, or the presence of the word “free.” These variables are used to define rules that determine whether an incoming message is spam or ham.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Rights and permissions
Copyright information
© 2007 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
(2007). Supervised Classification. In: Interactive and Dynamic Graphics for Data Analysis. Use R!. Springer, New York, NY. https://doi.org/10.1007/978-0-387-71762-3_4
Download citation
DOI: https://doi.org/10.1007/978-0-387-71762-3_4
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-71761-6
Online ISBN: 978-0-387-71762-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)