Skip to main content

Outlier Detection in Categorical, Text and Mixed Attribute Data

  • Chapter
  • First Online:
Outlier Analysis

Abstract

A significant number of attributes in real data sets are not numerical, but have categorical values. For example, while demographic data may contain quantitative attributes such as the age, many other attributes such as sex and zip code are categorical. Data collected from surveys may often contain responses to multiple-choice questions, which are categorical. Similarly, many kinds of data such as the names of people and entities, IP-addresses and URLs are inherently discrete in nature. In many cases, categorical and numerical data are found in the same data set, as different attributes. This is referred to as mixed-attribute data. Mixed data is quite challenging to address because of the difficulties in appropriately weighting the importance of the different attributes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Aggarwal, C.C. (2013). Outlier Detection in Categorical, Text and Mixed Attribute Data. In: Outlier Analysis. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6396-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-6396-2_7

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-6395-5

  • Online ISBN: 978-1-4614-6396-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics