Abstract
A significant number of attributes in real data sets are not numerical, but have categorical values. For example, while demographic data may contain quantitative attributes such as the age, many other attributes such as sex and zip code are categorical. Data collected from surveys may often contain responses to multiple-choice questions, which are categorical. Similarly, many kinds of data such as the names of people and entities, IP-addresses and URLs are inherently discrete in nature. In many cases, categorical and numerical data are found in the same data set, as different attributes. This is referred to as mixed-attribute data. Mixed data is quite challenging to address because of the difficulties in appropriately weighting the importance of the different attributes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Aggarwal, C.C. (2013). Outlier Detection in Categorical, Text and Mixed Attribute Data. In: Outlier Analysis. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6396-2_7
Download citation
DOI: https://doi.org/10.1007/978-1-4614-6396-2_7
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-6395-5
Online ISBN: 978-1-4614-6396-2
eBook Packages: Computer ScienceComputer Science (R0)