Abstract
In real world applications, data usually contain errors and noise, need to be scaled and transformed, or need to be collected from different and possibly heterogeneous information sources. We distinguish deterministic and stochastic errors. Deterministic errors can sometimes be easily corrected. Inliers and outliers may be identified and removed or corrected. Inliers, outliers, or noise can be reduced by filtering. We distinguish many different filtering methods with different effectiveness and computational complexities: moving statistical measures, discrete linear filters, finite impulse response, infinite impulse response. Data features with different ranges often need to be standardized or transformed.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
B. A. Barsky and D. P. Greenberg. Determining a set of B–spline control vertices to generate an interpolating surface. Computer Graphics and Image Processing, 14(3):203–226, November 1980.
S. Butterworth. On the theory of filter amplifiers. Wireless Engineer, 7:536–541, 1930.
A. V. Oppenheim and R. W. Schafer. Discrete–Time Signal Processing. Prentice Hall, 2009.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2020 Springer Fachmedien Wiesbaden GmbH, part of Springer Nature
About this chapter
Cite this chapter
Runkler, T.A. (2020). Data Preprocessing. In: Data Analytics. Springer Vieweg, Wiesbaden. https://doi.org/10.1007/978-3-658-29779-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-658-29779-4_3
Published:
Publisher Name: Springer Vieweg, Wiesbaden
Print ISBN: 978-3-658-29778-7
Online ISBN: 978-3-658-29779-4
eBook Packages: Computer Science and Engineering (German Language)