Model and Feature Selection in Metrology Data Approximation

Yang, Xin-She; Forbes, Alistair B.

doi:10.1007/978-3-642-16876-5_14

Xin-She Yang⁴ &
Alistair B. Forbes⁴

Part of the book series: Springer Proceedings in Mathematics ((PROM,volume 3))

Summary

For many data approximation problems in metrology, there are a num- ber of competing models which can potentially fit the observed data. A crucial task is to quantify the extent to which one model performs better than others, taking into account the influence of random effects associated with the data. For example, for a given data set, we can use a series of polynomials of various degrees to fit the data using a least squares criterion. The residual sum of squares is a measure of how well the model fits the data. However, it is generally required to balance goodness of fit with minimising the model complexity. We consider a number of criteria that aim to do this: the Akaike information criterion (AIC), the Bayesian/Schwarz in- formation criterion, and the AIC with a correction for small sample size (AICc). In this paper, we compare the performance of these criteria for polynomial regression and show that for the examples tested the AICc criterion performs best. A second element of model selection is to determine from a set of feature vectors, the sub- set that defines a model space most suitable for describing the observed response. Since there are 2N possible model spaces defined by N feature vectors, for even a modest number of feature vectors it is necessary to reduce or prioritise the number of candidate models. Partial least squares and the least angle regression algorithms can be used as model reduction tools. We describe these algorithms in the context of feature selection and how they can be used with a model selection criterion such as AICc and illustrate their performance using simulations and on an application from human sensory perception.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

H. Akaike: A new look at the statistical model identification. IEEE Transactions on Automatic Control 19, 1974, 716–723.
Article MATH MathSciNet Google Scholar
A. Bialek, A.B. Forbes, T. Goodman, R. Montgomery, M. Rides, G. van der Heijden, H. van der Voet, G. Polder, and K. Overvliet: Model development to predict perceived degree of naturalness. In: IMEKO XIX World Congress, 6-11 September 2009, Lisbon, 2009.
Google Scholar
K.P. Burnham and D.R. Anderson: Model Selection and Multimodel Inferences: A Practical Information-Theoretic Approach. 2nd edition, Springer, New York, 2002.
Google Scholar
H. Chipman, E.I. George, and R.E. McCulloch: The practical implementation of Bayesian model selection. In: IMS Lecture Notes – Monograph Series, Vol. 38, P. Lahiri (ed.), Institute of Mathematical Statistics, Beachwood, Ohio, 2001.
Google Scholar
B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani: Least angle regression. Annals of Statistics 32(2), 2004, 407–499.
Article MATH MathSciNet Google Scholar
G.H. Golub and C.F. Van Loan: Matrix Computations. John Hopkins University Press, Baltimore, third edition, 1996.
Google Scholar
T.G. Goodman, R. Montgomery, A. Bialek, A. Forbes, M. Rides, A. Whitaker, K. Overvliet, F. McGlone, and G. van der Heijden: The measurement of naturalness (MONAT). In: Man, Science & Measurement. Proceedings of the 12th IMEKO TC1–TC7 joint symposium, Annecy, France, September 3–5, 2008.
Google Scholar
R. Hoffman, V.I. Minkin, and B.K. Carpenter: Ockham’s razor and chemistry. International Journal for Philosophy of Chemistry 3, 1997, 3–28.
Google Scholar
C.M. Hurvich and C. Tsai: Regression and time series model selection in sample samples. Biometrika 76, 1989, 297–307.
Article MATH MathSciNet Google Scholar
K. Knight andW. Fu: Asymptotics for LASSO-type estimators. Annals of Statistics 28, 2000, 1356–1378.
MATH MathSciNet Google Scholar
H. Linhart and W. Zucchini: Model Selection. Wiley, New York, 1986.
Google Scholar
D. Madigan and A.E. Raftery: Model selection and accounting for model uncertainty in graphical models using Occam’s window. Journal of the American Statistical Association 89, 1535–1546.
Google Scholar
R. Manne: Analysis of two partial-least-squares algorithms for multivariate calibration. Chemometrics and Intelligent Laboratory Systems 2(1-3), August 1987, 187–198.
Article Google Scholar
A.J. Miller: Subset Selection in Regression. Chapman-Hall, New York, 1990.
Google Scholar
A.E. Raftery, D. Madigan, and J.A. Hoeting: Bayesian model averaging for linear regression. Journal of the American Statistical Association 92, 1997, 179–191.
Article MATH MathSciNet Google Scholar
G. Schwarz: Estimating the dimension of a model. Annals of Statistics 6, 1978, 461–464.
Article MATH MathSciNet Google Scholar
M. Sewell: Statistical inference (and what is wrong with classical statistics). In: The Social Construction of Statistics, S.M. Harding, T. and R. Thomas (eds.), Pluto Press, London, 2008.
Google Scholar
R. Tibshirani: Regression shrinkage and selection via LASSO. Journal of Royal
Google Scholar
Statistical Society, Series B 58, 1996, 267–288.
Google Scholar
L. Wasserman: Bayesian model selection and model averaging. Journal of Mathematical Psychology 44, 2000, 92–107.
Article MATH MathSciNet Google Scholar
A. Zellner, H.A. Keuzenkamp, and M. McAleer: Simplicity, Inference and Modelling: Keeping it Sophisticated Simple. Cambrige University Press, 2001.
Google Scholar

Download references

Author information

Authors and Affiliations

National Physical Laboratory, Teddington, TW11 0LW, Greater London, UK
Xin-She Yang & Alistair B. Forbes

Authors

Xin-She Yang
View author publications
You can also search for this author in PubMed Google Scholar
Alistair B. Forbes
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

, Dept. of Mathematics, University of Leicester, University Road, Leicester, LE1 7RH, United Kingdom
Emmanuil H Georgoulis
Abt. Mathematik, Universität Hamburg, Bundesstr. 55, Hamburg, 20146, Germany
Armin Iske
Dept. Mathematics and, Computer Science, University of Leicester, University Road, Leicester, LE1 7RH, United Kingdom
Jeremy Levesley

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, XS., Forbes, A.B. (2011). Model and Feature Selection in Metrology Data Approximation. In: Georgoulis, E., Iske, A., Levesley, J. (eds) Approximation Algorithms for Complex Systems. Springer Proceedings in Mathematics, vol 3. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16876-5_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-16876-5_14
Published: 13 December 2010
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16875-8
Online ISBN: 978-3-642-16876-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics