Skip to main content

Model and Feature Selection in Metrology Data Approximation

  • Conference paper
  • First Online:
Approximation Algorithms for Complex Systems

Part of the book series: Springer Proceedings in Mathematics ((PROM,volume 3))

Summary

For many data approximation problems in metrology, there are a num- ber of competing models which can potentially fit the observed data. A crucial task is to quantify the extent to which one model performs better than others, taking into account the influence of random effects associated with the data. For example, for a given data set, we can use a series of polynomials of various degrees to fit the data using a least squares criterion. The residual sum of squares is a measure of how well the model fits the data. However, it is generally required to balance goodness of fit with minimising the model complexity. We consider a number of criteria that aim to do this: the Akaike information criterion (AIC), the Bayesian/Schwarz in- formation criterion, and the AIC with a correction for small sample size (AICc). In this paper, we compare the performance of these criteria for polynomial regression and show that for the examples tested the AICc criterion performs best. A second element of model selection is to determine from a set of feature vectors, the sub- set that defines a model space most suitable for describing the observed response. Since there are 2N possible model spaces defined by N feature vectors, for even a modest number of feature vectors it is necessary to reduce or prioritise the number of candidate models. Partial least squares and the least angle regression algorithms can be used as model reduction tools. We describe these algorithms in the context of feature selection and how they can be used with a model selection criterion such as AICc and illustrate their performance using simulations and on an application from human sensory perception.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. H. Akaike: A new look at the statistical model identification. IEEE Transactions on Automatic Control 19, 1974, 716–723.

    Article  MATH  MathSciNet  Google Scholar 

  2. A. Bialek, A.B. Forbes, T. Goodman, R. Montgomery, M. Rides, G. van der Heijden, H. van der Voet, G. Polder, and K. Overvliet: Model development to predict perceived degree of naturalness. In: IMEKO XIX World Congress, 6-11 September 2009, Lisbon, 2009.

    Google Scholar 

  3. K.P. Burnham and D.R. Anderson: Model Selection and Multimodel Inferences: A Practical Information-Theoretic Approach. 2nd edition, Springer, New York, 2002.

    Google Scholar 

  4. H. Chipman, E.I. George, and R.E. McCulloch: The practical implementation of Bayesian model selection. In: IMS Lecture Notes – Monograph Series, Vol. 38, P. Lahiri (ed.), Institute of Mathematical Statistics, Beachwood, Ohio, 2001.

    Google Scholar 

  5. B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani: Least angle regression. Annals of Statistics 32(2), 2004, 407–499.

    Article  MATH  MathSciNet  Google Scholar 

  6. G.H. Golub and C.F. Van Loan: Matrix Computations. John Hopkins University Press, Baltimore, third edition, 1996.

    Google Scholar 

  7. T.G. Goodman, R. Montgomery, A. Bialek, A. Forbes, M. Rides, A. Whitaker, K. Overvliet, F. McGlone, and G. van der Heijden: The measurement of naturalness (MONAT). In: Man, Science & Measurement. Proceedings of the 12th IMEKO TC1–TC7 joint symposium, Annecy, France, September 3–5, 2008.

    Google Scholar 

  8. R. Hoffman, V.I. Minkin, and B.K. Carpenter: Ockham’s razor and chemistry. International Journal for Philosophy of Chemistry 3, 1997, 3–28.

    Google Scholar 

  9. C.M. Hurvich and C. Tsai: Regression and time series model selection in sample samples. Biometrika 76, 1989, 297–307.

    Article  MATH  MathSciNet  Google Scholar 

  10. K. Knight andW. Fu: Asymptotics for LASSO-type estimators. Annals of Statistics 28, 2000, 1356–1378.

    MATH  MathSciNet  Google Scholar 

  11. H. Linhart and W. Zucchini: Model Selection. Wiley, New York, 1986.

    Google Scholar 

  12. D. Madigan and A.E. Raftery: Model selection and accounting for model uncertainty in graphical models using Occam’s window. Journal of the American Statistical Association 89, 1535–1546.

    Google Scholar 

  13. R. Manne: Analysis of two partial-least-squares algorithms for multivariate calibration. Chemometrics and Intelligent Laboratory Systems 2(1-3), August 1987, 187–198.

    Article  Google Scholar 

  14. A.J. Miller: Subset Selection in Regression. Chapman-Hall, New York, 1990.

    Google Scholar 

  15. A.E. Raftery, D. Madigan, and J.A. Hoeting: Bayesian model averaging for linear regression. Journal of the American Statistical Association 92, 1997, 179–191.

    Article  MATH  MathSciNet  Google Scholar 

  16. G. Schwarz: Estimating the dimension of a model. Annals of Statistics 6, 1978, 461–464.

    Article  MATH  MathSciNet  Google Scholar 

  17. M. Sewell: Statistical inference (and what is wrong with classical statistics). In: The Social Construction of Statistics, S.M. Harding, T. and R. Thomas (eds.), Pluto Press, London, 2008.

    Google Scholar 

  18. R. Tibshirani: Regression shrinkage and selection via LASSO. Journal of Royal

    Google Scholar 

  19. Statistical Society, Series B 58, 1996, 267–288.

    Google Scholar 

  20. L. Wasserman: Bayesian model selection and model averaging. Journal of Mathematical Psychology 44, 2000, 92–107.

    Article  MATH  MathSciNet  Google Scholar 

  21. A. Zellner, H.A. Keuzenkamp, and M. McAleer: Simplicity, Inference and Modelling: Keeping it Sophisticated Simple. Cambrige University Press, 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yang, XS., Forbes, A.B. (2011). Model and Feature Selection in Metrology Data Approximation. In: Georgoulis, E., Iske, A., Levesley, J. (eds) Approximation Algorithms for Complex Systems. Springer Proceedings in Mathematics, vol 3. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16876-5_14

Download citation

Publish with us

Policies and ethics