Using Data Mining for Wine Quality Assessment
Certification and quality assessment are crucial issues within the wine industry. Currently, wine quality is mostly assessed by physicochemical (e.g alcohol levels) and sensory (e.g. human expert evaluation) tests. In this paper, we propose a data mining approach to predict wine preferences that is based on easily available analytical tests at the certification step. A large dataset is considered with white vinho verde samples from the Minho region of Portugal. Wine quality is modeled under a regression approach, which preserves the order of the grades. Explanatory knowledge is given in terms of a sensitivity analysis, which measures the response changes when a given input variable is varied through its domain. Three regression techniques were applied, under a computationally efficient procedure that performs simultaneous variable and model selection and that is guided by the sensitivity analysis. The support vector machine achieved promising results, outperforming the multiple regression and neural network methods. Such model is useful for understanding how physicochemical tests affect the sensory preferences. Moreover, it can support the wine expert evaluations and ultimately improve the production.
KeywordsOrdinal Regression Sensitivity Analysis Sensory Preferences Support Vector Machines Variable and Model Selection Wine Science
Unable to display preview. Download preview PDF.
- 1.Bi, J., Bennett, K.: Regression Error Characteristic curves. In: Proceedings of 20th Int. Conf. on Machine Learning (ICML), Washington DC, USA (2003)Google Scholar
- 2.Blake, C., Merz, C.: UCI Repository of Machine Learning Databases (1998)Google Scholar
- 5.Cortez, P.: RMiner: Data Mining with Neural Networks and Support Vector Machines using R. In: Rajesh, R. (ed.) Introduction to Advanced Scientific Softwares and Toolboxes (in press)Google Scholar
- 7.CVRVV. Portuguese Wine - Vinho Verde. Comissão de Viticultura da Região dos Vinhos Verdes (CVRVV) (July 2008), http://www.vinhoverde.pt
- 10.Ferrer, J., MacCawley, A., Maturana, S., Toloza, S., Vera, J.: An optimization approach for scheduling wine grape harvest operations. Production Economics, pp. 985–999 (2008)Google Scholar
- 11.Flexer, A.: Statistical evaluation of neural networks experiments: Minimum requirements and current practice. In: Proceedings of the 13th European Meeting on Cybernetics and Systems Research, Vienna, Austria, vol. 2, pp. 1005–1008 (1996)Google Scholar
- 16.Legin, A., Rudnitskaya, A., Luvova, L., Vlasov, Y., Natale, C., D’Amico, A.: Evaluation of Italian wine by the electronic tongue: recognition, quantitative analysis and correlation with human sensory perception. Analytica Chimica Acta, 33–34 (2003)Google Scholar
- 17.Moreno, I., González-Weller, D., Gutierrez, V., Marino, M., Cameán, A., González, A., Hardisson, A.: Differentiation of two Canary DO red wines according to their metal content from inductively coupled plasma optical emission spectrometry and graphite furnace atomic absorption spectrometry by using Probabilistic Neural Networks. Talanta 72, 263–268 (2007)CrossRefGoogle Scholar
- 18.R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2008), http://www.R-project.org, ISBN 3-900051-00-3
- 19.Rumelhart, D., Hinton, G., Williams, R.: Learning Internal Representations by Error Propagation. In: Rulmelhart, D., McClelland, J. (eds.) Parallel Distributed Processing: Explorations in the Microstructures of Cognition, pp. 318–362. MIT Press, Cambridge (1986)Google Scholar
- 23.Turban, E., Sharda, R., Aronson, J., King, D.: Business Intelligence, A Managerial Approach. Prentice-Hall, Englewood Cliffs (2007)Google Scholar
- 24.Vlassides, S., Ferrier, J., Block, D.: Using Historical Data for Bioprocess Optimization: Modeling Wine Characteristics Using Artificial Neural Networks and Archived Process Information. Biotechnology and Bioengineering, 73(1) (2001)Google Scholar