On Combining Boosting with Rule-Induction for Automated Fruit Grading
The automation of post-harvest fruit grading in the industry is a problem that is receiving considerable attention in the realm of computer vision and machine learning. Classification accuracy with automated systems in this domain is a challenge given the inherent variability in the visual appearance of fruit and its quality-determining features. While the accuracy of automated systems is of paramount importance, the usability and the interpretability of machine learning solutions to the operators are also crucial since many sophisticated algorithms involve numerous tunable parameters and are often “black-boxes”. This research presents a generalizable machine learning solution that balances the need for high accuracy and usability by decomposing the problem into sub-tasks. A powerful boosting algorithm (AdaBoost.ECC) with low interpretability is employed for learning fruit-surface characteristics. The classification outputs of boosting then become inputs for rule-induction algorithms (RIPPER and FURIA), generating human-interpretable rule sets that are amenable for review and revisions by operators. Using seven datasets of different fruit varieties, the performance of the proposed method was compared against a manually calibrated commercial fruit-grading system. The results showed that the proposed system is able to match the accuracy of machines calibrated by domain experts having many years of experience, while providing simpler rule sets possessing high interpretability and usability while yielding knowledge discovery.
KeywordsAdaBoost.ECC Boosting Classification Decomposition strategies Fruit grading FURIA Machine learning RIPPER Rule-induction
The authors express their gratitude to Compac Sorting Ltd. for providing access to their datasets, grading maps and the necessary software for these experiments.
- 9.C.S Nandi, B. Tudu, C. Koley, An automated machine vision based system for fruit sorting and grading, in Proceedings of the 6th International Conference on Sensing Technology (ICST, IEEE), 2012, pp. 195–200Google Scholar
- 16.W. Huang, C. Zhang, B. Zhang, in Identifying Apple Surface Defects Based on Gabor Features and SVM Using Machine Vision, eds. by D. Li, Y. Chen. Computer and Computing Technologies in Agriculture V, IFIP Advances in Information and Communication Technology, vol. 370 (Springer, Berlin, 2012), pp. 343–350. doi: 10.1007/978-3-642-27275-2_39
- 18.V. Guruswami, A. Sahai, Multiclass learning, boosting, and error-correcting codes, in Proceedings of the 12th Annual Conference on Computational Learning Theory (COLT’99) (ACM, New York, 1999), pp. 145–155Google Scholar
- 19.J. Fürnkranz, D. Gamberger, N. Lavrac, Foundations of Rule Learning (Springer, New York, 2012)Google Scholar
- 21.W. Cohen, Fast effective rule induction, in Proceedings of the 12th International Conference on Machine Learning, 1995, pp. 115–123Google Scholar
- 22.T. Susnjak, A. Barczak, N. Reyes, A decomposition machine-learning strategy for automated fruit grading, in Proceedings of the World Congress on Engineering and Computer Science (WCECS 2013), (San Francisco, 2013), pp. 819–825Google Scholar
- 25.O. Maimon, L. Rokach, Decomposition methodology for knowledge discovery and data mining: Data Mining and Knowledge Discovery Handbook (Springer, New York, 2005), pp. 981–1003Google Scholar
- 29.Y. Freund, R.E. Schapire, A short introduction to boosting. J. JSAI 14(5), 771–780 (1999)Google Scholar
- 30.F. Yuan, X. Li, W. Li-ming, P. Le-ping, S. Ying, in Knowledge Discovery of Energy Management System Based on Prism, Furia and J48, vol. 100, ed. by M. Ma. Communication Systems and I.T (Lecture Notes in Electronic Engineering). (Springer, Berlin, Heidelberg, 2011) pp. 593–600Google Scholar
- 32.I. Witten, E. Frank, M. Hall, Data Mining: Practical Machine Learning Tools and Techniques (Morgan Kaufmann, San Francisco, 2011)Google Scholar
- 33.F. Provost, T. Fawcett, Analysis and visualization of classifier performance: comparison under imprecise class and cost distributions, in Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining (AAAI Press, 1997), pp. 43–48Google Scholar
- 34.Y. Sun, M. Kamel, Y. Wang, Boosting for learning multiple classes with imbalanced class distribution, in Proceedings of 6th International Conference on Data Mining ICDM’06, 2006, pp. 592–602Google Scholar