Abstract
This paper presents Active Class Selection (ACS), a new class of problems for multi-class supervised learning. If one can control the classes from which training data is generated, utilizing feedback during learning to guide the generation of new training data will yield better performance than learning from any a priori fixed class distribution. ACS is the process of iteratively selecting class proportions for data generation. In this paper we present several methods for ACS. In an empirical evaluation, we show that for a fixed number of training instances, methods based on increasing class stability outperform methods that seek to maximize class accuracy or that use random sampling. Finally we present results of a deployed system for our motivating application: training an artificial nose to discriminate vapors.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Baram, Y., El-Yaniv, R., Luz, K.: Online choice of active learning algorithms. JMLR 5, 255–291 (2004)
Bencic-Nagale, S., Walt, D.: Extending the longevity of fluorescence-based sensor arrays using adaptive exposure. Anal. Chem. 77(19), 6155–6162 (2005)
Bousquet, O., Elisseeff, A.: Stability and generalization. JMLR 2, 499–526 (2002)
Brodley, C., Friedl, M.: Identifying and eliminating mislabeled training instances. JAIR 11, 131–167 (1999)
Cohn, D.A., Ghahramani, Z., Jordan, M.l.: Active learning with statistical models. In: Advances in NIPS vol. 7, pp. 705–712 (1995)
Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: ICML, pp. 148–156 (1996)
Hastie, T., Tibshirani, R.: Classification by pairwise coupling. In: NIPS, pp. 507–513 (1998)
Japkowicz, N., Stephen, S.: The class imbalance problem: A systematic study. Intel. Data Anal. 6(5), 429–449 (2002)
Jo, T., Japkowicz, N.: Class imbalances versus small disjuncts. In: KDD, pp. 40–49 (2004)
Kearns, M., Ron, D.: Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. In: COLT, pp. 152–162 (1997)
Lewis, D.: A sequential algorithm for training text classifiers: Corrigendum and additional data. SIGIR 29(2), 13–19 (1995)
Platt, J.: Fast Training of Support Vector Machines Using Sequential Minimal Optimization. MIT Press, Cambridge (1999)
Poggio, T., Rifkin, R., Mukherjee, S., Niyogi, P.: General conditions for predictivity in learning theory. Nature 428(6981), 419–422 (2004)
Raskutti, B., Ferra, H., Kowalczyk, A.: Combining clustering and co-training to enhance text classification using unlabelled data. In: KDD, pp. 620–625 (2002)
Sanderson, M.: Reuters Test Collection. In: BSC IRSG (1994)
Sebban, M., Nock, R., Lallich, S.: Stopping criterion for boosting-based data reduction techniques: From binary to multiclass problem. JMLR 3, 863–885 (2003)
Srinivasan, A., Muggleton, S., Bain, M.: Distinguishing exceptions from noise in non-monotonic learning. In: Int. Workshop on ILP (1992)
Tong, S., Chang, E.: Support vector machine active learning for image retrieval. Multimedia, 107–118 (2001)
Optical sensing arrays. White paper, Tufts University (2006), ase.tufts.edu/chemistry/walt/research/projects/artificialnosepage.htm
Wilson, D., Martinez, T.: An integrated instance-based learning algorithm. Comp. Intel. 16(1), 1–28 (2000)
Witten, I., Frank, E., Trigg, L., Hall, M., Holmes, G., Cunningham, S.: Weka: Practical machine learning tools and techniques with java implementations (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lomasky, R., Brodley, C.E., Aernecke, M., Walt, D., Friedl, M. (2007). Active Class Selection. In: Kok, J.N., Koronacki, J., Mantaras, R.L.d., Matwin, S., Mladenič, D., Skowron, A. (eds) Machine Learning: ECML 2007. ECML 2007. Lecture Notes in Computer Science(), vol 4701. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74958-5_63
Download citation
DOI: https://doi.org/10.1007/978-3-540-74958-5_63
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74957-8
Online ISBN: 978-3-540-74958-5
eBook Packages: Computer ScienceComputer Science (R0)