Skip to main content
Log in

Single-pass active learning with conflict and ignorance

  • Original Paper
  • Published:
Evolving Systems Aims and scope Submit manuscript

Abstract

In this paper, we present a new methodology for conducting active learning in a single-pass on-line learning context. Single-pass active learning can be understood as an approach for reducing the annotation effort for users and operators in on-line classification problems, in which usually the true class labels of new incoming samples are usually unknown. This reduction in effort can be achieved by selecting the most informative samples, that is, those that contribute most to improving the predictive performance of incremental classifiers. Our approach builds upon certainty-based sample selection in connection with version-space reduction. Two new reliability concepts were investigated and developed in connection with evolving fuzzy classifiers: conflict and ignorance. Conflict models the extent to which a new query point lies in the conflict region between two or more classes and therefore reflects a level of certainty in the classifier’s prediction. Ignorance represents the distance of a new query point from the training samples seen so far. In extended form, it integrates the actual variability of the version space. The choice of the model architecture used for on-line classification scenarios (evolving fuzzy classifier) is clearly motivated in the paper. The results based on real-world binary and multi-class classification streaming data show that our single-pass active learning approach yields evolving classifiers whose performance is similar to that of classifiers using all samples for adaptation; however, the annotation effort in terms of the number of class label requests is reduced by up to 90 %.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. http://archive.ics.uci.edu/ml/datasets/Spambase.

  2. Results on CD imprint data could not be included as the state-of-the-art active learning software crashed after some iterations on this data set.

References

  • Angelov P, Filev D, Kasabov N (2010a) Editorial to evolving systems. Evol Syst 1(1):1–2

    Article  Google Scholar 

  • Angelov P, Filev D, Kasabov N (2010b) Evolving intelligent systems—methodology and applications. Wiley, New York

    Book  Google Scholar 

  • Angelov P, Lughofer E, Zhou X (2008) Evolving fuzzy classifiers using different model architectures. Fuzzy Sets Syst 159(23):3160–3182

    Article  MATH  MathSciNet  Google Scholar 

  • Angelov P, Zhou X (2008) Evolving fuzzy-rule-based classifiers from data streams. IEEE Trans Fuzzy Syst 16(6):1462–1475

    Article  Google Scholar 

  • Backer SD, Scheunders P (2001) Texture segmentation by frequency-sensitive elliptical competitive learning. Image Vis Comput 19(9–10):639–648

    Article  Google Scholar 

  • Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis. J Mach Learn Res 11:1601–1604

    Google Scholar 

  • Bifet A, Kirkby R (2011) Data stream mining—a practical approach. Technical report, Department of Computer Sciences, University of Waikato, Japan

  • Bordes A, Ertekin S, Weston J, Bottou L (2005) Fast kernel classifiers with online and active learning. J Mach Learn Res 6:1579–1619

    MATH  MathSciNet  Google Scholar 

  • Bouchachia A (2009) Incremental induction of classification fuzzy rules. In: IEEE workshop on evolving and self-developing intelligent systems (ESDIS) 2009. Nashville, USA, pp 32–39

  • Bouchachia A (2010) An evolving classification cascade with self-learning. Evol Syst 1(3):143–160

    Article  Google Scholar 

  • Bouchachia A, Mittermeir R (2006) Towards incremental fuzzy classifiers. Soft Comput 11(2):193–207

    Article  Google Scholar 

  • Chu W, Zinkevich M, Li L, Thomas A, Zheng B (2011) Unbiased online active learning in data streams. In: Proceedings of the KDD 2011. San Diego, California

  • Cohn D, Atlas L, Ladner R (1994) Improving generalization with active learning. Mach Learn 15(2):201–221

    Google Scholar 

  • Cohn D, Ghahramani Z, Jordan M (1996) Active learning with statistical models. J Artif Intell Res 4(1):129–145

    MATH  Google Scholar 

  • Condurache A (2002) A two-stage-classifier for defect classification in optical media inspection. In: Proceedings of the 16th international conference on pattern recognition (ICPR’02), vol 4. Quebec City, Canada, pp 373–376

  • Dagan I, Engelson S (1995) Committee-based sampling for training probabilistic classifier. In: Proceedings of 12th international conference on machine learning, pp 150–157

  • Diehl C, Cauwenberghs G (2003) SVM incremental learning, adaptation and optimization. In: Proceedings of the international joint conference on neural networks, vol 4. Boston, pp 2685–2690

  • Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining. Boston, MA, pp 71–80

  • Domingos P, Hulten G (2001) Catching up with the data: research issues in mining data streams. In: Proceedings of the workshop on research issues in data mining and knowledge discovery. Santa Barbara, CA

  • Donmez P, Carbonell J (2008) Proactive learning: cost-sensitive active learning with multiple imperfect oracles. In: Proceedings of the CIKM 2008 conference. Napa Valley, California

  • Eitzinger C, Heidl W, Lughofer E, Raiser S, Smith J, Tahir M, Sannen D, van Brussel H (2010) Assessment of the influence of adaptive components in trainable surface inspection systems. Mach Vis Appl 21(5):613–626

    Article  Google Scholar 

  • Fukumizu K (2000) Statistical active learning in multilayer perceptrons. IEEE Trans Neural Netw 11(1):17–26

    Article  Google Scholar 

  • Fürnkranz J (2002) Round robin classification. J Mach Learn Res 2:721–747

    MATH  MathSciNet  Google Scholar 

  • Gacto M, Alcala R, Herrera F (2011) Interpretability of linguistic fuzzy rule-based systems: an overview of interpretability measures. Inf Sci 181(20):4340–4360

    Article  Google Scholar 

  • Gama J (2010) Knowledge discovery from data streams. Chapman and Hall/CRC, Boca Raton

    Book  MATH  Google Scholar 

  • Hartert L, Sayed-Mouchaweh M, Billaudel P (2010) A semi-supervised dynamic version of fuzzy k-nearest neighbors to monitor evolving systems. Evol Syst 1(1):3–15

    Article  Google Scholar 

  • Hisada M, Ozawa S, Zhang K, Kasabov N (2010) Incremental linear discriminant analysis for evolving feature spaces in multitask pattern recognition problems. Evol Syst 1(1):17–27

    Article  Google Scholar 

  • Hu W, Hu W, Xi N, Maybank S (2009) Unsupervised active learning based on hierarchical graph-theoretic clustering. IEEE Trans Syst Man Cybern Part B Cybern 39(5):1147–1161

    Article  Google Scholar 

  • Hühn J, Hüllermeier E (2009) FR3: A fuzzy rule learner for inducing reliable classifiers. IEEE Trans Fuzzy Syst 17(1):138–149

    Article  Google Scholar 

  • Hüllermeier E, Brinker K (2008) Learning valued preference structures for solving classification problems. Fuzzy Sets Syst 159(18):2337–2352

    Article  MATH  Google Scholar 

  • Ishibuchi H, Nakashima T (2001) Effect of rule weights in fuzzy rule-based classification systems. IEEE Trans Fuzzy Syst 9(4):506–515

    Article  Google Scholar 

  • Jackson P (1999) Introduction to expert systems. Addison Wesley Pub Co Inc., Edinburgh Gate

    Google Scholar 

  • Kruse R, Gebhardt J, Palm R (1994) Fuzzy systems in computer science. Verlag Vieweg, Wiesbaden

    Book  Google Scholar 

  • Kuncheva L (2000) Fuzzy classifier design. Physica-Verlag, Heidelberg

    Book  MATH  Google Scholar 

  • Lemos A, Caminhas W, Gomide F (2012) Adaptive fault detection and diagnosis using an evolving fuzzy classifier. Inf Sci. (in press). doi:10.1016/j.ins.2011.08.030

  • Leng G, McGinnity T, Prasad G (2005) An approach for on-line extraction of fuzzy rules using a self-organising fuzzy neural network. Fuzzy Sets Syst 150(2):211–243

    Article  MATH  MathSciNet  Google Scholar 

  • Lewis D, Catlett J (1994) Heterogeneous uncertainty sampling for supervised learning. In: Proceedings of the 11th international conference on machine learning. New Brunswick, New Jersey, pp 148–156

  • Lughofer E (2011) All-pairs evolving fuzzy classifiers for on-line multi-class classification problems. In: Proceedings of the EUSFLAT 2011 conference. Elsevier, Aix-Les-Bains, France, pp 372–379

  • Lughofer E (2011) Evolving fuzzy systems—methodologies, advanced concepts and applications. Springer, Berlin

    Book  MATH  Google Scholar 

  • Lughofer E (2011) On-line incremental feature weighting in evolving fuzzy classifiers. Fuzzy Sets Syst 163(1):1–23

    Article  MATH  MathSciNet  Google Scholar 

  • Lughofer E, Angelov P, Zhou X (2007) Evolving single- and multi-model fuzzy classifiers with FLEXFIS-Class. In: Proceedings of FUZZ-IEEE 2007. London, UK, pp 363–368

  • Lughofer E, Bouchot JL, Shaker A (2011) On-line elimination of local redundancies in evolving fuzzy systems. Evol Syst 2(3):165–187

    Article  Google Scholar 

  • Lughofer E, Smith JE, Caleb-Solly P, Tahir M, Eitzinger C, Sannen D, Nuttin M (2009) Human-machine interaction issues in quality control based on on-line image classification. IEEE Trans Syst Man Cybern Part A Syst Hum 39(5):960–971

    Article  Google Scholar 

  • Muslea I (2000) Active learning with multiple views. PhD thesis, University of Southern California

  • Nauck D, Kruse R (1998) NEFCLASS-X–a soft computing tool to build readable fuzzy classifiers. BT Technol J 16(3):180–190

    Article  Google Scholar 

  • Oza NC, Russell S (2001) Online bagging and boosting. In: Proceedings of the 8th international workshop on artificial intelligence and statistics 2001 (AI and STATISTICS 2001). Morgan Kaufmann, Key West, Florida, pp 105–112

  • Pang S, Ozawa S, Kasabov N (2005) Incremental linear discriminant analysis for classification of data streams. IEEE Trans Syst Men Cybern Part B Cybern 35(5):905–914

    Article  Google Scholar 

  • Roy N, Mccallum A (2001) Toward optimal active learning through sampling estimation of error reduction. In: Proceedings of the 18th international conference on machine learning. Morgan Kaufmann, pp 441–448

  • Schölkopf B, Smola A (2002) Learning with kernels: support vector machines, regularization, optimization and beyond. MIT Press, London

    Google Scholar 

  • Sculley D (2007) Online active learning methods for fast label efficient spam filtering. In: Proceedings of the fourth conference on email and AntiSpam. Mountain View, California

  • Settles B (2010) Active learning literature survey. Technical report, Computer Sciences Technical Report 1648, University of Wisconsin Madison

  • Shilton A, Palaniswami M, Ralph-D D, Tsoi A-C (2005) Incremental training of support vector machines. IEEE Trans Neural Netw 16(1):114–131

    Article  Google Scholar 

  • Thompson C, Califf M, Mooney R (1999) Active learning for natural language parsing and information extraction. In: Proceedings of 16th international conference on machine learning. Bled, Slovenia, pp 406–414

  • Tong S, Koller D (2001) Support vector machine active learning with application to text classification. J Mach Learn Res 2:45–66

    Google Scholar 

  • Tuia D, Volpi M, Copa L, Kanevski M, Muñoz-Marí J (2011) A survey of active learning algorithms for supervised remote sensing image classification. IEEE J Sel Topics Signal Process 5(3):606–617

    Article  Google Scholar 

  • Utgoff P (1989) Incremental induction of decision trees. Mach Learn 4(2):161–186

    Article  Google Scholar 

  • Vapnik V (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  • Varma M, Zisserman A (2004) Unifying statistical texture classification frameworks. Image Vis Comput 22:1175–1183

    Article  Google Scholar 

  • Zvarova J (2006) Decision support systems in medicine. In: Zielinski K, Duplaga M, Ingram D (eds) Information technology solutions for healthcare. Springer, London, pp 182–204

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edwin Lughofer.

Additional information

This work was funded by the Austrian fund for promoting scientific research (FWF, contract number I328-N23, acronym IREFS). This publication reflects only the authors’ views.

Appendix

Appendix

Here we provide the pseudo-code for our single-pass active learning scheme using evolving fuzzy classifiers with single and multi-model (all-pairs) architecture.

Algorithm 1 Single-Pass Active Learning for Binary Classification Problems

  1. 1.

    Input: Current evolved classifier \({\mathbb{C}}\) [rules in the form (4)] with C rules, class frequency matrix h (yields conf ij ), current accumulated accuracy of the classifier Acc(N − 1) (Acc(0) = 0), flag denoting whether ignorance is used (=1) or not (=0), thr as threshold for conflict level; new input query sample \({\bf x}_N\) at time instance N.

  2. 2.

    Predict class label of \({\bf x}_N, L, \) after (9) using \({\mathbb{C}. }\)

  3. 3.

    Obtain the confidence=conflict degree \(conf_L \in [0.5,1]\) of the prediction after (7).

  4. 4.

    IF flag==1

    1. (a)

      Calculate ignorance degree after (10).

    2. (b)

      IF ignorance condition (11) does not hold, THEN set ign s  = 1, ELSE set ign s  = 0.

  5. 5.

    ELSE

    1. (a)

      ign s  = 0

  6. 6.

    END IF

  7. 7.

    IF ign s  == 1 OR conf L  < thr

    1. (a)

      Get the real class label of \({\bf x}_N, y_N. \)

    2. (b)

      Update the accumulated one-step-ahead accuracy Acc(N) by (25).

    3. (c)

      UpdateClassifier(\({\mathbb{C}, h, \{{\bf x}_N,y_N\}}\)).

  8. 8.

    (From time to time evaluate classifier with separate test set.)

  9. 9.

    Output: Updated classifier \({\mathbb{C}}\) (old classifier in case when active learning suggests no update) and its current accuracy Acc(N).

For on-line simulations with batch stored data sets, obtaining the real class label in Step 7 (a) is for free as usually the real class labels are stored along the feature vectors contained in the data sets (which is also the case for all data sets we have used in our experimental study). For real on-line applications and installations, the class labels often need to be obtained by feedback from operators. The condition in Step 7 tries to reduce the number of feedbacks as much as possible (as discussed throughout Sect. 4 and verified in Sect. 5). Step 8 is optional and for the purpose to validate the active learning approach by inspecting the accuracy trend line over time on a separate test data set from the same application—as is used in Sect. 5.2.3 when comparing with a state-of-the-art batch active learning approach.

Algorithm 2 Single-Pass Active Learning for Multi-Class Classification Problems

  1. 1.

    Input: Current evolved classifiers \({\mathbb{C}_{k,l},l>k}\) for each class pair (kl) after (12), each one with C k,l rules in the form (4); class frequency matrices h k,l , current accumulated accuracy of the classifier Acc(N − 1) (Acc(0) = 0), flag denoting whether ignorance is used (=1) or not (=0), thr con as threshold for conflict level; new input query sample \({\bf x}_N\) at time instance N.

  2. 2.

    FOR all class pairs (kl) with l > k

    1. (a)

      Obtain the confidence = conflict degree \(conf_{k,l} \in [0.5,1]\) of the prediction after (7).

    2. (b)

      Down-weight conf k,l to obtain the preference degree conf * k,l after (18).

  3. 3.

    END FOR

  4. 4.

    Predict class label of \({\bf x}_N, L, \) after the scoring formula in (14) using conf * k,l (instead of conf k,l ).

  5. 5.

    IF flag==1

    1. (a)

      Calculate ignorance degree of each classifier after (10).

    2. (b)

      IF ignorance condition (17) holds, THEN set ign s  = 1, ELSE set ign s  = 0.

  6. 6.

    ELSE

    1. (a)

      ign s  = 0

  7. 7.

    END IF

  8. 8.

    IF ign s  == 1 OR Condition (16) holds (using conf * k,l in the first part and conf k,l in the second part)

    1. (a)

      Get the real class label of \({\bf x}_N, y_N. \)

    2. (b)

      Update the accumulated one-step-ahead accuracy Acc(N) by (25).

    3. (c)

      For l < y N UpdateClassifier(\({\mathbb{C}_{l,y_N}; h_{l,y_N}; \{{\bf x}_N,2\}}\)).

    4. (d)

      For l > y N UpdateClassifier(\({\mathbb{C}_{y_N,l}; h_{y_N,l}; \{{\bf x}_N,1\}}\)).

  9. 9.

    (From time to time evaluate classifier with separate test set.)

  10. 10.

    Output: Updated classifiers \({\mathbb{C}_{k,l}}\) (old classifiers in case when active learning suggests no update) and the current overall accuracy Acc(N).

Please note that if y N belongs to the higher class label in the pair (ly N ) (Step 8 (c)), the binary classifier \({\mathbb{C}_{l,y_N}}\) is updated with class #2, otherwise \({\mathbb{C}_{y_N,l}}\) is updated with class #1 (Step 8 (d)), thus always only updating the upper right triangular matrix of binary classifiers (the lower left matrix is assumed to produce reciprocal preference degrees conf).

The routine ’UpdateClassifier’ is outlined in Algorithm 3.

Algorithm 3 Routine Update Classifier(\({\mathbb{C}, h, \{{\bf x},y\}}\))

  1. 1.

    Input: Current classifier \({\mathbb{C}}\) with C rules, frequency class matrix h, new input query sample \({\bf x}\) with real class label y.

  2. 2.

    IF \({\bf x}\) fulfills rule evolution criterion

    1. (a)

      Evolve a new rule (usually, the center of the new rule is set to the current sample \(c_C={\bf x}\) and its spread to some initial value).

    2. (b)

      C = C + 1.

    3. (c)

      Append row in frequency class matrix with h C,y  = 1, h C,k  = 0 for k ≠ y.

  3. 3.

    ELSE

    1. (a)

      Obtain index of rule with maximal activation level \(i* = {\text{argmax}}_{1} \le i \le C\,\mu _i(\vec{x})\).

    2. (b)

      Increment actual class frequency matrix entry \(h_{i*,y} = h_{i*,y} + 1\).

    3. (c)

      Update rule(s) of the classifier \(\mathbb{C}\).

  4. 4.

    Output: Updated classifier \({\mathbb{C}}\) and updated class frequency matrix h.

Regarding the concrete rule evolution criterion (Step 2) and updating the current rule(s) of the classifier [Step 3 (c)], any evolving fuzzy classification method from literature can be used, see Lughofer (2011) for a survey. In all our experiments in Sect. 5, we used the FLEXFIS-Class approach, which is based on incremental vector quantization employing vigilance concept, see Lughofer (2011) for details.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lughofer, E. Single-pass active learning with conflict and ignorance. Evolving Systems 3, 251–271 (2012). https://doi.org/10.1007/s12530-012-9060-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12530-012-9060-7

Keywords

Navigation