Abstract
A data stream is a flow of unbounded data that arrives continuously at high speed. In a dynamic streaming environment, the data changes over the time while stream evolves. The evolving nature of data causes essentially the appearance of new concepts. This novel concept could be abnormal such as fraud, network intrusion, or a sudden fall. It could also be a new normal concept that the system has not seen/trained on before. In this paper we propose, develop, and evaluate a technique for concept evolution in evolving data streams. The novel approach continuously monitors the movement of the streaming data to detect any emerging changes. The technique is capable of detecting the emergence of any novel concepts whether they are normal or abnormal. It also applies a continuous and active learning for assimilating the detected concepts in real time. We evaluate our approach on activity recognition domain as an application of evolving data streams. The study of the novel technique on benchmarked datasets showed its efficiency in detecting new concepts and continuous adaptation with low computational cost.
Similar content being viewed by others
References
Abdallah ZS, Gaber MM, Srinivasan B, Krishnaswamy S (2015) Adaptive mobile activity recognition system with evolving data streams. Neurocomputing 150(Part A):304–317
Aggarwal CC (2013) Outlier analysis. Springer, New York
Al-Khateeb T, Masud MM, Khan L, Aggarwal CC, Han J, Thuraisingham BM (2012) Stream classification with recurring and novel class detection using class-based ensemble. In ICDM, pp 31–40
Andreu J, Angelov P (2013) An evolving machine learning method for human activity recognition systems. J Ambient Intell Humanized Comp 4(2):195–206
Andreu J, Baruah RD, Angelov P (2011) Real time recognition of human activities from wearable sensors by evolving classifiers. In: 2011 IEEE International Conference on Fuzzy Systems (FUZZ), IEEE, pp 2786–2793
Angiulli F, Fassetti F (2007) Detecting distance-based outliers in streams of data. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, CIKM ’07, New York, NY, USA. ACM, pp 811–820
Assent I, Kranen P, Baldauf C, Seidl T (2012) Anyout: anytime outlier detection on streaming data. In: Proceedings of the 17th International Conference on Database Systems for Advanced Applications—Volume Part I, DASFAA’12. Springer-Verlag, Berlin, pp 228–242
Cauwenberghs G, Poggio T (2001) Incremental and decremental support vector machine learning. Adv Neural Inform Process Syst, pp 409–415
Costa BSJ, Angelov PP, Guedes LA (2014) Real-time fault detection using recursive density estimation. J Control Autom Elect Syst 25(4):428–437
Costa BSJ, Angelov PP, Guedes LA (2015) Fully unsupervised fault detection and identification based on recursive density estimation and self-evolving cloud-based classifier. Neurocomputing 150:289–303
Faria ER, Gama JA, Carvalho ACPLF (2013) Novelty detection algorithm for data streams multi-class problems. In: Proceedings of the 28th Annual ACM Symposium on Applied Computing, SAC ’13, New York, ACM, pp 795–800
Faria ER, Gonçalves I. J, de Carvalho AC, Gama J (2015a) Novelty detection in data streams. Artif Intell Rev 1–35
Faria ER, Ponce de L, Ferreira Carvalho A, Gama J (2015b) Minas: multiclass learning algorithm for novelty detection in data streams. Data Mining Knowl Discov 1–41
Gomes JB, Krishnaswamy S, Gaber MM, Sousa PAC, Menasalvas E (2012) Mars: a personalised mobile activity recognition system. In: Proceedings of the 2012 IEEE 13th International Conference on Mobile Data Management (Mdm 2012), MDM ’12, Washington, DC, USA. IEEE Computer Society, pp 316–319
Gurjar G, Chhabria S (2015) A review on concept evolution technique on data stream. In: 2015 International Conference on Pervasive Computing (ICPC), pp 1–3
Haque A, Khan L, Baron M (2015) Semi supervised adaptive framework for classifying evolving data stream. In: Advances in Knowledge Discovery and Data Mining, Springer, pp 383–394
Hayat MZ, Hashemi MR (2010) A dct based approach for detecting novelty and concept drift in data streams. In: 2010 International Conference of Soft Computing and Pattern Recognition (SoCPaR), pp 373–378
Kifer D, Ben-David S, Gehrke J (2004) Detecting change in data streams. In: Proceedings of the Thirtieth international conference on Very large data bases-Volume 30. VLDB Endowment, pp 180–191
Krishnan NC, Cook DJ (2014) Activity recognition on streaming sensor data. Pervasive Mobile Comp 10
Kwapisz JR, Weiss GM, Moore SA (2011) Activity recognition using cell phone accelerometers. ACM SigKDD Explor Newsl 12(2):74–82
Last M (2002) Online classification of nonstationary data streams. Intel Data Anal 6(2):129–147
Lockhart JW, Weiss GM (2014) The benefits of personalized smartphone-based activity recognition models. In: Proceedings of the 2014 SIAM International Conference on Data Mining, pp 614–622
Lughofer E, Angelov P (2011) Handling drifts and shifts in on-line data streams with evolving fuzzy systems. Appl Soft Comp 11(2):2057–2068
Luštrek M, Kaluža B (2009) Fall detection and activity recognition with machine learning. Informatica (Slovenia) 33(2):197–204
Marsland S, Nehmzow U, Shapiro J (2005) On-line novelty detection for autonomous mobile robots. Robot Auton Syst 51(2):191–206
Masud MM, Al-Khateeb TM, Khan L, Aggarwal C, Gao J, Han J, Thuraisingham B (2011a) Detecting recurring and novel classes in concept-drifting data streams. In: Proceedings of the 2011 IEEE 11th International Conference on Data Mining, ICDM ’11, Washington, DC, USA. IEEE Computer Society, pp 1176–1181
Masud MM, Chen Q, Khan L, Aggarwal CC, Gao J, Han J, Srivastava A, Oza NC (2013) Classification and adaptive novel class detection of feature-evolving data streams. IEEE Trans Knowl Data Eng 25(7):1484–1497
Masud MM, Gao J, Khan L, Han J, Thuraisingham B (2011b) Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng 23(6):859–874
Mubashir M, Shao L, Seed L (2013) A survey on fall detection: principles and approaches. Neurocomputing 100:144–152
Nguyen LT, Zeng M, Tague P, Zhang J (2015) Recognizing new activities with limited training data. In: Proceedings of the 2015 ACM International Symposium on Wearable Computers, ACM, pp 67–74
Niennattrakul V, Keogh E, Ratanamahatana CA (2010). Data editing techniques to allow the application of distance-based outlier detection to streams. In Proceedings of the 2010 IEEE International Conference on Data Mining, ICDM ’10, Washington, DC, USA, IEEE Computer Society, pp 947–952
Peterek T, Penhaker M, Gajdo P, Dohnlek P (2014) Comparison of classification algorithms for physical activity recognition. In: Innovations in Bio-inspired Computing and Applications, volume 237 of Advances in Intelligent Systems and Computing, Springer International Publishing, pp 123–131
Pimentel MA, Clifton DA, Clifton L, Tarassenko L (2014) A review of novelty detection. Signal Process 99:215–249
Pokrajac D, Lazarevic A, Latecki LJ (2007) Incremental local outlier detection for data streams. In IEEE Symposium on Computational Intelligence and Data Mining, CIDM, IEEE, pp 504–515
Preece SJ, Goulermas JY, Kenney LP, Howard D, Meijer K, Crompton R (2009) Activity identification using body-mounted sensorsa review of classification techniques. Physiol Meas 30(4):1–33
Rashidi P, Cook DJ (2010) Mining sensor streams for discovering human activity patterns over time. In IEEE 10th International Conference on Data Mining (ICDM), pp 431–440
Roggen D, Förster K, Calatroni A, Holleczek T, Fang Y, Tröster G, Lukowicz P, Pirkl G, Bannach D, Kunze K, Ferscha A, Holzmann C, Riener A, Chavarriaga R, del R. Millán J (2009) Opportunity: towards opportunistic activity and context recognition systems. In World of Wireless, Mobile and Multimedia Networks Workshops. WoWMoM. IEEE International Symposium on a, pages 1–6
Schlimmer JC, Granger RH Jr (1986) Incremental learning from noisy data. Mach Learn 1(3):317–354
Spinosa EJ, Carvalho ACPLF, Gama JA (2007) Olindda: a cluster-based approach for detecting novelty and concept drift in data streams. In Proceedings of the 2007 ACM Symposium on Applied Computing, SAC ’07, New York, NY, USA. ACM, pp 448–452
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2 edn, Morgan Kaufmann, San Francisco
Yang Y, Zhang J, Carbonell J, Jin C (2002) Topic-conditioned novelty detection. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’02, New York, NY, USA. ACM, pp 688–693
Yeung DY, Chow C (2002) Parzen-window network intrusion detectors. In: 16th International Conference on Pattern Recognition, 2002. Proceedings, vol 4, pp 385–388
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Abdallah, Z.S., Gaber, M.M., Srinivasan, B. et al. AnyNovel: detection of novel concepts in evolving data streams. Evolving Systems 7, 73–93 (2016). https://doi.org/10.1007/s12530-016-9147-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12530-016-9147-7