Encyclopedia of Big Data Technologies

Living Edition
| Editors: Sherif Sakr, Albert Zomaya

Adaptive Windowing

  • Ricard Gavaldà
Living reference work entry
DOI: https://doi.org/10.1007/978-3-319-63962-8_194-1

Synonyms

Definitions

Adaptive Windowing is a technique used for the online analysis of data streams to manage changes in the distribution of the data. It uses the standard idea of sliding window over the data, but, unlike other approaches, the size of the window is not fixed and set a priori but changed dynamically as a function of the data. The window is maintained at all times to the maximum length consistent with the assumption that there is no change in the data contained in it.

Context

Many modern sources of data are best viewed as data streams: a potentially infinite sequence of data items that arrive one at a time, usually at high and uncontrollable speed. One wants to perform various analysis tasks on the stream in an online, rather than batch, fashion. Among these tasks, many consist of building models such as creating a predictor, forming clusters, or discovering frequent patterns. The source of data may evolve over time, that is, its statistical properties...

This is a preview of subscription content, log in to check access.

References

  1. Bakker J, Pechenizkiy M, Sidorova N (2011) What’s your current stress level? Detection of stress patterns from GSR sensor data. In: 2011 IEEE 11th international conference on data mining workshops (ICDMW), Vancouver, 11 Dec 2011, pp 573–580.  https://doi.org/10.1109/ICDMW.2011.178
  2. Basseville M, Nikiforov IV (1993) Detection of abrupt changes: theory and application. Prentice-Hall, Upper Saddle River. http://people.irisa.fr/Michele.Basseville/kniga/. Accessed 21 May 2017
  3. Bifet A (2010) Adaptive stream mining: pattern learning and mining from evolving data streams, frontiers in artificial intelligence and applications, vol 207. IOS Press. http://www.booksonline.iospress.nl/Content/View.aspx?piid=14470 zbMATHGoogle Scholar
  4. Bifet A, Gavaldà R (2007) Learning from time-changing data with adaptive windowing. In: Proceedings of the seventh SIAM international conference on data mining, 26–28 Apr 2007, Minneapolis, pp 443–448. https://doi.org/10.1137/1.9781611972771.42
  5. Bifet A, Gavaldà R (2009) Adaptive learning from evolving data streams. In: Advances in intelligent data analysis VIII, proceedings of the 8th international symposium on intelligent data analysis, IDA 2009, Lyon, Aug 31–Sept 2 2009, pp 249–260. https://doi.org/10.1007/978-3-642-03915-7_22
  6. Bifet A, Gavaldà R (2011) Mining frequent closed trees in evolving data streams. Intell Data Anal 15(1):29–48.  https://doi.org/10.3233/IDA-2010-0454 Google Scholar
  7. Bifet A, Holmes G, Pfahringer B, Gavaldà R (2009a) Improving adaptive bagging methods for evolving data streams. In: Advances in machine learning, proceedings of the first Asian conference on machine learning, ACML 2009, Nanjing, 2-4 Nov 2009, pp 23–37. https://doi.org/10.1007/978-3-642-05224-8_4
  8. Bifet A, Holmes G, Pfahringer B, Kirkby R, Gavaldà R (2009b) New ensemble methods for evolving data streams. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’09. ACM, New York, pp 139–148. http://doi.acm.org/10.1145/1557019.1557041 CrossRefGoogle Scholar
  9. Bifet A, Holmes G, Pfahringer B (2010a) Leveraging bagging for evolving data streams. In: European conference on machine learning and knowledge discovery in databases, proceedings, part I of the ECML PKDD 2010, Barcelona, 20–24 Sept 2010, pp 135–150. https://doi.org/10.1007/978-3-642-15880-3_15
  10. Bifet A, Holmes G, Pfahringer B, Frank E (2010b) Fast perceptron decision tree learning from evolving data streams. In: Advances in knowledge discovery and data mining, proceedings, part II of the 14th Pacific-Asia conference, PAKDD 2010, Hyderabad, 21-24 June 2010, pp 299–310. https://doi.org/10.1007/978-3-642-13672-6_30
  11. Bifet A, Holmes G, Pfahringer B, Gavaldà R (2011a) Detecting sentiment change in twitter streaming data. In: Proceedings of the second workshop on applications of pattern analysis, WAPA 2011, Castro Urdiales, 19-21 Oct 2011, pp 5–11. http://jmlr.csail.mit.edu/proceedings/papers/v17/bifet11a/bifet11a.pdf
  12. Bifet A, Holmes G, Pfahringer B, Gavaldà R (2011b) Mining frequent closed graphs on evolving data streams. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’11. ACM, New York, pp 591–599. http://doi.acm.org/10.1145/2020408.2020501 Google Scholar
  13. Bifet A, Frank E, Holmes G, Pfahringer B (2012) Ensembles of restricted hoeffding trees. ACM TIST 3(2):30:1–30:20. http://doi.acm.org/10.1145/2089094.2089106
  14. Bifet A, Gavaldà R, Holmes G, Pfahringer B (2018) Machine learning for data streams, with practical examples in MOA. MIT Press, Cambridge. https://mitpress.mit.edu/books/machine-learning-data-streams Google Scholar
  15. Carmona J, Gavaldà R (2012) Online techniques for dealing with concept drift in process mining. In: Advances in intelligent data analysis XI – proceedings of the 11th international symposium, IDA 2012, Helsinki, 25-27 Oct 2012, pp 90–102. https://doi.org/10.1007/978-3-642-34156-4_10
  16. Cohen L, Avrahami-Bakish G, Last M, Kandel A, Kipersztok O (2008) Real-time data mining of non-stationary data streams from sensor networks. Info Fusion 9(3):344–353. https://doi.org/10.1016/j.inffus.2005.05.005 CrossRefGoogle Scholar
  17. Datar M, Gionis A, Indyk P, Motwani R (2002) Maintaining stream statistics over sliding windows. SIAM J Comput 31(6):1794–1813. http://dx.doi.org/10.1137/S0097539701398363 MathSciNetCrossRefzbMATHGoogle Scholar
  18. Ditzler G, Roveri M, Alippi C, Polikar R (2015) Learning in nonstationary environments: a survey IEEE Comp Int Mag 10(4):12–25.  https://doi.org/10.1109/MCI.2015.2471196 CrossRefGoogle Scholar
  19. Gama J, Medas P, Castillo G, Rodrigues PP (2004) Learning with drift detection. In: Advances in artificial intelligence – SBIA 2004, proceedings of the 17th Brazilian symposium on artificial intelligence, São Luis, Sept 29–Oct 1 2004, pp 286–295. https://doi.org/10.1007/978-3-540-28645-5_29
  20. Gama J, Sebastião R, Rodrigues PP (2009) Issues in evaluation of stream learning algorithms. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 09), Paris, June 28–July 1 2009, pp 329–338. http://doi.acm.org/10.1145/1557019.1557060
  21. Gama J, žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44:1–44:37. http://doi.acm.org/10.1145/2523813
  22. Hulten G, Spencer L, Domingos PM (2001) Mining time-changing data streams. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining (KDD 01), San Francisco, 26–29 Aug 2001, pp 97–106. http://portal.acm.org/citation.cfm?id=502512.502529
  23. Kuncheva LI, žliobaitė I (2009) On the window size for classification in changing environments. Intell Data Anal 13(6):861–872.  https://doi.org/10.3233/IDA-2009-0397
  24. Last M (2002) Online classification of nonstationary data streams. Intell Data Anal 6(2):129–147. http://content.iospress.com/articles/intelligent-data-analysis/ida00083 zbMATHGoogle Scholar
  25. Muthukrishnan S, van den Berg E, Wu Y (2007) Sequential change detection on data streams. In: Workshops proceedings of the 7th IEEE international conference on data mining (ICDM 2007), 28–31 Oct 2007, Omaha, pp 551–550. http://dx.doi.org/10.1109/ICDMW.2007.89
  26. Papapetrou O, Garofalakis MN, Deligiannakis A (2015) Sketching distributed sliding-window data streams. VLDB J 24(3):345–368. http://dx.doi.org/10.1007/s00778-015-0380-7 CrossRefGoogle Scholar
  27. Pechenizkiy M, Bakker J, žliobaitė I, Ivannikov A, Kärkkäinen T (2009) Online mass flow prediction in CFB boilers with explicit detection of sudden concept drift. SIGKDD Explor 11(2):109–116. http://doi.acm.org/10.1145/1809400.1809423
  28. Ross GJ, Adams NM, Tasoulis DK, Hand DJ (2012) Exponentially weighted moving average charts for detecting concept drift. Pattern Recogn Lett 33(2):191–198. http://dx.doi.org/10.1016/j.patrec.2011.08.019, erratum in Pattern Recogn Lett 33(16):2261 (2012)
  29. Talavera E, Dimiccoli M, Bolaños M, Aghaei M, Radeva P (2015) R-clustering for egocentric video segmentation. In: Pattern recognition and image analysis – 7th Iberian conference, proceedings of the IbPRIA 2015, Santiago de Compostela, 17-19 June 2015, pp 327–336. https://doi.org/10.1007/978-3-319-19390-8_37

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Universitat Politècnica de CatalunyaBarcelonaSpain

Section editors and affiliations

  • Alessandro Margara
    • 1
  • Tilmann Rabl
    • 2
  1. 1.Politecnico di Milano
  2. 2.Database Systems and Information Management GroupTechnische Universität BerlinBerlinGermany