Skip to main content

Machine Learning for Streaming Data: Overview, Applications and Challenges

  • Conference paper
  • First Online:
Applied Advanced Analytics

Part of the book series: Springer Proceedings in Business and Economics ((SPBE))

Abstract

This chapter gives a brief overview of machine learning for streaming data by establishing the need for special algorithms suited for prediction tasks for data streams, why conventional batch learning methods are not adequate, followed by applications in various business domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Aggarwal, C. C. (Ed.). (2007). Data streams: Models and algorithms (Vol. 31). Springer Science & Business Media.

    Google Scholar 

  • Aggarwal, C. C., & Yu, P. S. (2005, April). Online analysis of community evolution in data streams. In Proceedings of the 2005 SIAM International Conference on Data Mining (pp. 56–67). Society for Industrial and Applied Mathematics.

    Google Scholar 

  • Baena-Garcia, M., del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavalda, R., & Morales-Bueno, R. (2006, September). Early drift detection method. In Fourth international workshop on knowledge discovery from data streams (Vol. 6, pp. 77–86).

    Google Scholar 

  • Bajwa, R., Rajagopal, R., Varaiya, P., & Kavaler, R. (2011, April). In-pavement wireless sensor network for vehicle classification. In Proceedings of the 10th ACM/IEEE International Conference on Information Processing in Sensor Networks (pp. 85–96). IEEE.

    Google Scholar 

  • Bifet, A., & Gavalda, R. (2007, April). Learning from time-changing data with adaptive windowing. In Proceedings of the 2007 SIAM international conference on data mining (pp. 443–448). Society for Industrial and Applied Mathematics.

    Google Scholar 

  • Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., & Gavaldà, R. (2009, June). New ensemble methods for evolving data streams. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 139–148). ACM.

    Google Scholar 

  • Bobadilla, J., Ortega, F., Hernando, A., & Gutiérrez, A. (2013). Recommender systems survey. Knowledge-based systems, 46, 109–132.

    Google Scholar 

  • Boukhechba, M., Bouzouane, A., Bouchard, B., Gouin-Vallerand, C., & Giroux, S. (2015). Online prediction of people’s next Point-of-Interest: Concept drift support. In Human Behavior Understanding (pp. 97–116). Springer, Cham.

    Google Scholar 

  • Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys (CSUR), 41(3), 15.

    Article  Google Scholar 

  • Chang, S., Zhang, Y., Tang, J., Yin, D., Chang, Y., Hasegawa-Johnson, M. A., & Huang, T. S. (2017, April). Streaming recommender systems. In Proceedings of the 26th International Conference on World Wide Web (pp. 381–389). International World Wide Web Conferences Steering Committee.

    Google Scholar 

  • Columbus, L. (2019, March). Roundup of Machine Learning Forecasts and Market Estimates for 2019. Forbes. Retrieved from https://www.forbes.com/sites/louiscolumbus/2019/03/27/roundup-of-machine-learning-forecasts-and-market-estimates-2019/#206e54247695

  • Domingos, P. M. (2012). A few useful things to know about machine learning. Communications of the ACM, 55(10), 78–87.

    Article  Google Scholar 

  • Domingos, P., & Hulten, G. (2000, August). Mining high-speed data streams. In Kdd (Vol. 2, p. 4).

    Google Scholar 

  • Faria, E. R., Gama, J., & Carvalho, A. C. (2013, March). Novelty detection algorithm for data streams multi-class problems. In Proceedings of the 28th annual ACM symposium on applied computing (pp. 795–800). ACM.

    Google Scholar 

  • Gama, J., Medas, P., Castillo, G., & Rodrigues, P. (2004, September). Learning with drift detection. In Brazilian symposium on artificial intelligence (pp. 286–295). Springer, Berlin, Heidelberg.

    Google Scholar 

  • Hastie T. T. R., & Friedman, J. H. (2003). Elements of statistical learning: data mining, inference, and prediction.

    Google Scholar 

  • Hayat, M. Z., Basiri, J., Seyedhossein, L., & Shakery, A. (2010, December). Content-based concept drift detection for email spam filtering. In 2010 5th International Symposium on Telecommunications (pp. 531–536). IEEE.

    Google Scholar 

  • Huang, H., Cheng, Y., & Weibel, R. (2019). Transport mode detection based on mobile phone network data: A systematic review. Transportation Research Part C: Emerging Technologies.

    Google Scholar 

  • Ikonomovska, E., & Gama, J. (2008, October). Learning model trees from data streams. In International Conference on Discovery Science (pp. 52–63). Springer, Berlin, Heidelberg.

    Google Scholar 

  • Ikonomovska, E., Gama, J., & Džeroski, S. (2015). Online tree-based ensembles and option trees for regression on evolving data streams. Neurocomputing, 150, 458–470.

    Article  Google Scholar 

  • Ikonomovska, E., Gama, J., Sebastião, R., & Gjorgjevik, D. (2009, October). Regression trees from data streams with drift detection. In International Conference on Discovery Science (pp. 121–135). Springer, Berlin, Heidelberg.

    Google Scholar 

  • Khamassi, I., Sayed-Mouchaweh, M., Hammami, M., & Ghédira, K. (2018). Discussion and review on evolving data streams and concept drift adapting. Evolving Systems, 9(1), 1–23.

    Article  Google Scholar 

  • Kolter, J. Z., & Maloof, M. A. (2003, November). Dynamic weighted majority: A new ensemble method for tracking concept drift. In Third IEEE international conference on data mining (pp. 123–130). IEEE.

    Google Scholar 

  • Kourtellis, N., Morales, G. D. F., Bifet, A., & Murdopo, A. (2016, December). Vht: Vertical hoeffding tree. In 2016 IEEE International Conference on Big Data (Big Data) (pp. 915–922). IEEE.

    Google Scholar 

  • Laha, A. K., & Putatunda, S. (2018). Real time location prediction with taxi-GPS data streams. Transportation Research Part C: Emerging Technologies, 92, 298–322.

    Article  Google Scholar 

  • Laney, D. (2001). 3D data management: Controlling data volume, velocity and variety. META Group Research Note, 6(70), 1.

    Google Scholar 

  • Masud, M., Gao, J., Khan, L., Han, J., & Thuraisingham, B. M. (2010). Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Transactions on Knowledge and Data Engineering, 23(6), 859–874.

    Article  Google Scholar 

  • Mazhelis, O., & Puuronen, S. (2007, April). Comparing classifier combining techniques for mobile-masquerader detection. In The Second International Conference on Availability, Reliability and Security (ARES'07) (pp. 465–472). IEEE.

    Google Scholar 

  • Moreira-Matias, L., Gama, J., Ferreira, M., Mendes-Moreira, J., & Damas, L. (2013). Predicting taxi–passenger demand using streaming data. IEEE Transactions on Intelligent Transportation Systems, 14(3), 1393–1402.

    Article  Google Scholar 

  • Nasraoui, O., Cerwinske, J., Rojas, C., & Gonzalez, F. (2007, April). Performance of recommendation systems in dynamic streaming environments. In Proceedings of the 2007 SIAM International Conference on Data Mining (pp. 569–574). Society for Industrial and Applied Mathematics.

    Google Scholar 

  • Page, E. S. (1954). Continuous inspection schemes. Biometrika, 41(1/2), 100–115.

    Article  Google Scholar 

  • Parthasarathy, S., Ghoting, A., & Otey, M. E. (2007). A survey of distributed mining of data streams. In Data Streams (pp. 289–307). Springer, Boston, MA.

    Google Scholar 

  • Parveen, P., Evans, J., Thuraisingham, B., Hamlen, K. W., & Khan, L. (2011, October). Insider threat detection using stream mining and graph mining. In 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing (pp. 1102–1110). IEEE.

    Google Scholar 

  • Sethi, T. S., Kantardzic, M., & Hu, H. (2016). A grid density based framework for classifying streaming data in the presence of concept drift. Journal of Intelligent Information Systems, 46(1), 179–211.

    Article  Google Scholar 

  • Spinosa, E. J., de Leon F de Carvalho, A. P., & Gama, J. (2007, March). Olindda: A cluster-based approach for detecting novelty and concept drift in data streams. In Proceedings of the 2007 ACM symposium on Applied computing (pp. 448–452). ACM.

    Google Scholar 

  • Street, W. N., & Kim, Y. (2001, August). A streaming ensemble algorithm (SEA) for large-scale classification. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 377–382). ACM.

    Google Scholar 

  • Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and Policy Considerations for Deep Learning in NLP. arXiv preprint arXiv:1906.02243.

  • Sun, Y., Tang, K., Minku, L. L., Wang, S., & Yao, X. (2016). Online ensemble learning of data streams with gradually evolved classes. IEEE Transactions on Knowledge and Data Engineering, 28(6), 1532–1545.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shikha Verma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Verma, S. (2021). Machine Learning for Streaming Data: Overview, Applications and Challenges. In: Laha, A.K. (eds) Applied Advanced Analytics. Springer Proceedings in Business and Economics. Springer, Singapore. https://doi.org/10.1007/978-981-33-6656-5_1

Download citation

Publish with us

Policies and ethics