Abstract
This chapter gives a brief overview of machine learning for streaming data by establishing the need for special algorithms suited for prediction tasks for data streams, why conventional batch learning methods are not adequate, followed by applications in various business domains.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C. C. (Ed.). (2007). Data streams: Models and algorithms (Vol. 31). Springer Science & Business Media.
Aggarwal, C. C., & Yu, P. S. (2005, April). Online analysis of community evolution in data streams. In Proceedings of the 2005 SIAM International Conference on Data Mining (pp. 56–67). Society for Industrial and Applied Mathematics.
Baena-Garcia, M., del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavalda, R., & Morales-Bueno, R. (2006, September). Early drift detection method. In Fourth international workshop on knowledge discovery from data streams (Vol. 6, pp. 77–86).
Bajwa, R., Rajagopal, R., Varaiya, P., & Kavaler, R. (2011, April). In-pavement wireless sensor network for vehicle classification. In Proceedings of the 10th ACM/IEEE International Conference on Information Processing in Sensor Networks (pp. 85–96). IEEE.
Bifet, A., & Gavalda, R. (2007, April). Learning from time-changing data with adaptive windowing. In Proceedings of the 2007 SIAM international conference on data mining (pp. 443–448). Society for Industrial and Applied Mathematics.
Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., & Gavaldà, R. (2009, June). New ensemble methods for evolving data streams. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 139–148). ACM.
Bobadilla, J., Ortega, F., Hernando, A., & Gutiérrez, A. (2013). Recommender systems survey. Knowledge-based systems, 46, 109–132.
Boukhechba, M., Bouzouane, A., Bouchard, B., Gouin-Vallerand, C., & Giroux, S. (2015). Online prediction of people’s next Point-of-Interest: Concept drift support. In Human Behavior Understanding (pp. 97–116). Springer, Cham.
Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys (CSUR), 41(3), 15.
Chang, S., Zhang, Y., Tang, J., Yin, D., Chang, Y., Hasegawa-Johnson, M. A., & Huang, T. S. (2017, April). Streaming recommender systems. In Proceedings of the 26th International Conference on World Wide Web (pp. 381–389). International World Wide Web Conferences Steering Committee.
Columbus, L. (2019, March). Roundup of Machine Learning Forecasts and Market Estimates for 2019. Forbes. Retrieved from https://www.forbes.com/sites/louiscolumbus/2019/03/27/roundup-of-machine-learning-forecasts-and-market-estimates-2019/#206e54247695
Domingos, P. M. (2012). A few useful things to know about machine learning. Communications of the ACM, 55(10), 78–87.
Domingos, P., & Hulten, G. (2000, August). Mining high-speed data streams. In Kdd (Vol. 2, p. 4).
Faria, E. R., Gama, J., & Carvalho, A. C. (2013, March). Novelty detection algorithm for data streams multi-class problems. In Proceedings of the 28th annual ACM symposium on applied computing (pp. 795–800). ACM.
Gama, J., Medas, P., Castillo, G., & Rodrigues, P. (2004, September). Learning with drift detection. In Brazilian symposium on artificial intelligence (pp. 286–295). Springer, Berlin, Heidelberg.
Hastie T. T. R., & Friedman, J. H. (2003). Elements of statistical learning: data mining, inference, and prediction.
Hayat, M. Z., Basiri, J., Seyedhossein, L., & Shakery, A. (2010, December). Content-based concept drift detection for email spam filtering. In 2010 5th International Symposium on Telecommunications (pp. 531–536). IEEE.
Huang, H., Cheng, Y., & Weibel, R. (2019). Transport mode detection based on mobile phone network data: A systematic review. Transportation Research Part C: Emerging Technologies.
Ikonomovska, E., & Gama, J. (2008, October). Learning model trees from data streams. In International Conference on Discovery Science (pp. 52–63). Springer, Berlin, Heidelberg.
Ikonomovska, E., Gama, J., & Džeroski, S. (2015). Online tree-based ensembles and option trees for regression on evolving data streams. Neurocomputing, 150, 458–470.
Ikonomovska, E., Gama, J., Sebastião, R., & Gjorgjevik, D. (2009, October). Regression trees from data streams with drift detection. In International Conference on Discovery Science (pp. 121–135). Springer, Berlin, Heidelberg.
Khamassi, I., Sayed-Mouchaweh, M., Hammami, M., & Ghédira, K. (2018). Discussion and review on evolving data streams and concept drift adapting. Evolving Systems, 9(1), 1–23.
Kolter, J. Z., & Maloof, M. A. (2003, November). Dynamic weighted majority: A new ensemble method for tracking concept drift. In Third IEEE international conference on data mining (pp. 123–130). IEEE.
Kourtellis, N., Morales, G. D. F., Bifet, A., & Murdopo, A. (2016, December). Vht: Vertical hoeffding tree. In 2016 IEEE International Conference on Big Data (Big Data) (pp. 915–922). IEEE.
Laha, A. K., & Putatunda, S. (2018). Real time location prediction with taxi-GPS data streams. Transportation Research Part C: Emerging Technologies, 92, 298–322.
Laney, D. (2001). 3D data management: Controlling data volume, velocity and variety. META Group Research Note, 6(70), 1.
Masud, M., Gao, J., Khan, L., Han, J., & Thuraisingham, B. M. (2010). Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Transactions on Knowledge and Data Engineering, 23(6), 859–874.
Mazhelis, O., & Puuronen, S. (2007, April). Comparing classifier combining techniques for mobile-masquerader detection. In The Second International Conference on Availability, Reliability and Security (ARES'07) (pp. 465–472). IEEE.
Moreira-Matias, L., Gama, J., Ferreira, M., Mendes-Moreira, J., & Damas, L. (2013). Predicting taxi–passenger demand using streaming data. IEEE Transactions on Intelligent Transportation Systems, 14(3), 1393–1402.
Nasraoui, O., Cerwinske, J., Rojas, C., & Gonzalez, F. (2007, April). Performance of recommendation systems in dynamic streaming environments. In Proceedings of the 2007 SIAM International Conference on Data Mining (pp. 569–574). Society for Industrial and Applied Mathematics.
Page, E. S. (1954). Continuous inspection schemes. Biometrika, 41(1/2), 100–115.
Parthasarathy, S., Ghoting, A., & Otey, M. E. (2007). A survey of distributed mining of data streams. In Data Streams (pp. 289–307). Springer, Boston, MA.
Parveen, P., Evans, J., Thuraisingham, B., Hamlen, K. W., & Khan, L. (2011, October). Insider threat detection using stream mining and graph mining. In 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing (pp. 1102–1110). IEEE.
Sethi, T. S., Kantardzic, M., & Hu, H. (2016). A grid density based framework for classifying streaming data in the presence of concept drift. Journal of Intelligent Information Systems, 46(1), 179–211.
Spinosa, E. J., de Leon F de Carvalho, A. P., & Gama, J. (2007, March). Olindda: A cluster-based approach for detecting novelty and concept drift in data streams. In Proceedings of the 2007 ACM symposium on Applied computing (pp. 448–452). ACM.
Street, W. N., & Kim, Y. (2001, August). A streaming ensemble algorithm (SEA) for large-scale classification. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 377–382). ACM.
Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and Policy Considerations for Deep Learning in NLP. arXiv preprint arXiv:1906.02243.
Sun, Y., Tang, K., Minku, L. L., Wang, S., & Yao, X. (2016). Online ensemble learning of data streams with gradually evolved classes. IEEE Transactions on Knowledge and Data Engineering, 28(6), 1532–1545.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Verma, S. (2021). Machine Learning for Streaming Data: Overview, Applications and Challenges. In: Laha, A.K. (eds) Applied Advanced Analytics. Springer Proceedings in Business and Economics. Springer, Singapore. https://doi.org/10.1007/978-981-33-6656-5_1
Download citation
DOI: https://doi.org/10.1007/978-981-33-6656-5_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-33-6655-8
Online ISBN: 978-981-33-6656-5
eBook Packages: Business and ManagementBusiness and Management (R0)