Data-driven decision support under concept drift in streamed big data

Data-driven decision-making (D3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm {D^3}$$\end{document}M) is often confronted by the problem of uncertainty or unknown dynamics in streaming data. To provide real-time accurate decision solutions, the systems have to promptly address changes in data distribution in streaming data—a phenomenon known as concept drift. Past data patterns may not be relevant to new data when a data stream experiences significant drift, thus to continue using models based on past data will lead to poor prediction and poor decision outcomes. This position paper discusses the basic framework and prevailing techniques in streaming type big data and concept drift for D3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm {D^3}$$\end{document}M. The study first establishes a technical framework for real-time D3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm {D^3}$$\end{document}M under concept drift and details the characteristics of high-volume streaming data. The main methodologies and approaches for detecting concept drift and supporting D3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm {D^3}$$\end{document}M are highlighted and presented. Lastly, further research directions, related methods and procedures for using streaming data to support decision-making in concept drift environments are identified. We hope the observations in this paper could support researchers and professionals to better understand the fundamentals and research directions of D3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm {D^3}$$\end{document}M in streamed big data environments.


Introduction
Organizational decision-making is to find an optimal or the most satisfactory solution for a decision problem.These decision problems have various types, from daily operational decisions to long-term strategy business decisions, from an internal single decision to a multi-level decision or a multi-organizational decision [41].Different decisionmaking tasks may have different features and, therefore, are normally modeled in different forms or presented by different methods and solved by different decision-making techniques.
In general, organizational decision problems can be classified by their natures.The classic classification is based on a given problem's structure, i.e., structured, semi-structured and unstructured [34].The last two are also called illstructured.A structured decision problem can be described by classic mathematical models, such as linear programming or statistics methods.The procedure for obtaining the optimal solution is known as standard solution methods.An unstructured decision problem is fuzzy, uncertain and vague, for which there is no standard solution method to get an optimal solution or such an optimal solution does not exist.Semi-structured decision problems fall between structured and unstructured problems, having both structured and unstructured features, and reflecting most real-world situations.Conventional decision support techniques performance well on solving structured decision problems, but cannot solve ill-structured decision problems.Data-driven decisionmaking (D 3 M) techniques or called machine-learning-based decision-making techniques are more suitable for an illstructured decision problem and for decision making in dynamic and complex situations.
Recent years, various data sources (datasets, data warehouses, databases, data streams, etc.) become available to form a Big Data environment.Many decision problems, particularly ill-structured, can be well solved by findings obtained from data through data mining, data analysis and machine learning, that is D 3 M techniques [41].Various D 3 M techniques including models, methods, algorithms and software tools have been developed through learning from big data.As a result, conventional decision-making or decision support systems (DSSs) have evolved in line with the increasing availability of data and computational power.Current D 3 M techniques are capable to generate decision options through collected data from databases or data warehouses, and to provide queries and management reports according to decision-makers' requirements.However, they are inadequate for supporting highly dynamic (rapid change) decision situations which require fast responses to the changes.A very recent survey [21] has pointed out that a dynamic environment with uncertainty (concept drift) is an inherent property of big streaming data.These unavoidable rapid changes in decision environments, e.g., new markets, new products and new customer behaviors, inevitably results in changes in the underlying data distribution in data streams.These changes are known as concept drift and may result in poor prediction and poor decision outcomes, as the pattern of past data does not conform to that of newly arrived data.How to maintain the effectiveness of a DSS under concept drift for big streaming data is a challenging research question, and developing a new generation of adaptive DSS for real-time decision-making is an urgent requirement.In other words, self-learning and selfadaptive features are important characteristics for the next generation of D 3 M and D 3

M-based DSSs
This paper presents our position on DSSs in the context of big streaming data containing concept drift.We present each challenge and discuss their implications.The rest of the paper is organized as follows.Section 2 summarizes the adaptive decision-making framework.Section 3 highlights the characteristics and the challenges of streamed big data.Section 4 analyzes the existing work on big streaming data and introduces essential work that has not yet been done.Section 5 presents our position on the future directions for real-time decision support under concept drift.Lastly, Sect.6 presents our summary of this paper.

Adaptive decision-making framework
This section presents the general framework of adaptive decision-making.

Data-driven decision-making
Data-driven decision-making uses a variety of machine learning approaches for data analysis by characterizing a decision problem and ascertaining the connections between the problem variables (input, internal and output variables) without having explicit knowledge of the physical behavior of the decision model.
Adaptive decision-making to address the concept drift problem has gained considerable attention.Concept drift detection and adaptation is an effective strategy for improving the accuracy of decision-making in a dynamic data streaming environment.When a drift is detected, machine learning techniques are applied to adapt decision models to new concepts.The components of an adaptive data-driven decision-making framework are introduced in the next section.

An adaptive data-driven decision-making framework under concept drift
Data-driven decision support under concept drift in highvolume streaming data has three major components as shown in Fig. 1.The first collects raw data from various sources and reformats them to unify the time frame and feature space, so that they can be applied to modeling and constructing the training data.The second detects and interprets the changes in data streams over time.If the most recently arrived data significantly conflicts with the historical data, a concept drift will be reported and an adaptation process will be triggered.The third component is adaptive decision-making.In this component, DSSs are actively updated according to the results of the drift detection and understanding.The data-driven decision process under this framework is as follows.Drift detection part will detect drift, once a drift is identified, it will notify the system.The drift understanding will be then initialized to target the drift and propose possible drift resolve solutions.To help finding a better solution, the system will interpret the drift from When the drift occurred, How significant the drift is, and Where the drift is located.Drift responses and adaptation are dependent on the types of DSSs.For a model-based DSS, adaptive decision-making could be able to react to drift by, e.g., updating an optimization model's parameters; for a knowledge-based DSS, adaptive decision-making could be able to react to drift by, e.g., updating a knowledge base.In a data-driven DSS, adaptive decision-making involves, e.g., retraining a prediction model.In this paper, we only focus on data-driven DSSs.

Streamed big data
This section discusses the characteristics of big streaming data and the challenges of learning under big streaming data.

Characteristics of streamed big data
Big data is an outcome of the current information explosion that is relevant to a diverse range of fields in the natural, life, social, and applied science, including physics, biology, medicine, economics and management [26].Big data has been widely characterized by the three Vs [15]: a hugely increased Volume of data, a Variety of data sources and quality, and the high Velocity at which data is generated or obtained.Big data technology holds incredible promise for improving people's lives, accelerating scientific discovery and innovation, and instigating positive societal change [7].Meanwhile, new challenges accompanying the heterogeneity, incompleteness, scale, timeliness, privacy and process complexity of big data, including aspects of data acquisition, data storage, information extraction, and big data analysis, need to be overcome [18].Further three Vs are now recognized as the development of big data analysis: Veracity, which focuses on the unreliability inherent in data sources; Variability, which refers to variations in data flow rates; and Value, which refers to the issue of low value density [8,9,13].

Challenges in streamed big data
Eight big streaming data challenges were discussed in [17], covering the cycle of knowledge discovery from data.We consider these challenges from three aspects: (1) the development of new data mining skills for big streaming data; (2) the development of simpler, self-adaptive machine learning algorithms; and (3) the requirements of privacy and confidentiality for gaining trust of the users and society in the system.
As data evolves over time, the validity and reliability of the historical data are questionable.Decision support for big streaming data has to consider these issues to perform accurate, up-to-date, real-time analysis.For example, the detection of highway flooding [28].Although data streams, online learning, big data, and adaptation to concept drift have become important research topics during the last decade, a truly autonomous, self-maintaining, adaptive data mining system is still lacking [17].The short lifespan of data restricts us to storing and accessing all historical data during each processing cycle; however, processing accuracy has been strictly limited by the fact that the data can be accessed only once (one-pass setting).This is critical when concept drift occurs, because good and bad data samples are treated equally when they are used to learn a new concept.Computing resources such as hardware and storage space have been developed to be more efficient and effective, therefore, it is more practical to adopt a limited storage assumption rather than a zerostorage assumption when discussing decision support for high-volume streaming data.In addition, previous decisions are no longer applicable when data evolve and have to be replaced according to the current situation.Therefore, when to make a decision change and how to conduct that change are two unsolved aspects of this problem, which become more difficult when multiple streams are involved.

Concept drift in streamed big data
This section presents the definition of concept drift and how to detect, understand and react it.Related real-world applications are also discussed.

Learning with concept drift
Learning with concept drift is an auxiliary research field of continuous learning, as discussed in [32], and has also been referred to as learning under a dynamic environment [12,37], or learning in a non-stationary environment [5].The research objective is to identify whether the model learnt from historical data is the same as that in the hypothesis set, which demonstrates the best performance on current concepts, where a concept is a mapping from input space to labels or target values.Concept drift can be caused by changes in data distribution, or training with misleading samples.Learning with high-volume streaming data requires particular attention to be paid to concept drift.
Concept drift can be categorized as sudden/abrupt drift, incremental drift, gradual drift or recurrent drift, according to When, How and Where: (1) When the drift occurs and how long it lasts; (2) How severe the drift is; and (3) Where the drift region is.These three criteria provide a three-dimensional perspective to describe concept drift.Drift adaptation strategies are thus specifically designed and applied to update models experiencing different types of drift.
Early concept drift studies mainly focused on drift point detection, addressing the When criterion by identifying when the empirical error exceeded the upper bound of an established model [2,11,29].Adaptation methods are to relearn the models or to use ensemble algorithms to adapt to new concepts [3,33,35].In recent years, drift point detection has been developed to cover more complicated cases, such as feature selection drift [4,39], region selection drift [19,20,25] and the detection of multi-layer drift [1,40].These developments address the Where criterion.Some drift detection techniques have similar objectives as multivariate two-sample tests, which compare the similarity between two distributions according to the available samples [6,19].A number of recent publications have considered the test statistics applied in two-sample tests as a measure for quantifying drift severity, addressing the issue of How (How severe the drift is).However, very few have proposed drift adaptation strategies that use the severity information to learn new concepts.
Learning with concept drift has three steps: drift detection, understanding drift, and drift adaptation.We will discuss the challenges of each step in the paragraphs that follow.

Drift detection
A wide range of algorithms for concept drift detection have been developed to identify the inconsistency between historical data and newly available data.False-positive and false-negative criteria are used to evaluate the performance of drift detection algorithms.Type I errors detect drifts with fewer false-positive detections, and Type II errors detect drifts with fewer false-negative detections.In the case of high-volume streaming data, this may be inadequate, since Velocity ensures that data arrives at a very fast pace and there may be insufficient time to collect labels or target values for drift detection.Drift detection algorithms must detect drift with a limited quantity of labeled samples, thus solutions that achieve the desired drift detection accuracy with the least number of samples are preferable.In other words, the convergence rate of algorithms should also be considered as an evaluation metric.Although active learning has been applied to solve this problem [42], solving the issue of Velocity is still an open question.

Drift understanding
Understanding drift is another key stage of learning under concept drift.It refers to retrieving information about the When, How, and Where of concept drift and is used to describe the status of concept drift.This information is learned and integrated after drift has been confirmed by drift detection methods or algorithms and is used as the input for knowledge adaptation.The need to understand drift has increasingly gained attention, as mentioned in [36], but very few concrete methods have been developed to quantify this information.

Drift adaptation
How to update existing learning models according to the characteristics of the drift is critical to achieve consistently high performance [30].This is called drift adaptation (or knowledge adaptation).Some adaptation methods explicitly rely on drift detection algorithms and adopt a variety of retraining strategies to better handle different types of drift [24].Others, mainly decision-tree-based methods, may not include a global drift detection procedure but can partially update models according to changes in some leaf node based on the newly available data.Ensemble learning for streaming data with concept drift has also achieved remarkable results [16]; however, integrating concept drift adaptation into incremental learning is still a challenging problem.Making better use of How and Where drift information in high-volume streaming data learning, rather than only When, is the next step in boosting learning performance.

Concept drift applications
Handling concept drift is highly important in real-world practice; for example, in traffic networks, telecommunications, and financial transactions.Machine learning tasks in these systems will inevitably encounter the problem of concept drift, and in some cases, the ability to handle concept drift will be the key factor in improving system performance.
A discussion of concept drift applications in industry can be found in [43].Drift detection applications in this context refer to the industrial requirement to diagnose significant internal and external environmental changes in industry trends or customer preferences, such as using drift detection technology to identify changes in the preferences of users [14].Similar tasks include fraud detection in finance, intrusion detection in computer security, mobile masquerade detection in telecommunications, topic changes in information document organization, and clinical studies in biomedicine.The aim of drift adaptation applications is to maintain a continuously effective evaluation and prediction system for industry.This may also involve using drift detection technologies to achieve greater accuracy.A real case example, in which a credit risk assessment framework for dynamic credit scoring was designed, is represented in [31].Other real-world drift adaptation applications can be found in transportation traffic management, production and service monitoring, customer recommendation, bankruptcy prediction, and so on.
With the latest developments in technology, data streams have become larger in size and faster.The new challenges posed by high-volume streaming data require the development of more advanced concept drift applications.One such challenge is how to handle concept drift problems in the Internet of Things (IoT) [43], since the huge quantity of streaming data from the IoT requires deeper insight and better understanding of concept drift.

Real-time decision support under concept drift: future directions
This section presents possible future research directions of real-time decision support under concept drift.

Adaptive decision support systems under concept drift
Streaming data are a set of continuous record of events.The volume of data is expanded by its time stamps, which can be infinite in number.Nowadays, streaming data has the capacity to track events for long periods at high frequency from mobile and/or embedded devices (e.g., sensors) [10].It can thus continuously capture the potential risk of an event by analyzing its data stream.If a potential risk is detected that may result in a significant decision-making failure, the existing decision-making results need to be immediately updated to prevent loss being caused by old decisions.We refer to this as Adaptive Decision Support, and it has application in such environments as the IoT, emergency management, industrial control systems and online decision-making.An example of the applications is situation awareness-based decision support systems which can improve human decision-makers' performance and reduce error in dynamic environments [27].

Multi-stream decision support under concept drift
Huge amounts of streaming data are now generated by government and industry from multiple sources, such as sensors and marketing activities.They are known as multi-streams.Disruptive technologies and unique user experiences, e.g., new markets and new customer behaviors, have inevitably resulted in changes in the underlying data distribution in almost all streaming data.In addition, high-volume streaming data commonly have undiscovered correlations across data streams, and a drift in one stream may cause drift in other streams.A data-driven decision support system on a single stream could be highly related to decision support systems on other streams, thus efficient learning methods on streaming data, such as identifying correlations between streams and constructing adaptive correlation networks, are urgently needed to support the timely prediction of drift and aid decision-making.In the finance industry, for example, the bid/offer rate in the inter-bank lending market always involves the behaviors of more than two banks.The rate needs to be determined based on the interrelationship between banks to benefit the involved banks; in telecommunications, smartphone producers are competitors between each other.The marketing strategy of one producer affects other producers' strategies, especially for large companies, such as Samsung and Apple.Therefore, it is import for the Apple company to analyze marketing behaviors of Samsung to make efficient strategies and maximize the profit.How to take advantage of this interrelationship for decision support to benefit these network groups individually or as a whole is a promising future research direction.

Recommender systems under concept drift
Recommender systems have attracted great attention and achieved great success in the last decade [22,38].Nevertheless, the dynamic characteristics of high-volume streaming data have not been adequately addressed.Current recommender systems treat user preferences as static, in spite of the fact that preferences change with increased expertise, personal experiences, or social popularity.The performance of recommender systems will be impaired in many aspects, such as accuracy, novelty and diversity, if these dynamic changes in user profiles, item analysis, or user preferences are not considered.Recommendation should consider the consistency of customer behaviors, customer interactions, and changes in customer preferences; adopting concept drift detection and reaction techniques are, therefore, promising directions in recommender system research for both academia and industry.

Data-driven decision-making under uncertainty
A significant challenge of using large quantities of streaming data collected from different sources in different time frames is uncertainty [23].Uncertainty in high-volume streaming data takes a number of different forms.We consider that four main layers are impacted by uncertainty issues in streaming data-driven decision support: the data layer, the stream layer, the concept drift detection layer, and the decision-making layer.The first two layers correspond to Component I in Fig. 1.Layers three and four correspond, respectively, to Component II and Component III.Uncertainty problems in the data layer concern data insufficiency [33], outdatedness [35], incompletion, and ambiguity [3].In the stream layer, uncertainty may exist in the relationship between streams, such as whether two streams convey the same information, and may also exist in the correlation of concept drift between streams, such as the likelihood of drift in one stream causing drift in other streams.In the concept drift layer, uncertainty may take the form of noise, false alarms caused by outliers, and new emerging classes.Uncertainty issues also need to be considered in the generation of drift early warning.Lastly, in the decision-making layer, both the model adaptation and decision optimization processes may be subject to uncertainty issues, since there is no universal decision model to fit all situations.The research problem is to develop a general guidance framework for addressing uncertainty issues and to use uncertainty characteristics to aid decision support.

Summary
In this position paper, we propose a framework for adaptive D 3 M under concept drift in high-volume streaming data environment, elaborate the challenges and opportunities presented by big streaming data, introduce the three steps of learning under concept drift, and discuss future research directions for adaptive decision support.
This paper highlights the issue of real-time D 3 M and provides some fundamental knowledge and methodologies for researchers and practitioners in decision support system area.We hope it could provide a good guideline on how to apply concept drift handling methodologies to help D 3 M techniques in big streaming data.

Fig. 1 A
Fig. 1 A general framework of adaptive D 3 M