Data streams (streaming data) consist of continuously observed, non-equally spaced and temporally evolving multidimensional data sequences that challenge our computational and/or inferential capabilities. In economics, data streams are among others related to electricity consumption monitoring, Internet user behavior in exploring, or order book forecasting in high-frequency financial markets. In this paper, we point out and discuss several open problems related to robust data stream analysis and propose three robust and conceptually very simple approaches in this context. We apply the proposals to real data sets related to the activity of investors in the futures contracts market.
Similar content being viewed by others
References
Ch. Anagnostopuoulos, D. K. Tasoulis, N. M. Adams, N. G. Pavlidis, and D. J. Hand, “Online linear and quadratic discriminant analysis with adaptive forgetting for streaming classification,” Stat. Anal. Data Mining, No. 5, 139–166 (2012).
Ch. C. Aggerwal, Data Streams — Models and Algorithms, Springer, Berlin (2007).
A. Cuevas, M. Febrero, and R. Fraiman, “Robust estimation and classification for functional data via projection-based depth notions,” Comput. Stat. Data Anal., 22, No. 3, 481–496 (2007).
L. Devroye, L. Gorfi, and L. Gabor, A Probabilistic Theory of Pattern Recognition, Springer-Verlag, New York (1996).
J. Durbin and S. J. Koopman, Time Series Analysis by State Space Methods, Oxford University Press, Oxford (2001).
R. Dyckerhoff, “Data depths satisfying the projection property,” Allgemeines Stat. Arch., 88, 163–190 (2004).
R. Frainman and G. Muniz, “Trimmed means for functional data,” Test, 10, No. 2, 419–440 (2007).
M. M. Gaber, “Advances in data stream mining,” WIREs Data Mining Knowl. Discov., No. 2, 79–85 (2012).
M. G. Genton and A. Lucas, “Comprehensive definitions of breakdown points for independent and dependent observations,” J. R. Stat., Soc. Ser. B, 65, 81–84 (2003).
T. Górecki and M. Krzyśko, “Functional principal component analysis,” in: Data Analysis Methods and its Applications, J. Pociecha and R. Decker (eds.), Beck, Warsaw (2012), pp. 71–87.
J. Hajek, Theory of Rank Tests, Academia, Prague (1967).
L. Horvath and P. Kokoszka, Inference for Functional Data with Applications, Springer, New York (2012).
P. Huber, Data Analysis: What Can Be Learned From the Past 50 Years, Springer, Wiley (2011).
H. L. Hyndeman, “Forecasting functional time series (with discussion),” J. Korean Stat. Soc., 38, No. 3, 199–221 (2009).
R. J. Hyndman, A. B. Koehler, J. B. Ord, and R. D. Snyder, Forecasting with Exponential Smoothing: the State Space Approach, Springer-Verlag, Berlin (2008).
D. Kosiorowski, “Functional regression in short term prediction of economic time series,” Stat. Trans., 15, No. 4 (2014).
D. Kosiorowski, “Two procedures for robust monitoring of probability distributions of economic data streams induced by depth functions,” Oper. Res. Dec., 25, No. 1 (2015).
D. Kosiorowski and Z. Zawadzki, “DepthProc: An R package for robust exploration of multidimensional economic phenomena,” http://arxiv.org/pdf/1408.4542.pdf (2014).
D. Kosiorowski and Z. Zawadzki. “Selected issues related to online calculation of multivariate robust measures of location and scatter,” in: Proceedings from VIIIth A. Zelia International Conference, UEK w Krakowie (2014), pp.17–34.
J. Li and R. Y. Liu, “New nonparametric tests of multivariate locations and scales using data depth,” Stat. Sci., 19, No. 4, 686–696 (2004).
R. Y. Liu, “Control charts for multivariate processes,” J. Am. Stat. Assoc., 90, 1380–1387 (1995).
R. Y. Liu, J. M. Parelius, and K. Singh, “Multivariate analysis by data depth: Descriptive statistics, graphics and inference (with discussion),” Ann. Stat., 27, 783–858 (1999).
C. Loader, Local Regression and Likelihood, Springer, New York (1999).
R. A. Maronna, R. D. Martin, and V. J. Yohai, Robust Statistics — Theory and Methods. Wiley, Chichester (2006).
K. Mosler, “Depth statistics,” in: Robustness and Complex Data Structures, Festschrift in Honour of Ursula Gather, C. Becker, R. Fried, and S. Kuhnt, (eds.), Springer, New York (2013), pp. 17–34.
S. Muthukrishan, Data Streams: Algorithms and Applications, Now Publishers, New York (2006).
N. Hautsch, Econometrics of Financial High-Frequency Data, Springer, Heidelberg (2012).
D. Paindavaine and G. Van Bever, “Nonparametrically Consistent Depth-based Classifiers,” Bernoulli, 21, 62–85 (2015).
D. Paindavaine and G. Van Bever, “From depth to local depth: a focus on centrality,” J. Am. Stat. Assoc., 105, 1105–1119 (2013).
J. O. Ramsay, G. Hooker, and S. Graves, Functional Data Analysis with R and Matlab, Springer, New York (2009).
B. Sch¨olkopf and A.J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, Cambridge (2002).
J-P. Stockis, J. Franke, and J. T. Kamgaing, “On geometric ergodicity of charme models,” J. Time Ser. Anal., 31, No. 2, 141–152 (2010).
V. Todorov and P. Filzmoser. “An object-oriented framework for robust multivariate analysis,” J. Stat. Soft., 32, No. 3, 1–47 (2009).
J. Zhang, “Some extensions of Tukey’s depth function,” J. Multivar. Anal., 82, 134–165 (2002).
Y. Zuo and X. He, “On the limiting distributions of multivariate depth-based rank sum statistics and related tests,” Ann. Stat., 34, 2879–2896 (2006).
Y. Zuo and R. Serfling, “General notions of statistical depth function,” Ann. Stat., 28, 461–482 (2000).
Author information
Authors and Affiliations
Corresponding author
Additional information
The author acknowledges financial support from the Polish National Science Center grant UMO- 2011/03/B/HS4/01138.
Proceedings of the XXXII International Seminar on Stability Problems for Stochastic Models, Trondheim, Norway, June 16–21, 2014
Rights and permissions
About this article
Cite this article
Kosiorowski, D. Dilemmas of Robust Analysis of Economic Data Streams*. J Math Sci 218, 167–181 (2016). https://doi.org/10.1007/s10958-016-3019-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10958-016-3019-3