A Sequential Regression Model for Big Data with Attributive Explanatory Variables

Zhang, Qing-Ting; Liu, Yuan; Zhou, Wen; Yang, Zhou-Wang

doi:10.1007/s40305-015-0109-8

A Sequential Regression Model for Big Data with Attributive Explanatory Variables

Published: 14 December 2015

Volume 3, pages 475–488, (2015)
Cite this article

Journal of the Operations Research Society of China Aims and scope Submit manuscript

Qing-Ting Zhang¹,
Yuan Liu¹,
Wen Zhou¹ &
…
Zhou-Wang Yang¹

3 Citations
Explore all metrics

Abstract

As the applications for modeling of big data and analysis advance in scope, computational efficiency faces greater challenges in terms of storage and speed. In many practical problems, a great amount of historical data is sequentially collected and used for online statistical modeling. For modeling sequential data, we propose a sequential linear regression method that extracts essential information from historical data. This carefully selected information is then utilized to update a model according to a sequential estimation scheme. With this technique, the earlier data no longer needs to be stored, and the sequential updating is computationally efficient in speed and storage. A weighted strategy is introduced on the current model to determine the impact of data from different periods. When compared with estimation methods that use historical data, our numerical experiments demonstrate that our solution increases the speed while decreasing the storage load.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Selective Review on Statistical Techniques for Big Data

Renewable learning for multiplicative regression with streaming datasets

Article 04 May 2023

Parallel inference for big data with the group Bayesian method

Article 25 June 2020

References

Philip Chen, C., Zhang, C.Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf Sci 275, 314–347 (2014)
Article Google Scholar
Chen, J., Chen, Y., Du, X., Li, C., Lu, J., Zhao, S., Zhou, X.: Big data challenge: a data management perspective. Front. Comput. Sci. 7(2), 157–164 (2013)
Article MathSciNet Google Scholar
Menard, S.: Applied Logistic Regression Analysis, vol. 106. Sage, Thousand Oaks (2002)
Google Scholar
Hastie, T., Tibshirani, R.: Discriminant adaptive nearest neighbor classification. IEEE Trans. Pattern Anal. Mach. Intell. 18(6), 607–616 (1996)
Article Google Scholar
Mayer-Schönberger, V., Cukier, K.: Big Data: A Revolution that will Transform How We Live, Work and Think. Houghton Mifflin Harcourt, Boston (2013)
Google Scholar
Russom, P., et al.: Big data analytics. TDWI Best Practices Report, Fourth Quarter (2011)
Sathiamoorthy, M., Asteris, M., Papailiopoulos, D., Dimakis, A.G., Vadali, R., Chen, S., Borthakur, D.: Xoring elephants: novel erasure codes for big data. In: Proceedings of the VLDB Endowment, vol. 6, pp. 325–336. VLDB Endowment (2013)
Chapman, D.G.: The estimation of biological populations. Ann. Math. Stat. 24, 1–15 (1954)
Article Google Scholar
Wald, A.: Sequential tests of statistical hypotheses. Ann. Math. Stat. 16(2), 117–186 (1945)
Article MATH MathSciNet Google Scholar
Stein, C.: A two-sample test for a linear hypothesis whose power is independent of the variance. Ann. Math. Stat. 16(3), 243–258 (1945)
Article MATH Google Scholar
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)
Article MATH MathSciNet Google Scholar
Hastie, T., Tibshirani, R., Friedman, J., Franklin, J.: The elements of statistical learning: data mining, inference and prediction. Math. Intell. 27(2), 83–85 (2005)
Google Scholar
Varian, H.R.: Big data: new tricks for econometrics. J. Econ. Perspect. 28(2), 3–27 (2014)
Article Google Scholar
Chen, H., Chiang, R.H., Storey, V.C.: Business intelligence and analytics: from big data to big impact. MIS Q. 36(4), 1165–1188 (2012)
Google Scholar
Bishop, C.M., et al.: Pattern Recognition and Machine Learning, vol. 1. Springer, New York (2006)
MATH Google Scholar
Wald, A.: Sequential Analysis. Courier Corporation (1973)
Aitken, A.C.: Iv.on least squares and linear combination of observations. Proc. R. Soc. Edinb. 55, 42–48 (1936)
Article Google Scholar
Farin, G.E.: Curves and Surfaces for CAGD: A Practical Guide. Morgan Kaufmann, San Francisco (2002)
Google Scholar
Pozrikidis, C.: Numerical Computation in Science and Engineering, vol. 307. Oxford University Press, New York (1998)
MATH Google Scholar
Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970)
Article MATH MathSciNet Google Scholar
Le Gall, F.: Powers of tensors and fast matrix multiplication. In: Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, pp. 296–303. ACM (2014)

Download references

Acknowledgments

We would like to thank the anonymous reviewers for their constructive and valuable comments.

Author information

Authors and Affiliations

University of Science and Technology of China, Hefei, 230026, China
Qing-Ting Zhang, Yuan Liu, Wen Zhou & Zhou-Wang Yang

Authors

Qing-Ting Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wen Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Zhou-Wang Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhou-Wang Yang.

Additional information

This work was supported by the National Natural Science Foundation of China (Nos. 11171322 and 11426236) and the Fundamental Research Funds for the Central Universities (WK0010000051).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, QT., Liu, Y., Zhou, W. et al. A Sequential Regression Model for Big Data with Attributive Explanatory Variables. J. Oper. Res. Soc. China 3, 475–488 (2015). https://doi.org/10.1007/s40305-015-0109-8

Download citation

Received: 08 June 2015
Revised: 08 November 2015
Accepted: 17 November 2015
Published: 14 December 2015
Issue Date: December 2015
DOI: https://doi.org/10.1007/s40305-015-0109-8

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Sequential Regression Model for Big Data with Attributive Explanatory Variables

Abstract

Access this article

Similar content being viewed by others

A Selective Review on Statistical Techniques for Big Data

Renewable learning for multiplicative regression with streaming datasets

Parallel inference for big data with the group Bayesian method

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A Sequential Regression Model for Big Data with Attributive Explanatory Variables

Abstract

Access this article

Similar content being viewed by others

A Selective Review on Statistical Techniques for Big Data

Renewable learning for multiplicative regression with streaming datasets

Parallel inference for big data with the group Bayesian method

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation