Skip to main content

Statistical Challenges with Big Data in Management Science

  • Chapter
  • First Online:

Abstract

In the past few years, there has been an increasing awareness that the enormous amount of data being captured by both public and private organisations can be profitably used for decision making. Aided by low-cost computer hardware, fast processing speeds and advancements in data storage technologies, Big Data Analytics has emerged as a fast growing field. However, the statistical challenges that are faced by statisticians and data scientists, while doing analytics with Big Data has not been adequately discussed. In this paper, we discuss the several statistical challenges that are encountered while analyzing Big data for management decision making. These challenges give statisticians significant opportunities for developing new statistical methods. Two methods—Symbolic Data Analysis and Approximate Stream Regression—which holds promise in addressing some of the challenges with Big Data are discussed briefly with real life examples. Two case studies of applications of analytics in management—one in marketing management and the other in human resource management—are discussed.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bhattacharya A, Bhattacharya R (2012) Nonparametric inference on manifolds: with applications to shape spaces. Cambridge University Press, Cambridge

    Google Scholar 

  2. Billard L (2011) Brief overview of symbolic data and analytic issues. Stat Anal Data Min 4(2):149–156

    Article  MathSciNet  Google Scholar 

  3. Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth Inc

    Google Scholar 

  4. Coppersmith D, Winograd S (1990) Matrix multiplication via arithmetic progressions. J Symb Comput 9:251–280

    Article  MathSciNet  MATH  Google Scholar 

  5. Cormen TH, Leiserson CE, Rivest RL, Stein C (2009) Introduction to algorithms, 3rd edn. MIT Press and McGraw Hill

    Google Scholar 

  6. Davenport TH, Harris JG (2007) Competing on analytics: the new science of winning. Harvard Business School Publishing Corporation

    Google Scholar 

  7. Desarbo WS, Cron WL (1988) A maximum likelihood methodology for clusterwise linear regression. J Classif 5:249–282

    Article  MathSciNet  MATH  Google Scholar 

  8. Dryden IL, Mardia KV (1998) Statistical shape analysis. Wiley

    Google Scholar 

  9. Jammalamadaka SR, Sengupta A (2001) Topics in circular statistics. World Scientific

    Google Scholar 

  10. Karr AF, Sanil AP, Banks DL (2006) Data quality: a statistical perspective. Stat Methodol 3(2):137173

    Article  MathSciNet  MATH  Google Scholar 

  11. Landefield S (2014) Uses of big data for official statistics: privacy, incentives, statistical challenges, and other issues. Discussion Paper, International conference on big data for official statistics, Beijing, China, 28–30 Oct 2014. http://unstats.un.org/unsd/trade/events/2014/beijing/SteveLandefeld-UsesofBigDataforofficialstatistics.pdf. Accessed 30 May 2015

  12. Le-Rademacher J, Billard L (2011) Likelihood functions and some maximum likelihood estimators for symbolic data. J Stat Plan Inference 141:1593–1602

    Article  MathSciNet  MATH  Google Scholar 

  13. Mahalonobis PC (1965) Statistics as a key technology. Am Stat 19(2):43–46

    Google Scholar 

  14. Majumdar K, Mukherjee S (2011) Designing intelligent recommendations for cross selling. In: Video documentation of 2nd IIMA International conference on advanced data analysis, business analytics and intelligence, DVD-II, IIM Ahmedabad, India

    Google Scholar 

  15. Mardia KV, Jupp PE (1999) Directional statistics. Wiley

    Google Scholar 

  16. Montgomery DC (2012) Statistical quality control, 7th edn. Wiley

    Google Scholar 

  17. Nadungodage CH, Xia Y, Li F, Lee JJ, Ge J (2011) StreamFitter: a real time linear regression analysis system for continuous data streams. In: Xu J, Kim MH, Unland R (eds) Database systems for advanced applications, pp 458–461. Springer

    Google Scholar 

  18. Noirhomme-Fraiture M, Brito P (2011) Far beyond the classical models: symbolic data analysis. Stat Anal Data Min 4(2):157–170

    Article  MathSciNet  Google Scholar 

  19. Ramsey JO, Silverman BW (2005) Functional data analysis, 2nd edn. Springer

    Google Scholar 

  20. Reiter JP (2012) Statistical approaches to protecting confidentiality for microdata and their effects on the quality of statistical inferences. Public Opin Q 76(1):163–181

    Article  Google Scholar 

  21. Rao CR (1973) Linear statistical inference and its applications, 2nd edn. Wiley

    Google Scholar 

  22. Silva JA, Faria ER, Barros RC, Hruschka ER, de Carvalho ACPLF, Gama J (2013) Data stream clustering: a survey. ACM Comput Surv 46, 1, 13:1–13:31

    Google Scholar 

  23. Smith BC, Leimkuhler JF, Darrow RM (1992) Yield management at American Airlines. Interfaces 22(2):8–31

    Article  Google Scholar 

  24. Strassen V (1969) Gaussian elimination is not optimal. Numerische Mathematik 13:354–356

    Article  MathSciNet  MATH  Google Scholar 

  25. Srivastava R (2015) Analytics for improving talent acquisition process. In: Video documentation of 4th IIMA International conference on advanced data analysis, business analytics and intelligence, DVD-II, IIM Ahmedabad, India

    Google Scholar 

  26. Tandon R, Chakraborty A, Srinivasan G, Shroff M, Abdullah A, Shamasundar B, Sinha R, Subramanian S, Hill D, Dhore P (2013) Hewlett Packard: delivering profitable growth for HPDirect.com using operations research. Interfaces 43(1):48–61

    Article  Google Scholar 

  27. Wegman EJ, Solka JL (2005) Statistical data mining. In: Rao CR, Wegman EJ, Solka JL (eds) Data mining and data visualization, handbook of statistics, vol 24. Elsevier

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arnab Laha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer India

About this chapter

Cite this chapter

Laha, A. (2016). Statistical Challenges with Big Data in Management Science. In: Pyne, S., Rao, B., Rao, S. (eds) Big Data Analytics. Springer, New Delhi. https://doi.org/10.1007/978-81-322-3628-3_3

Download citation

Publish with us

Policies and ethics