Abstract
In the past few years, there has been an increasing awareness that the enormous amount of data being captured by both public and private organisations can be profitably used for decision making. Aided by low-cost computer hardware, fast processing speeds and advancements in data storage technologies, Big Data Analytics has emerged as a fast growing field. However, the statistical challenges that are faced by statisticians and data scientists, while doing analytics with Big Data has not been adequately discussed. In this paper, we discuss the several statistical challenges that are encountered while analyzing Big data for management decision making. These challenges give statisticians significant opportunities for developing new statistical methods. Two methods—Symbolic Data Analysis and Approximate Stream Regression—which holds promise in addressing some of the challenges with Big Data are discussed briefly with real life examples. Two case studies of applications of analytics in management—one in marketing management and the other in human resource management—are discussed.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bhattacharya A, Bhattacharya R (2012) Nonparametric inference on manifolds: with applications to shape spaces. Cambridge University Press, Cambridge
Billard L (2011) Brief overview of symbolic data and analytic issues. Stat Anal Data Min 4(2):149–156
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth Inc
Coppersmith D, Winograd S (1990) Matrix multiplication via arithmetic progressions. J Symb Comput 9:251–280
Cormen TH, Leiserson CE, Rivest RL, Stein C (2009) Introduction to algorithms, 3rd edn. MIT Press and McGraw Hill
Davenport TH, Harris JG (2007) Competing on analytics: the new science of winning. Harvard Business School Publishing Corporation
Desarbo WS, Cron WL (1988) A maximum likelihood methodology for clusterwise linear regression. J Classif 5:249–282
Dryden IL, Mardia KV (1998) Statistical shape analysis. Wiley
Jammalamadaka SR, Sengupta A (2001) Topics in circular statistics. World Scientific
Karr AF, Sanil AP, Banks DL (2006) Data quality: a statistical perspective. Stat Methodol 3(2):137173
Landefield S (2014) Uses of big data for official statistics: privacy, incentives, statistical challenges, and other issues. Discussion Paper, International conference on big data for official statistics, Beijing, China, 28–30 Oct 2014. http://unstats.un.org/unsd/trade/events/2014/beijing/SteveLandefeld-UsesofBigDataforofficialstatistics.pdf. Accessed 30 May 2015
Le-Rademacher J, Billard L (2011) Likelihood functions and some maximum likelihood estimators for symbolic data. J Stat Plan Inference 141:1593–1602
Mahalonobis PC (1965) Statistics as a key technology. Am Stat 19(2):43–46
Majumdar K, Mukherjee S (2011) Designing intelligent recommendations for cross selling. In: Video documentation of 2nd IIMA International conference on advanced data analysis, business analytics and intelligence, DVD-II, IIM Ahmedabad, India
Mardia KV, Jupp PE (1999) Directional statistics. Wiley
Montgomery DC (2012) Statistical quality control, 7th edn. Wiley
Nadungodage CH, Xia Y, Li F, Lee JJ, Ge J (2011) StreamFitter: a real time linear regression analysis system for continuous data streams. In: Xu J, Kim MH, Unland R (eds) Database systems for advanced applications, pp 458–461. Springer
Noirhomme-Fraiture M, Brito P (2011) Far beyond the classical models: symbolic data analysis. Stat Anal Data Min 4(2):157–170
Ramsey JO, Silverman BW (2005) Functional data analysis, 2nd edn. Springer
Reiter JP (2012) Statistical approaches to protecting confidentiality for microdata and their effects on the quality of statistical inferences. Public Opin Q 76(1):163–181
Rao CR (1973) Linear statistical inference and its applications, 2nd edn. Wiley
Silva JA, Faria ER, Barros RC, Hruschka ER, de Carvalho ACPLF, Gama J (2013) Data stream clustering: a survey. ACM Comput Surv 46, 1, 13:1–13:31
Smith BC, Leimkuhler JF, Darrow RM (1992) Yield management at American Airlines. Interfaces 22(2):8–31
Strassen V (1969) Gaussian elimination is not optimal. Numerische Mathematik 13:354–356
Srivastava R (2015) Analytics for improving talent acquisition process. In: Video documentation of 4th IIMA International conference on advanced data analysis, business analytics and intelligence, DVD-II, IIM Ahmedabad, India
Tandon R, Chakraborty A, Srinivasan G, Shroff M, Abdullah A, Shamasundar B, Sinha R, Subramanian S, Hill D, Dhore P (2013) Hewlett Packard: delivering profitable growth for HPDirect.com using operations research. Interfaces 43(1):48–61
Wegman EJ, Solka JL (2005) Statistical data mining. In: Rao CR, Wegman EJ, Solka JL (eds) Data mining and data visualization, handbook of statistics, vol 24. Elsevier
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer India
About this chapter
Cite this chapter
Laha, A. (2016). Statistical Challenges with Big Data in Management Science. In: Pyne, S., Rao, B., Rao, S. (eds) Big Data Analytics. Springer, New Delhi. https://doi.org/10.1007/978-81-322-3628-3_3
Download citation
DOI: https://doi.org/10.1007/978-81-322-3628-3_3
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-3626-9
Online ISBN: 978-81-322-3628-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)