Statistical Challenges with Big Data in Management Science

Laha, Arnab

doi:10.1007/978-81-322-3628-3_3

Statistical Challenges with Big Data in Management Science

Arnab Laha⁴

Chapter
First Online: 13 October 2016

5150 Accesses
2 Citations

Abstract

In the past few years, there has been an increasing awareness that the enormous amount of data being captured by both public and private organisations can be profitably used for decision making. Aided by low-cost computer hardware, fast processing speeds and advancements in data storage technologies, Big Data Analytics has emerged as a fast growing field. However, the statistical challenges that are faced by statisticians and data scientists, while doing analytics with Big Data has not been adequately discussed. In this paper, we discuss the several statistical challenges that are encountered while analyzing Big data for management decision making. These challenges give statisticians significant opportunities for developing new statistical methods. Two methods—Symbolic Data Analysis and Approximate Stream Regression—which holds promise in addressing some of the challenges with Big Data are discussed briefly with real life examples. Two case studies of applications of analytics in management—one in marketing management and the other in human resource management—are discussed.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Bhattacharya A, Bhattacharya R (2012) Nonparametric inference on manifolds: with applications to shape spaces. Cambridge University Press, Cambridge
Google Scholar
Billard L (2011) Brief overview of symbolic data and analytic issues. Stat Anal Data Min 4(2):149–156
Article MathSciNet Google Scholar
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth Inc
Google Scholar
Coppersmith D, Winograd S (1990) Matrix multiplication via arithmetic progressions. J Symb Comput 9:251–280
Article MathSciNet MATH Google Scholar
Cormen TH, Leiserson CE, Rivest RL, Stein C (2009) Introduction to algorithms, 3rd edn. MIT Press and McGraw Hill
Google Scholar
Davenport TH, Harris JG (2007) Competing on analytics: the new science of winning. Harvard Business School Publishing Corporation
Google Scholar
Desarbo WS, Cron WL (1988) A maximum likelihood methodology for clusterwise linear regression. J Classif 5:249–282
Article MathSciNet MATH Google Scholar
Dryden IL, Mardia KV (1998) Statistical shape analysis. Wiley
Google Scholar
Jammalamadaka SR, Sengupta A (2001) Topics in circular statistics. World Scientific
Google Scholar
Karr AF, Sanil AP, Banks DL (2006) Data quality: a statistical perspective. Stat Methodol 3(2):137173
Article MathSciNet MATH Google Scholar
Landefield S (2014) Uses of big data for official statistics: privacy, incentives, statistical challenges, and other issues. Discussion Paper, International conference on big data for official statistics, Beijing, China, 28–30 Oct 2014. http://unstats.un.org/unsd/trade/events/2014/beijing/SteveLandefeld-UsesofBigDataforofficialstatistics.pdf. Accessed 30 May 2015
Le-Rademacher J, Billard L (2011) Likelihood functions and some maximum likelihood estimators for symbolic data. J Stat Plan Inference 141:1593–1602
Article MathSciNet MATH Google Scholar
Mahalonobis PC (1965) Statistics as a key technology. Am Stat 19(2):43–46
Google Scholar
Majumdar K, Mukherjee S (2011) Designing intelligent recommendations for cross selling. In: Video documentation of 2nd IIMA International conference on advanced data analysis, business analytics and intelligence, DVD-II, IIM Ahmedabad, India
Google Scholar
Mardia KV, Jupp PE (1999) Directional statistics. Wiley
Google Scholar
Montgomery DC (2012) Statistical quality control, 7th edn. Wiley
Google Scholar
Nadungodage CH, Xia Y, Li F, Lee JJ, Ge J (2011) StreamFitter: a real time linear regression analysis system for continuous data streams. In: Xu J, Kim MH, Unland R (eds) Database systems for advanced applications, pp 458–461. Springer
Google Scholar
Noirhomme-Fraiture M, Brito P (2011) Far beyond the classical models: symbolic data analysis. Stat Anal Data Min 4(2):157–170
Article MathSciNet Google Scholar
Ramsey JO, Silverman BW (2005) Functional data analysis, 2nd edn. Springer
Google Scholar
Reiter JP (2012) Statistical approaches to protecting confidentiality for microdata and their effects on the quality of statistical inferences. Public Opin Q 76(1):163–181
Article Google Scholar
Rao CR (1973) Linear statistical inference and its applications, 2nd edn. Wiley
Google Scholar
Silva JA, Faria ER, Barros RC, Hruschka ER, de Carvalho ACPLF, Gama J (2013) Data stream clustering: a survey. ACM Comput Surv 46, 1, 13:1–13:31
Google Scholar
Smith BC, Leimkuhler JF, Darrow RM (1992) Yield management at American Airlines. Interfaces 22(2):8–31
Article Google Scholar
Strassen V (1969) Gaussian elimination is not optimal. Numerische Mathematik 13:354–356
Article MathSciNet MATH Google Scholar
Srivastava R (2015) Analytics for improving talent acquisition process. In: Video documentation of 4th IIMA International conference on advanced data analysis, business analytics and intelligence, DVD-II, IIM Ahmedabad, India
Google Scholar
Tandon R, Chakraborty A, Srinivasan G, Shroff M, Abdullah A, Shamasundar B, Sinha R, Subramanian S, Hill D, Dhore P (2013) Hewlett Packard: delivering profitable growth for HPDirect.com using operations research. Interfaces 43(1):48–61
Article Google Scholar
Wegman EJ, Solka JL (2005) Statistical data mining. In: Rao CR, Wegman EJ, Solka JL (eds) Data mining and data visualization, handbook of statistics, vol 24. Elsevier
Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Management Ahmedabad, Ahmedabad, India
Arnab Laha

Authors

Arnab Laha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arnab Laha .

Editor information

Editors and Affiliations

Indian Institute of Public Health , Hyderabad, India
Saumyadipta Pyne
CRRao AIMSCS, University of Hyderabad Campus CRRao AIMSCS, Hyderabad, India
B.L.S. Prakasa Rao
CRRao AIMSCS, University of Hyderabad Campus CRRao AIMSCS, Hyderabad, India
S.B. Rao

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Laha, A. (2016). Statistical Challenges with Big Data in Management Science. In: Pyne, S., Rao, B., Rao, S. (eds) Big Data Analytics. Springer, New Delhi. https://doi.org/10.1007/978-81-322-3628-3_3

Download citation

DOI: https://doi.org/10.1007/978-81-322-3628-3_3
Published: 13 October 2016
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-3626-9
Online ISBN: 978-81-322-3628-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics