Abstract
Generating insights and value from data has become an important asset for organizations. At the same time, the need for experts in analytics is increasing and the number of analytics applications is growing. Recently, a new trend has emerged, i.e. analytics-as-a-service platforms, that makes it easier to apply analytics both for novice and expert users. In this study, the authors approach these new services by conducting a full-factorial experiment where both inexperienced and experienced users take on an analytics task with an analytics-as-a-service technology. The research proves that although experts in analytics still significantly outperform novices, these web-based platforms do offer an advantage to inexperienced users. Furthermore, the authors find that analytics-as-a-service does not offer the same benefits across different analytics tasks. That is, they observe better performance for supervised analytics tasks. Moreover, this study indicates that there are significant differences between novices. The most important distinction lies in the approach they take on the task. Novices who follow a more complex, although structured, workflow behave more similarly to experts and, thus, also perform better. The findings can aid managers in their hiring and training strategy with regards to both business users and data scientists. Moreover, it can guide managers in the development of an enterprise-wide analytics culture. Finally, the results can inform vendors about the design and development of these platforms.
Similar content being viewed by others
Notes
See "Appendix D" (available online via http://springerlink.com).
See http://www.dataminingapps.com/wp-content/uploads/2015/09/Cluster-English.mp4 (unsupervised problem) and http://www.dataminingapps.com/wp-content/uploads/2015/09/churn-English.mp4 (supervised problem).
See "Appendix E" (available online via http://springerlink.com).
See "Appendix F" (available online via http://springerlink.com).
References
Alpar P, Schulz M (2016) Self-service business intelligence. Bus Inf Syst Eng 58(2):151–155. https://doi.org/10.1007/s12599-016-0424-6
Anderson TW, Darling DA (1954) A test of goodness of fit. J Am Stat Assoc 49(268):765–769. https://doi.org/10.2307/2281537
Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I et al (2010) A view of cloud computing. Commun ACM 53(4):50–58. https://doi.org/10.1145/1721654.1721672
August T, Niculescu MF, Shin H (2014) Cloud implications on software network structure and security risks. Inf Sys Res 25(3):489–510. https://doi.org/10.1287/isre.2014.0527
Baesens B (2014) Analytics in a big data world: the essential guide to data science and its applications. Wiley, Hoboken
Bartlett MS (1937) Properties of sufficiency and statistical tests. In: Proceedings of the royal society of London series a, mathematical and physical sciences, pp 268–282, http://www.jstor.org/stable/96803. Accessed 22 Mar 2018
Boudreau MC, Gefen D, Straub DW (2001) Validation in information systems research: a state-of-the-art assessment. MIS Q 25(1):1–16. https://doi.org/10.2307/3250956
Chen H, Chiang RH, Storey VC (2012) Business intelligence and analytics: From big data to big impact. MIS Q 36(4):1165–1188. http://www.misq.org/skin/frontend/default/misq/pdf/V36I4/SI_ChenIntroduction.pdf. Accessed 22 Mar 2018
Chen PY, Wu SY (2013) The impact and implications of on-demand services on market structure. Inf Syst Res 24(3):750–767. https://doi.org/10.1287/isre.1120.0451
Chen Y, Kreulen J, Campbell M, Abrams C (2011) Analytics ecosystem transformation: a force for business model innovation. In: SRII global conference (SRII), 2011 annual, IEEE, pp 11–20. https://doi.org/10.1109/SRII.2011.12
Conover WJ, Johnson ME, Johnson MM (1981) A comparative study of tests for homogeneity of variances, with applications to the outer continental shelf bidding data. Technometrics 23(4):351–361. https://doi.org/10.1080/00401706.1981.10487680
Davenport TH (2014) Big data at work: dispelling the myths, uncovering the opportunities. Harvard Business Review Press, Boston
Davenport TH, Harris JG (2007) Competing on analytics: the new science of winning. Harvard Business Press, Boston
Debortoli S, Müller O, vom Brocke J (2014) Comparing business intelligence and big data skills. Bus Inf Syst Eng 6(5):289–300. https://doi.org/10.1007/s12599-014-0344-2
Demirkan H, Delen D (2013) Leveraging the capabilities of service-oriented decision support systems: putting analytics and big data in cloud. Decis Support Syst 55(1):412–421. https://doi.org/10.1016/j.dss.2012.05.048
Elazhary H (2014) Cloud computing for big data. Tech. Rep. 4, MAGNT Research Report
Fayyad U, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery in databases. AI Magn 17(3):37. http://www.aaai.org/ojs/index.php/aimagazine/article/view/1230. Accessed 22 Mar 2018
Fligner MA, Killeen TJ (1976) Distribution-free two-sample tests for scale. J Am Stat Assoc 71(353):210–213. https://doi.org/10.1080/01621459.1976.10481517
Gartner (2015) Magic quadrant for business intelligence and analytics platforms. http://www.gartner.com/technology/reprints.do?id=1-2AD8O9T$&$ct=150223$&$st=sb. Accessed 03 Feb 2016
Gavrilov M, Anguelov D, Indyk P, Motwani R (2000) Mining the stock market: which measure is best? In: Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining, Boston, MA, August 20–23, 2000, pp 487–496. https://doi.org/10.1145/347090.347189
Gower JC (1971) A general coefficient of similarity and some of its properties. Biometrics 27(4):857–871. https://doi.org/10.2307/2528823
Gupta P, Seetharaman A, Raj JR (2013) The usage and adoption of cloud computing by small and medium businesses. Int J Inf Manag 33(5):861–874. https://doi.org/10.1016/j.ijinfomgt.2013.07.001
Imhoff C, White C (2011) Self-service business intelligence—empowering users to generate insights. TDWI Best Practice Report. https://tdwi.org/articles/2011/09/20/self-service-bi-empowerment.aspx. Accessed 22 Mar 2018
Jaatun MG, Pearson S, Gittler F, Leenes R, Niezen M (2016) Enhancing accountability in the cloud. Int J Inf Manag. https://doi.org/10.1016/j.ijinfomgt.2016.03.004
Jarque CM, Bera AK (1987) A test for normality of observations and regression residuals. Int Stat Rev pp 163–172. https://doi.org/10.2307/1403192
Leavitt N (2013) Bringing big analytics to the masses. IEEE Comput 46(1):20–23. https://doi.org/10.1109/MC.2013.9
Levene H (1960) Robust tests for equality of variances1. Contrib Probab Stat Essays Honor Harold Hotel 2:278–292
Liao TW (2005) Clustering of time series data—a survey. Pattern Recogn 38(11):1857–1874. https://doi.org/10.1016/j.patcog.2005.01.025
Lismont J, Van Calster T, Oskarsdottir M, Vanthienen J, Baesens B, Lemahieu W (2015) API for prediction and machine learning: poll results and analysis. KDnuggets News 29. http://www.kdnuggets.com/2015/09/api-prediction-machine-learning-poll-results.html. Accessed 22 Mar 2018
Lismont J, Vanthienen J, Baesens B, Lemahieu W (2017) Defining analytics maturity indicators: a survey approach. Int J Inf Manag 37(3):114–124. https://doi.org/10.1016/j.ijinfomgt.2016.12.003
Liu Y, Li Z, Xiong H, Gao X, Wu J, Wu S (2013) Understanding and enhancement of internal clustering validation measures. IEEE Trans Cybern 43(3):982–994. https://doi.org/10.1109/TSMCB.2012.2220543
Marston S, Li Z, Bandyopadhyay S, Zhang J, Ghalsasi A (2011) Cloud computing—the business perspective. Decis Support Syst 51(1):176–189. https://doi.org/10.1016/j.dss.2010.12.006
Mell P, Grance T (2011) The NIST definition of cloud computing. Computer Security Division, Information Technology Laboratory, National Institute of Standards and Technology Gaithersburg. http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf. Accessed 22 Mar 2018
Moges HT, Van Vlasselaer V, Lemahieu W, Baesens B (2016) Determining the use of data quality metadata (DQM) for decision making purposes and its impact on decision outcomes: an exploratory study. Decis Support Syst 83:32–46. https://doi.org/10.1016/j.dss.2015.12.006
Montero P, Vilar JA (2014) TSclust: an R package for time series clustering. J Stat Softw. https://doi.org/10.18637/jss.v062.i01
Ransbotham S, Kiron D, Prentice PK (2016) Beyond the hype: the hard work behind analytics success. MIT Sloan Manag Rev 57(3). http://sloanreview.mit.edu/analytics2016. Accessed 22 Mar 2018
Shapiro SS, Wilk MB (1965) An analysis of variance test for normality (complete samples). Biometrika 52(3/4):591–611. https://doi.org/10.2307/2333709
Sjøberg DI, Hannay JE, Hansen O, Kampenes VB, Karahasanovic A, Liborg NK, Rekdal AC (2005) A survey of controlled experiments in software engineering. IEEE Trans Softw Eng 31(9):733–753. https://doi.org/10.1109/TSE.2005.97
Straub DW (1989) Validating instruments in MIS research. MIS Q 13(2):147–169. http://misq.org/validating-instruments-in-mis-research.html. Accessed 22 Mar 2018
Van Calster T, Lismont J, Óskarsdóttir M, vanden Broucke S, Vanthienen J, Lemahieu W, Baesens B (2016) Automated analytics: the organizational impact of analytics-as-a-service. In: 1st workshop on enterprise intelligence in conjunction with KDD 2016, August 14, San Francisco, CA, forthcoming. Available at https://www.researchgate.net/publication/311576573_Automated_Analytics_The_Organizational_Impact_of_Analytics-as-a-Service. Accessed 22 Mar 2018
van der Aalst W (2011) Process mining: discovery. Conformance and enhancement of business processes. Springer, Heidelberg. https://doi.org/10.1007/978-3-642-19345-3
Weinhardt C, Anandasivam A, Blau B, Borissov N, Meinl T, Michalk W, Stößer J (2009) Cloud computing—a classification, business models, and research directions. Bus Inf Syst Eng 1(5):391–399. https://doi.org/10.1007/s12599-009-0071-2
Wobbrock JO, Findlater L, Gergle D, Higgins JJ (2011) The aligned rank transform for nonparametric factorial analyses using only anova procedures. In: Proceedings of the sigchi conference on human factors in computing systems, ACM, pp 143–146. https://doi.org/10.1145/1978942.1978963
Zorrilla M, García-Saiz D (2013) A service oriented architecture to provide data mining services for non-expert data miners. Decis Support Syst 55(1):399–411. https://doi.org/10.1016/j.dss.2012.05.045
Acknowledgements
This work was supported by Colruyt Group; and the Coca-Cola Company.
Author information
Authors and Affiliations
Corresponding author
Additional information
Accepted after two revisions by Prof. Dr. Kliewer.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Lismont, J., Van Calster, T., Óskarsdóttir, M. et al. Closing the Gap Between Experts and Novices Using Analytics-as-a-Service: An Experimental Study. Bus Inf Syst Eng 61, 679–693 (2019). https://doi.org/10.1007/s12599-018-0539-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12599-018-0539-z