Abstract
This paper provides a survey of the state-of-the-art and future directions of one of the most important emerging technologies within business analytics (BA), namely prescriptive analytics (PSA). BA focuses on data-driven decision-making and consists of three phases: descriptive, predictive, and prescriptive analytics. While descriptive and predictive analytics allow us to analyze past and predict future events, respectively, these activities do not provide any direct support for decision-making. Here, PSA fills the gap between data and decisions. We have observed an increasing interest for in-DBMS PSA systems in both research and industry. Thus, this paper aims to provide a foundation for PSA as a separate field of study. To do this, we first describe the different phases of BA. We then survey classical analytics systems and identify their main limitations for supporting PSA, based on which we introduce the criteria and methodology used in our analysis. We next survey, categorize, and discuss the state-of-the-art within emerging, so-called PSA\(^+\), systems, followed by a presentation of the main challenges and opportunities for next-generation PSA systems. Finally, the main findings are discussed and directions for future research are outlined.
Similar content being viewed by others
Notes
At the time of publication [73], Tiresias has been tested only with PostgreSQL.
References
Aalst, W.M.P.V.D.: Process Mining—Discovery, Conformance and Enhancement of Business Processes. Springer, Berlin (2011)
Abbena, E., Salamon, S., Gray, A.: Modern Differential Geometry of Curves and Surfaces with Mathematica. Chapman and Hall/CRC, Boca Raton (2017)
Akdere, M., Çetintemel, U., Riondato, M., Upfal, E., Zdonik, S.B.: The case for predictive database systems: opportunities and challenges. CIDR 2011, 167–174 (2011)
Aref, M., ten Cate, B., Green, T.J., Kimelfeld, B., Olteanu, D., Pasalic, E., Veldhuizen, T.L., Washburn, G.: Design and implementation of the logicblox system. In: Proceedings of SIGMOD, pp. 1371–1382 (2015)
Basu, A.: Five pillars of prescriptive analytics success. Analyt. Mag. March-April (2013). http://analytics-magazine.org/executive-edge-five-pillars-of-prescriptiveanalytics-success/. Accessed 27 May 2019
Bertsimas, D., Kallus, N.: From predictive to prescriptive analytics. ArXiv e-prints (2014)
Bezanson, J., Edelman, A., Karpinski, S., Shah, V.B.: Julia: a fresh approach to numerical computing. SIAM Rev. 59(1), 65–98 (2017)
Bihis, M., Roychowdhury, S.: A generalized flow for multi-class and binary classification tasks: an azure ml approach. In: 2015 IEEE International Conference on Big Data, pp. 1728–1737 (2015)
Birge, J.R., Louveaux, F.: Introduction to Stochastic Programming. Springer, Berlin (2011)
Bixby, R.E.: Solving real-world linear programs: a decade and more of progress. Oper. Res. 50(1), 3–15 (2002)
Blockeel, H.: Data mining: from procedural to declarative approaches. New Gener. Comput. 33(2), 115–135 (2015)
Boehm, M., Evfimievski, A.V., Pansare, N., Reinwald, B.: Declarative machine learning—a classification of basic properties and types. CoRR arXiv:abs/1605.05826 (2016)
Bonczek, R.H., Holsapple, C.W., Whinston, A.B.: Foundations of Decision Support Systems. Academic Press, London (2014)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Brown, P.G.: Overview of scidb: large scale array storage, processing and analysis. In: Proceedings of SIGMOD, pp. 963–968 (2010)
Brucato, M., Beltran, J.F., Abouzied, A., Meliou, A.: Scalable package queries in relational database systems. PVLDB 9(7), 576–587 (2016)
Burstein, F., Holsapple, C.: Handbook on Decision Support Systems 2: Variations. Springer, Berlin (2008)
Chasseur, C., Li, Y., Patel, J.M.: Enabling JSON document stores in relational systems. WebDB 13, 14–15 (2013)
Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. SIGMOD Rec. 26(1), 65–74 (1997)
Chen, D.S., Batson, R.G., Dang, Y.: Applied Integer Programming: Modeling and Solution. Wiley, New York (2010)
COIN-OR: COIN-OR: Computational infrastructure for operations research—open-source software for the operations research community. https://www.coin-or.org/ (2018). Accessed 22 Mar 2018
Crotty, A., Galakatos, A., Dursun, K., Kraska, T., Binnig, C., Çetintemel, U., Zdonik, S.: An architecture for compiling udf-centric workflows. PVLDB 8(12), 1466–1477 (2015)
Crotty, A., Galakatos, A., Dursun, K., Kraska, T., Çetintemel, U., Zdonik, S.B.: Tupleware: “big” data, big analytics, small clusters. In: CIDR 2015 (2015)
De Gooijer, J.G., Hyndman, R.J.: 25 years of time series forecasting. Int. J. Forecast. 22(3), 443–473 (2006)
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Desanctis, G., Gallupe, R.B.: A foundation for the study of group decision support systems. Manag. Sci. 33(5), 589–609 (1987)
Dietterich, T.G.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput. 10(7), 1895–1923 (1998)
Feng, X., Kumar, A., Recht, B., Ré, C.: Towards a unified architecture for in-RDBMS analytics. In: Proceedings of SIGMOD, pp. 325–336 (2012)
Fischer, U., Dannecker, L., Siksnys, L., Rosenthal, F., Böhm, M., Lehner, W.: Towards integrated data analytics: time series forecasting in DBMS. Datenbank-Spektrum 13(1), 45–53 (2013)
Fischer, U., Rosenthal, F., Lehner, W.: F2DB: the flash-forward database system. In: IEEE 28th ICDE 2012, pp. 1245–1248 (2012)
Frazzetto, D., Neupane, B., Pedersen, T.B., Nielsen, T.D.: Adaptive user-oriented direct load-control of residential flexible devices. In: Proceedings of e-Energy, pp. 1–11 (2018)
Gartner: Flipping to Digital Leadership, Insights from the 2015 Gartner CIO Agenda Report (2015). https://www.gartner.com/imagesrv/cio/pdf/cio_agenda_insights2015.pdf. Accessed 21 Aug 2018
Gartner: Gartner’s 2016 hype cycle for emerging technologies identifies three key trends that organizations must track to gain competitive advantage. https://www.gartner.com/newsroom/id/3412017 (2016). Accessed 22 Mar 2018
Getoor, L.: Introduction to Statistical Relational Learning. MIT Press, Cambridge (2007)
Ghoting, A., Krishnamurthy, R., Pednault, E.P.D., Reinwald, B., Sindhwani, V., Tatikonda, S., Tian, Y., Vaithyanathan, S.: SystemML: declarative machine learning on MapReduce. In: Proceedings of ICDE, pp. 231–242 (2011)
Gorunescu, F.: Data Mining—Concepts, Models and Techniques, Intelligent Systems Reference Library, vol. 12. Springer, Berlin (2011)
Goyal, A., Aprilia, E., Janssen, G., Kim, Y., Kumar, T., Mueller, R., Phan, D., Raman, A., Schuddebeurs, J.D., Xiong, J., Zhang, R.: Asset health management using predictive and prescriptive analytics for the electric power grid. IBM J. Res. Dev. 60(1), 1–4 (2016)
Green, T.J., Aref, M., Karvounarakis, G.: Logicblox, platform and language: a tutorial. In: Proceedings of Datalog, pp. 1–8 (2012)
Gröger, C., Schwarz, H., Mitschang, B.: Prescriptive analytics for recommendation-based business process optimization. In: International Conference on Business Information Systems, pp. 25–37 (2014)
Gurobi Optimization LLC: Gurobi Optimizer (2014). http://www.gurobi.com/products/gurobi-optimizer. Accessed 7 May 2019
Haas, P.J., Maglio, P.P., Selinger, P.G., Tan, W.C.: Data is dead... without what-if models. PVLDB 4(12), 1486–1489 (2011)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explor. Newslett. 11(1), 10–18 (2009)
Hellerstein, J.M., Ré, C., Schoppmann, F., Wang, D.Z., Fratkin, E., Gorajek, A., Ng, K.S., Welton, C., Feng, X., Li, K., Kumar, A.: The madlib analytics library or MAD skills, the SQL. PVLDB 5(12), 1700–1711 (2012)
High, R.: The Era of Cognitive Systems: An Inside Look at IBM Watson and How It Works. IBM Corporation, Redbooks (2012)
Holsapple, C.W., Lee-Post, A., Pakath, R.: A unified foundation for business analytics. Decis. Support Syst. 64, 130–141 (2014)
Hupfeld, D., Maccioni, R., Sesemann, R., Ravazzolo, D.: Fleet asset capacity analysis and revenue management optimization using advanced prescriptive analytics. J. Revenue Pricing Manag. 15(6), 516–522 (2016)
IBM: IBM DB2 database—database software: IBM analytics. https://www.ibm.com/analytics/us/en/db2/ (2018). Accessed 22 Mar 2018
IBM: Prescriptive analytics—IBM analytics. https://www.ibm.com/analytics/data-science/prescriptive-analytics (2018). Accessed 22 Mar 2018
Inmon, W.H.: Building the Data Warehouse. Wiley, New York (2005)
Jardine, D.A.: The ANSI/SPARC DBMS Model; Proceedings of the Second Share Working Conference on Data Base Management Systems, Montreal, Canada, April 26–30, 1976. Elsevier Science Inc., Amsterdam (1977)
Jarke, M., Lenzerini, M., Vassiliou, Y., Vassiliadis, P.: Fundamentals of Data Warehouses. Springer, Berlin (2013)
Kalinin, A., Cetintemel, U., Zdonik, S.: Searchlight: enabling integrated search and exploration over large multidimensional data. Proc. VLDB Endow. 8(10), 1094–1105 (2015)
Kaur, J., Mann, K.S.: AI based healthcare platform for real time, predictive and prescriptive analytics using reactive programming. J. Phys. Conf. Ser. 933, 012010 (2018)
Keen, P.G., Morton, M.S.S.: Decision Support Systems: An Organizational Perspective, vol. 35. Addison-Wesley, Reading (1978)
Khalefa, M.E., Fischer, U., Pedersen, T.B., Lehner, W.: Model-based integration of past and future in timetravel. Proc. VLDB Endow. 5(12), 1974–1977 (2012)
Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. Wiley, New York (2011)
Kraska, T., Talwalkar, A., Duchi, J.C., Griffith, R., Franklin, M.J., Jordan, M.I.: Mlbase: a distributed machine-learning system. In: Proceedings of CIDR (2013)
Kumar, A., McCann, R., Naughton, J., Patel, J.M., Babros, T.E., Hunt, R.J., Koski, K., Strikwerda, J.C., Wade, B.A., Arnold, R.B., et al.: A survey of the existing landscape of ml systems. UW-Madison CS Tech. Rep. TR1827 (2015)
Kumar, A., McCann, R., Naughton, J.F., Patel, J.M.: Model selection management systems: the next frontier of advanced analytics. SIGMOD Rec. 44(4), 17–22 (2015)
Laborie, P., Rogerie, J., Shaw, P., Vilím, P.: IBM ILOG CP optimizer for scheduling. Constraints 23(2), 210–250 (2018)
Lattner, C., Adve, V.S.: LLVM: a compilation framework for lifelong program analysis and transformation. In: 2nd IEEE ACM CGO, pp. 75–88 (2004)
Linoff, G.S., Berry, M.J.: Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management. Wiley, New York (2011)
Luhn, H.P.: A business intelligence system. IBM J. Res. Dev. 2(4), 314–319 (1958)
Lustig, I., Dietrich, B., Johnson, C., Dziekan, C.: The analytics journey. Analyt. Mag. 3(6), 11–13 (2010)
Madsen, A.L., Jensen, F., Kjærulff, U., Lang, M.: The hugin tool for probabilistic graphical models. Int. J. Artif. Intell. Tools 14(3), 507–544 (2005)
Makhorin, A.: The GNU linear programming kit (GLPK). GNU Software Foundation (2015). https://www.gnu.org/software/glpk/. Accessed 7 May 2019
Makridakis, S., Wheelwright, S.C., Hyndman, R.J.: Forecasting Methods and Applications. Wiley, New York (2008)
Malinowski, E., Zimányi, E.: Advanced Data Warehouse Design—From Conventional to Spatial and Temporal Applications. Data-Centric Systems and Applications. Springer, Berlin (2008)
Mansinghka, V.K., Tibbetts, R., Baxter, J., Shafto, P., Eaves, B.: Bayesdb: a probabilistic programming system for querying the probable implications of data. CoRR arXiv:abs/1512.05006 (2015)
Markl, V.: Breaking the chains: on declarative data analysis and data independence in the big data era. PVLDB 7(13), 1730–1733 (2014)
MathWorks: Matlab—mathworks. https://www.mathworks.com/products/matlab.html (2018). Accessed 22 Mar 2018
Meliou, A., Gatterbauer, W., Suciu, D.: Reverse data management. PVLDB 4(12), 1490–1493 (2011)
Meliou, A., Suciu, D.: Tiresias: the database oracle for how-to queries. In: Proceedings of SIGMOD, pp. 337–348 (2012)
Meng, X., Bradley, J., Yavuz, B., Sparks, E., Venkataraman, S., Liu, D., Freeman, J., Tsai, D., Amde, M., Owen, S., et al.: Mllib: machine learning in apache spark. J. Mach. Learn. Res. 17(1), 1235–1241 (2016)
Microsoft: Microsoft excel 2016, spreadsheet software, excel free trial. https://products.office.com/en-us/excel (2018). Accessed on 22 Mar 2018
Nagabhushana, S.: Data Warehousing OLAP and Data Mining. New Age International, Chennai (2006)
Nechifor, S., Puiu, D., Tarnauca, B., Moldoveanu, F.: Prescriptive analytics based autonomic networking for urban streams services provisioning. In: 2015 IEEE 81st Vehicular Technology Conference (VTC Spring), pp. 1–5 (2015)
Neupane, B., Pedersen, T.B., Thiesson, B.: Utilizing device-level demand forecasting for flexibility markets. In: Proceedings of e-Energy, pp. 108–118 (2018)
Neupane, B., Šikšnys, L., Pedersen, T.B.: Generation and evaluation of flex-offers from flexible electrical devices. In: Proceedings of e-Energy, pp. 143–156 (2017)
Owen, S., Anil, R., Dunning, T., Friedman, E.: Mahout in action. Manning Publications Co, Shelter Island, NY (2011)
Power, D.J., Sharda, R., Burstein, F.: Decision Support Systems. Wiley, New York (2015)
Powers, C.A., Meyer, C.M., Roebuck, M.C., Vaziri, B.: Predictive modeling of total healthcare costs using pharmacy claims data: a comparison of alternative econometric cost modeling techniques. Med. Care 43(11), 1065–1072 (2005)
Pritchard, P.J., Pritchard, R.: MathCAD: A Tool for Engineering Problem Solving (BEST Series). McGraw-Hill Higher Education, New York (1998)
Ramakrishnan, R., Gehrke, J.: Database Management Systems, 3rd edn. McGraw-Hill, New York (2003)
Recht, B., Re, C., Wright, S., Niu, F.: Hogwild: a lock-free approach to parallelizing stochastic gradient descent. In: Proceedings of the 25th Annual Conference on Neural Information Processing Systems, pp. 693–701 (2011)
Richardson, M., Domingos, P.M.: Markov logic networks. Mach. Learn. 62(1–2), 107–136 (2006)
Rusitschka, S., Doblander, C., Goebel, C., Jacobsen, H.A.: Adaptive middleware for real-time prescriptive analytics in large scale power systems. In: Proceedings of Middleware, p. 5 (2013)
Russell, S.J., Norvig, P., Canny, J.F., Malik, J.M., Edwards, D.D.: Artificial Intelligence: A Modern Approach, vol. 2. Prentice Hall, Upper Saddle River (2003)
SAS: SAS business analytics—SAS. https://www.sas.com/en_us/solutions/business-analytics.html (2018). Accessed 22 Mar 2018
Sauter, V.L.: Decision Support Systems for Business Intelligence. Wiley, New York (2014)
Shim, J.P., Warkentin, M., Courtney, J.F., Power, D.J., Sharda, R., Carlsson, C.: Past, present, and future of decision support technology. Decis. Support Syst. 33(2), 111–126 (2002)
Siegel, E.: Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die. Wiley, New York (2013)
Šikšnys, L., Pedersen, T.B.: Prescriptive analytics. In: Encyclopedia of Database Systems, 2nd ed. Springer, New York, NY (2018). https://doi.org/10.1007/978-1-4614-8265-9_80624
Šikšnys, L., Pedersen, T.B.: Demonstrating solveDB: an SQL-based DBMS for optimization applications. In: Proceedings of ICDE, pp. 1367–1368 (2017)
Smet, G.D.: A decade of optaplanner. https://www.optaplanner.org/blog/2016/08/07/ADecadeOfOptaPlanner.html (2016). Accessed 01 Sept 2018
Soltanpoor, R., Sellis, T.: Prescriptive analytics for big data. In: Databases Theory and Applications—27th Australasian Database Conference, pp. 245–256 (2016)
Song, S., Kim, D.J., Hwang, M., Kim, J., Jeong, D., Lee, S., Jung, H., Sung, W.: Prescriptive analytics system for improving research power. In: 16th IEEE CSE, pp. 1144–1145 (2013)
Souza, G.C.: Supply chain analytics. Bus. Horiz. 57(5), 595–605 (2014)
Stackowiak, R., Rayman, J., Greenwald, R.: Oracle Data Warehousing and Business Intelligence SO. Wiley, New York (2007)
Steinhaus, S.: Comparison of mathematical programs for data analysis. http://www.cybertester.com/data/ncrunch4.pdf (2008). Accessed 24 Aug 2018
Šikšnys, L.: Towards prescriptive analytics in cyber-physical systems. Ph.D. thesis, Aalborg University and Dresden University of Technology (2015)
Šikšnys, L., Pedersen, T.B.: Dependency-based flexoffers: scalable management of flexible loads with dependencies. In: Proceedings of e-Energy, pp. 11:1–11:13 (2016)
Šikšnys, L., Pedersen, T.B.: Solvedb: integrating optimization problem solvers into SQL databases. In: Proceedings of SSDBM, pp. 14:1–14:12 (2016)
Šikšnys, L., Valsomatzis, E., Hose, K., Pedersen, T.B.: Aggregating and disaggregating flexibility objects. TKDE 27(11), 2893–2906 (2015)
Tang, Z., Maclennan, J.: Data Mining with SQL Server 2005. Wiley, New York (2005)
Valsomatzis, E., Pedersen, T.B., Abell, A., Hose, K.: Aggregating energy flexibilities under constraints. In: Proceedings of SmartGridComm, pp. 484–490 (2016)
Van Poucke, S., Thomeer, M., Heath, J., Vukicevic, M.: Are randomized controlled trials the (g) old standard? From clinical intelligence to prescriptive analytics. J. Med. Internet Res. 18(7), e185 (2016)
Vanderbei, R.J.: Linear Programming. Springer, Berlin (2014)
Waller, M.A., Fawcett, S.E.: Data science, predictive analytics, and big data: a revolution that will transform supply chain design and management. J. Bus. Logist. 34(2), 77–84 (2013)
Watkins, E.R.: Principles of the business rule approach: Ronald G. Ross, Addison-Wesley information technology series, february 2003, 256pp., price £30.99, ISBN 0-201-78893-4. Int. J. Inf. Manag. 24(2), 196–197 (2004)
Winston, W.L., Goldberg, J.B.: Operations Research: Applications and Algorithms, vol. 3. Thomson/Brooks/Cole, Belmont (2004)
Wu, P.J., Yang, C.K.: The green fleet optimization model for a low-carbon economy: a prescriptive analytics. ICASI 2017, 107–110 (2017)
Acknowledgements
This research was supported in part by the MADE-AAU Project, the DiCyPS Project funded by Innovation Fund Denmark, and the GOFLEX Project funded by the EC under the Horizon 2020 Program.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Frazzetto, D., Nielsen, T.D., Pedersen, T.B. et al. Prescriptive analytics: a survey of emerging trends and technologies. The VLDB Journal 28, 575–595 (2019). https://doi.org/10.1007/s00778-019-00539-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-019-00539-y