1 Introduction

The pervasive influence of digital technologies impacts value creation and value capture (Schwab 2017) as digital products become more the rule than the exception (Brynjolfsson and McAfee 2014). Given the transformational character of these digital products on many levels, the concept of Digital Transformation (DT) receives increasing attention in management research and practice. For our purposes, it helps to understand DT as generally the “disruptive implications of digital technologies” (Nambisan et al. 2019, p. 1). These implications appear at and across various levels, from the individual over the organizational to the societal level (Lepak et al. 2007; Nambisan et al. 2019). The transformation affects organizations as a whole and leads to changes in ways of performing work (Haverkort and Zimmermann 2017), organizing work, and even in the business models of companies (Lucas and Goh 2009; Schallmo et al. 2017).

However, research approaches are often very specialized and restricted to their domains resulting in a rapidly growing number of publications with results from different disciplines and point of views in the field of DT each year. Due to these different research approaches and domains, the larger field of DT is very complex and hard to comprehend. Researchers do not even agree on a common definition of the term “digital transformation” (cf. Morakanyane et al. 2017) and it is often used interchangeably with terms like “digitization” and “digitalization”. This complexity leads to uncertainty regarding the topic, especially in practice, such that many firms struggle with the development, diffusion, and implementation of new technologies regarding digital transformation (Brynjolfsson and McAfee 2014), and consequently, great opportunities remain wasted (Hirsch-Kreinsen 2015).

In order to improve our understanding of possible implications of DT, it is critical to overcome these uncertainties and to develop further a common understanding of this field. There are already studies in literature on the implications of DT in businesses (Kane et al. 2015; Matt et al. 2015), which can be used as a basis to foster understanding. Besides many technology-driven studies, additional research approaches from a business perspective are needed (Hirsch-Kreinsen 2015). Changes can be observed in the industry and industrial processes (Pisano and Shih 2012), as well as in areas like smart homes (Risteska Stojkoska and Trivodaliev 2017) or e-health (Ross et al. 2016). Therefore, the topic is of interest to many different disciplines, yet there is a lack of synergy. Cooperation among the disciplines electrical engineering, business administration, computer science, business, and information systems engineering is a necessary feature of this phenomenon (Hirsch-Kreinsen 2015).

Our study aims at structuring existing research, identifying the major current trends, and thus offers an overview of recent research streams and topics in the area of DT from a business perspective. We contribute to the wide field of DT research by providing a theoretical background for subsequent research. Research areas are shown and possible gaps identified. This work may help researchers to identify similarities and differences within areas of DT research. Our findings may ease the comprehension of complementary conclusions from adjacent fields and foster an interdisciplinary understanding. In emerging topics, expertise is important, as is adaptive expertise, which describes the ability of researchers to understand and combine results and procedures from different fields (Boon et al. 2019). Thus, our results can be regarded as the first step towards this ability by showing a holistic approach to DT research. We appreciate a mutual interchange of findings from corresponding research streams in future.

There are many different opportunities to study the complex and immense field of DT from a business perspective. To bring these together, we use a citation network analysis (Boyack and Klavans 2010). Unlike other literature review approaches, the network analysis does not focus on a special field within DT research. It is less selective in the first instance and enables the implication of a broad literature base, allowing the diverse field to be structured. To gain a broad literature base, we use search terms combining DT with the focused business perspective. The generated database is further used for the citation network analysis which is executed with the tool, Gephi, resulting in clusters representing different research streams. Finally, the most relevant clusters are examined qualitatively to give an overview of major trends and topics studied in these streams.

In the following, we develop the theoretical foundation for the research approach including the definition of digital transformation and a short introduction to our understanding of the business and technology perspective. Afterward, our method is introduced in detail. Results are presented in general, following an overview of the different clusters identified. Moreover, research gaps are shown. We conclude with a summary, limitations, and an outlook for further research.

2 Theoretical foundation

2.1 Digital transformation

The term “digital transformation” (DT) pervades the modern world. However, a generally valid definition for the concept of digital transformation does not yet exist. Some researchers focus on specific technologies to explain an “organizational shift to big data analytics” (Nwankpa and Roumani 2016, p. 4), while others focus on technology in general as the driver of radical change (Westerman et al. 2014). We want to underline, however, that DT does not merely refer to technological changes, but also to the impacts thereof on the organization itself (Hinings et al. 2018). It leads to “transformations of key business operations and affects products and processes, as well as organizational structures and management concepts” (Matt et al. 2015, p. 339). The changes that come along with the digitalization affect people, society, communication and the whole business (Gimpel and Röglinger 2015; Jung et al. 2018).

Many of the technologies that affect DT are not new. The innovation is about “combinations of information, computing, communication, and connectivity technologies” (Bharadwaj et al. 2013, p. 471). The major technological areas which enable DT are very diverse and traditionally called “general purpose technologies” (Hirsch-Kreinsen and ten Hompel 2017). These include, for example, cyber-physical systems (CPS), (industrial) internet of things (I/IoT), cloud computing (CC), big data (BD), artificial intelligence but also augmented and virtual reality (Cheng et al. 2016).

Yet, “organizations struggle with radical change to adopt novel digital institutional arrangements that are radical and transformational” (Hinings et al. 2018, p. 59). However, many researchers and practitioners see positive effects of the digitalization. They sense the manifold benefits that foster an increase in sales and productivity triggered by innovative forms of value creation and new ways of interaction with customers and suppliers (Downes and Nunes 2013; Matt et al. 2015; Parviainen et al. 2017). For example, the digital interconnection of machines will enable flexible small series (Spath et al. 2013) and improve the value creation process (Stock and Seliger 2016). Digital communication opportunities and virtual networks change the way of doing business and gaining competitive advantage (Parviainen et al. 2017). Moreover, researchers sense positive effects because DT triggers job growth, such as service occupations and robot development (Brynjolfsson and McAfee 2014).

In summary, the DT of business leads to three significant changes (Fitzgerald et al. 2014; Liere-Netheler et al. 2018) (1) digitally supported and cross-linked processes, (2) digitally enabled communication, and (3) new ways of value generation based on digital innovations or gained digital data. These major changes can be found worldwide and in all industries. Moreover, DT has spawned new business areas such as e-government, e-banking, e-marketing, e-tourism and the highly innovative field of e-health where two research areas (medicine and information systems) meld.

Despite the gains of the DT, more and more researchers see the negative effects of digitalization. A significant threat is impending job loss (Brynjolfsson and McAfee 2014). Digital processes and the increased use of robot technologies will lead to employee reduction in mainly low ordered jobs (Frey and Osborne 2017). Furthermore, risks such as cybersecurity menaces (Greengard 2016) or uncontrolled or errant data (Allcott and Gentzkow 2017) pose threats to businesses. Firms within all branches struggle with the heterogeneous landscape of interfaces and integration standards (Bley et al. 2016). Still, the general expectations towards DT are high. Researchers from different disciplines contribute to an ongoing evolution of DT, its risks, and future applications.

2.2 Business and technology perspective

As described in the chapter before, DT is based on technological progress but implies a much broader focus influencing organizations as a whole. So, research in technological areas like informatics and engineering are very important. However, to drive the topic forward, business perspectives are necessary. As the discipline of information systems unites these views, we regard it as useful for our purpose. Since the development of information systems, their role in the support of management became increasingly important. Gross and Solymossy (2016) draft three eras in the development of IS: from 1937 to 1962, storage of economic data in central administrations; from 1962 to 1987, adoption of computer hard- and software by companies; and from 1987 to 2012, usage in transactions with stakeholders. The current era, i.e., after 2012, is characterized by digital technologies implicating how companies are driven (Fitzgerald et al. 2014). Companies use digital twins, Business-to-Machine Communication, and data-driven business models to deliver value to customers. Looking at Porter’s value chain (Huggins and Izushi 2011) activities move closer together through the use of connected digital devices and IS systems.

Within this paper, we will not focus on specific technologies. The aim is to take a holistic view of how the area of DT is evolving (Devaraj and Kohli 2003; Karimi and Walter 2015). Of course, we will use specific technological terms for our literature search to find relevant articles, but at the same time connect to its usage within organizations. As different research fields arise within DT (see Sect. 2.1), the scope of this article is not limited to applications but rather to a non-technological perspective. We aim at topics from a socio-technical view. This includes the acceptance, adoption and use of technologies (Liere-Netheler et al. 2018).

3 Method

The importance and potential of reviews have increased across all academic disciplines (Schryen 2015). To gain an overall understanding, a literature review in the sense of a state of the art has many benefits. Researchers collect and understand what is already known in the specified field of interest. Furthermore, they can identify and name the research gaps. Moreover, it is essential for the foundation of a proposed study (Levy and Ellis 2006) and can also help to bring ideas for practical problems (Okoli and Schabram 2010), thereby serving as the basis for any further research in a specific field (vom Brocke et al. 2015). According to Fink (2005), a literature review has to be systematic in the approach, explicit in procedure, comprehensive in scope, and reproducible. The documentation of the research process has been identified as the crucial part of a successful review (Brocke et al. 2009) which is why in the following we will present our procedure in detail.

We followed a three-step research approach similar to other research designs in the literature (Hausberg and Korreck 2018). An overview of the approach can be seen in Fig. 1. The outcome (out) of each step is used to perform the following step and is thus described as an input (in). The single steps are explained in the further cause of this chapter.

Fig. 1
figure 1

Research approach

3.1 Identification of literature

As a first step for our study, we identified the data base for further analysis. To develop the search terms for our review, we firstly read articles from the field of interest with special regard to main titles and keywords. We searched, from a holistic view, seeking research dealing with DT as an organizational change. With the help of the literature, we deduced a set of relevant buzzwords combining two research streams: digitalization and business research. As the goal was not to focus on a specific technology, we included different technologies within the search terms. Using the list of keywords, we conducted several search loops to adopt the relevant terms iteratively. After each loop, the top ten to twenty results regarding times cited were checked to make sure the search stream fits with our research question. The final terms used can be seen in Table 1. The first column of the table includes synonymous concepts of digitalization like “Industrie 4.0” as well as technologies and inventions linked to DT. Many terms have connections to the field of Information Systems (IS) research and linkage to production systems. The right side of the table mainly presents business areas (e.g., controlling, logistics etc.) and closely linked terms. By combining these two fields, we gain research material dealing with the appreciated view of DT in business. We are aware that the search terms are theory- and technology- as well as less impact-driven. As DT is at an evolving stage, we expect the focus of past and current research on theory and technology development to be useful.

Table 1 Search terms

We used the ISI Web of Science (WoS) as the database for our search. The different compositions of terms were searched in title, keywords or abstracts by using the field ‘topic’. WoS is considered the most comprehensive database and is frequently used in management and IS research (Dahlander and Gann 2010; Schryen 2015; Mian et al. 2016; Albort-Morant and Ribeiro-Soriano 2016). We conducted the search by November 2017 and decided to limit the search period to the last 20 years because DT as used for the purpose of this article (described in the theoretical foundation) emerged as a topic in the 2000’s. Nevertheless, we included research back to 1997 to miss no important groundwork. Before that time, digital technologies like the Internet just surfaced. To stay focussed on the business and technology perspective, we restricted the research areas to operations research management science, business economics international relations, social sciences other topics, communication, behavioural sciences, social issues, and sociology.

3.2 Citation network analysis

Today, literature reviews face the challenge of a fast-growing number of articles, the majority of which is available online (vom Brocke et al. 2015). An analysis with the help of tools makes the large amount of literature manageable. We used the freeware online tool hammer.nailsproject.org to conduct a bibliometric analysis and obtain the co-citation node-edge-files. We imported the data to the software Gephi 0.9.2 to carry out the citation network analysis and visualization of the co-citation network. Citation network analyses assume that with an increasing number of shared citations between two publications, the probability increases that the cited papers share a specialized language and specific worldview (Boyack and Klavans 2010). Based on this assumption, we can infer that nodes belonging to the same cluster within such a citation network treat the topic of interest from a similar perspective and with similar argumentative backgrounds and patterns.

In a subsequent step, we searched for double entries, for example, like those due to errors in the spelling of author names. In our final sample, we had 1876 articles citing an additional 71,368 references, leaving us with a total of 73,244 publications that constituted the nodes of our co-citation network. We filtered out all entries with fewer than two citations to make sure that all included articles were cited more than once as we assume one citation as rather random (Boyack and Klavans 2010). This is also in accordance with the goal to bring together research with at least few overlaps. Doing so, the network is reduced to a size of 7980 nodes (10.9% of the total network) with 3790 edges, a diameter of 5, and an average path length of 1.598.

Based on this, we ran a cluster analysis identifying 226 clusters. However, only the top 22 clusters had a meaningful size and included each at least 1.1% of all nodes. We took these clusters as a starting point for our qualitative analysis. We visualize the network in Fig. 2 with the nodes being color-coded according to their common research streams as identified through the cluster analysis. Each article in the analysis is assigned to one cluster.

Fig. 2
figure 2

Co-citation network graph (largest connected component)

3.3 Qualitative analysis

To study the major topics at the interfaces between business and management research and information systems literature, we sorted the clusters by size (number of articles total within each cluster) and focused on the first ten percent clusters with the highest number of articles. Thus, for our qualitative analysis, we have a total of 22 clusters ranging from 2887 articles (cluster 1) to 841 (cluster 22).

To proceed with the qualitative reading, we checked which of the clustered articles are available within the ISI Web of Science (WoS). In result, we conducted a qualitative reading of 728 articles. The qualitative reading followed a threefold approach: First, we examined all articles within each cluster by reading the heading, the abstract, and the keywords, focusing on categorizing the cluster in the field of existing research on DT from a business and management research perspective. Second, by quantitative text mining tools, we took the headings, as well as the keywords of the articles, and identified the most relevant keywords and topics within each cluster to designate the clusters by main topics and subtopics. The process of cluster-naming and definition took place in a two-stage evaluation process of a team of five heterogeneous researchers. To name the clusters, each author first individually evaluated the cluster. Afterward, the individual cluster evaluation results were merged and discussed jointly among members of the whole research group, before the results of the cluster designation were finally defined and clusters were named.

In this process, we recognized some articles that did not fit within the topic that constituted the theme of the cluster. This usually happens when articles represent fringe topics or when their citation pattern is at odds with the norm in a specific subfield. After filtering for papers without clear relation to the research context of the designated cluster, we conducted the third step of our qualitative analysis, a detailed, qualitative reading of each article left. To evaluate the clusters, different methods are known in literature which are classified into three groups: internal, external and relative validation techniques. These methods are mainly based on distances between objects and are useful to evaluate the algorithms used (Arbelaitz et al. 2013). However, because our goal was to evaluate the consistency of topics within one cluster, we developed our own measurement: the “Cluster Trust Index” (CTI), which we defined as the ratio of articles utilized to further describe the clusters and the total number of articles in the cluster.Footnote 1 The CTI may provide an indication of the quality of the automated allocation to the clusters. In this last step, we gained deeper insights as we named the main research streams, pointed out the most used theories, presented the key methods and tools, as well as summarized the main results. Furthermore, we identified the most cited authors in each cluster and concluded with identified research gaps and suggested fields for further research.

4 Research streams on digital transformation

The identification of the literature base with the help of Web of Science leads to 1876 hits. Most articles were published during the last five years, as seen in Fig. 3. We assume the attention on the research is still growing as it has raised attention since 2013. More than 300 papers were published in the journal “Expert Systems with Applications” which focuses on technical solutions and intelligent systems applied in different contexts and is not limited to a specific area. Moreover, many articles were published in “Decision Support Systems” and the “European Journal of Operational Research”. Besides these journals from a business perspective, other journals with a more psychological view were found.

Fig. 3
figure 3

Articles per year

The technologies investigated in the analyzed articles (recognized by keywords) can be seen in Fig. 4. Especially research on big data is gaining more and more attention during the last 5 years. As big data can be understood as a large amount of data (Chen 2014) as well as technological challenges associated with these data (Madden 2012) many articles are dealing with this topic. The number of articles on cloud computing also rose significantly since 2013. As the Internet of Things emerged as a concept by Kevin Ashton in 2009 (Ashton 2009) research grew from that time. Artificial intelligence, machine learning, as well as augmented and virtual reality, seem to be rather steady topics in research.

Fig. 4
figure 4

Articles per technology per year

For the identification of clusters and superior research streams, the cited references were included in the analysis. For the qualitative analysis, 22 clusters were analyzed in-depth which represent the most important topics in our database. For an overview of the clusters, see Table 2. The clusters are further introduced in the following chapters by presenting the research streams identified. This means we merged clusters dealing with similar research issues to one topic. In total, we introduce nine identified streams in the following chapters. The numbering of clusters is based on their size regarding articles found (see # in Table 2). During the qualitative analysis, we identified two clusters which were excluded for further examination because they do not fit the business perspective that was intended. One of these was named “methods” as it mainly deals with research methods, especially in statistics and game theory. Moreover, many papers are technology focussed as they deal with programming issues. We also did not investigate the cluster “health care” in further detail because of a missing business perspective.

Table 2 Cluster with color coding, article count, and central keywords

The size of the clusters can be found in Table 2. “Total” includes articles from the base sample, as well as references. The column “found” shows only the articles found during the Web of Science search. QA (qualitative analysis) is the number of articles, which were in-depth analysed in the third step. Lastly, the cluster trust index is used to evaluate the quality of the cluster-building process.

The ratio of the size of the clusters, measured by the number of articles, seems to be rather unchanged. A peak of articles can be found between 2011 and 2014 for the innovation and manufacturing cluster (see Fig. 5). Yet the topics seem to decline afterwards in the field of DT research leading one to the assumption that these fields are in a more advanced stage than the others from a research perspective. Research on innovation, especially, has been carried out extensively in the last 5 years. Analytics and society, too, have the most articles in 2014. A growing interest in societal questions can be observed as there are more articles in the last few years. The research interest on implications regarding whole societies is getting higher but is still a less mature field of research, e.g. in the field of changing labour markets due to more automation of tasks. Knowledge management, tourism, and marketing seem to be rather steady areas of research. Regarding DT in finance, the interest has decreased a little bit which indicates an advanced stage in this application field of digital technologies. As the total number of papers has grown significantly since 2006, there are no outstanding results before that time.

Fig. 5
figure 5

Articles per research stream per year

In the following, the identified research streams are presented by highlighting important results and articles.

4.1 Finance

Within this research stream, three clusters were identified and named credit and risk management (cluster 1), artificial intelligence (AI) methods (cluster 10), and trading of investment certificates (cluster 16). The leading journal in this field is ‘Expert Systems with Application’. Within the second cluster, the ‘European Journal of Operational Research’ and within the third cluster ‘Quantitative Finance’ are additional sources with a high number of articles related to the field.

In the first cluster, three articles from ‘Expert Systems with Application’ show high ranks above 150 in their times of citations. Regarding the in-degree, these articles are outstanding with values of six and five. Looking at the betweenness centrality, articles from Tsai and Wu (2008) as well as Min and Lee (2005) show values above 1000. They are also those most cited. As “the performance of multiple classifiers in bankruptcy prediction and credit scoring is not fully understood,” Tsai and Wu (2008) propose to compare a single classifier with multiple classifiers and diversified multiple classifiers by using them on three different datasets.

In the second cluster, two articles from the ‘European Journal of Operational Research’ as well as ‘Information & Management’ have citations above 100. Looking further at in-degree and betweenness centrality the article from the ‘European Journal of Operational Research’ is outstanding with values of 11 as well as 1538. This article is written by Zhang et al. (1999) and provides a general framework for better understanding artificial neural networks. The authors show the advantage of neural networks over logistic regression and classification rate estimation, relating to the prediction of bankruptcy as well as robustness towards variation in the sample.

In the third cluster, four articles show highest ranks between 20 and 30 citations. All are from the ‘Expert Systems with Application’. Looking at the betweenness centrality, two articles show values above 100. Booth et al. (2014) also have a high value of citations. In their work, they use seasonal effects and regularities in financial data to develop an expert system based on random forests techniques to develop a trading strategy. The performance of the models is assessed by using data from the German Stock Exchange Index (DAX). In general, using seasonal effects has proven to produce superior results.

Compared to the other two clusters, this third cluster is smaller and the articles newer. Specific algorithms still need to be applied in this area. Interestingly, Hsu et al. (2016) are questioning the efficiency of financial markets. Views which financial economists have been taken on markets for decades such as Smith’s invisible hand might have to be adjusted. All in all, the field of finance has already presented significant changes and developments due to DT, especially forecasts which are useful for financial decisions can be made using algorithms. Technology enables the control of complex environments like financial markets. However, many unpredictable events still make forecasting difficult and lead to challenges for the DT in the finance sector.

4.2 Marketing

The marketing stream focuses on three aspects: the use of virtual reality (VR) in marketing and sales (cluster 3), the possibilities to work with user-generated content to deduce sentiments and further data (cluster 5) and computer-assisted customer relationship management (cluster 19). For cluster 3, we dismissed topics regarding VR application for pedestrians and mere VR acceptance. The most cited article (288 times with betweenness centrality of 134) of cluster 3 is written by Coyle and Thorson (2001). This work deals with the perceptions towards websites and the influence of the characteristics vividness and interactivity. This work is closely tied to the work about the effects of different technologies on product ratings. Moreover, the ability to use reviews for further marketing and sales purposes is shown in this cluster (Singh et al. 2017; Ordenes et al. 2017; Sodero and Rabinovich 2017).

Cluster 19 is about customer relationship management (CRM) and technical implications using automated responses for service purposes. The analysis of the most used words within the keywords showed an accumulation of the fields of BD, user-generated content, and consumer. Cui et al. (2006) show the highest values of in-degree (3) and betweenness centrality (239) of cluster 19. The text deals with machine learning (ML) for direct marketing response to enable immediate response to customer inquiries.

The work of Das and Chen (2007) provides the highest in-degree (12) in cluster 5 and a betweenness centrality of 1133. The authors developed a methodology for extracting small investor sentiment from stock message boards. The content analysis of cluster 5 shows: BD, customer, social, marketing, and ML are the most used words of the keywords of cluster 5. In general, cluster 5 deals with articles about user-generated content and text mining systems that are used to gain additional information from the data. The analysis of user- or customer- generated data via reviews and the fast reaction of the enterprises play a vital role in this research stream. We identified several articles in all marketing clusters that focus on that topic and on response modelling (Kim et al. 2008). Furthermore, new technologies and opportunities like VR and AR enable new dimensions of online product presentation (Yim et al. 2017).

In summary, marketing activities are highly influenced by DT which opens up new possibilities of understanding customer behavior and placement of individually adapted advertising which is possible due to a huge amount of data created by the user or automatically generated data. A further need for research in the field of VR and AR for marketing purposes is identified. These technologies should be developed and enhanced to create a more sensual atmosphere.

4.3 Innovation

The clusters of this stream deal with business model innovation (cluster 18), adoption and diffusion of innovations (cluster 2), impact on the process of innovation and organizational learning (cluster 12) as well as strategic aspects of innovation in terms of, for example, search orientation and capabilities (cluster 20).

Cluster 18 is closely related to the manufacturing clusters for it deals with the industrial internet of things (IIoT). However, rather than investigating primarily manufacturing aspects of IIoT, studies in this cluster investigate the relationship between business model innovation and DT in general as well as IIoT in particular. The article with the highest in-degree (4) and 50 citations examines the effects of business model innovations triggered by the DT on accounting (Bhimani and Willcocks 2014). Other articles deal more strictly with the implications of IIoT for business models (Arnold et al. 2016) and how the new business models of the digital era can be identified and developed (Pisano et al. 2015; Najmaei 2016). Of particular interest is the emergence of these new business models in the context of the DT through entrepreneurship (Guo et al. 2017), as well as their more sustainable nature (Gerlitz 2016; Prause and Atari 2017).

While the technological focus of cluster 18 was on IIoT, cloud computing (CC) is the subject of cluster 2. In fact, the study of this cluster with the highest in-degree (7) and over 290 citations investigate determinants of its adoption. Oliveira et al. (2014) find significant differences in the determining factors between manufacturing and service firms. While adoption in manufacturing is driven by the relative advantages and cost savings of CC, service firms are more reluctant to adopt it due to the complexity of CC and require more top management support. In terms of theoretical frameworks, the technology adoption model (TAM) is the most applied in this cluster (Gangwar 2016). One of the earlier studies integrates the TAM with marketing theory in order to explain firm adoption behavior regarding radical innovations like CC (Bohling et al. 2013). However, some studies also investigate combinations of theories (e.g., TAM and media richness) and technologies (e.g., CC and augmented reality) (Lin and Chen 2015).

Cluster 12 covers managerial challenges of the DT. For example Khanagha et al. (2013) study the impact of management innovation on the adoption of emerging technologies. They show, based on an in-depth case study, that management innovations can provide the required changes in organizational structures that enable the adoption of emerging core technologies. Most importantly, it is argued organizational routines that prevent early stage experimentation with the new technology need to be overturned as they can hinder knowledge accumulation. Other studies investigate the role of established management concepts like absorptive capacity (Lam et al. 2017; Trantopoulos et al. 2017) and ambidexterity (Khanagha et al. 2014). The managerial challenges during the innovation process most investigated by studies in this cluster are the changing opportunities and difficulties related to managing the customer and customer communities, in particular, managing customer co-creation and ideation (Hoornaert et al. 2017; Khanagha et al. 2017).

Cluster 20 covers also managerial challenges of the DT, but with a distinct focus on BD. The issues investigated regarding the relationship between management and BD range from human resources (Shah et al. 2017) over new product success (Xu et al. 2016) to firm performance and strategy (Akter et al. 2016; Mazzei and Noble 2017). The article with the highest in-degree (11) received 130 citations on Google Scholar at the time of analysis and uses the resource-based view of the firm to explain the outcome of BD usage for consumer analytics (Erevelles et al. 2016).

In summary, innovation is by nature an important research avenue to pursue in regards to digital transformation because the transformation process has to be innovative itself to be successful. DT implies implementing and using new technologies in combination with a cultural change of the whole organization. Innovation literature can contribute to developing effective ways to apply and utilize DT.

4.4 Knowledge management

The cluster knowledge management (cluster 7) focuses on aspects of knowledge management and strategy in the realm of digitalization. The journal that most occurred in this cluster is the ‘Journal of Knowledge Management’ with one third of the articles published here, of which 57 percent of the articles were published in 2017. The most frequent keywords are big data, analytics and for the content-related realms knowledge management, intellectual capital, and performance. The article by Braganza et al. (2017) is the most cited article (in-degree = 2) with the highest betweenness centrality (168). They discuss the management of resources in BD initiatives and how to effectively introduce BD initiatives into companies.

We divided this cluster into two main areas as articles show tendencies towards (1) Knowledge Management as well as (2) Strategy.

(1) Knowledge Management is the primary topic focus of 13 articles. The major part of the cluster consists of articles focussing on digitalization in knowledge management. Among these papers, most (8) deal with BD and its use for knowledge management in companies. Half of the articles take a closer look at specific applications of BD in the realm of knowledge management. Fowler (2000) and Weber et al. (2001) on the one hand focus more on use cases that involve AI and how it can “contribute to knowledge management solutions” (Weber et al. 2001, p. 17). On the other hand, Murray et al. (2016)as well as Uden and He (2017) take a look at IoT devices and how they can enhance knowledge management systems because of the data that are automatically generated. A strict theoretical view can be found with Rothberg and Erickson (2017), who mean to bring together the existing theory from knowledge management, competitive intelligence and BD analytics. One article is quite critical of the use of BD and elucidates that “to describe it [BD in the context of knowledge management] as ‘revolutionary’ is premature” (Tian 2017, p. 113).

(2) Strategy is investigated by eight articles. The strategy topics can be divided into three subareas. Two articles focus heavily on decision making and how BD can be of use (Prescott 2014; O’Flaherty and Heavin 2015), while another two articles deal with text mining techniques and their impact on business strategy (Li et al. 2012; Zhang et al. 2016). Moreover, four articles investigate performance aspects of BD in relation to business strategy (Cleary and Quinn 2016; Tian 2017; Blackburn et al. 2017). This performance perspective includes papers that show how BD can help to improve the understanding of purchasing decisions (Tian 2017). It can also be seen how BD affects operation models (Roden et al. 2017), and whether BD might affect R&D Management (Blackburn et al. 2017), as well as “how the use of cloud-based accounting/finance infrastructure affects the business performance of small and medium-sized enterprises” (Cleary and Quinn 2016, p. 225).

Braganza et al. (2017) propose to utilize theories drawn from strategy and leadership fields. Deeper insights on how strategies are changing and still need to change are missing. Moreover, as business models are already studied in-depth regarding DT, concrete application scenarios would be useful.

4.5 Analytics and data management

Seventy percent of the articles in the Analytics and Data Management cluster are published in 2017. We further subclassify the publications in four major realms:

(1) Operations and supply chain management, in addition to the matter of BD and analytics, enhancement of supply chain processes and ultimately, performance, are important areas of study. Bag (2017) shows empirically the positive relationship between BD, predictive analytics, and supply chain performance. Rajesh (2016) presents a prediction model to forecast supply chain resilience performance and to test it. For an extensive literature review, see (Lamba and Singh 2017). Tan et al. (2015) propose an analytic infrastructure to assist firms to capture the potential of supply chain innovation afforded by data. This is also the article with second highest values for in-degree (12) and betweenness centrality (764). Ji et al. (2017) present an example of how BD in the food chain can be combined with Bayesian network and deduction graph models to guide production decisions.

The second significant research realm is in the context of (2) innovation and operations management. Furthermore, articles dealing with application and exploitation of BD to create competitive advantage and value in business are studied. For instance, Barton and Court (2012), also the most cited article in this cluster (in-degree: 26), present a practical perspective on how to improve companies’ performance with advanced analytics. Zhan et al. (2017) suggest how firms could use BD to facilitate product innovation processes. Moreover, Tan and Zhan (2017) present three principles related to BD which support new product development.

Another noteworthy topic is (3) analytics to improve decision-making in management. For example, Horita et al. (2017) present a framework that connects decision-making with data sources through an extended modelling notation and modelling process.

The last realm refers to (4) data analytic techniques and quality framework of data management systems. Zhang et al. (2015) discuss specific techniques for modelling BD and analytics in the context of computational efficiency. Others present explicit analytical modelling for designated business fields, such as quality control in manufacturing (He et al. 2016).

We conclude that “successfully introducing analytics requires substantial organizational transformation” (Dremel et al. 2017). Management decisions supported by BD analytics depend on the underlying data quality. With the highest values on in-degree (12) and betweenness centrality (3108), the article from Hazen et al. (2014) contributes to the data quality problem within the supply chain management context. Lamba and Singh (2017) see a lack of data analytics techniques and works which can suggest the practical implementation of BD. For future research, it is suggested one consult, for example, Sivarajah et al. (2017). How to analyse and use data effectively is still a topic with growing interest in research and a big challenge for practice.

4.6 Manufacturing

The research stream manufacturing is represented by three sub-clusters that deal with the fields of cloud manufacturing, strategic implications for manufacturing and logistics.

Cluster 4 is quite diverse. We excluded specialized topics in the field of space science (Metzger 2016), mobile services (Qi et al. 2014) and football robots (Bi et al. 2017). Among representative works within this cluster, a visualization platform for IoT to control and monitor wireless sensor networks (Bi et al. 2016), resource allocation (Pillai and Rao 2016) and resource bundling (Guo et al. 2016) are examined. Moreover, strategic issues are discussed (Li et al. 2012; Guggenheim 2016). One particularly strategic article dealing with information architecture in the context of supply chain management (Xu 2011) has a very high betweenness centrality (number six and seven of the whole sample). Xu (2011) is also cited 124 times.

Cluster 17 has a focus on cloud-manufacturing (also most mentioned keyword). The ‘International Journal of Computer Integrated Manufacturing’ focuses topics in this area and is the publisher of most of the articles of the cluster. Cloud-manufacturing means that the principles of cloud computing will be transferred to manufacturing concerns, so related manufacturing resources are offered as services which lead to a network of exchanging needed resources and products. This application of DT can optimize processes which is shown in an example of sheet metal processing (Helo and Hao 2017). Frameworks for building a cloud manufacturing solution (Cheng et al. 2016; Lu and Xu 2017) and the design of the network architecture (Škulj et al. 2015) are presented and discussed. Moreover, the communication between machines in different companies is a necessary condition to make cloud-manufacturing a success. Therefore, a scheduling model was developed to efficiently exploit distributed resources (Li et al. 2017).

Cluster 22 is the smallest of all clusters in the sample. It includes articles on manufacturing whereas it exhibits limited focus on logistics topics. Most articles were published in the ‘International Journal of Production Research’. The most cited article of the cluster with 43 cites is also the one with the highest betweenness centrality. Reaidy et al. (2015) and Zhong et al. (2017) show that RFID technology is especially useful in warehouses to track resources and to connect objects. Advantages of the aforementioned communication technologies in smart logistics, as in higher safety are shown (Trab et al. 2017). Moreover, applications of technologies are demonstrated like the development of an algorithm to optimize truck docking (Miao et al. 2014).

Smart factories, as well as smart industry (Haverkort and Zimmermann 2017), are popular areas of research which are shaped by examples from practical applications. Machines, information systems and workers become more connected. The future factory is decentralized and can produce diverse products in a short time period. The topic of DT is getting more and more important for the manufacturing industry.

4.7 Supply chain management

Two of the identified clusters were allocated to the topic supply chain management (SCM). The importance of the topic was extraordinarily high in the years between 2010 and 2014 when more than 100 articles were published.

The clusters differ especially in their technological focus. These are supply chain and CC for cluster 15 as well as supply chain and BD for cluster 21. Cluster 15 deals with the adoption and usage of one of the central technologies in DT—cloud computing—in the context of supply chain management. Empirical results show a positive effect of the technology on supply chain integration (Bruque Cámara et al. 2015; Bruque-Cámara et al. 2016) which also leads to higher operational performance. This fostering effect on collaborations is also examined by other authors in different contexts like manufacturing and humanitarian organizations (Schniederjans and Hales 2016; Yu et al. 2017). The highest betweenness centrality and a total number of times cited can be observed for the article from Cegielski et al. (2012) which deals with the adoption of CC in supply chains. A few other technologies are also discussed in the context of SCM. O’Donnell et al. (2009) develop a generic algorithm to reduce the bullwhip effect, and Cantor (2016) examines effects of work monitoring technologies. The author with most articles in this cluster is Dara Schniederjans who published four of the 20 papers.

Cluster 21 has a focus on the use of BD in SCM. Benefits like a higher supply chain visibility and transparency, along with challenges like the balance between humans and analytics management styles are shown (Waller and Fawcett 2013; Dutta and Bose 2015; Kache and Seuring 2017). The article of Waller and Fawcett (2013) is in total cited 95 times as they give a broad overview of BD in SCM and define critical terms in this area. Two very famous authors in the area of DT also occur in this cluster with an article on BD impacts (McAfee and Brynjolfsson 2012). The reputation can be seen by the in-degree of 75 and total times cited of 387.

In sum, collaborations between firms in supply chains are identified as one primary driver of DT (Liere-Netheler et al. 2018) as borders between enterprises are known to blur (Lucke et al. 2008). This means that technologies should support this change in the supply chains. Two of the significant technologies which lead to more exchange of data are CC and BD. Wieland et al. (2016) identified BD and analytics as an overestimated research theme in the next 5 years which is in accordance with our findings. Topics like people dimensions, ethical issues, and integration are underestimated as DT also includes a cultural change in companies and the whole supply chain. Moreover, the exchange of data is still an open question. Security and legal aspects are especially unclear (Richey et al. 2016).

4.8 Society

Cluster 8 contains 23 articles. An article from Boyd and Crawford (2012) has the highest betweenness centrality (2727) and the highest in-degree (37). Besides keywords from the digital context (BD, algorithms, and technology), the most frequently used keywords were social, communication, governance and epistemology. Hence, we further sub-classify the articles in three major realms:

(1) Society and communication Articles in this realm deal with topics like an ‘analytic culture’ (Gano 2015), data-driven urban geographical imaginaries and understandings (Lake 2017; Shelton 2017), ‘datafication’ of daily life (Madsen et al. 2016), and the monetization of user data (Doyle 2015). Other topics include data-journalism (Parasie 2015), data protection (MacDonnell 2015), impacts of socio-technical systems (Carolan 2017), or BD as communication with targeted audiences in a social and cultural context (Holtzhausen 2016). Furthermore, we find articles referring to a technical communication perspective discussion in which BD found to ignore the crucial roles of interpretation and communication (Frith 2017).

(2) Policy and international finds most of the articles taking a critical view on digitalization in this context (Chandler 2015). For example, Sanders and Sheptycki, who discuss stochastic governance, “defined as the governance of populations and territory using statistical representations based on the manipulation of BD” (2017, p. 2), towards a critique of the moral economy of neo-liberalism. A considerable number of articles deals with the topic ‘algorithmic governance’/‘datafication-governance’ (e.g. Chandler 2015; Madsen et al. 2016; Rothe 2017). Rothe (2017), for example, highlights the role of visual technologies and discusses the construction of environmental security as a form of ontological politics.

(3) Philosophy and ethics Lake (2017) integrates an epistemological view and discusses BD and urban governance in a democratic society upon an ontological approach. He concludes that BD leads to an atomistic behaviour in management and thus “undermines the contribution of urban complexity as a resource for governance […]” (Lake 2017, p. 1). Furthermore, we find articles provide critiques about the efficacy of BD approaches (Lowrie 2017) and the hidden, positivist assumptions (labelled techno positivism e.g., (Gano 2015) behind the movement. Critics of technological solutions and BD are also discussed, such as surveillance of the population (Heath-Kelly 2017). Furthermore, articles reflecting how BD affect people as psychological beings are found (Raab 2015). The predicament of living in a networked world and being partly unable to sufficiently grasp with the implications thereof is discussed epistemologically (Van Den Eede 2016).

In summary, the cluster provides multidisciplinary approaches on the impact of DT on society, and most of the articles engage with BD and digital technologies from critical positions. In the work of Madsen et al. (2016), we find a research agenda for future research on BD within international political sociology. An important field for further studies is the importance of theory-driven data production. From a societal point of view, DT needs to be considered as a possibility for advancement but also, and probably more important, risks need to be taken into account so that no people will be left behind.

4.9 Tourism

The cluster tourism deals with research articles in the cross-area of tourism and social media. Starting from the year 2000, there was a peak in 2012 (116 articles) whereas in 2016 only 28 articles were published. A content analysis showed that besides the tourism aspects (tourism, destination, marketing), the most frequently used keywords from the digital context were Facebook, social media and data analytics.

We identified only two journals that provided more than one source: ‘Journal of Destination Marketing & Management’ (5 articles) and the ‘Journal of Tourism Management’ (2 publications). Only one author contributed more than one article (Kwok and Yu 2013, 2016). Both articles deal with the consumer communication via Facebook. Furthermore, the article of Kwok and Yu (2013)—an analysis of restaurant business-to-consumer communications—was one of the most cited articles in this cluster. Only Fuchs et al. (2014) with six citations and Xiang et al. (2015) with seven citations provided a higher in-degree. The research is about BD analysis in the field of hotel guest experience.

We aligned the articles to dominant fields of interest: destination management, (Fuchs et al. 2014; Raun et al. 2016) and geospatial data (Supak et al. 2015) to improve the touristic attractiveness of an area. A further sub-cluster is the research on the use of forums, customer recommendations and consumer-to-consumer communication. Dominant research focuses on text mining and how user-generated content influences the success of tourism organizations and the feelings of customers (Xiang et al. 2015; Ksiazek 2015; Kim et al. 2017). The last sub-cluster deals with the use of social media for marketing purposes in this field (Buhalis and Foerste 2015; Hornik 2016).

In summary, the influence of consumers and peers increased due to DT. The digital (user-generated) data is increasingly used for analytical purposes, such as text mining and sentiment analysis. Surprisingly trust plays no critical role in the field of user-generated content. We assume this topic is linked more closely to specific marketing research. Moreover, DT has led to a change of the whole industry as a huge amount of purchasing activities has shifted from travel agencies to online booking.

5 Research agenda for DT

During the analysis of all research streams, two major research directions were present. On the one hand individualization with an increasing influence of individual interaction like customer-created content or individual production is recognized. On the other hand, we sense a shift for widespread technology use where computer-controlled workflows impede human interaction as e.g., in smart production or automated decision support. Though we carefully, and by consensus of the involved researchers, named the clusters and streams by using keywords of related articles, we detect some research deficiencies in the areas of accounting and human resource management, as well as in sustainability in combination with the mentioned fields of interest. This does not necessarily mean that there is no research in this area; rather it indicates research regarding these topics is relatively small concerning our sample. So, the topics are not closely connected in research yet. For example, research streams about the integration of human resource management and IT exist (Bondarouk and Ruël 2009). However, a deeper understanding of the consequences of e-human resource on the human resource organization, more particularly an understanding of the phenomenon of e-human resource management and its multilevel consequences within and across organizations, is still lacking (Bondarouk and Ruël 2009). Recently, Gepp et al. (2018) reviewed existing research on BD in accounting and finance supporting our finding that the research stream in auditing is still lagging behind. This indicates future research directions and, as Gepp et al. (2018) postulate, a greater alignment to practice.

Nearly all research recommendations of the defined clusters appreciate further investigations regarding the future application and impact of digital technologies. Some examples of research gaps, resulting from the analysis of the streams, are presented in Table 3. Further research in all clusters is required for all technologies associated with DT. We have explicitly identified the need for research in the area of big data analytics in the clusters of marketing, knowledge management, manufacturing and society. For example, a specific linking of data with other applications such as business data or social media, as well as the combination of machine-generated data and customer information, is still new and demanding. These could lead to major efficiency gains and might also simplify lives. To study how these gains can be achieved, empirical research requires more focus. Using in-depth case studies is an appropriate method because case studies can highlight best practices. Both opportunities and threats should be identified, defined and evaluated. Still ethical questions coming along with the accessibility of semi-public or public data for researchers and the other parties (e.g. industry, politics) are not yet sufficiently investigated. Research on the development of mathematical models for the application of BD and for machine learning to support decision making needs to be further focused.

Table 3 Research gaps in DT research streams

The use of blockchains is also an issue. Many possible use scenarios are still to be discovered and tested. A search in the Web of Science Core Collection with the keyword blockchain within the areas of business, as well as management and a time horizon of 2017 and before shows 32 results. 17 results are not cited by other resources. “The Truth about Blockchain” (Iansiti and Lakhani 2017) published in HBR in 2017 is cited 41 times which is the highest amount of citations. This might be an indicator for future importance of this topic in business research.

In general, we emphasize a demand for more case studies describing the benefits, values and weaknesses of DT implementations in all clusters. In order to align the applications of DT with traditional research, the basic models should be tested for their suitability for the new, changed world. Furthermore, researchers advise caution in the sense of security and safety of the data produced and collected. Only the cluster society provided research about possible negative implications. We assume the digital revolution proclaimed is a slow process and for sure not over yet. The implications on culture and society will be enormous, so further work, integrating the cultural, technological and business level would be appreciated. Furthermore, long-term studies will show the real impact of the DT trend. Researchers may answer the major question for all clusters: How much of the enthusiasm is due to the novelty of the technology itself and how great are the long-term benefits? Moreover, the theorization of DT in general is not clear yet. First studies arise which collect different definitions (Morakanyane et al. 2017). However, we do not see a conceptualization that is used interdisciplinary. Besides the definitions, characteristics as well as frameworks on DT are necessary.

6 Conclusion and limitations

In sum, our study gives a holistic overview on topics in DT research. We aimed at identifying major research streams and possible gaps for further research. Nine main streams were discussed by giving an overall picture of the sample. Moreover, all relevant streams were presented in detail to get an overview of the fields. The study is based on a structured literature review, combined with a citation network analysis, which enables us to deal with a huge amount of literature. This work aims on a brought overview of recent research of DT in business. Many articles discuss the application of digital technologies to support or refine business (e.g., VR in tourism, marketing, and manufacturing). The three dominant areas in our database are finance, marketing and innovation management. The focussed technological fields in the articles are the internet of things, big data, cloud computing and artificial intelligence. Especially in the field of finance new abilities to work with big data and analytics for trading and predicting markets shape the research field. Data management methods and the application of data analysis methods become more important, as they can be used for prediction and prognosis of e.g., bankruptcy. In the field of the production industry, the topic of cloud manufacturing is gaining more and more attention.

We recognize that our study has limitations. By explanation, a literature review rests on the existing as well as accessible research studies. As we conducted a thorough literature search through the ISI Web of Science to identify all relevant articles according to our search terms, it cannot be excluded that in this literature review some articles could have been missed from some other leading databases (i.e. Scopus and EBSCO). However, WoS is considered the most comprehensive database and is thus frequently used in management and IS research (Schryen 2015). Another limitation lies in the definition of the research objectives and selection terms. It is possible that our systematic literature review cannot cover exhaustively the vast field of research. This possibility is especially relevant as different technologies regarding DT are included in the study. Thus, the findings are limited to these technologies. However, by conducting several search loops in an iterative approach of search terms and checking after each loop that the search stream fits our research question, authors are quite confident this research is robust as every effort to mitigate error was taken. Additionally, the qualitative analysis and cluster descriptions are based on the research team interpretation of the selected research articles. By conducting a two-step cluster evaluation process, first cross-checking articles independently, second reviewing clusters in an author team of five heterogeneous researchers, we addressed with this embedded bias. Moreover, we use a citation network analysis. Compared to other literature review approaches, the network analysis does not focus on a special field within DT research. Thus, we were able to study the field of DT from a more holistic perspective and provide implication of a broad literature base and an overview of the current state. Moreover, this study points to future directions in the field.

Besides these limitations, the procedure was permanently reflected during the research process which resulted in two major questions: (1) How consolidated is the body of literature? (2) How do we consolidate the body of literature in an adequate research procedure?

(1) For the first question, we assume that many clusters aroused by the business perspective. However, we also identified clusters with very little connection to management topics such as health care (cluster 14). This cluster contains two management related articles (Bental et al. 1999; Brown et al. 2015). Therefore, we excluded health care from an in-depth analysis. Other clusters focus on technology or the method (e.g., cluster 1). Therefore, an alternative mean of analysis could be to focus on streams of technology instead of streams of business disciplines or a combined analysis with a matrix approach. Moreover, our research approach is limited due to the search terms used.

(2) For the second question, we chose a combination of quantitative and qualitative approaches to arrive at an appropriate and representative number of articles. Discussions and rounds of consensus within the research team ensured a minimal amount of subjectivity. For the selection of clusters, we decided for an absolute approach to select the largest 10%. Alternative solutions could include relative approaches, like using k-means (Jain 2010) or other measurements. The cluster trust index showed that most clusters kept over 50 percent of the assigned articles after the manual qualitative analysis. For this reason, we consider the citation network analysis based on the tool Gephi as a valuable proceeding. In some way, our approach is an example of DT in research, as we worked with a digital-based dataset and presented an exemplary way to work with the rapidly growing amounts of research literature data. With our work, we will encourage researchers to recognize the threats, continue the research about DT in business, and examine the advantages of the digital change. Moreover, in showing a holistic approach to DT research, our results can be regarded as the first step to foster researcher’s adaptive expertise to understand and combine results and procedures from different fields (Boon et al. 2019). For future research, we encourage a mutual interchange of findings from corresponding research streams, as we showed with our study.