1 Introduction

Innovation is crucial to the advancement of a society and an economy. Data-driven innovation (DDI)—producing innovative outputs from data—is one among the latest innovations (Deloitte, 2016). According to The Organisation for Economic Co-operation and Development (OECD), DDI “forms a key pillar in twenty-first century sources of growth” (OECD, 2015a). Given that data is the most important raw material for DDI, firms compete with each other accessing data (Hernaes, 2021). In general, bigger firms have more access to resources including data (Karlsson, 2021). Therefore, small-to-medium firms demand data to be made public so that any firm could translate data into economic and social value (Khurshid et al., 2020). Along this line, meanwhile—especially in last two decades—a demand for non-sensitive data (e.g., traffic or weather data) to make freely available to everyone to use, reuse, and republish has been observed (Hossain et al., 2022). Consequently, governments of many countries have been releasing open data in reusable formats and without restrictions from copyright and monetary obligations to reuse or distribute (Hossain et al., 2016a, 2016b).

As the products and/or interpretations of data are sometimes more attractive than the original data (Serra, 2014), firms collate, combine, and enhance open data to produce DDI applications (Jetzek et al., 2014). Since the open data came into existence, it was expected that data-driven applications would emerge not only from the large firms but also from small-to-medium start-up communities (Andersen & Pedersen, 2021). All firms, irrespective of their size, would develop DDI applications by using data from government (e.g., bureau of statistics), business organisations (e.g., retailers), and individuals (e.g., consumer data) (DPMC, 2016). Unfortunately, “DDI is still at the early stage of development and adoption” (Luo, 2022, p. 5) and therefore its success is yet to realise in a larger scale (Deloitte, 2016).

At the same time, overall, empirical research on DDI is limited (Saura et al., 2021). Nevertheless, academic research on its drivers is emerging. A major stream of research on DDI has paid efforts on its processes to facilitate decision-making (e.g., Babu et al., 2021), and the economic and social values of it (e.g., Glass et al., 2015; OECD, 2014, 2015b; Saura et al., 2021). Nonetheless, few studies have discussed DDI from technological perspective and identified some associated variables including data and technology (e.g., Babu et al., 2021; Sultana et al., 2022a, 2022b) and user privacy (Saura et al., 2021). Studies also suggest that the success of DDI cannot be achieved without a proper combination of organisational resources and capabilities (Babu et al., 2021; Chatterjee et al., 2021). Extant literature has examined its external environmental (hereafter “environmental”) factors including technological infrastructure (e.g., Lafferty, 2019) and legal issues (e.g., Witjas-Paalberends et al., 2018). However, an integrated approach combining the technological, organizational, and external environmental (TOE) perspectives of DDI drivers is scarce, which is the fundamental research gap that this study addresses. The other research gap this current study intends to handle emanates from the following extant studies. In a recent study Jetzek et al. (2019) suggest a research agenda on organizational-level research (than macro industry level) to study the ‘complex’ relationship of the variables that explain how value can be generated using data-driven applications. Instead of the net effect, they suspect ‘interplay’ of the related variables (p. 722). Similarly, Jetzek et al. (2014) have explored the drivers (which can be explained with TOE framework) of DDI mechanism and suggest that, “authors [should] examine which configurations of identified enabling factors actually lead to an observed successful outcome” (p. 117) i.e., DDI development. Recent literature also suggest that one of the fundamental limitations of the studies that have applied the TOE framework is “The underlying assumptions are that the [antecedent] factors are mutually independent, and no interaction and coordinate relationship exists among various factors” (Zhao & Fan, 2021, p. 3). Although the name TOE itself suggests a triadic relationship of T-O-E variables, IS studies overlook such an integrative approach, which reminds us ‘the blind men and the elephant’ story (see Saxe, 1873). Meanwhile, explaining a complex phenomenon, scholars have established that, in a complex world, individual factors rarely operate in isolation. Rather, “reality usually includes more than one combination of conditions that lead to high values in an outcome condition” (Woodside, 2013, p. 464). For TOE, nonetheless, Hossain et al. (2022) and Zhao and Fan (2021) show complex configurational trade-off effects among TOE variables, However, no study has been conducted on DDI field that empirically identifies the configurations of the drivers of DDI development and thus remains an important research gap. Due to the complexity of DDI (Jetzek et al., 2019), identifying the different causal relationships among the DDI drivers is important for better theoretical and policy implications.

Against this backdrop, we develop the following research questions (RQs)

  • RQ1: What are the technology-organization-environment drivers of DDI, and how they individually affect development of DDI applications by firms?

  • RQ2: How the technology-organization-environment configure to explain high DDI development.

To answer the first question, we have identified the TOE variables from a systematic literature review and investigated their effects on DDI development by using partial least square based structural equation modelling (PLS-SEM). Our results suggests that the technological and organisational but environmental variables have significant individual effects on DDI development. To respond to the second research question, we have applied fuzzy set qualitative comparative analysis (fsQCA) technique because it deals with the complex configurational trade-off effects of the antecedent factors, providing a more realistic representation of a complex phenomenon (Woodside, 2014). Moreover, Hossain et al. (2022) have recommend that “fsQCA can successfully reveal which combinations of TOE variables are important” (p. 14). Based on complexity theory (Woodside, 2014), we consider that DDI development is not dependent on the net individual effects of the antecedent factors, but on their specific configurations (i.e., the ‘‘gestalt’’ effect). From the fsQCA analysis, we find the technological variables, and firms’ exploitation capacity and technology-oriented leadership individually are necessary for DDI development, but not sufficient. For sufficient conditions, fsQCA suggests three solutions where the TOE variables are differently combined.

This paper thus contributes to the literature in several ways. First, the major contribution of this paper is to apply an integrated i.e., TOE framework to the research field of DDI by integrating DDI drivers identified from a SLR. To the best of our knowledge, this is the first study attempting to investigate context-specific antecedents of DDI, thereby expanding our understanding of our phenomenon of interest. Second, our study’s findings confirm that a multi-method approach, identifying both the net and configurational effects of the variables, offers richer insights explaining DDI development. Our fsQCA results suggest that the individual drivers have complex trade-off effects and that their certain combinations act as sufficient conditions for DDI. Finally, methodologically, we contribute to DDI research domain by using complexity theory via fsQCA. It is noted that fsQCA uses both qualitative and quantitative methods within its process. Although it does not specifically optimizes the combination of antecedents to produce the best outcome, it provides sufficient and necessary conditions for DDI development using configurations of various antecedents. We hence argue that fsQCA has the basic tenet of soft operational research (OR) approach. Mingers (2011) mentions that soft OR deals with “wicked problems or messes”—as we are dealing with complex problem of DDI development using fsQCA, we argue that this approach is a legitimate soft OR method.

2 Background

2.1 Data driven innovation: a brief discussion

Data driven innovation (DDI) “is simply about using data for innovation” (Curley & Salmelin, 2017, p. 123). It can also be defined as “the use of data and analytics to improve or foster new products, processes, organisational methods and markets” (OECD, 2015b, p. 21). Studies understand DDI as the analysis of large volumes of data, by private and public sector organisations, to make better decisions and create new products and services, and thus generating economic and social values (e.g., Glass et al., 2015). In short, “Innovation that creates real business value and stems from data processing and analysis is known as data-driven innovation or DDI” (Deloitte, 2016, p. 27).

Data for DDI development may come from different sources—closed versus open. For example, Google, Amazon, and Facebook use the customer/user data for DDIs both for their customers/users and advertisers; such data may not be released to other organisations, and thus are called as closed data (Zuiderwijk & Janssen, 2015). On the contrary, data may come from open sources with ‘open licencing’, which are called as open data. “Open data is data that has been made publicly available for use in an accessible format, along with sufficient metadata to understand and analyse the data, and the legal freedom to use it for most or any purpose without asking for further permission” (Glass et al., 2015, p. 70). For instance, Terbine collects global Internet-of-things (IoT) raw data from different type of machines (e.g., vehicle, satellite) and makes it searchable by and available to any firms/users to develop DDI e.g., for precision agriculture, smart transportation (Babu et al., 2021). Similarly, Dawex and Caruso provide data exchange platforms that share in-vehicle data of different vehicle manufacturers (Fetene et al., 2017). Recently, Amazon Web Service (AWS) has made it easier for DDI-developing firms to use data from world leading data providers (Hejazi, 2020).

Other than private sector businesses, open data may come from government agencies that, while serving the citizens, produce and collect enormous amount of data including traffic, weather, census, and crime data. This is called open government data (OGD), a sub-set of open data. Many countries understand that public data (which are not privacy-sensitive and not related to state security) should not be captivated but set free and thus mandate public departments and agencies to release data in ‘open’ format (i.e., without any further copyright obligation to reuse or distribute) (Hossain et al., 2022). Such OGD principle focuses on releasing data for use and re-use by public and private organisations (i.e., DDI developer communities) to promote innovation (Chatterjee et al., 2021). Governments realise that if data is not open, DDI will become the exclusive province of large firms (Deloitte, 2016), where small-to-medium firms will have difficulty in surviving. By opening its datasets, governments thus encourage DDI and facilitate the realization of its potential.

Combining open data from both private and government agencies (hereafter “open data”), firms develop DDI applications that facilitate access to existing public services and/or providing innovative services to the users (Babu et al., 2021). For DDI to take place, organisations use different techniques and technologies to define and capture relevant data, process and analyse it to produce innovative products and services, optimising organisational processes, enhancing research and development, and more (OECD, 2014).

2.2 DDI initiatives using open data

Open data are used to develop a very wide range of DDI applications including web-based portals and applications for smart phones and mobile devices. For example, the Open Data Cube (ODC) (https://www.opendatacube.org/) is a global initiative where anyone can access, use and reuse satellite data to increase the value and use of it. ODC also provides users with access to free and open data management technologies and software for data analysis. PatientsLikeMe—“the world’s largest integrated community, health management, and real-world data platform” (www.patientslikeme.com/about) connects patients with similar diseases (or symptoms) so they can share experience or information (e.g., hot spots in cities that trigger asthma attacks) (Hendler et al., 2012). It also provides “an evidence base of personal data for analysis and a platform for linking patients with clinical trials” (OECD, 2015b, p. 30). The WoozyFootnote 1 app is one of the top health apps in the USA that updates most recent recalls issued by different government agencies including the Food & Drug Administration (FDA), US Department of Agriculture. Additionally, users can customise their feed to allergies (Adida et al., 2010). Similarly, as a classic example showing “how data can help to revolutionise a sector”, SoilEssentialsFootnote 2 employ data to help farmers optimise their businesses, leading to precision agriculture (Scottish-enterprise, n.d.). In Nigeria where no official traffic feed exists, for smart traffic decisions (to enable users to check the traffic information in real-time), Tsaboin crowdsources traffic data from individual passengers around bus stops in Lagos (OECD, 2015b). Similarly, Openbaarvervoer in the Netherlands or Tågtavlan in Sweden shows real-time traffic information (van Oort et al., 2015). Finally, Närmaste provides location-based information about local services and companies (libraries, pharmacies, gas stations etc.) in Sweden. The list can go longer reporting similar applications.

2.3 Theoretical foundation of DDI development

The theoretical development of DDI and its related factors e.g., its adoption, diffusion are still emerging. Historically, innovation is considered as the key to organizational success (Chaudhuri et al., 2021) whereas design and development of data-driven innovative services is a key to survival in today’s market (OECD, 2015a). To explain DDI use in organizations, literature appears to use several theories. Among them the dynamic capability view (DCV) (Teece et al., 1997) is popular (e.g., Kozak et al., 2021). Over the past few decades, numerous studies have identified DCV as one of the prominent theories in the domain of formulating firm strategies, especially in the area of innovation and new product development (Sultana et al., 2022a). Taking the principle of DCV, Kozak et al. (2021) suggest that not organizational resources but the processes that transform these resources into business operations in a dynamic business environment are critical for the success of an organization.

The constituent elements of DCV leverage technological and organizational capabilities required for improving organizational DDI. Accordingly, few studies (e.g., Chatterjee et al., 2022; Sultana et al., 2022a, 2022b) have integrated DCV with the resource-based view (RBV) (Barney, 1991). RBV suggests that firms need to utilize different capabilities; more precisely, they need to technologically innovative, to survive and grow. From an empirical study conducted in Australia, Sultana et al. (2022a) find DDI capability as a third-order construct, reflected by market-oriented, infrastructural, and innovation-talent capabilities, which are further reflective by six first-order capabilities. They further suggest that, DDI capability directly as well indirectly (mediated via competitive performance) increases organizations’ strategic market agility. Similarly, from a systematic literature review, Sultana et al. (2022b) propose that DDI capability is dependent on organizations’ management, infrastructural, and talent capabilities. In both studies, the ‘infrastructural capability’ is related to ‘data and technological capabilities’ whereas the other capabilities are directly related with organizational resources. Sultana et al. (2022b) understand ‘data’ as an internal capability of an organization to explore and exploit big data for effective decision-making by manager whereas ‘technology’ as a generic capability to utilize hardware and software to analyse data. However, Sultana et al. (2022a) have not elaborate the nature of ‘data and technology capability’ and how they contribute to firms’ infrastructural capability. Nevertheless, both studies suggest that, to handle the fast and unprecedented changes within business environment, organizational capabilities are necessary and the constituent elements of DCV and RBV leverage organizational and technological capabilities required for improving organizational DDI. Meanwhile, Chatterjee et al. (2022) have found that DDI capability (IT, and R&D) and techno-functional capability (remote work, and CRM technology) increase the supply chain (SC) management capability of Indian firms. Moreover, technology-oriented leadership moderates the relationship between firms’ SC capability and SC performance. Interestingly, their approach is different; they understand the causal effects of DDI and techno-functional capabilities on SC management capability, whereas other studies (e.g., Sultana et al., 2022a, 2022b) consider ‘capability’ as a higher-order construct.

Babu et al. (2021) have combined DCV with institutional theory (DiMaggio & Powell, 1983) and technology-organisation-environment (TOE) framework (Tornatzky et al., 1990), and applied a mixed method of research approach (SLR, thematic analysis, and semi-structured interview). They have proposed a seven-step DDI process, including conceptualizing innovation, data acquisition, data refinement, data storage and retrieval, data distribution, data presentation, and market feedback. Similarly, applying SLR and thematic analysis, Sultana et al. (2021) suggest a ‘standardized’ seven-step process for DDI, including data product conceptualization, data acquisition, data refinement, data storage and retrieval, data product distribution, data product presentation, and market feedback. Although these two studies have used different keywords in different databases for the SLR, and applied different research methods, both studies suggest an exact seven-step process of DDI, which is interesting.

While there is a consensus about the complex nature of DDI and its associated variables, there is no agreement on how the antecedents of DDI can consistently explain it i.e., DDI. Subsequently, we witness inconsistencies in the literature. For instance, Luo (2022) conceptualize that strong leadership is a driver of DDI. However, Chatterjee et al. (2022) do not suspect a direct effect of IT leadership on DDI capability but on the relationship between SC capability and SC performance. Similarly, Chatterjee et al. (2021) have found that leadership is a moderator between DDI and firm performance. Moreover, while there are several studies suggest the independent influence of the driver of DDI but their configurational effects are not identified yet. This is oversimplifying a complex process like DDI where examining the net effects of its antecedents does not explain the phenomenon entirely. Furthermore, for a complex technology e.g., DDI, one factor is rarely sufficient to explain its success; rather, a configuration of factors is extremely likely. Therefore, we argue that the constellations of DDI drivers can offer more accurate and consistent answer, and add valuable insights.

3 The research model explaining DDI development

Our research model explains the drivers of DDI development by firms. In order to find out the drivers of DDI development, we have conducted a systematic literature research by following Saura et al. (2021) and Sultana et al. (2022b). Prior studies (e g., Webster & Watson, 2002) suggest that a literature review can be effectively identify emerging issues that could potentially benefit our theoretical foundation. Following Sultana et al. (2022b) in the DDI domain, our study had three main phases for the systematic literature review process: (a) planning and searching, (b) screening and extraction, and (c) synthesizing.

In the planning and searching phase, we have started searching on Scopus database due to its comprehensiveness and inclusion of major articles from other databases (Bhimani et al., 2019). We have used the keywords: “Data driven innovation” OR “Data-driven innovation” only in article titles. The search includes papers within any timeline, which produce 134 papers. In screening and extraction phase, we have extracted 67 papers that include conference paper, book chapter, editorial, review etc. and kept only the journal articles as these reflect state-of-the-art of a research topic (e.g., Olanrewaju et al., 2020). Furthermore, we have excluded 10 papers from unrelated fields including medicine, physics, nursing and similar. In the final synthesis phase, we have screened the abstract of the papers and extracted 29 paper because their non-relevance with the current topic of this paper. Finally, we review 28 papers that are presented in Table 1. Figure 1 depicts the summary of entire step-by-step process for the SLR.

Table 1 The papers from systematic literature review on DDI antecedents
Fig. 1
figure 1

The systematic literature review process

The reviewed papers from our SLR suggest that DDI is dependent on various technological, organisational, and environmental variables (see Table 1). Specifically, “DDI integration is not an easy task. It requires learning, knowledge and recognition of the need, qualifying personnel and, of course, the existence of both internal and external datasets” (Deloitte, 2016, p. 11). Therefore, the TOE framework (Tornatzky et al., 1990) can be used as a significant theoretical lens to understand our current research. The TOE framework is an attempt to developing a unified framework for explaining the organisational adoption of a technological innovation suggesting technological, organizational, and environmental dimensions. It is considered that the framework is an integration of several IS theories including Technology Acceptance Model (TAM), Innovation Diffusion Theory (IDT), and Institutional theory, and thus effective to explain an IS/IT phenomenon including DDI (Hossain et al., 2017).

3.1 Technological variables

“The key to DDI is in fact the data itself” (Deloitte, 2016, p. 17). In other words, data is essential for DDI; however, more data does not simply mean more innovation (Gupta, 2022). Prior DDI studies accuse several technological barriers for its low and slow adoption, that include the availability, accessibility, and quality of data and metadata (Jetzek et al., 2014), compatibility, processability, licensing, and lack of standards of open data (see, for example Janssen et al., 2012). As DDI development is highly dependent on open data,Footnote 4 its success is highly dependent on the attributes of open data. By definition, open data are freely accessible online, available without technical restrictions to re-use, and provided under open access license that allows the data to be re-used without limitation” (Jetzek et al., 2014, p. 102). Hence, the technological attributes including availability, accessibility, and open license are already captured by the definition of open data and therefore have not been considered in our study. As the quality of data and metadata are consistently considered as critical for open data (e.g., Hossain et al., 2016a, 2016b, 2022b; Purwanto et al., 2020) and so for DDI (e.g., Hemerly, 2013; Witjas-Paalberends et al., 2018), we consider these as our technological dimensions driving DDI development.

3.1.1 Data quality

Data is the most important input of value creation through DDI and is considered as the ‘foundation-level’ resource (Yu et al., 2021). Data quality is a ‘technical attribute’ of data (Purwanto et al., 2020), which can be defined as the extent to which “data that are fit for use by data consumer” (Wang & Strong, 1996, p. 6). Availability of data and lack of data quality are consistent concerns of DDI stakeholders (Glass et al., 2015; Luo, 2022). In general, quality data improves DDI (Chatterjee et al., 2021). Prior studies suggest that good quality data, in general, eases developing DDI (e.g., Cronholm et al., 2017). On the contrary, poor quality data is as good as with no data. Following the ‘garbage in garbage out’ principle, in DDI context, poor-quality data leads to the fusion chain of wrong interpretation and to reduced DDI quality. In addition, insufficient data quality may increase the cost of accessing and interpreting data and reduce re-use opportunities (OECD, 2015b).

3.1.2 Metadata quality

Metadata refers to data about data; it describes some other data (or dataset) and the structure of the data in question. Metadata provide structured information that makes data/datasets easier to retrieve and use (Open Data Support, 2014). It also may include supplementary data/information about the records in a dataset (Glass et al., 2015) including title and description of data, method of collection, author/publisher, area and time period covered, licensing criteria, and date and frequency of release (Lisowska, 2016). High quality metadata allows the users to find the context of the data in a simple and accessible way. Contrarily, poor metadata may merely state the existence of a dataset without defining the relationships between datasets. The issues of metadata and their quality is a common concern for DDI development (Morabito, 2015). For a good quality DDI to develop, it is essential that the open datasets have adequate and quality metadata to aid both discoverability and usability of the data by the application developers (Abella et al., 2017; Jetzek et al., 2014). It is often observed that, “there are metadata quality issues that could disrupt the success of Open Data” (Neumaier et al., 2016, p. 25) and thus DDI development. Lack of metadata quality makes it difficult for developers to find relevant dataset, understand the dataset, and perform data analyses to develop DDI applications (Kubler et al., 2018; Villanova University, 2019). On the contrary, availability of ‘satisfactory’ metadata offers better data acquisition and analytics, which ensure successful DDI development (Morabito, 2015).

3.2 Organisational variables

DDI initiatives are organization driven. Because of the variance of organizational variables, the success story of DDI initiatives may vary among organisations. Hence, DDI literature paid significant attention on the organizational dimensions. Among the organisational variables, the common aspect is the capability and capacity of firms in terms of acquiring, analysing, and exploiting knowledge in general and more specifically, data (e.g., Babu et al., 2021; Kozak et al., 2021; Sultana et al., 2021). The next important organisational capabilities for DDI development are strong leadership (e.g., Chatterjee et al., 2021; Luo, 2022) and skilled professional (e.g., Babu et al., 2021; Luo, 2022). Therefore, this current study considers three organisational dimensions that drive DDI development: absorptive capacity, technology-oriented leadership, and skilled professional.

3.2.1 Absorptive capacity

Absorptive capacity refers to “a firm’s ability to identify, assimilate, transform, and apply valuable external knowledge” (Roberts et al., 2012, p. 625). While the classical absorptive capacity literature mostly focuses on knowledge, some studies relate it with data/information acquisition, exploitation etc. that eventually generate knowledge (Cohen & Levinthal, 1990). For example, in DDI context, Yu et al. (2021) suggest that firms apply various strategies to produce, collect and exploit data for knowledge creation, market formation and investment mobilization. Precisely, Chatterjee et al. (2021) demonstrate that acquisition of data, its assimilation, synthesisation, and exploitation are associated with absorptive capacity, and assist firms towards DDI.

In extant literature, “absorptive capacity is treated as a second-order construct composed of three or four first-order reflective constructs” (Cooper & Molla, 2017, p. 396). For example, Cohen and Levinthal (1990, p. 128) suggests three components including acquisition, assimilation, and exploitation of knowledge. Later, Roberts et al. (2012) have summarised identification, assimilation, and application of external knowledge as the components/sub-dimensions of absorptive capacity. Our study adopts Cooper and Molla (2017)’s notion that modelled absorptive capacity as a second-order formative construct consisting four first-order constructs including acquisition, assimilation, transformation, and exploitation.

First, acquisition of data and knowledge is critical in DDI context. For example, identifying and acquiring data from various sources is prerequisite of generating new ideas and applications (Chatterjee et al., 2021). Next, organisations need prior related knowledge to assimilate and use new knowledge (Cohen & Levinthal, 1990). Hence, assimilation i.e., analysing, understanding and interpreting data and knowledge obtained from external sources is important (Cooper & Molla, 2017). Transformation capabilities help firms to develop and refine the existing processes that facilitate integrating its existing knowledge with the newly acquired and assimilated knowledge (Cooper & Molla, 2017). The final is exploitation capacity. DDI is related with business initiative that is based on exploiting data and generating values (Jetzek et al., 2014). Since the revolution of open data, data is now an accessible resource but often “is not sufficiently exploited”; the firms that “know how to exploit it wisely will enjoy huge economic and social advantages” (Deloitte, 2016, p. 5). Recent studies, in general, find a direct association between absorptive capacity and organisation’s innovation capacity (e.g., Sancho-Zamora et al., 2022). In DDI context, extant literature reports that DDI requires the exploitation of data to create value through data acquisition, data analysis, data curation, data storage and data usage (e.g., Bresciani et al., 2021; Yu et al., 2021).

3.2.2 Technology-oriented leadership

Current studies suggest strong leadership in general (e.g., Luo, 2022) and IT leadership in specific (e.g., Chatterjee et al., 2022) as important organizational variables for DDI. More specifically, “leadership around data-driven decisions was crucial: if the CEO makes it clear that data is required to support every decision, then the organisation can orient around measured reality rather than around hunches about the way the world actually is” (Glass et al., 2015, p. 35). In open data environment, although all firms have similar access to data, not all can use it because of difference in technology-oriented leadership (e.g., Jetzek et al., 2014). Therefore, the utilisation of data to create innovative applications needs strong technology-oriented leadership for DDI. This specific type of leadership entails integrating IT effort with business purpose and operations, and deciding procedures and designing workflows related to DDI development (OECD, 2015c). Contrary to general leadership, technology-oriented leadership focuses specifically on management’s IT and technology-oriented leadership—a pattern of CIO-type positions in the business leadership structure of a firm. The technology-oriented leadership suggests managers with open data and DDI awareness and knowledge who place DDI initiatives into the overall business of the firm, set effective strategy, and assign appropriate level of authority and resources (adapted from Hossain et al., 2016a, 2016b). They further establish strong IT strategies including directing training, research and innovation (Chatterjee et al., 2021; Feeny & Willcocks, 1998). Form DDI context, Chatterjee et al. (2022) have found that technology-oriented leadership moderates the relationship between firms’ SC management capability and SC performance. Deloitte (2016) suggests that firms having strong technology-oriented leadership produce more DDI applications than their counterparts do.

3.2.3 IT skill

DDI development demands different IT skills, for example, to collect data, make sense of the data, and analysing and applying data for application development (Kopanakis et al., 2016). “The anticipated benefits in productivity growth from DDI depend on several enabling and complementary factors, including in particular (i) the level of skills available to organisations, and (ii) the readiness of organisations” (OECD, 2014, p. 8). As DDI is relatively new, the skilled workforce is not readily available in the market (LaValle et al., 2011). “Lack of people with the right skills was the real constraint on DDI” (Glass et al., 2015, p. 35). Recent DDI studies have found that skilled professionals (Bresciani et al., 2021; Luo, 2022; Witjas-Paalberends et al., 2018; Yu et al., 2021) are critical for DDI as their unavailability disrupts DDI development (Babu et al., 2021; OECD, 2015b).

3.3 Environmental variables

3.3.1 Communication infrastructure

In the current context, communication infrastructure is defined as the extent to which a firm can access the required technological systems to collect data from various open sources, process it to develop applications, and bring the applications into market. DDI development is not easy without an effective technical communication infrastructure (Jetzek et al., 2019). An effective data communication infrastructure facilitates easy data exchange between government agencies, private firms, and the users (Jetzek et al., 2014). To develop an application, developer-firms need data that are distributed across different sources (e.g., weather, geographical, and population data, spatial planning, trade and finance). Therefore, a strong communication infrastructure, which will ensure that different datasets can communicate with each other, is required (Zuiderwijk & Janssen, 2014). Moreover, for DDI applications to run, continuous communication between the users’ device (‘client’) and the (database) server/s is required, which require effective communication infrastructure. Recent DDI studies (e.g., Babu et al., 2021; Lafferty, 2019) suggest that the lack of effective technological and communication infrastructure is a significant barrier for DDI.

3.3.2 Multiple platform

In general, “a platform is a group of technologies that are used as a base upon which other applications, processes or technologies are developed” (Techopedia, n.d.). More specifically, digital platform refers to an environment (or a technical architecture) upon which a DDI application is supported, developed and/or run. “The future of innovation will involve data-driven innovation based on new digital platforms” (Andersen & Pedersen, 2021, p. ix). They further discuss specifically about two different types of platforms that support DDI: “one is a broad platform, such as the platform offered by CBInsights (delivers industry reports at macro level) and the other type of platform, such as the one offered by Valuer.ai (operates solely at the micro level). The availability of different digital platforms is important for DDI because they “enable a data-driven world” (Bendor-Samuel, 2018). The data for DDI to develop need to be available in multiple platforms including web and mobile, Windows or Mac, or smartphone and tables with Windows, Apple, or Android. Similarly, DDI developers develop applications that allow end-users to run on multiple platforms.

The research model, integrating the TOE variables as the antecedents of DDI development, is presented in Fig. 1.

4 A configurational model of DDI development

The majority of TOE studies assume the relations between TOE variables are symmetric. However, Hossain et al. (2022) and Zhao and Fan (2021) suggest that technology adoption is quite complex where one variable by itself cannot explain the adoption phenomenon fully; rather, more than one variable together influence the outcome variable in question. Regression models e.g., PLS-SEM partially addresses this issue by employing moderation (aka interaction) effect but cannot handle multiple variables in a single moderating effect. Moreover, ‘one-model-fits-all’ approach seldom works; we often see that two studies conducted in a same context and applying similar regression methods produce different results (Hossain et al., in press). It happens because these models assume an independent variable (IV) as both a necessary and sufficient condition to predict the dependent variable (DV), and relations between the DV and IV are symmetric (Pappas et al., 2020). However, this is extremely optimistic and misleading because the relationship between two variables is unlikely to be symmetric (Woodside, 2014). Rather, one IV may be necessary for a DV but not always be sufficient; it rather may need to be combined with other IVs with different combinations. Therefore, a configurational approach is required to identify the asymmetric relations between the IVs (‘condition’) to predict the DV (‘outcome’).

We posit that DDI development is a complex phenomenon because it depends on the different configurations of TOE factors (Petter et al., 2007). On DDI, Jetzek et al. (2014) suggested a research agenda to investigate the “configurations of [the] … enabling factors” leading to DDI (p. 117). Our configurational research model (see Fig. 2) posits that DDI development depends on technological (i.e., data quality and metadata quality), organisational (i.e., absorptive capacity, technology-oriented leadership, and IT skill), andFootnote 5 environmental (i.e., communication infrastructure and multiple platform) factors. When identifying the detailed nature of the configurations of the TOE variables, we specifically look at three propositions, which have been developed from the tenets of complexity theory (Woodside, 2014).

Fig. 2
figure 2

The research model predicting DDI development

Fig. 3
figure 3

A configurational view of DDI development

Proposition 1

A TOE factor individually may be necessary but is rarely sufficient for predicting high scores in DDI development.

Proposition 2

Outcomes of interest rarely result from a single condition i.e., causal conditions rarely operate in isolation. Hence, a combination of two or more TOE factors may suggest sufficient condition for a consistently high score in DDI development.

Proposition 3

Not just one deterministic solution, but disparate configurations of causal factors (i.e., conditions) are equifinal in leading to in an outcome condition. It means high scores in DDI development can be achieved through different configurations of the TOE variables.

5 Research method

5.1 Measures

For reliability, our measures are based on previously developed indicators. Data quality is often discussed with its attributes (Wang & Strong, 1996) including accuracy, currency, openness, accessibility, and completeness (Roa et al., 2019). In open data and DDI context, data quality is characterised with completeness, correctness, ease of linking, and consistency of data. Therefore, to measure data quality, we adopt scales from Purwanto et al. (2020). Similarly, using the scale from Hossain et al. (2022), metadata quality has been measured with four items: consistent, easy to understand, structured, and well-documented features of metadata available on open datasets.

Following the scholars, “absorptive capacity is modelled as a second-order reflective-formative construct” (Cooper & Molla, 2017, p. 396). Therefore, to be consistent with prior studies (e.g., Cooper & Molla, 2017; Sancho-Zamora et al., 2022), our lower-order constructs (LOC) (i.e., acquisition, assimilation, transportation, and exploitation) of the higher-order construct (HOC) i.e., absorptive capacity are measured with reflective items. The items of acquisition, assimilation, transportation, and exploitation have been adopted from Jansen et al. (2005) and Flatten et al. (2011). As absorptive capacity is a HOC, it does not have any items to measure but uses the items of its LOC as reflective measures (see Hair et al., 2021 for detail). Technology-oriented leadership has been measured with five items. Its first item, adapted from Feeny and Willcocks (1998), identifies the managers’ knowledge, on DDI. The following three items have been taken from Hossain et al. (2022) that reflect if the management endorses, and develops and communicates DDI, and provide training on it. The final item (management’s innovative ideas) has been obtained from Deloitte (2016), which is not validated but consistently reported by DDI studies (e.g., Sultana et al., 2022a). Similarly, IT skill has used four items from Hossain et al. (2022).

Because of scarcity, this current study has developed the measures of environmental factors (i.e., communication infrastructure and multiple platform), and DDI development. From extant studies (e.g., Hossain et al., 2022; Jetzek et al., 2014) and industry reports (e.g., Deloitte, 2016), we have initially identified the items of respective constructs. The items then have been reviewed by a convenient sample of three academics (working in IS field) and two PhD students (on data analytics) to ensure consistency, ease of understanding, and relevance to the context. In this process, we checked the content validity of the items, applying derivate of Q-Methodology (Van Exel & De Graaf, 2005) and item matrix vs. construct rating (MacKenzie et al., 2011). We grouped and placed the items horizontally, and the columns represented the constructs. All the respondents have correctly allocated the item groups to their respective constructs, which express the adequacy of the content validity of the items. Before finalising, we have had a one-on-one session with these experts to ensure content validity. Table 4 contains our construct measures.

All these items have been rated based on 5-point Likert scale ranging from ‘strongly disagree’ to ‘strongly agree’ because “literature suggests that five-point scale appears to be less confusing and to increase response rate” (Bouranta et al., 2009, p. 280). All indicators are reflective in nature. We have demographic variables: size (i.e., number of employee), type (private, not-for profit, and others) (Chaudhuri et al., 2021), industry category (e.g., technology, healthcare), and business model (e.g., B-2-B, B-2-C) (Hallikainen et al., 2020), and the position of the respondent (e.g., senior and mid-level manager) (Chaudhuri et al., 2021). To measure firm size, we use the OECD (2005) definition, i.e., by assigning firms into the categories of 1–49 employees (small), 50–249 employees (medium-sized), and more than 250 employees (large).

5.2 Data collection

The empirical context of our research is Australian organisations engaged in DDI development using open data. Australia has been chosen for this study because it is one of the forerunners that “make public data openly available and support its use to launch commercial and non-profit ventures … [and] make data-driven decisions” (DPMC, 2016, p. 25). In 2013 alone, DDI added $67 billion to Australian economy (Keating, 2014). For data collection, we have obtained a list of the name and website address of Australian private organisations from various sources including https://data.gov.au/ and the Open Data 500 Global Network. From their respective homepage, the contact email address has been obtained. An email (invitation letter), along with the link of the online survey, was sent to each organisation asking to forward to the people who are engaged on DDI movements. In the invitation letter, we have explained the purpose of the study and any future use of data collected promising utmost data confidentiality. Moreover, two authors have browsed each of the websites and collected email addresses, where possible, of potential respondents. We also requested the participants to pass the questionnaire to other colleagues if they believed that their colleagues qualify to respond the survey. Thus, we have allowed multiple responses from one organisation, which is common in DDI research field (e.g., Chatterjee et al., 2021). In fact, “Numerous authors in the management and marketing literature have called for using multiple, as opposed to single, respondents per organization” (Balloun et al., 2011, p. 288). Hence, have applied a combination of convenience and snowball sampling for data collection.

The online survey, administered by a research assistant, using Qualtrics platform was launched in March 2021. After two weeks, a reminder email was sent to the same email addresses. It can be noted here, in order to secure privacy, we did not use any code to the web link or did not record the IP address of the respondents. Through two waves of data collection over four weeks, we obtained 264 complete and usable responses. The profiles of the responding firms and the respondents are summarized in Table 5. It can be mentioned that, in the questionnaire, we intentionally kept ‘annual revenue’ as non-mandatory (i.e., optional) so that they do not feel any information-discloser threat or social desirability bias. Consequently, more than 41% respondents did not answer this; therefore, we discarded ‘annual revenue’ from data analysis.

5.3 Common method bias

This study has used remedies to minimise common method bias (CMB). As the procedural remedies, measurement items used in this study have been taken from previously developed and applied studies, where possible. In the questionnaire, some items have been reverse-coded. Next, the instructions of the survey questionnaire have been kept as simple and direct as possible. In addition, respondents have not been allowed to revisit the questions they already attempted. Regarding statistical remedies, first, we compared respondents in the first wave (n = 155) with later respondents (n = 109) on all measures through a t-test. The t-test results do not find a significant difference at a significance level of 5%. Next, following Lindell and Whitney (2001), a marker variable “that is theoretically unrelated to substantive variables and for which its expected correlation with these substantive variables is 0” (Williams et al., 2010, p. 478) has been included in the model. The marker (“I like blue to other colours”), which is theoretically unrelated to the nomological network shows an insignificant effect on DDID (β = 0.056, t = 1.552, p = 0.121), which suggests that CMB was negligible in this study. Moreover, the correlation of the marker variable is 0.108, 0.105, 0.145, and 0.112 with DDID, communication infrastructure, technology-oriented leadership, and IT skill, respectively. The results suggest that CMB was negligible in this study (Williams et al., 2010).

6 Data analysis

Our research model has been validated employing two complementary methods namely partial-least-square based structural equation modelling (PLS-SEM) and fuzzy set qualitative comparative analysis (fsQCA); such approach is growing (e.g., Fang et al., 2016; Mikalef & Pateli, 2017). PLS-SEM “allows the testing of theoretical frameworks from a prediction perspective and the analysis of complex models with latent constructs” (Rasoolimanesh et al., 2021, p. 2). Alternatively, fsQCA “offers an alternative for—or complement to—linear regression analysis that is particularly suitable for the kinds of complex phenomena and causal relationships” (Fainshmidt et al., 2020, p. 456).

6.1 PLS-SEM analysis

For PLS data analyses, we have used SmartPLS 3.3.3. Following standard PLS procedure, the validity of the constructs was established by examining their reliability, convergent validity, and discriminant validity. For internal consistency, composite reliability (CR) and Cronbach’s alpha (α) for each construct were calculated. As shown in Table 6, all the values for CR and α are greater than the threshold of 0.70. In order to evaluate convergent validity of the constructs, outer loadings of the items and the average variance extracted (AVE) have been checked (Hair et al., 2021). Following Igbaria et al. (1995), one item having < 0.6 has been discarded (see Table 4). Similarly, the AVE values of all constructs are well above the required minimum level of 0.50 (see Table 6). Discriminant validity has been first assessed based on both Fornell-Larcker criterion (see Table 6). The square root of each construct’s AVE is greater than its highest correlation with any other construct (Henseler et al., 2009). Next, the HTMT values are lower than the conservative threshold value of 0.85 (Hair Jr et al., 2021). The collective evidence suggests that the constructs and the items demonstrate good measurement properties. Further, the VIF values of the items (ranging between 1.127 and 2.9) are below the threshold value of 5; thus, collinearity has not reached to critical levels in any of our constructs.

Next, we assess the higher-order scale of absorptive capacity. It is found that the CR, α, and AVE of absorptive capacity are 0.839, 0.745, and 0.523, respectively, which provide evidence of reliable higher-order measures. The results confirm that each of the first-order construct acquisition (β = 0.320; t = 16.084), assimilation (β = 0.215; t = 5.246), transformation (β = 302; t = 10.402), and exploitation (β = 0.501; t = 15.442) has a strong association with the second-order construct (i.e., absorptive capacity). Further, they explain 99.6% absorptive capacity. Hence, we conclude that the four LOCs statistically explain the HOC i.e., absorptive capacity.

To assess the structural model, we first examine the R2 values of DDID is almost substantial (65.5%) (Henseler et al., 2009). Then, we evaluate the direct effects of the TOE variables. The results show in Table 2 show that the technological and organisational variables have significant positive impact on DDID. Next, the LOCs of absorptive capacity also have significant indirect effect (through absorptive capacity) on DDID. However, we could not find any influence of the control variables on DDID.

Table 2 The structural model results showing the influence of the driver of DDI development

An additional analysis has been performed to control for the effect of the firm size and respondent’s position on DDID. To test the effect of respondent’s job position, we have grouped the executives and senior managers under ‘senior manager’, and the operations manager, data analysts, and operations officer under ‘mid-level manger’. The bootstrapping results show that the firm size and respondent’s position have statistically non-significant effects (β = 0.023; t = 0.626; p = 0.532), and (β = 0.025; t = 0.624; p = 0.533), respectively, on and DDID.

6.2 fsQCA analysis

To test the proposed configurational model (Fig. 2), have applied asymmetric modelling using fsQCA (Ragin, 2000, 2018). This method examines the relationships between the outcome variable (i.e., DDI development) and all possible combinations of binary states, i.e., presence or absence of its conditions (i.e., TOE variables). For our analysis we follow guidelines and recommendations from recent papers in IS (Mattke et al., 2021; Pappas & Woodside, 2021; Park et al., 2020). The fsQCA analysis has been carried carried out using the software fsQCA 3.0 (Ragin, 2018); the reason for the choice is that the software supported all the required calculations for the analysis.

Using fsQCA software (from www.fsQCA.com), we first have conducted data calibration. We have rescaled the latent variable scores produced by PLS-SEM into fuzzy values (values between 0 and 1) (Rasoolimanesh et al., 2021). It requires calibrating the standardized latent variable scores between − 3 (i.e., full-set non-membership) and 3 (full-set membership), whereby 0 (zero) is the crossover point (intermediate-set membership) (Rasoolimanesh et al., 2021). Similarly, to calibrate size, three values i.e., 3, 1, and 2 correspond to full-set membership, full-set non-membership, and intermediate-set membership, respectively. Calibration is not required for organisation type where 1 is used for “private” and 0 for “non-private” firms (Olya & Akhshik, 2019).

Next, the calibrated data have been incorporated into a fuzzy-set truth table. The truth table lists all possible configurations of the conditions. In refining the truth tables, we delete rows with no cases (Ragin, 2008). In addition, for configurations with a PRI consistency of < 0.75, the outcome in the truth table has been set to “0” to ensure that the sufficient configurations exhibit a satisfactory quality (Mattke et al., 2021; Ragin, 2008). Furthermore, from the truth table, we have removed the alternative solutions with less than two cases (Pappas & Woodside, 2021; Ragin, 2008). For a configuration to be considered as ‘sufficient’, its consistency and coverageFootnote 6 values need to be >  = 0.75 (Pappas et al., 2020) and >  = 0.2, respectively (Rasoolimanesh et al., 2021). The diagrammatic representation of the sufficient solutions for modelling high DDI development are outlined in Table 3. The results in Table 3 show that data quality, exploitation, and leadership are common in every solution. At this stage, using fsQCA software, we have run the necessary conditions test for high DDI development. For a condition to consider as ‘necessary’, the consistency and coverage values should be equal or higher than 0.8 (Roy et al., 2018). Based on results from necessary condition test and the solutions in Table 3 (Mattke et al., 2021), we have identified that data quality (coverage = 0.861; consistency = 0.863), metadata quality (coverage = 0.868; consistency = 0.871), exploitation capacity (coverage = 0.869; consistency = 0.870), and technology-oriented leadership (coverage = 0.868; consistency = 0.869) are necessary conditions for the high score of DDID. It can be interpreted that DDID cannot occur without them.

Table 3 The diagrammatic representation of the configurational solutions

The findings from the fsQCA on the configurations for DDID are presented in Table 3. Every combination (i.e., solution) is able to explain the same outcome i.e., DDID at a specific amount where a condition (i.e., variable) may be present, negated, or absent (no influence) on a particular solution. The overall solution coverage shows that, 61.2% of DDID is explained by the three solutions (S1, S2, and S3).

7 Discussion

7.1 Findings

7.1.1 PLS results

The purpose of PLS-SEM method is to understand if the technological, organisational, and environmental factors affect DDI development. From technology perspective, our results suggest that data readiness (i.e., quality of data and metadata) is a significant predictor of DDI development. Specifically, data quality has positive effect on DDI development, which is in line with recent researches that show that data is the foundation-level (Yu et al., 2021) and essential resource for DDI (Gupta, 2022). Others show that overall quality of data is a concern for open data (Jetzek et al., 2019) and DDI (Witjas-Paalberends et al., 2018). Similarly, DDI development is dependent on metadata quality of open data, which is reiterated by prior studies (Hossain et al., 2022, 2016a, 2016b; Trifacta, 2015). Combining these, this study confirms that the quality of data and metadata strongly increase DDI development of a firm. Prior studies advocate that high-quality data and metadata drive high performance of open data; equally, low-quality data and metadata hindered it (Sadiq & Indulska, 2017). Therefore, to inspire the maximization of using open data for DDI development, the providers of open data (e.g., government agencies) need to make sure and provide high quality and consistent data (Bresciani et al., 2021). Accordingly, they need to work on enhancing quality of data and metadata by applying contemporary tools and techniques (McCord et al., 2022; Wahyudi et al., 2018).

Looking at the hierarchical model, all dimensions of the LOC i.e., acquisition, assimilation, transformation, and exploitation significantly contribute to their HOC i.e., absorptive capacity. This is consistent with extant literature (e.g., Cooper & Molla, 2017). As far the organisational variables are concerned, it is found that absorptive capacity is a strong predictor of DDI development. With a deeper look, exploitation and transformation capacities had the strongest effects on DDI development, followed by data acquisition. Extant studies (e.g., Belissent, 2018; Gupta, 2022) explain that the value of data is not situated on the data itself but in what firms can do with it. While data quality establishes that data is fit for use, firms need to ensures that its use is acquiescent (Chaudhuri et al., 2021), exploited, and streamlined. When both data quality and absorptive capacity work together, firms can create immense value from data i.e., develop DDI applications (Gupta, 2022).

Moreover, both technology-oriented leadership as well as IT skill significantly contribute to organisational readiness. Earlier literature shows that technology-oriented leadership influences organisational performance and readiness through innovation (Chatterjee et al., 2022). It means that leaders with technological knowledge and expertise can support in developing and deploying big-data-based applications. This finding suggests for technology-oriented leadership than tradition one (Chatterjee et al., 2022). Although previous studies have reported a positive influence of environmental factors on DDI development (e.g., Babu et al., 2021; Jetzek et al., 2014), our results reject it.

7.1.2 Results of fsQCA analysis

Our results suggest that data quality, technology-oriented leadership, and exploitation capacity individually is necessary but none of them individually is sufficient to predict high DDID scores. Rather, the solutions consist of several conditions (see solution 1–3). Therefore, our proposition 1 is supported. This is consistent with prior studies (e.g., Hossain et al., 2022; Zhao & Fan, 2021) that suggest that no single TOE variables is sufficient. However, Hossain et al. (2022) find different necessary conditions for high performance of OGD e.g., IT skill. Supporting the second proposition, the configurations shown in Table 3 combine more than one TOE variable predicting high DDID. This supports the configurational approach, contrary to net effects, in explaining a complex phenomenon. In line with our study, prior studies (Hossain et al., 2022; Zhao & Fan, 2021) too derive configurations combining the TOE variables for OGD performance. It implies that firms should pay simultaneous focus on the TOE variables. According to proposition 3, to explain an outcome, there can be alternative models, not just one. Our fsQCA results offer three causal models predicting high DDID scores; thus, proposition 3 is supported. Like ours, Hossain et al. (2022) and Zhao and Fan (2021) suggest four configurations for high OGD. It implies that the same outcome (e.g., DDID) can be obtained with more than one deterministic configuration of the TOE conditions (Woodside, 2014). In other words, there are more than one alternative models for simulating high DDID; for example, S1 is a sufficient but it is not necessary because there are two alternative models (i.e., S2 and S3) that sufficiently explain conditions leading to high DDID, as far as the TOE variables are concerned. From these supported propositions, we surmise that DDI development is a ‘causally complex’ phenomenon (Misangyi et al., 2017) where the ideal combinations of TOE variables are required.

Before we explain the individual solutions (i.e., solution 1–3), we first report an interesting finding. The solution 1 and 3 imply that, in general, the TOE variables have almost similar roles for medium and large firms. They do not care environmental factors as long as they possess technological and organisational readiness. This validates our PLS results that suggest a non-significant influence of the environmental factors on DDI development. This is in line with a recent research from Hossain et al. (2022) that suggest the importance of management leadership and organisational skill for OGD initiatives in Australian government agencies. However, contrary to our solution 1 and 3, they suggest strong influence of external environmental factors. Therefore, contrary to a “one-size-fits-all” approach, our study shows that the TOE variables have differential effects on firms depending on their size.

The fsQCA results report three distinct configurations that sufficiently predict high DDID where data quality, metadata quality, acquisition, exploitation, technology-oriented leadership and skills present in all configurations. Moreover, acquisition capacity is common in every solution, which suggests its importance on DDI. Out PLS results also suggest the stronger effects of acquisition and exploitation—both on absorptive capacity and DDI development. Although all firms can implement the solution 1 to enhance DDI development, small firms can alternatively implement the second solution, and big firms the third one.

Solution 1 does not take into account firm size. In other words, all firms can implement the first combination of technological-organisational variables to enhance DDI development. This configuration further indicates that any firm can achieve DDI provided it has developed transformation capacity (do not care assimilation) and high IT skill, along with the common factors. This solution supports our PLS-SEM results and in line with prior studies that find the individual effects of these conditions (e.g., Chatterjee et al., 2022; Roberts et al., 2012).

Our second solution suggests a different combination of TOE variables for small firms. In this configuration, communication infrastructure is sufficient for DDI development when it is combined with the common conditions, even with low assimilation capacity. This solution suggests improving underlying infrastructure for communicating data and high-quality DDI applications. This is in line with recent research showing that small firms do not care multiple platforms for successful DDI as they cannot afford different platforms (The Cheap Squad, 2020). On the contrary, because of their size, which can be related with less resource, they depend on the infrastructures developed by external bodies. Moreover, their no transformation or low assimilation capacity do not affect their DDI performance as long as they possess high acquisition and exploitation capacities. They rather struggle to develop DDI in case of low data quality, and metadata quality (Ataccama, 2021). Interestingly, this is the only solution that suggests the convergence of TOE variables for DDI development, which implicitly derives that the full TOE model works better for smaller firms.

Configuration 3 shows that large firms, without assimilation and transformation capacities, and even skilled people still can be successful for DDI development when they possess the common factors. The non-significant effect of assimilation and transformation for big firms is plausible as data integration etc. are routinized in bigger firms (Melovic et al., 2020). Similarly, bigger firms usually possess high skilled workforce, and enjoy access to skilled workforce than smaller firms (Dupuy & de Grip, 2002); therefore, our finding is reasonable and is consistent with extant studies (Melovic et al., 2020).

7.2 Implications for research

Although previous studies have enumerated the failures of DDI, we still do not know much why such initiatives fail (Dominic, 2019; Saulles, 2018). Our research implications are based on revealing the black box in this phenomenon. Furthermore, this study complements other studies that have demonstrated the relevance of combining SEM and fsQCA in explaining complex phenomena (e.g., Rasoolimanesh et al., 2021) e.g., open data (e.g., Hossain et al., 2022). The study contributes to TOE research in general and DDI research in specific in three different ways.

Firstly, the fundamental TOE believes that “slack is neither necessary nor sufficient for innovation” (Tornatzky et al., 1990, p. 161); we extend this traditional approach. Our PLS analysis suggest that data quality, metadata quality, exploitation capacity, and technology-oriented leadership individually has positive significant effects on DDI development. The fsQCA result find that these are necessary for DDI—for any firm. Therefore, growth of DDI is not possible without these technological and organizational resources. We further strengthen the traditional view of TOE by suggesting that not only organisational resources, in fact, no single TOE variable is sufficient for DDI. This means the TOE variables cannot explain DDI development when working individually (Zhao & Fan, 2021). Rather, the organizational variables have to combine with technological drivers for medium-to-large firms; whereas, these variables need to combine with environmental variable(s) for small firms. This respond to a research agenda of prior studies (e.g., Bresciani et al., 2021) and extends our current knowledge on DDI (Luo, 2022).

Secondly, TOE does not provide us the ‘sufficient’ conditions for technology adoption. Extending TOE, this study offers three fsQCA solutions/configurations to predict high DDI development. All solutions suggest that, DDI is not only a technological move but also is dependent on organizations variables. This extends our current knowledge and contributes to the body of knowledge on the interplay between TOE variables to predict DDI development, specifically in the developed countries. Moreover, in the literature review on TOE, Baker (2012) summarizes that the relationship between firm size and innovation is inconclusive. Our study suggests a few variations to it. For instance, difference in IT skills plays a critical role for small-to-medium organizations; communication infrastructure is a major driver of DDI only for small firms. Our study thus deepens existing knowledge on the possible differences on the DDI drivers on the basis of firm size. This opens a future research avenue to identify different equifinal solutions combining TOE to predict a given outcome. It further encourages prior studies to revisit their models and check if a configurational model can better explain their research problem and the context.

The third theoretical implication of the present study stems from an innovative approach we employed to explain the drivers of DDI development. In technology development contexts, scenarios are often complex and unique since a factor can have a differential effect from one organization to another, especially with the difference in availability of resources. Thus, regression-based methods like PLS-SEM oversimplify the relationship between variables (Pappas et al., 2020). This study therefore uses fsQCA to capture the complex relationships between TOE variables and thus differentiates from the majority of the studies that are based on variance-based methods (e.g., Chatterjee et al., 2021). As mentioned earlier fsQCA is a soft OR method. Hence, the explicated configurations are the best possible configurations in terms of necessary and sufficient conditions for successful DDI development. Our fsQCA results suggest asymmetric relationships between TOE variables and DDI development, leading to the creation of new hypotheses suggesting differential effects of these variables and thereby contributing to theory development.

7.3 Implications to practice

It is understood that DDI is transforming the economy and society of many countries, and is emerging as an essential tool to improve growth and prosperity (Andersen & Pedersen, 2021). A significant growing interest is also observed among users of DDI that incorporate open data in usable formats e.g., for Smart mobility (e.g., Yadav et al., 2017). Such prospects drive a growing interest to many organisations and researchers on how DDI initiatives can be encouraged. Our study contributes to ongoing discussions on DDI by identifying its drivers—from TOE perspective. Such research can enable firm managers and policymakers to get insights that can help to understand the topic better.

Our first practical implication guides firm managers and government agencies who need to understand the value of high quality data and metadata for DDI. Primarily, this study demonstrates that both data quality and metadata quality are necessary conditions to develop DDI applications. That means, without them DDI is not possible to flourish. High quality data effectively can save lives if properly used and presented through data-driven applications (Peled, 2011). Therefore, our research suggests that, as data readiness—in terms of data quality and metadata quality—is essential for successful DDI development, government agencies (as a big source of open data) should ensure quality data and metadata. The same implication is valid for the other open data providers as well e.g., World Bank, World Health Organization, and Google (Public Data Explorer).

The second practical implication target firm executives. Our results suggest that organizational dimensions are important for DDI. Considering absorptive capacity, exploitation of data is necessary while assimilation is appreciated—for any organization irrespective of size. This is historically evident (e.g., Ndiege et al., 2012). Therefore, firms need to exploit employees’ knowledge on data and establish strategies to advance the use of it. Such strategy may include practicing data-driven culture (Chatterjee et al., 2021) and ‘democratise’ data within organizations so that anyone can make sense of the data and use/re-use it (Glass et al., 2015).

The third implication can be useful for executives and governments. In terms of organizational readiness, technology-oriented leadership is necessary for DDI. This is reiterated by prior studies on DDI (e.g., Jetzek et al., 2019) that advocate that organizations must promote DDI through leadership. Thus, for successful DDI development, organizations should emphasize on technology-oriented leadership over conventional leadership. The managers should promote DDI to greater extent and train its workforce with data and IT matters. On the contrary, skilled IT workforce—the other aspect of organizational readiness—is important for small and medium organizations. “The availability of skilled workers affects the ability of organizations to create shared digital content and digital products and services” (Jetzek et al., 2019, p. 708). With the growing interest on data analytics, Big Data, and DDI, the gap between the supply and demand for skilled professionals (e.g., data scientists, data analysts) is evident and is predicted to continue (Jetzek et al., 2019). Our finding is echoed by extant studies, e.g., “many more people must be equipped with the skills to develop, deploy and operate DDI services” (Analysis-Mason, 2016, p. 36). Therefore, government (in liaison with universities) must develop policies producing more graduates on this field. In addition, government need to ensure an affordable-access of high skilled workforce to small and medium firms.

Finally, we highlight that communication infrastructure plays an important role for DDI development for small firms in particular. Our finding is in line with Jetzek et al. (2019). They demonstrate that while OGD is a prime source of data for firms to develop DDI applications, the actual value-generation depends on the digital infrastructure of a given country. It is plausible that small firms possess resource constraints and therefore rely on external resources including IT infrastructure. Our implication extends prior research e.g., Mosig et al. (2021) that discuss how appropriate digital infrastructure of small business and start-up firms can be developed through internal and external resources to realize the true value of DDI. Therefore, to inspire successful DDI development, external bodies including governments and third-party technology providers play significant roles to ensure an efficient communication infrastructure, especially for the small firms.

7.4 Limitations and future research

Despite its significant implications, our study has several limitations that recommend future research directions. The first limitation is related to generalisability of our model. While our studied antecedents to predict DDI development are informed by a systematic literature review, some other factors could have been overlooked. For instance, privacy and security challenges of DDI application is a continuing concernFootnote 7 (Saura et al., 2021), which could be considered as technological issues. Also, organisational variables are dynamic in nature; for example, different sets of analytical capability may decide the success of DDI by firms (Sultana et al., 2021; Wamba et al., 2017). This is even more critical for fsQCA model, which runs various configurations of a given set of causal factors. Consequently, the solutions are sensitive to the range of factors included in the model—adding or removing factors may lead to significantly different solutions (Ordanini et al., 2014). Therefore, the configurational solution might differ if other sets of TOE variables are included. The other issue regarding generalizability of the results is collecting data from one country i.e., Australia. The drivers and thus their configurations to predict DDI development may differ especially in the developing countries where open data movement is yet to be realised in greater scales (Kassen, 2019). Nonetheless, future studies could apply other relevant TOE variables and check if our variables participate in their configurations as well. In addition, annual revenue could be an important factor for firms to invest in DDI.

Second, our dependent variable was DDI development. The actual underlying variable of interest in this context is successful development of DDI applications, which could be an objective measure. In our study, we had no practical way of accessing this, and we anticipated great difficulty in gathering such data in a survey and thus had to use subjective measures. Therefore, we suggest research that is more engaged in the context—e.g. ethnographic studies—that can enlighten us on the phenomenon. Further, apart from the supply side, investigating this phenomenon from the user side of DDI ecosystem may deepen our knowledge on DDI mechanisms.

Finally, we have collected cross-sectional data at a given time; conducting longitudinal studies potentially will assess the varied nature and effects of the variables, if there is any. In this regard, we could compare the effects of TOE variables at different stages of DDI development lifecycle e.g., emerging, growth, maturity, and saturation (Riserbato, 2021). It is likely that in early stage, the internal as well as external stakeholders do not realise the potential of DDI and thus firms may experience a shortage in resources. As DDI matures over time, the TOE variables may change or new variables may emerge (Hossain et al., 2016a, b).