To derive meaningful insights into trends, frequencies and distributions, a classical statistical data analysis was used. Based on a percentage-frequency analysis, many insightful findings along the main dimensions of the DDI framework could be identified. In the following subsections, we will summarise all findings derived from the percentage-frequency analysis. We will represent these findings by first discussing some generic findings before discussing the findings in relation to the dimension the DDI framework.
4.1 General Findings
It was important for us to find out whether the distinction between B2B and B2C has an influence on the design of data-driven innovation. In addition, we wanted to better understand the possible impact of the (non-)sector focus of data-driven innovations.
Target Customer: The majority of data-driven start-ups (78%) are addressing B2B markets. Only 2 out of 90 start-ups in our sample focused solely on end-customer markets. Start-ups addressing end-user needs prefer already established channels to deliver their offering to the users. They tend to rely on partnerships with established business partners to bring their offering to users. A second, quite frequent, strategy used by 19% of start-ups is positioning data-driven solutions as multi-sided market offering combining complementary offerings to align private and business needs.
Seventy-five per cent of our start-up sample have developed a clear sector focus. Companies with clear sector focus have a concrete customer segment in mind for whom a concrete value proposition is delivered. Those companies have a concrete customer segment(s) in mind for which a concrete value proposition is delivered.
For example, CloudMedxFootnote 8 Inc. designs artificial intelligence-driven software for medical analytics. Clinical partners at all levels can derive meaningful and real-time insights from their data and intervene at critical junctures of patient care. Its underlying clinical AI computing platform uses healthcare-specific NLP and machine learning to generate real-time clinical insights at all points of care to improve patient outcomes. By relying on evidence-based algorithms and deep learning, a wide variety of structured and unstructured data being stored in clinical workflows can be understood and used for decision making.
In comparison, we also found start-ups that focus on technology with cross-domain impact. In general, their solution will be used by other intra- or entrepreneurs to build data-driven solutions for end users.
For instance, the start-up DGraph LabsFootnote 9 is offering an open-source distributed graph database. The company is planning to release an enterprise version that is closed source, as well as a hosted version (as it is easier to run hosted services for customers than trying to help them debug every issue on their own). Customers are using the service to build their own sector-specific applications.
In summary, sector-specific data-driven offerings are much more frequent than technology-driven sector-agnostic solutions. This is due to the very different pre-processing challenges of data sources in the various sectors, as well as the higher possibilities of identifying target groups in concrete sector settings. Most sector-agnostic offerings are intermediate functionalities addressing developers to build customised solutions.
4.2 Value Proposition
To analyse the value proposition in the context of data-driven businesses, our main focus is on the different ways data is used to generate value. Data value refers to the insights that can be generated out of data and how this can be used in a particular user or business context. In accordance with its value and complexity, we distinguish four different types of analytics that are used for generating different types of insights, i.e. descriptive analytics explain what happened, diagnostic analytics highlight why something happens, predictive analytics forecast what will happen in the future, and prescriptive analytics identify optimal actions and strategies (Zillner 2019).
Two out of every three start-ups rely on data analytics in general for generating insights. Among the start-ups using data analytics, 83% rely on descriptive analytics in their offering (i.e. every second start-up).
For instance, the start-up ApptopiaFootnote 10 is using descriptive analytics to provide app analytics, data mining and business intelligence services. They collect, measure, analyse and provide user engagement statistics for mobile apps and visualise the aggregated data in classical dashboards. The unique selling point of their offering is the high number of data points they are able to integrate and visualise, i.e. they state that they rely on “more different data points than nearly any other app data provider in the world”. The insights, which can be generated by descriptive data in this large data set, are of interest to the worldwide mobile app developer community as they allow them to compare their own app performance with competing or related apps. Whenever app developers are engaging with the Apptopia platform to benchmark their own apps, additional valuable data sets can be generated. By offering free-of-charge descriptive analytics-based dashboards, Apptopia are able to attract a large number of developers to use their platform, which again allows them to produce high-value data sets that can be sold to business customers.
Four out of ten start-ups in our sample set relied on predictive analytics to generate value for their users.
For instance, the start-up VisibleeFootnote 11 collects IP addresses and cookies of all website visitors and uses these to predict the identity of unknown visitors in real time. By relying on these real-time predictions, the company is able to increase the leadsFootnote 12 threefold.
Compared to descriptive and predictive analytics, we can observe that diagnostics and prescriptive analytics are used less frequently. Only every fifth data-driven start-up is offering solution for automating manual tasks or activities, and match-making is observed in only 16% of cases.
To implement data-driven offerings, in general, several algorithms and approaches are combined. This is also true for the four different types of data analytics discussed earlier. In our sample, 4 out of 10 start-ups use more than 2 different types of data analytics, and 19% of start-ups rely even on 3 or more types of analytics to generate value.
For instance, EliqFootnote 13 provides a comprehensive platform for the intelligent energy monitoring of utilities. The AI-powered app offers a wide range of insights:
-
By relying on descriptive analytics, Eliq shows periodic energy consumption patterns that can be drilled down into different time frames, i.e. yearly, monthly, hourly, etc.).
-
By relying on diagnostics analytics, Eliq helps users to identify potential “energy leaks” or potential sources of energy theft.
-
By integrating external data sources, such as extreme weather change forecast, Eliq can inform users that their energy consumption is likely to change significantly (predictive analytics). Utilities benefit from such information as they can customise marketing communication accordingly.
-
By relying on prescriptive analytics, the Eliq platform can not only inform users about increased energy consumption but also recommend strategies to overcome such high consumption scenarios, e.g. by upgrading or replacing devices with higher efficiencies. This allows utilities to establish a personalised and targeted user engagement.
Eliq is an example of a start-up that establishes a unique value proposition and competitive edge by offering a wide range of analytical services. We want to highlight that this is not a frequent pattern. The majority of start-ups (62%) is focusing on only one analytical offering.
4.3 Data
Data is the key resource for realising data-driven innovation. In general, we observe that the used data sources greatly influence the efforts in data pre-processing as well as the scope of the data-driven offering. In case a data-driven innovation is based on image data, we can conclude that an image segmentation algorithm needs to be in place. In accordance with how specific or domain specific the underlying image data set is, a new pre-processing image algorithm needs to be developed. Or in the case of personal data and of industrial or operational data, GDPR-compliant services and data privacy methods need to be in place, respectively.
For that reason, we recommend exploring the data assets early when scoping one’s data-driven innovation. Data exploration will help to understand:
-
Whether the envisioned value proposition can be realised. Very often, we face the situation that the data quality is not good enough to generate the needed insights.
-
How much effort is needed to create data of high quality. Often the raw data is not yet the data quality needed. The good news is that there exist many approaches to increase the quality of data for this scoped purpose. However, the expected return always needs to be aligned with the efforts needed. Other projects in the Big Data Value Public-Private Partnership (BDV PPP) have reported similar experiences (Metzger et al. 2020).
In the following, we will give an overview of which data types and sources are used and how frequently in data-driven innovations.
A wide range of different types of data sources exist that are relevant for developing data-driven innovation. Although only 19% of start-ups were addressing B2C markets, personal data was still the most frequently (67%) used in the analysed data-driven offerings. This is a very impressive number given the fact that only a very low number of companies in our sample (19%) were addressing business-to-consumer markets. In consequence this also implies that a high percentage of start-ups addressing business customers in EuropeFootnote 14 need to handle the constraints of the General Data Protection Regulation (GDPR).
For example, Oncora MedicalFootnote 15 is using personal data to fight cancer. The US-based company collects data on cancer patients including information related to treatments and clinical outcomes through an intuitive software used by doctors. Their objective is to deliver predictions that can help design better radiation treatments for patients, as well as enabling precision medicine in radiation oncology. The data collected is personal data and is thus sensitive and has higher standards of protection.
Industrial data, i.e. any data assets that are produced or used in industrial areas, is a second type of data which has high data protection requirements. In comparison to personal data, industrial data is used only half as often. Organisations seem to be reluctant (in particular if they do not see the immediate value) to share their industrial and operational data with third parties, such as start-ups, because they are afraid to reveal relevant business secrets.
One successful example, PlutoShift,Footnote 16 offers a platform that is helping industrial customers to improve their operational efficiency by identifying inefficient patterns of energy usage by analysing customer data stored in the cloud and operational sensor data. With energy being a high-cost driver, PlutoShift can help industrial customers to reduce resource consumption and operating costs.
The second most popular types of data source are time-series and temporal data. Fifty-six per cent of start-ups in our sample rely on these types of data to generate value. The high frequency might be due to the popularity of using behavioural data that is tracked within each user interaction on the web and mobile devices and is thus very likely to cover time-series data. Another very frequently used data source is geo-spatial data (46%), and the usage of Internet of Things (IoT) data is seen in 30% of our sample.
4.4 Technology
The BDV Strategic Research and Innovation Agenda (SRIA) (Zillner et al. 2017) describes five technical priorities identified by the BDVA ecosystem and experts as strategic technical objectives. In our study, we were interested in which of these technical areas were most frequently covered when realising data-driven innovation.
Among the five technology areas listed in the BDV SRIA, data analytics is used most frequently. Eighty-two per cent of our start-up samples relied on some type of data analytics to implement data-driven value proposition. The usage of technologies in the data management area is seen in 41% of cases and is very much in line with offerings addressing the challenges of processing unstructured data sources. Solutions for data protection are the least frequently addressed research challenge with 13%. When looking at to which extent BDV SRIA technologies are used in combination, we observed that more than half of the start-ups, precisely 59%, combine two or more technologies.
Uplevel SecurityFootnote 17 is one example that combines data management with data protection. They redefine security automation by using graph theory for real-time alert correlation. Their product creates a dynamic security graph (data management) for an organisation based on incoming alerts, prior incident investigations and current threat intelligence (data protection). Uplevel Security then transforms the ingested data into subgraphs that continuously inform the main security graph. By automatically surfacing relationships, investigations no longer occur in isolation but begin with context.
Less frequently observed, 22% of the companies combine more than three technologies.
One example of this is the medical company CloudMedx,Footnote 18 which started with the aim to make healthcare affordable, accessible and standardised for all patients and doctors. The company uses NLP and proprietary clinical contextual ontologies (data management) and deep learning (data analytics) to extract key clinical concepts from electronic health records, which serve as insights for physicians and care teams with the goal to improve clinical operations, documentation and patient care. In addition, CloudMedx is presenting the results to dedicated teams through a user-friendly platform that allows for interactive predictive and prescriptive analytics to assess current metrics and build a path forward with informed decisions.
4.5 Network Strategies
For digital and data-driven innovations, network effects are important phenomena to reflect. In our study, 57% of start-ups rely on network effects. A network effect occurs when a product or a service becomes more valuable to its users as more people use it (Shapiro and Varian 1999). Network effects are also known as demand-side economies of scale and predominately exist in areas where networks are of importance, such as online social networks or online dating sites. A social network or dating site is more appealing to its user when it is able to continuously attract and add more and more users. In consequence, harnessing network effects requires developing a broader network of users in order for the network or site to differentiate itself from its competitors. For that reason, the critical mass of users and timing are key success factors in a network economy.
Due to the high impact of the network effects, competitors starting from “ground zero” with no users in their network will face difficulties in entering the market success fully. In this context we are using the expression “network effect” to highlight the positive feedback (positive network externalityFootnote 19), i.e. the phenomena that already existing strengths or weaknesses are reinforced, might lead to extreme outcomes. In the most extreme case, positive feedback can lead to a winner-takes-all market (e.g. Google).
Network effects impact the underlying economics and operation of data-driven innovation. Instead of creating products that are early on the market and different from other offerings, the focus here is on scaling and scoping the demand perspective. Understanding network effects and their underlying market dynamics is crucial to successfully positioning data-driven products, services and businesses in the market. In doing so, data-driven innovation can harness network effects on three different levels.
First, data-driven businesses are relying on network effects at data level, if they are able to improve their offerings by the sheer amount of data they hold available. In our sample this was the case in 49% of start-ups.
For instance, the already mentioned company ApptopiaFootnote 20 uses big data technology to collect, measure, analyse and provide user engagement statistics for mobile apps. The more app providers produce data being connected to the platform, the more valuable the service becomes. In order to gain more real-time data, they attract app developers to connect to their platform by providing free data analytics products. With this free-of-charge value proposition, developers benefit in registering their mobile apps on the platform while giving the platform the permission to analyse user engagement data of the mobile app. Professional and expensive subscription fee models for business customers, including Google, Pinterest, Facebook, NBCUniversal, Deloitte and others, benefiting from real-time engagement insights of mobile apps, complement the revenue strategy of this offering.
In this context, multi-sided business models are the usual way forward. Typically, a multi-sided business model brings together two or more distinct but interdependent groups of customers. Value is only created if all groups are attracted and addressed simultaneously. The intermediary, in our example the company Apptopia, generates value by facilitating interactions between the different customer groups, whereas the value increases when more users are attracted. The more app developers register on the platform, the more accurate the statistics become. With an increasing number of business customers, Apptopia then creates the required resources to invest in advanced functionalities for app developers.
Second, when businesses are providing a technical foundation for others to build upon, we can observe network effects at infrastructure level. In our sample these have been 12% of start-ups. Based on a layer of common components, third-party players are invited to develop and produce an increasing number of data-driven offerings.
This set-up is also known as product platforms (Hagel et al. 2015). A prominent example is the Android platform – it provides the technical foundation for others to build apps. This includes any type of tool and service that enables the plug-and-play building of data-driven offerings, e.g. (open) standards, de facto standards, APIs and standardised data models. The more functionalities are available that help others to build and position innovative offerings better, faster, etc., the more attractive the offering itself becomes. The infrastructure layer has little value per se unless other users and partners create value on top of it.
An example of this dynamic is the agricultural-robotics technology company Skyx.Footnote 21 This company is offering neither hardware nor agriculture end-customer applications, but a software that enables a modular swarm of autonomous drones for spraying. By providing a technology to plan and control the mission of drones in real time as well as to auto-pilot the entire fleet/swarm, it addresses the need for agri-spraying application developer applicators in building their solutions at a higher quality and at less cost by relying on a standardised approach. In addition, as the software is compatible with any commercially available hardware, the cost of connecting the wide range of drones can be significantly reduced. Thus, Skyx provides tools and connectors for agri-spraying application developers to build their own solutions. The more drone hardware can be connected, and the more spraying functionalities can be provided, the more attractive the overall offering for applicators.
Third, in cases where the number of marketplace participants is the key source of value, data-driven offerings can harness network effects at marketplace level. Offerings that are able to connect participants in their specific roles, such as buyer and seller, and consumer and producer, allow two participants to easily interact with each other.
The low number of network effects at marketplace level in our study (10%) indicates the difficulties and challenges in building them. The challenges are less at the technical level and more at the level of building critical size and balanced user communities. Several strategies to attract users from the different communities have been implemented by start-ups.
4.6 Revenue Strategy
We have been interested in the question of how data-driven businesses are making money. Is this different from traditional businesses? And can we identify some dominant revenue models?
Our first finding is that it was often difficult to find information about the type of revenue models used. Especially in cases when start-ups have been focusing on emerging technical advances, such as drones or autonomous driving, information about revenue models was – understandably – not available.
As emerging technology businesses are often seen as a risky investment or bet on the future in a market not yet established, the absence of revenue-related information is not surprising. This was the case for 10% of the companies analysed: We couldn’t find or extract any information about the revenue model.
Our study confirmed the findings of Attenberger (2016) that revenue models have not changed through the usage of data technologies per se. The major difference to traditional businesses is that data-driven innovations rely on different types and combinations of revenue streams that are continuously changing over time in order to address the specific user needs of each customer segment. On the one hand, we observe new forms of value propositions, ranging from service offerings, to the bundling and unbundling of offerings, to intermediate offerings, to product differentiations through versioning, that allow the specific user needs to be addressed.
On the other hand, the majority of data-driven innovations have – in comparison to traditional businesses – a different cost structure. With data and data offerings being cheap to reproduce and deliver, the typical cost structure of data-driven innovations relies on fixed costs for the development of the offerings but low variable cost. This kind of cost structure leads to substantial economies of scale as with more offerings sold, the average costs of development decrease dramatically. In addition, as the reproduction and distribution costs are often marginal, the danger of price dumping and surplus of offerings in the competitive market is a frequent phenomenon. For instance, Aitken and Gauntlett (2013) counted more than 40,000 health apps in the app store being offered for free or for a very low price.
With this new cost structure for most data-driven innovations, organisations have a new flexibility to adjust the equation between value proposition and price in accordance with the user needs of various customer segments. In this context, companies elaborate the specific price level the targeted user group is willing to pay. The main objective for aligning the product version with the pricing version for each customer segment is to attract more users and interactions, as well as to grow the community.
The most frequently used revenue model in our study was the subscription model. We observed in this context a strong correlation between the spread and high adoption of software as a service (SaaS) approach, which brings a lot of flexibility when used for deploying data-driven innovations. The second most frequently used revenue model is the selling of services in which the person’s time is paid for. These revenue models are very often used for open software offerings as well as when offerings are not standardised or off-the-shelf. Advertisement as a revenue model is rarely observed. In our sample, only 2% of start-ups are applying it. Although this might seem surprising, it merely reflects the high percentage of B2B models.
4.7 Type of Business
Data-driven innovations can disrupt existing value chains. However, at the same time, we observe a large number of “low hanging fruits”, i.e. business opportunities in the scope of established processes (intern) or value chains (cross-organisational).
To classify data-driven business opportunities we will introduce four strategies with a significant impact on markets and associated value chains:
-
(a)
Providing new value to customer with established market position
-
(b)
Developing a new marketplace/ecosystem
-
(c)
Leverage an existing ecosystem by scoping a niche offering
-
(d)
Building technology assets that ensure a future competitive advantage
The following remarks describe the four strategies in detail and illustrate them with an example from our sample of start-ups.
In general, this classification is based on approaches available for the classification of traditional business opportunities. One important work in this context is Ardichvili et al. (2003), who classified business opportunities into two dimensions: value creation capability and value sought. Although both dimensions have at first glance a good mapping to the DDI supply and demand side, they did not reflect the changing nature of underlying business ecosystems. As already discussed at the beginning of this chapter, data-driven innovations are rarely developed alone but rely on the collaboration between many partners in the value chain.
When positioning data-driven offerings in the market, it is also necessary to reflect the associated business strategy and innovation ecosystem.
Data-driven services are often associated with the strategy of “Finding a new business partner”. This strategy tries to focus on one single customer (segment) and his or her business processes. Based on a detailed understanding of his/her business processes (including the pain points, happiness points and unaddressed user needs), new values/services for specific user needs are built. As the service is heavily focusing on this one specific partner, the overall market and business ecosystem is only observed in an indirect manner. In our study, the data-driven service business was the most frequently observed approach (with 78%) to position offerings in the market.
For instance, the company Arable provides an agricultural solution based on in-field measurements as a software-as-a-service (SaaS)-based service offering. To enable growth, advisors and businesses are invited to play a proactive role in ensuring high quality and longevity of their agricultural operations. As a consequence, the company can derive real-time, actionable monitoring and predictions related to weather risk and crop health by means of a tiered SaaS offering with different levels of services combined with IoT businesses. The tier I service includes reporting, integrating and visualisation, whereas the tier II services include predictions and advanced analytics.
Compared to data-driven services, the second type of business strategy – developing a data-driven marketplace – is significantly more complex as a new marketplace/ecosystem needs to be built up. Only 16% of companies in our sample relied on this approach. Market participants on the supply as well as on the demand side need to be attracted. In addition, it is necessary to ensure that a critical number of participants are providing their assets and at the same time a critical number of participants are requesting them.
The growth of the marketplace needs to be balanced on both sides – the supply and demand sides – in order to retain its attractiveness. It seems that organisations have been developing very different strategies to attract the different participant groups, e.g. by providing necessary IT services and analytics services, and offering services for free.
One example of this strategy is Zizoo,Footnote 22 a Vienna-based company that established a global boat rental platform. Zizoo is building a global digital booking platform and website connecting suppliers (charter companies) to travellers worldwide, similar to “Booking.com for Boats”. When the building of this marketplace started, the founders of the company were entering a market (the boat rental market) which was 10 years behind any other travel sector. As the majority of boat charter companies had not yet been digitalised, they needed to put a lot of effort into attracting the supply side to join their emerging marketplace. For instance, they offered charter companies a powerful inventory management tool and business intelligence for free. As they are making boat holidays affordable and accessible to everyone (bookings start at €20 a day), they were also able to attract the demand side.
Another strategy is to identify an existing healthy ecosystem that is already in place which gives the opportunity to position one’s own offering as a niche application. The so-called niche player leverages an existing ecosystem by scoping a niche offering in accordance with the defined constraints of the dominant or key player of the ecosystem. Typical examples of such strategies are the thousands of apps offered in the iOS or Android ecosystems for mobiles. In our sample we could observe this in 12% of cases.
One good example of this strategy is AIMS Innovation.Footnote 23 This start-up develops AI and machine learning technologies to give the world’s largest companies deep insights into and control of their most business-critical processes – such as safely distributing electricity, shipping thousands of daily orders to ecommerce customers or delivering the results of medical tests to doctors quickly and reliably. They are positioning their offering in the Microsoft ecosystem. According to their website, they offer the only artificial intelligence solution in IT operations covering all core Microsoft enterprise technologies.
The last type of business category is the emerging technology business that anticipates a future ecosystem or market. In our study this was seen in 9% of the sample. As the market is not yet settled and the technology is often in a very early stage, it is scoped as investment in the future. Thus, revenue strategies cannot be implemented. The main focus of emerging technology businesses is building capabilities/assets ensuring a future competitive advantage.
For instance, the company CarfitFootnote 24 is working on creating the most comprehensive library of car vibrations. They collect and generate systematically data related to noise, vibration or harshness. An enhanced data analytics algorithm is in place to incorporate automotive domain expertise. The company is aiming at a car vibration tracking device that can help to lower car maintenance costs and increase the efficiency and transparency of the car’s operations. But the self-diagnostic and predictive maintenance platform only brings real value to end users when vehicles are moving autonomously. Thus, the company is addressing a future market (as today drivers are in general good at detecting abnormal noises in their car). However, when cars are moving autonomously the need for remote monitoring will become critical.