1 Data Circulation: The Catalyst of Economic Value Creation

Data needs to circulate in order to liberate its full value potential. What does it mean?

Data has become a core asset in the economy, fostering new industries, processes, and products and creating significant competitive advantages.

Studies forecast the worldwide data valorization to reach $500B by 2022 and $708B by 2025 [1]. In the European Union, the data economy will soon contribute between 1.9% and 4% of GDP [2]. A Boston Consulting Group study shows that by generating just 1% of incremental revenue through data monetization, organizations can see their earnings increase by 10% and company valuation rise by more than 25% [3].

The need for data is skyrocketing, as is the volume being produced. Private and public organizations can take advantage of this mega-trend and develop data-driven strategies that are:

  • Impacting their corporate performance and competitiveness, by optimizing business processes, creating new data-driven products and services, developing and training AI models, and generating new revenue streams through innovative data productization and distribution strategies,

  • Impacting the society as a whole, by addressing more global objectives such as data altruism or leveraging data for decarbonation and other environmental use cases.

Innovation is greatly accelerated when data can be circulated, exchanged, and shared between organizations.

However, until now, the most useful data to achieve these goals are mostly industrial data, which has until now remained poorly accessible or not accessible at all, due to the lack of trustworthy frameworks allowing stakeholders—external to the organization that produced or collected the data—to access it.

The development of data ecosystems and data spaces will unlock the full potential of data.

Key requirements to be successful include:

  • Achieving a high level of trust among all stakeholders involved in the exchange of data

  • Building data spaces on the basis of solid and trusted data exchange environments

  • Regulating data ecosystems through data exchange regulatory frameworks

  • Providing traceability at all levels of data transactions

  • Creating optimal conditions for data governance within complex, distributed, and heterogeneous environments

2 Trust as the Cornerstone of Data Exchanges

Data spaces’ main challenge is to enable the fair and trusted circulation of data. Trust is the cornerstone of data exchanges, which must be achieved at both the operational level of the interactions between data providers and data consumers, and the more strategic and ecosystemic levels.

Trust is an essential component in conditioning the development of data spaces. It can be expressed on several levels:

  • Trust in the source of the data: data acquirers engaging in data transactions want to be sure they can trust data providers, that they are acting in good faith. Trust will depend on many factors, including the reputation of the organization providing the data, its ability to deliver over time, the visibility offered on the origin of the data being exchanged, etc.

  • Trust in the acquirer and/or user of the data: data providers are expecting data acquirers to act according to the agreed terms and conditions for the use of the data being exchanged. Trust typically increases if the data provider already knows the data acquirer or if it can access sufficient details about the acquiring organization’s profile and identity before engaging in data transactions.

  • Trust in the use made of the data: the use of clear and detailed legally binding contractual agreements, which can be based on open data licenses or commercial licenses, as well as the guarantee that the data transactions comply with regulations in force, contribute a lot to increasing trust in the use of the data.

  • Trust in the data exchange service providers: Data providers and data acquirers increasingly engage and conduct data transactions via structured data exchanges, data marketplaces, or data hubs that act as trusted third parties facilitating data exchanges and offering all the guarantees of security and compliance to regulations. Those entities, also named Data Intermediation Services providers in the EU Data Governance Act, play an essential role in the building of trust.

2.1 A Solid Data Exchange Environment at the Heart of the Data Value Chain

A data exchange platform is materialized by the technology layer that provides tools and automation needed by data acquirers, data providers, and data exchange services providers to deploy their data exchange strategy.

Private corporations and public entities use data exchange platforms to:

  • Become more data-driven and use data to build better products and services, and improve their business processes

  • Unlock the full value of their data by distributing them (monetization or/and for free), in a controlled way, to other organizations

  • Position their organization strategically in emerging data ecosystems by providing data exchange services to other organizations

Data exchange platforms are used by data providers and data acquirers to conduct data transactions smoothly and at scale. They can rely on automated features and capabilities for:

  • Streamlining data discovery so that organizations spend less time searching for data and more time synthesizing valuable insights

  • Facilitating the packaging of data products with the description in clear terms of the conditions of use of the data and the licensing terms

  • Handling the technical exchange of the data through various file-based or API-based mechanisms

Data exchange platforms also include all the tools needed for data exchange service providers to orchestrate, industrialize, and automate the governance of the platform, and grow its usage and the number of business use cases leveraging the data exchanged on the platform. The platform supports and covers the main activities of the orchestrator, also called operator of the platform, and acts as the trusted third party, including:

  • The administration of the data exchange platform and all its participants, namely the orchestrator’s teams, data acquirers, and data providers, such as the registration, vetting, access rights, etc.

  • The monitoring of the activity on the platform with automatic review of meta-information used by data providers to describe data offerings

  • The stimulation of participants to exchange data using dedicated tools for automatically matching data supply and demand using algorithms, combined with notification systems

  • The management of the business models applied to the use of the data exchange platform by the participant, which can combine free, transactional, and/or subscription-based approaches

  • The production of various reports and metrics to conduct analysis on activity, behaviors, and trends, but also to enable the orchestrators to meet their obligations regarding interoperability and exercise of individual rights, that may be required by regulations.

Data exchange platforms play a critical role in the building of data spaces by clearly separating the roles and responsibilities of those who make data circulate from those who collect, store, and transform data (Fig. 23.1).

Fig. 23.1
figure 1

Data exchange platforms at the heart of the data value chain (©2021, Dawex)

2.2 Regulating Ecosystems: Compliance at the Heart of Data Exchange

Institutions, governments, and regulators worldwide are playing a crucial role in boosting a sustainable data economy. By regulating data access and use, they contribute to raising trust among all types of organizations engaged in data sharing and data exchange activities (Fig. 23.2).

Fig. 23.2
figure 2

Data protection around the world (©CNIL, 2022)

Europe is taking leadership in establishing data regulations. Since the General Data Protection Regulation (GDPR [4]) in 2018, which has inspired many similar other regulations around the world (the California Consumer Privacy Act (CCPA [5]), the Lei Geral de Proteção de Dados (LGPD [6]) in Brazil, or the Act on Protection of Personal Information (APPI [7]) in Japan, new European regulations have been proposed by the European Commission extending their scope beyond personal data to cover all types of data. In particular, industrial data has been identified by the European Commission as the next challenge in the creation of a single market for data that will allow it to flow freely within the EU and across sectors for the benefit of businesses, researchers, and public administrations. The President of the Commission, Ursula von der Leyen, declared in her political guidelines for the 2019–2024 Commission that Europe must “balance the flow and use of data while preserving high privacy, security, safety and ethical standards.”

The ensuing Commission Work Programme 2020 outlined several strategic objectives, including the European Strategy for Data, which was adopted in February 2020. The data strategy aims at building a genuine single market for data and at making Europe a global leader in the data-agile economy.

As part of the European Strategy for Data, the recently adopted proposal for a Data Governance Act aims to facilitate the voluntary sharing of data by individuals and businesses, harmonizes conditions for the use of certain public sector data, and defines the requirements applicable to data intermediation services. In particular, it stipulates among other conditions that data intermediation services providers may not use the data for which they provide data intermediation services for other purposes than to put them at the disposal of data users and shall provide data intermediation services through a separate legal person.

Complementing the Data Governance Act, the Data Act, whose proposal has been published by the European Commission in February 2022, aims at ensuring fairness in the allocation of value across the data economy. Future regulation will:

  • Facilitate the access to and use of data by businesses by increasing legal certainty in the sharing or exchange of data, especially IoT data

  • Provide for the public bodies’ use of data held by enterprises in certain public emergencies

  • Facilitate switching between cloud and edge services

  • Provide for safeguards against unlawful data access by non-EU/EEA governments

  • Provide for the development of interoperability standards for data to be reused between sectors

  • Also additional initiatives are driven in Europe through the Gaia-X association looking at developing common requirements for a European data infrastructure, defining a reference architecture, and providing a secure, federated system that meets the highest standards of digital sovereignty while promoting innovation. Gaia-X also positioned data exchange as one of the four key pillars defining Gaia-X Federation Services.

3 The Need for Traceability

Traceability refers to the completeness of the information about every step in a process chain. It refers to the capability of an application to track and trace the state of objects, discover information regarding its past states and potentially estimate future states [8].

As data is being utilized in the context of specific projects or initiatives, knowing what data acquirers and data providers intend to do with the data really matters. If participants in a data ecosystem are seeking to leverage data circulation for the common good, as part of a progressive project with a societal impact, or want to participate in the data economy, data exchange environments must provide them with traceability capabilities to reassure all participants that the data is being used in accordance with the initial agreement. This means data traceability is not only a cornerstone of trust but a crucial element to guarantee the business viability of all collaborative data projects.

In order to entirely fulfill their promise, data exchange environments must provide complete traceability of data to follow its pathway. Providing participants with traceability will generate trust in the exchange and in the environment. This is a key capability for any expanding data ecosystem. It enables participants to provide regulatory authorities with information on the source, nature, and development of all exchanges with the data ecosystem, which becomes increasingly time consuming and complex as ecosystems grow in size and activity. Traceability also provides huge data management benefits to participants engaging in numerous data exchanges with several different organizations, under different licenses. Data providers and data acquirers are empowered to optimize the orchestration of their data sharing and exchanges.

Data exchange platform environments can provide multiple capabilities to implement traceability, such as negotiation tools to find all interactions between data providers and data acquirers, the negotiation status, messages exchanged, or the date of all contacts. Functions to follow all legal documents related to the data transaction and the status of all data exchanges can also go a long way to provide increased visibility.

Multiple examples can be found to illustrate the urgent need to multiply traceability capabilities in all instances of data sharing and data exchanges, some of the most telling in the financial services sector. Stock exchanges are especially illustrative.

Financial markets are operated based on trust. Stock or commodities exchanges play a central role in creating trust, and traceability of all transactions that are taking place on those exchanges is vital for markets to operate properly, also allowing authorities to verify that trades are complying with regulations.

In the same way, proper traceability capabilities integrated into a complete and compliant data exchange framework will contribute to more efficient, trusted data exchanges, impacting directly the growth of such exchanges.

4 Data Exchange Governance: Toward Hybrid Data Exchanges Platforms

A data exchange platform is first and foremost a place where data supply and demand meet, securely and with confidence: the latter wanting to use or exploit data owned by the former.

Thus, specific conditions are essential to build a trustworthy digital ground for exchanging or accessing data such as:

  • Access conditions: these conditions allow only legitimate participants to share or acquire data.

  • Usage rules: enable control of the data for data providers through licenses that specify the criteria for use of the data or even the rights to sub-license the data, therefore providing legal coverage to limit misuse of the data.

  • Traceability of data circulation: the origin of data must be traced, as well as all its circulation.

  • Data exchange governance: this requirement is a means of governing the exchange of data, particularly with a view to the right of audit by regulators or authorities, by having mechanisms for data circulation, traceability of data exchanges, and data use.

  • Economic model: this requirement reflects the commercial objective of the data exchange platform, which is to generate revenue for itself via its services. This objective is achieved through commissions earned on the activity taking place on the data exchange platform or through other additional means such as subscriptions to the platform.

  • Data Sovereignty: the platform must have mechanisms that allow the data provider to control who can access its datasets and manage permissions, usage restrictions, and associated licenses.

  • Secure data exchange: this is a requirement related to the most fundamental aspect of the data marketplace: data exchange. These exchanges must be carried out in the most secure way, as the data exchanged has a high commercial and strategic value. The disclosure of this data would reduce its value and lead to commercial losses, competitive movements, and reputational and regulatory impacts.

  • Compliance and privacy: for data-driven innovation to thrive, it is essential that stakeholders share their data. To achieve this, the trust and security framework is key to both reassuring and providing technical guarantees of the security implemented.

The objective in addressing these challenges would, therefore, be to extend the traditional technological capabilities of data exchange platforms to develop a new type of solution that creates the optimal conditions for data governance within complex, distributed, and heterogeneous environments. Numerous technologies are already making it possible to envisage answers to these problems: deeptech, and particularly IoT and edge computing, for example. Nevertheless, many challenges exist, from monitoring millions of heterogeneous devices to secure data exchange between them.

Depending on the use cases, different modes can be provided:

  • Managed mode: in this mode, the orchestrator acts as a trusted third party for the exchange of data. Data flows through the data exchange platform.

  • Distributed mode: in this mode, the data provider provides access to its data to the data acquirer when the transaction is finalized. Data flows peer-to-peer.

  • Decentralized mode: in this mode, part of the business logic is delegated on nodes, through the use of agent or of smart contracts for example (Fig. 23.3).

Fig. 23.3
figure 3

Illustration of managed, distributed, and decentralized modes (©2021, Dawex)

These three modes do not oppose each other but complement each other, correspond to different use cases, and require specific technological responses. Data exchange platforms implementing these capabilities are called hybrid data exchange platforms.

Distributed ledger technologies (DLTs), such as blockchains, are powerful tools to reach decentralization needs. DLTs offer thus disruptive alternatives for these platforms in a peer-to-peer data exchange approach. The auditability of the information facilitates monitoring and allows anomalies to be identified. The blockchain also helps equipment interoperability by providing a reliable communication layer. In cases of large-scale use such as in smart cities, the blockchain and the IoT will work together.

To meet the requirements of distribution, decentralization, heterogeneity of actors, interoperability, protection of personal data, end-to-end encryption, and data governance requires finding the right balance between integrating distributed and managed technologies.

4.1 Innovative Nature of the Hybrid Approach

Indeed, there are different types of blockchains (public, private, pseudonyms, or with strong means of identification), using different consensus algorithms (proof of work, proof of stake, proof of participation, trusted execution environments), specific cryptographic means (including but not limited to asymmetric cryptography, zero disclosure proof of knowledge), different types of tokens (including but not limited to utilitarian, monetary, fungible or non-fungible), or different computer languages to write smart contracts. We can mention Ethereum, Hyperledger, Quorum, or ZCash technologies. These technologies are mature and implemented by many different economic actors or different consortia such as the European EBP/EBSI initiatives.

While these blockchain technologies are mature, their large-scale implementations beyond the PoC (Proof of Concept) or PoB (Proof of Business) stage are still relatively limited, facing many bottlenecks such as interoperability, environmental efficiency, privacy, and user experience, among others. The creation of data marketplaces that are both distributed and encrypted from end to end may be required only when the nature of the players involved are very heterogeneous. Thus, in addition to offering interfaces and functionalities that facilitate data exchanges, this new operating mode must offer maximum efficiency, trust, traceability, and security. These conditions (efficiency, transparency, traceability, security) are essential to create an environment of trust that is indispensable to the commitment of stakeholders in the exchange of data and the establishment of a transaction between these players.

Combined with great flexibility supporting multiple business models, the hybrid approach allows managed, distributed, and decentralized use cases, from data sharing among the organization’s business divisions to data exchange with external partners, leveraging free models, subscription-based access models, or pay-as-you-go models that charge a fee on each data transaction.