1 Introduction

Creating a European-wide Data Space is a very ambitious goal emerging from European stakeholders and visioned through many interventions during the last ten years. The quest for a European Digital Single Market, common Data Strategy, and a new kind of regulative framework based on European values are vehicles for a new kind of data economy enabled by data sharing infrastructures and the motivation to keep pace in the global competition. Data spaces are a must for Europe.

Simultaneously, another major technological breakthrough has happened. Artificial intelligence has become central as technology to facilitate the transformation of businesses exploiting digitalization. Also, in AI, European values are essential for technology development. Principles of privacy, security, and fairness are the basis of solutions where the European citizen, a human being, is center stage of society.

In a digital society, data is needed to improve industry performance and understand how sustainable our society is. Using data and AI, it is possible to answer the big questions, how sustainable the planet is or what impact industry has on climate.

This chapter delineates, starting from Chap. 2, the vision of Big Data Value Association on Data Sharing Spaces, explaining the five pillars needed to create value in data, trust as a central concept, and tools and mechanisms for strategic stakeholders to create data sharing spaces jointly. Section 22.3 elaborates on the strategic challenges which need to be overcome to realize the vision. Section 22.4 sets out our call to action for the community to make this a reality. Section 22.5 summarizes the initial progress on the development of a data platform to support data sharing. Sections 22.6 and 22.7 detail the importance of data governance and trustworthy AI. Section 22.8 details an example of a data space in smart manufacturing. Finally, Section 22.9 concludes the chapter.

2 Vision

The realization of a functioning and frictionless European-governed data sharing space that can successfully generate economic value by broadening data access for AI relies on carefully planned iterative implementation strategies and a timely concerted effort between all relevant stakeholders [1]. As depicted in Fig. 22.1, the success of widespread data sharing activities revolves around the central key concept of trust: in the validity of the data itself and the algorithms operating on it; in the entities governing the data space; in its enabling technologies; as well as in and among its wide variety of users (organizations and private individuals as data producers, consumers, or intermediaries).

Fig. 22.1
figure 1

The BDVA Data Sharing Value “Wheel” [1] ©2020, Big Data Value Association. Used under permission from Big Data Value Association

To achieve the required levels of trust, each of the following five pillars must meet some of the necessary conditions:

  • Data—As a touted fifth European fundamental freedom, free movement of data relies on organizational data strategies that embed methodologies for data sharing by design (e.g., interoperability) and clear standard guidelines that help determine the market value of data assets.

  • Governance—A European-governed data sharing space can inspire trust by adhering to the more advanced European rules, guidelines, and regulations and promote European values. Participation should be equally open to all and subject to transparent and fair rules of conduct.

  • People—Data sharing needs to guarantee individual privacy and offer fair value or compensation for shared personal data. For Europe to drive data sharing activities, the European workforce needs appropriate reskilling and upskilling to meet the evolving labor market’s needs.

  • Organizations—More organizations (including business, research, and governmental) need to rethink their strategy to fully embrace a data culture that places data at the center of their value proposition, exploring new data-driven business models and exploiting new data value flows.

  • Technology – Safer experimentation environments are needed to catalyze the maturation of relevant technology behind trustworthy data, data access, and algorithms (privacy, interoperability, security, and quality). Standardization activities need to adjust for faster reaction times to emerging standards and the identification of new ones.

The BDVA recognizes two complementary high-impact opportunities that can materialize as a result of timely interventions to converge data sharing initiatives in Europe and realize its vision:

  • Achieve wider access to data to realize the full potential of emerging AI technology by designing and implementing a common, trustworthy, decentralized data space that enables safe and democratic data sharing and boosts the European data economy.

  • Achieve a European-governed data space, giving Europe the possibility to assume a prominent position steering international efforts to develop data and AI solutions that reflect and respect European ethical values, including democracy, privacy protection, and equality [2].

The introduced vision for a European-governed data sharing space needs to be built around and in consultation with the same wide array of stakeholders that can exploit its benefits as users. Figure 22.2, designed on the triple helix view of research, development, and innovation production, shows the various roles that the active strategic stakeholders (Industry, Academia, Government) can play in the realization of this vision, the tools they can actively contribute, and the existing potential to achieve different kinds of societal impact (Economic, Technological, Political, and Cultural).

Fig. 22.2
figure 2

Tools and mechanisms for strategic stakeholders to jointly realize a data sharing space [1] ©2020, Big Data Value Association. Used under permission from Big Data Value Association

Rather than focusing on specific Business to Business (B2B) scenarios or restricting the vision to specific sectors, we envision a data sharing space that is open to all, thus offering equal opportunities and spanning all societal spheres, including private citizens. Even though the latter are not actors in the realization of the data sharing space, they still play an essential role in data sharing. Although as the main economic driver we retain business at the center of our recommendations, in addition to B2B cases, we also consider Business to Governance and vice versa (B2G, G2B), Business to Science and vice versa (B2S, S2B), as well as Consumer to Business (C2B) opportunities.

3 Challenges

The BDVA community has identified the most critical challenges (see Table 22.1) that stand in the way of the expected value generated by the identified opportunities [1]. The challenges can be categorized into two main concerns: interorganizational (lack of suitable data sharing ecosystems) and intraorganizational (issues faced by data producers and consumers, as data sharing participants).

Table 22.1 Challenges for common European Data Sharing Spaces

The most pressing interorganizational concern remains the lack of functional and trustworthy data sharing ecosystems that inspire immediate large-scale participation. Primary causes include the lack of robust legal and ethical frameworks and governance models and trusted intermediaries that guarantee data quality, reliability, and fair use. This is compounded by the lack of widespread adherence to emerging best practices and standards (e.g., interoperability, provenance, and quality assurance standards), whose maturity pace also continues to fail expectations. From a technical point of view, data sharing solutions need to address European concerns like ethics-by-design for democratic AI, and the rapid shift towards decentralized mixed-mode data sharing and processing architectures also poses significant scalability challenges.

In terms of intraorganizational concerns, a first significant concern is a difficulty to determine the value of data due to a lack of data valuation standards and assessment tools, compounded by the highly subjective and party-dependent nature of data value and the lack of data sharing foresight exhibited by a majority of producers. The second concern revolves around the difficulty faced by data producers balancing their data’s perceived value (after sharing) against risks exposed (upon its sharing) despite adhering to standard guidelines. Specific examples include the perceived loss of control over data (due to the fluid nature of data ownership, which remains hard if not impossible to define legally), the loss of trade secrets due to unintentional exposure or malicious reverse-engineering (in a business landscape that is already very competitive), and the risk of navigating around legal constraint given potential data policies breaches (including GDPR and exposure of private identities).

4 Call to Action

BDVA has identified five recommended preconditions for successfully developing, implementing, and adopting a European Data Sharing Space [1]. Following widespread consultation with all involved stakeholders, the recommendations have been translated into 12 concrete actions. These can effectively be implemented alongside the Horizon Europe and Digital Europe programs [3]. This call for action is aligned with the European Commission’s latest Data Strategy [4]. The recommended actions are categorized under five independent goals: Convergence, Experimentation, Standardization, Deployment, and Awareness, each of which is targeted towards specific stakeholders in the data sharing ecosystem. The implementation of the five goals should take place within the timeframe shown in Fig. 22.3. Assuming the convergence initiatives that are required over the next three years will yield satisfactory outcomes, deployment efforts can be scaled up with experimentation acting as a further catalyst. Other deployment efforts need to go hand in hand with intensified standardization activities, which are key to a successful European-governed data sharing space. Activities targeted at greater awareness for all end-users can initially target organizations, entities, and individuals that can act as data providers and then extend to all potential consumers as solid progress is achieved. The actions are targeted to specific actors, which map to one or more of the strategic stakeholders in Fig. 22.2.

Fig. 22.3
figure 3

Timeframe for implementing the recommended actions over the next decade [1] ©2020, Big Data Value Association. Used under permission from Big Data Value Association

5 Convergence: Data Platform Projects of the Big Data Value PPP

Trusted and secure platforms for secure sharing of “closed” personal and industrial data are key for creating a European data market and data economy. The data platform projects running under the umbrella of the Big Data Value PPP develop integrated technology solutions for data collection, sharing, integration, and exploitation to facilitate the creation of such a European data market and economy [5].

The data platform projects fall under the following three main types:

  • Personal data platforms facilitate respecting prevailing legislation and allow data subjects and data owners to remain in control of their data and its subsequent use. Personal data platforms preserve utility for data analysis and allow for the management of privacy versus utility trade-offs and metadata privacy and query privacy.

  • Industrial data platforms facilitate trusted and secure sharing and trading of proprietary and commercial data assets. Industrial data platforms offer automated and robust controls on compliance (including automated contracting) of legal rights and fair remuneration of data owners.

  • Mixed data platforms represent a combination of the above two types of data platforms.

5.1 The Portfolio of Projects

The portfolio of the Big Data Value PPP covers the data platform projects shown in Table 22.2.Footnote 1 This table gives an overview of these projects, the type of data platform they develop, and the domain, respectively, the use cases they address. Each of these projects is briefly summarized below based on open data from https://cordis.europa.eu/.

Table 22.2 Portfolio of the Big Data Value PPP covering data platforms

BD4NRG delivers a reference architecture for Smart Energy, which aligns with the BDVA, IDSA, and FIWARE reference models and architectures to enable B2B multi-party data exchange while providing full interoperability of leading-edge big data technologies with smart grid standards and operational frameworks. BD4NRG delivers an open modular big data analytic toolbox as front-end for one-stop-shop analytics services development by orchestrating legacy and third-party assets (data, computing resources, models, algorithms).

BD4OPEM develops an analytic toolbox to improve existing energy services and create new ones, all available in an open innovation marketplace. The analytic toolbox is based on big data techniques, providing tools for enabling efficient business processes in the energy sector. By extracting more value from available data, a range of innovative services are created in the fields of grid monitoring, operation, and maintenance, network planning, fraud detection, smart houses/buildings/industries energy management, blockchain transactions, and flexibility aggregation for demand-response. The open innovation marketplace ensures secure data flows from data providers to solution providers, compliant with GDPR requirements so that asset management is enhanced, consumer participation in energy balancing is promoted, and new data-driven business models are created through innovative energy services.

DataPorts designs, implements, and operates a cognitive ports data platform that (i) connects to the different digital infrastructures currently existing in digitized seaports, enabling the interconnection of a wide variety of systems into an integrated ecosystem, (ii) sets the policies for trusted and reliable data sharing and trading based on data owners’ rules and offering a clear value proposition, and (iii) leverages on the data collected to provide advanced data analytics services based on which the different actors in the port value chain can develop novel AI and cognitive applications.

DataVaults delivers a framework and a platform that has personal data coming from diverse sources in its center and that defines secure, trusted, and privacy-preserving mechanisms allowing individuals to take ownership and control of their data and share them at will, through flexible data sharing and fair compensation schemes with other entities (companies or not). The overall approach rejuvenates the personal data value chain into a multi-sided and multi-tier ecosystem governed and regulated by smart contracts, which safeguard personal data ownership, privacy, and usage and attributes value to those who produce it.

i3-MARKET addresses the growing demand for a single European data market economy by innovating marketplace platforms, demonstrating with industrial implementations that data economy growth is possible. i3-MARKET provides technologies for trustworthy (secure and reliable), data-driven collaboration and federation of existing and new future marketplace platforms, with special attention on industrial data and particularly on sensitive commercial data assets from both SMEs to large industrial corporations.

KRAKEN brings personal data sharing and trading to a level of maturity that does not yet exist by leveraging on (i) the emerging paradigm of self-sovereign identity built upon a stack of distributed ledger technologies (multi-ledger) which ensures future compatibility with different specific blockchain implementations for identity management. KRAKEN provides a decentralized user-centric approach to personal data sharing and incorporates trust and security assurance levels deriving claims from national identity schemas.

MOSAICrOWN enables data sharing and collaborative analytics in multi-owner scenarios in a privacy-preserving way, ensuring proper protection of private/sensitive/confidential information. MOSAICrOWN provides effective and deployable solutions allowing data owners to maintain control on the data sharing process, enabling selective and sanitized disclosure, and providing for efficient and scalable privacy-aware collaborative computations.

MUSKETEER creates a validated, federated, privacy-preserving machine learning platform tested on industrial data that is interoperable, scalable, and efficient enough to be deployed in real use cases. MUSKETEER alleviates data sharing barriers by providing secure, scalable, and privacy-preserving analytics over decentralized datasets using machine learning. Data can continue to be stored in different locations with different privacy constraints but shared securely.

OpertusMundi delivers a trusted, secure, and highly scalable pan-European industrial geospatial data market, acting as a single point for the streamlined and trusted discovery, sharing, trading, remuneration, and use of proprietary and commercial geospatial data assets. The OpertusMundi platform guarantees low cost and flexibility to accommodate the current and emerging needs of data economy stakeholders regardless of their size, domain, and expertise.

PIMCity delivers Personal Information Management Systems (PIMS) that give users back control over their data while creating transparency in the market. PIMCity implements a PIMS development kit (PDK) to commoditize the complexity of creating PIMS. This lowers the barriers for companies and SME to enter the web data market. In addition, PIMCity designs and deploys novel mechanisms to increase users’ awareness.

PLATOON facilitates deploying distributed/edge processing and data analytics technologies for optimized real-time energy system management in a simple way for the energy domain expert. Data governance among the different stakeholders for multi-party data exchange, coordination, and cooperation in the energy value chain is guaranteed through IDS-based connectors. The PLATOON architecture and components are valuable for the different stakeholders of the energy sector value chain.

Safe-DEED brings together partners from cryptography, data science, business innovation, and the legal domain aiming to improve security in data sharing, increase trust, and promote privacy-enhancing technologies to conform with global macrotrends and the data economy. It also delivers a set of tools to ease the evaluation of data value in large companies motivating data owners to utilize the protocols developed by Safe-DEED.

smashHit assures trusted and secure sharing of data streams from both personal and industrial platforms, which is required to build sectorial and cross-sectorial services. smashHIT establishes a framework for processing of data owner consent and legal rules and effective contracting, as well as joint security and privacy-preserving mechanisms. The tools of this framework facilitate traceability of the use of data, data fingerprinting, and automatic contracting among the data owners, data providers, service providers, and users.

SYNERGY introduces a novel reference big data architecture and platform that leverages data, primarily or secondarily related to the electricity domain, coming from diverse sources (APIs, historical data, statistics, sensors/IoT, weather, energy markets, and various other open data sources). SYNERGY helps electricity stakeholders to simultaneously enhance their data reach and improve their internal intelligence on electricity-related optimization functions while getting involved in novel data (intelligence) sharing/trading models to shift individual decision-making to a collective intelligence level.

TheFSM delivers an industrial data platform that will significantly boost the way that food certification takes place in Europe. It brings together and builds upon existing innovations from innovative ICT SMEs to deliver a uniquely open and collaborative virtual environment that will facilitate the exchange and connection of data between different food safety actors interested in sharing information critical to certification. TheFSM catalyzes the digital evolution of the quite traditional but very data-intensive business ecosystem that the global food certification market involves.

TRUSTS facilitates trust in the concept of data markets as a whole via its focus on developing a platform based on the experience of two large national projects while allowing the integration and adoption of future platforms. The TRUSTS platform acts independently and as a platform federator while investigating the legal and ethical aspects that apply to the entire data valorification chain, from data providers to consumers. TRUSTS delivers a fully operational and GDPR-compliant marketplace targeting both personal and industrial use by leveraging existing data marketplaces, such as the IDS.

5.2 Cross-Cutting Challenges in Data Platforms

A series of online workshops were organized by the BDVA and the EC under the umbrella of the European Big Data Value PPP to facilitate collaboration and interaction among the data platform projects. In addition to learning from each other about key aspects of the new data platform projects, these workshops facilitated identifying important transversal topics and challenges of common interest.

These challenges include:

  • Federation and interoperability of domain-specific data platforms: Many of the existing data platform projects focus on distinct vertical domains (such as energy, health, or transport). While such focus is important to deliver data platform services that are meaningful to their users (as there are not too generic and thus decoupled from the actual domain semantics and data types), federation and interoperation among these vertical data platforms will facilitate a further push towards a European Data Market. Among the current data platform projects, I3-MARKET, MUSKETEER, and TRUSTS investigate paths towards achieving such federation and interoperability. Complementing these paths, the EUHUBS4DATA project aims to set up a European federation of Big Data Digital Innovation Hubs (DIHs), with the ambition of becoming a reference instrument for data-driven cross-border experimentation and innovation, and support the growth of European SMEs and start-ups in a global Data Economy.

  • Multi-sided aspects of data platforms: To facilitate adoption and use of data platforms, their multi-sided aspects must be considered. To facilitate the adoption of data and service users, concerns such as attractiveness, ease of access, etc. are important. To attract data and service providers, concerns such as data protection and privacy, incentivizing mechanisms and strategies to attract data providers to platforms, as well as mechanisms to enforce or better incentivize data quality are important.

  • Commonalities among building blocks: Many of the data platforms share common concerns with respect to data ingestion, sharing, protection, management, etc. As a result, many of the projects develop building blocks to facilitate these different concerns of a data platform project. Many of the data platform projects focus on domain-specific building blocks or building blocks for certain types of data (e.g., personal vs. industrial). This is an important step to delivering effective platform services. Still, there are domain-independent commonalities among these building blocks, which may be leveraged to identify commodity building blocks that facilitate more efficiently bootstrapping future data platforms as well as federating data platforms. Identifying such common building blocks may also pave the way towards a common reference architecture for European data platforms.

6 The Needs for Data Governance

While originally mainly technical, the meaning of “data governance” has grown in scope over time. There is no consensual definition of “data governance,” although the expression is broadly used. From the legal and policy side, governance generally refers to “the high-level management of organisations or countries, as well as the decision-making system and institutions for doing it.” Data governance can therefore be defined (very broadly) as a system of rights and responsibilities that determine who can take what actions with what data.

The objective of developing integrated legal, ethical, organizational, and technical frameworks that can facilitate fair, compliant, trustworthy access to and reuse of data can only be reached by interdisciplinary efforts and involvement of a broad range of stakeholder’s perspectives. The BDVA community and activities and the projects of the BDV PPP contribute to this R&I topic.

  • The need for this forward-looking approach to data governance implies to advance on the following challenging questions: New regulations on substantive rights and organizational aspects are to be expected. However, we already have many legal frameworks dealing—more or less directly—with data and pursuing various legal and policy objectives. How to ensure that the new will interact with the existing legal frameworks is an important challenge that requires research in order to build consistent frameworks/infrastructure from the perspective of “data” and data spaces, to create a clear and fair legal ecosystem.

  • To date, there is a tension between horizontal and sector-specific regulation of data. How to position the role of the data spaces in this tension field constitutes a challenge. In any case, data spaces should not result in (re-)creating “data silos” but work towards a genuine Common European Data Space. Data governance mechanisms can open avenues for regulating data while paying due consideration for the context in which they are processed. Based on case-specific studies, the factual and regulatory factors that influence data governance need further research to draw lessons for further regulation. Sandboxes and testbeds could be used to gain more insight into the concrete needs of stakeholders and, therefore, help solve this tension.

  • It remains a challenge to safeguard a human-centric and fair approach. The “data control” paradigm should be further experienced in concrete settings to test to what extent it can serve to expand the data economy while protecting individuals with respect to data related to them and preventing commodification of their data. Data holders fear that they would “lose control” upon sharing “their” data. Therefore, empowering data holders (both legal entities and individuals) and turning them into active economic players concerning “their” data is increasingly viewed by policymakers as a way forward. In this respect, it could also be considered that data do have not only economic value but also societal value, i.e., for research and for fighting against collective challenges.

  • There is an important challenge in the facilitation of the creation of best practices of (sectorial) cases of specific (more collaborative) data governance mechanisms: data pools, data spaces, data commons, data trusts, data federations, data altruism, data cooperatives, data marketplace, PIMs, usage rights for co-generated data, etc. Therefore, an experimental research approach is needed to identify the factors for success or failure, e.g., the technology, the nature of data and stakeholders, the objectives assigned to the governance mechanism, and the legal framework.

  • The data intermediation landscape is today in its infancy. It is, therefore, crucial to map the emerging data intermediation models (centralized, decentralized cf. innovative business models in H2020 R&I projects) and how they could be influenced by EU legislation (such as the DGA proposal or the upcoming Data Act). Data intermediaries should not be channelled into a given direction, as other (also more collaborative and decentralized) models should be allowed to emerge in line with the data space objectives.

  • Technical and business models and architectures and also standards emerge to structure and facilitate data sharing. These initiatives are aimed at bringing trust to data holders and data users, which also requires another step, namely, solid legal infrastructure. The interplay of technical and legal infrastructure is a critical topic, e.g., legal layers of interoperability, the potential impact of fairness imperatives on the de facto appropriation of data, and the potential dynamic aspects of data that would require an evolutionary approach.

Results of these R&I efforts will create building blocks that stimulate the development of data spaces and the realization of trustworthy technology applications and ecosystems.

7 Towards Trustworthiness of Industrial AI

One of the core characteristics that AI-enabled industrial system needs to display is trustworthiness. The purpose of industrial AI is to boost the effectiveness and quality of the services delivered to the client and ensure that no negative impact is brought as a result of deploying AI solutions in critical applications. Building systems that can be trusted is critical to their acceptance.

Although only some critical AI application need high levels of trustworthiness, all applications need to be trustable. Trustworthiness will not be embodied by a single data source or technology but needs to be designed into an AI system, i.e., created by the interaction between all its technology building blocks and the data assets used [6]. Trustworthiness is built on multiple underlying system characteristics, such as reliability, dependability, safety, robustness, transparency, etc. In this context, high data quality and efficient and transparent means for data sharing are key leverages to ensure the trustworthiness of industrial AI.

However, the trustworthiness of Industrial AI involves the simultaneous achievement of objectives that are often in conflict. For instance, one critical challenge stems from the ever-increasing collection and analysis of personal data and the crucial requirement for protecting the privacy of all involved data subjects as well as protecting commercially sensitive data of associated organizations and enterprises. As all means for the trusted and secure sharing add cost and complexity to Industrial AI systems, the optimal trade-offs without adding considerable complexity are significant research challenges to be addressed.

In this context, the BDVA community has identified several R&I challenges [3, 7] that need to be addressed when implementing the trustworthiness of industrial AI while respecting transparent and secure data sharing:

  • To protect data for AI, improved privacy-preserving technologies are required. This includes data protection in machine learning, protecting the confidentiality and the integrity of training data, learned models and test samples, and means for data protection in dynamic environments (e.g., cloud/fog/Edge) with resource-constraint devices and immutable data stores.

  • All sensing and perception technologies that help create, access, assess, convert, and aggregate signals representing real-world objects into communicable data assets need to be transparent, traceable, and reliable. In addition, the data processing and management methods need to ensure data privacy, integrity, and accountability. The development of trusted execution environments for edge devices keeps sensitive data within the source to achieve this.

  • All reasoning and decision-making algorithms need to be transparent, explainable, and complement with efficient means for testing and validating the AI-based solution. This requires quality standards for reference datasets for the continuous testing and validation of the AI component performance in the context of the Industrial AI system, techniques that can work reliably with insufficient and missing data, as well as benchmarks for determining the performance, robustness, reliability, usability, and other quality indicators of industrial AI systems.

  • For all industrial AI systems, safe interaction in safety-critical and unstructured environments needs to be ensured. This requires the co-development of technology and the development of data-based confidence measures.

  • Deploying AI and Data systems often require the integration of diverse technologies ranging from software to hardware. Ensuring trustworthiness requirements, such as reliability, privacy, robustness, safety, dependability, transparency, etc., requires data-driven methodologies and tools as well as data-based validation processes and means for verification. This can be achieved by using data to identify hardware and software anomalies and new quality standards and methodologies to verify and “by-design” approaches.

8 Example: Smart Manufacturing Data Space

Data sharing spaces are developing for different sectors (i.e., Healthcare) and different sources of data (i.e., Internet of Things/Smart Environments [8]). The manufacturing sector, taking into account both discrete product and continuous process industries, is considering Data Spaces from a threefold perspective: availability of FAIR, high-value datasets, adoption of advanced AI-based Industrial Data Platforms, and deployment of governance rules to respect the business and security models of all the stakeholders [9]. On the one side, there is a continuously growing amount of data produced at all stages (e.g., factory, product, supply chain), while on the other side, analytical skills and tools are getting more and more advanced, mainly thanks to the adoption of Artificial Intelligence and data-intensive applications. The third challenging aspect is the governance of the match between the demand (e.g., manufacturing companies) and the offer (e.g., ICT service providers) to maximize the benefits obtained by the concrete application of AI technologies in real business cases while respecting security and confidentiality constraints. This match relies on the availability of a secure and trusted framework that enables the interaction among different subjects and the exchange of information, knowledge, and data among them, for example, via Data Sovereignty governance models.

Several Business Cases are available in the Smart Manufacturing Industry ecosystem. One of them is a use case developed in the MUSKETEER project, one of the research actions listed in Table 22.1. The case considers an industrial company (robot manufacturer) opportunity to set up a new business service based on the distributed knowledge available from the different customers they have. The scenario considers that a number of those customers may be interested in developing and deploying, on their own, an anomaly detection process for a defined robot-enabled operation (e.g., a welding action). Each one of those customers will collect data from the system, in the production line, and part of that data will be used to train a Machine Learning model able to detect if the welding operations are performing appropriately or not. In this case, the knowledge included in the ML model (the accuracy) is obtained considering just the cases that occurred in a single customer site. Assuming that different customers are operating the same kind of robots for the same kind of activity, they may face and record different correct and incorrect behaviors. Consequently, they can collect different training sets. By sharing such knowledge in a Manufacturing secure Data Space, a new “Fleet Management” application can be developed, which can consider a broader and more complete variety of customers behaviors and configurations.

The MUSKETEER platform federates the Data Spaces of the robot manufacturer and its clients. Starting from local ML models trained at customer sites, the robot manufacturer can aggregate them in a more accurate and high-value Data Space. From a service economy point of view, by implementing such a Manufacturing Data Space, the robot manufacturer can provide an added value service to all the clients in full respect of Data Sovereignty principles. The same paradigm is flexible enough to be applied to other kinds of analysis (predictive maintenance) or other domains (healthcare images).

9 Conclusions

In this chapter, we presented a wealth of work done on data platforms in the form of European research projects, which educate us about European governed data sharing spaces. These concrete activities provide a corpus of knowledge, which can be used to derive best practices and ways to design data spaces.

Convergence of solutions, however, is necessary in order to build compatible data spaces. Therefore, while data space implementations are emerging, it is necessary to consider how different approaches and systems in respective domains are interoperable and if their data access solution is scalable. In general, European software producers must eventually align their offerings, and data space implementations need to follow similar design principles.

As we have seen, currently, many of the approaches are domain-oriented, but several cross-cutting challenges exist: developing domain-independent building blocks or how to federate among data sources. Cross-domain and cross-border data spaces are an opportunity that still requires further work. A cross-border use case successfully boosting the value network of European businesses would eventually show the potential of European data spaces. There is no easy way out. We need experimenting, piloting, and understanding how to navigate in the construction of European data spaces.