Keywords

1 Introduction

The continuous and significant growth of data, together with improved access to data and the availability of powerful computing infrastructure, has led to intensified activities around Big Data Value (BDV) and data-driven Artificial Intelligence (AI). Powerful data techniques and tools allow collecting, storing, analysing, processing and visualising vast amounts of data, enabling data-driven disruptive innovation within our work, business, life, industry and society. The rapidly increasing volumes of diverse data from distributed sources create significant technical challenges for extracting valuable knowledge. Many fundamental, technological and deployment challenges exist in the development and application of big data and data-driven AI to real-world problems. For example, what are the technical foundations of data management for data-driven AI? What are the key characteristics of efficient and effective data processing architectures for real-time data? How do we deal with trust and quality issues in data analysis and data-driven decision-making? What are the appropriate frameworks for data protection? What is the role of DevOps in delivering scalable solutions? How can big data and data-driven AI be used to power digital transformation in various industries?

For many businesses and governments in different parts of the world, the ability to effectively manage information and extract knowledge is now a critical competitive advantage. Many organisations are building their core business to collect and analyse information, to extract business knowledge and insight [3]. The impacts of big data value go beyond the commercial world to significant societal impact, from improving healthcare systems, the energy-efficient operation of cities and transportation infrastructure to increasing the transparency and efficiency of public administration.

The adoption of big data technology within industrial sectors facilitates organisations to gain competitive advantage. Driving adoption is a two-sided coin. On one side, organisations need to master the technology needed to extract value from big data. On the other side, they need to use the insights extracted to drive their digital transformation with new applications and processes that deliver real value. This book has been structured to help you understand both sides of this coin and bring together technologies and applications for Big Data Value.

The chapter is structured as follows: Section 2 defines the notion of Big Data Value. Section 3 explains the Big Data Value Public-Private Partnership (PPP) and Sect. 4 summarises the Big Data Value Association (BDVA). Sections 5, 6 and 7 structure the contributions of the book in terms of three key lenses: BDV Reference Model (Sect. 5), Big Data and AI Pipeline (Sect. 6) and the AI, Data and Robotics Framework (Sect. 7). Finally, Sect. 8 provides a summary.

2 What Is Big Data Value?

The term “Big Data” has been used by different major players to label data with different attributes [6, 8]. Several definitions of big data have been proposed in the literature; see Table 1.

Table 1 Definitions of big data [4]

Big data brings together a set of data management challenges for working with data which exhibits characteristics related to the 3 Vs:

  • Volume (amount of data): dealing with large-scale data within data processing (e.g., Global Supply Chains, Global Financial Analysis, Large Hadron Collider).

  • Velocity (speed of data): dealing with streams of high-frequency incoming real-time data (e.g., Sensors, Pervasive Environments, Electronic Trading, Internet of Things).

  • Variety (range of data types/sources): dealing with data using differing syntactic formats (e.g., Spreadsheets, XML, DBMS), schemas, and meanings (e.g., Enterprise Data Integration).

The Vs of big data challenge the fundamentals of existing technical approaches and require new data processing forms to enable enhanced decision-making, insight discovery and process optimisation. As the big data field matured, other Vs have been added, such as Veracity (documenting quality and uncertainty) and Value [1, 16].

The definition of value within the context of big data also varies. A collection of definitions for Big Data Value is provided in Table 2. These definitions clearly show a pattern of common understanding that the Value dimension of big data rests upon successful decision-making and action through analytics [5].

Table 2 Definitions of big data value [5]

3 The Big Data Value PPP

The European contractual Public-Private Partnership on Big Data Value (BDV PPP) commenced in 2015. It was operationalised with the Leadership in Enabling and Industrial Technologies (LEIT) work programme of Horizon 2020. The BDV PPP activities addressed the development of technology and applications, business model discovery, ecosystem validation, skills profiling, regulatory and IPR environments and many social aspects.

With an initial indicative budget from the European Union of 534 million euros for 2016–2020 and 201 million euros allocated in total by the end of 2018, since its launch, the BDV PPP has mobilised 1570 million euros of private investments (467 million euros for 2018). Forty-two projects were running at the beginning of 2019. The BDV PPP in just 2 years developed 132 innovations of exploitable value (106 delivered in 2018, 35% of which are significant innovations), including technologies, platforms, services, products, methods, systems, components and/or modules, frameworks/architectures, processes, tools/toolkits, spin-offs, datasets, ontologies, patents and knowledge. Ninety-three per cent of the innovations delivered in 2018 had an economic impact, and 48% had a societal impact. By 2020, the BDV PPP had projects covering a spectrum of data-driven innovations in sectors including advanced manufacturing, transport and logistics, health and bioeconomy. These projects have advanced the state of the art in key enabling technologies for big data value and non-technological aspects, such as providing solutions, platforms, tools, frameworks, and best practices for a data-driven economy and future European competitiveness in Data and AI [5].

4 Big Data Value Association

The Big Data Value Association (BDVA) is an industry-driven international non-for-profit organisation that grew over the years to more than 220 members across Europe, with a well-balanced composition of large, small and medium-sized industries as well as research and user organisations. BDVA has over 25 working groups organised in Task Forces and subgroups, tackling all the technical and non-technical challenges of big data value.

BDVA served as the private counterpart to the European Commission to implement the Big Data Value PPP program. BDVA and the Big Data Value PPP pursued a common shared vision of positioning Europe as the world leader in creating big data value. BDVA is also a private member of the EuroHPC Joint Undertaking and one of the leading promoters and driving forces of the European Partnership on AI, Data and Robotics in the framework programme MFF 2021–2027.

The mission of the BDVA is “to develop the Innovation Ecosystem that will enable the data-driven digital transformation in Europe delivering maximum economic and societal benefit, and, to achieve and to sustain Europe’s leadership on Big Data Value creation and Artificial Intelligence.” BDVA enables existing regional multi-partner cooperation to collaborate at the European-level by providing tools and know-how to support the co-creation, development and experimentation of pan-European data-driven applications and services and know-how exchange. The BDVA developed a joint Strategic Research and Innovation Agenda (SRIA) on Big Data Value [21]. It was initially fed by a collection of technical papers and roadmaps [2] and extended with a public consultation that included hundreds of additional stakeholders representing both the supply and the demand side. The BDV SRIA defined the overall goals, main technical and non-technical priorities, and a research and innovation roadmap for the BDV PPP. The SRIA set out the strategic importance of big data, described the Data Value Chain and the central role of Ecosystems, detailed a vision for big data value in Europe in 2020, analysed the associated strengths, weaknesses, opportunities and threats, and set out the objectives and goals to be accomplished by the BDV PPP within the European research and innovation landscape of Horizon 2020 and at national and regional levels.

5 Big Data Value Reference Model

Fig. 1
figure 1

Big data value reference model

The BDV Reference Model (see Fig. 1) has been developed by the BDVA, taking into account input from technical experts and stakeholders along the whole big data value chain and interactions with other related PPPs [21]. The BDV Reference Model may serve as a common reference framework to locate big data technologies on the overall IT stack. It addresses the main concerns and aspects to be considered for big data value systems.The model is used to illustrate big data technologies in this book by mapping them to the different topic areas.

The BDV Reference Model is structured into horizontal and vertical concerns.

  • Horizontal concerns cover specific aspects along the data processing chain, starting with data collection and ingestion, and extending to data visualisation. It should be noted that the horizontal concerns do not imply a layered architecture. As an example, data visualisation may be applied directly to collected data (the data management aspect) without the need for data processing and analytics.

  • Vertical concerns address cross-cutting issues, which may affect all the horizontal concerns. In addition, vertical concerns may also involve non-technical aspects.

The BDV Reference Model has provided input to the ISO SC 42 Reference Architecture, which now is reflected in the ISO 20547-3 Big Data Reference Architecture.

5.1 Chapter Analysis

Table 3 shows how the technical outcomes presented in the different chapters in this book cover the horizontal and vertical concerns of the BDV Reference Model.

Table 3 Coverage of the BDV reference model’s core horizontal and vertical concerns by the book’s chapters

As this table indicates, the chapters in this book provide a broad coverage of the model’s concerns, thereby reinforcing the relevance of these concerns that were spelt out as part of the BDV SRIA.

The majority of the chapters cover the horizontal concerns of data processing architectures and data analytics, followed by data management. This indicates the critical role of big data in delivering value from large-scale data analytics and the need for dedicated data processing architectures to cope with the volume, velocity and variety of data. It also shows that data management is an important basis for delivering value from data and thus is a significant concern.

Many of the chapters cover the vertical concern engineering and DevOps, indicating that sound engineering methodologies for building next-generation Big Data Value systems are relevant and increasingly available.

6 Big Data and AI Pipeline

A Big Data and AI Pipeline model (see Fig. 2) suitable for describing Big Data Applications is harmonised with the Big Data Application layer’s steps in ISO 20547-3. This is being used to illustrate Big Data Applications in this book and a mapping to the different topic areas of the BDV Reference Model. Chapter 4 describes the Big Data and AI Pipeline in more detail and relates it to the Big Data Value Reference Model in Fig. 1 and the European AI, Data and Robotics Framework and Enablers in Fig. 3.

Fig. 2
figure 2

Big data and AI pipeline on top, related to the ISO 20547-3 Big Data Reference Architecture

Fig. 3
figure 3

European AI, data and robotics framework and enablers [20]

6.1 Chapter Analysis

Table 4 gives an overview to which extent the technical contributions described in the different chapters of this book are related to the four Big Data and AI Pipeline steps and in particular any of the six big data types.

The Big Data and AI Pipeline steps are the following:

  • P1: Data Acquisition/Collection.

  • P2: Data Storage/Preparation.

  • P3: Analytics/AI/Machine Learning.

  • P4: Action/Interaction, Visualisation and Access.

Part I on Technologies and Methods includes chapters that focus on the various technical areas mainly related to the pipeline steps P2 and P3, and mostly independent of the different big data types.

Part II on Processes and Applications includes chapters which typically covers the full big data pipeline, but some chapters also have a more specific focus. With respect to the big data types, a majority of the chapters are related to time series and IoT data (12), followed by image data (8), spatiotemporal data (7), graph data (6) and also chapters with a focus on text and natural language processing (2).

7 AI, Data and Robotics Framework and Enablers

Table 4 Coverage of AI pipeline steps and big data types as focused by the book’s chapters

In September 2020, BDVA, CLAIRE, ELLIS, EurAI and euRobotics announced the official release of the Joint Strategic Research Innovation and Deployment Agenda (SRIDA) for the AI, Data and Robotics Partnership [21], which unifies the strategic focus of each of the three disciplines engaged in creating the Partnership.

Together, these associations have proposed a vision for an AI, Data and Robotics Partnership: “The Vision of the Partnership is to boost European industrial competitiveness, societal wellbeing and environmental aspects to lead the world in developing and deploying value-driven trustworthy AI, Data and Robotics based on fundamental European rights, principles and values.”

To deliver on the vision, it is vital to engage with a broad range of stakeholders. Each collaborative stakeholder brings a vital element to the functioning of the Partnership and injects critical capability into the ecosystem created around AI, Data and Robotics by the Partnership. The mobilisation of the European AI, Data and Robotics Ecosystem is one of the core goals of the Partnership. The Partnership needs to form part of a broader ecosystem of collaborations that cover all aspects of the technology application landscape in Europe. Many of these collaborations will rely on AI, Data and Robotics as critical enablers to their endeavours. Both horizontal (technology) and vertical (application) collaborations will intersect within an AI, Data and Robotics Ecosystem.

Figure 3 sets out the primary areas of importance for AI, Data and Robotics research, innovation and deployment into three overarching areas of interest. The European AI, Data and Robotics Framework represents the legal and societal fabric that underpins the impact of AI on stakeholders and users of the products and services that businesses will provide. The AI, Data and Robotics Innovation Ecosystem Enablers represent essential ingredients for practical innovation and deployment. Finally, the Cross Sectorial AI, Data and Robotics Technology Enablers represent the core technical competencies essential for developing AI, Data and Robotics systems.

7.1 Chapter Analysis

Table 5 gives an overview to which extent the technical contributions described in the different chapters of this book are in line with the three levels of enablers covered in the European AI, Data and Robotics Framework.

Table 5 Coverage of the European AI, Data and Robotics Framework by the book’s chapters

Table 5 demonstrates that the European AI, Data and Robotics partnership and framework will be enabled by a seamless continuation of the current technological basis that the BDV PPP has established.

All cross-sectorial AI, Data and Robotics technology enablers are supported through contributions in this book. However, we observe bias towards the topics “Knowledge and Learning,” “Reasoning and Decision Making” and “System, Methodologies, Hardware and Tools.” This is not surprising as the topics related to the management and analysis of heterogeneous data sources—independently whether they are stored in one or distributed places—is one of the core challenges in the context of extracting value out of data. In addition, tools and methods and processes that integrate AI, Data, HPC or Robotics into systems while ensuring that core system properties and characteristics such as safety, robustness, dependability and trustworthiness can be integrated into the design cycle, tested and validated for use, have been in the past and will be in the future core requirements and challenges when implementing and deploying data-driven industrial applications. Finally, many of the chapters describe how data-driven solutions bring value to particular vertical sectors and applications.

8 Summary

The continuous and significant growth of data, together with improved access to data and the availability of powerful computing infrastructure, has led to intensified activities around Big Data Value and data-driven Artificial Intelligence (AI).

The adoption of big data technology within industrial sectors facilitates organisations to gain a competitive advantage. Driving adoption requires organisations to master the technology needed to extract value from big data and use the insights extracted to drive their digital transformation with new applications and processes that deliver real value. This book is your guide to help you understand, design and build technologies and applications that deliver Big Data Value.