From an operational point
of view, BIG defined a set of activities based on a three-phase approach as illustrated in Fig. 2.5. The three phases were:
Technology state of the art and sector analysis
Big data public private partnership
7.1 Technology State of the Art and Sector Analysis
In the first phase of the project, the sectorial forums and the technical working groups performed a parallel investigation in order to identify:
As part of the investigation, application sectors expressed their needs with respect to the technology as well as possible limitations and expectations regarding its current and future deployment.
Using the results of the investigation a gap analysis
was performed between what technology capability was ready, with the sectorial expectations of what technological capability was currently required together with future requirements. The analysis produced a series of consensus-reflecting sectorial roadmaps that defined priorities and actions to guide further steps in big data research.
7.1.1 Technical Working Groups
The goal of the technical working groups
was to investigate the state of the art in big data technologies to determine its level of maturity, clarity, understandability, and suitability for implementation. To allow for an extensive investigation and detailed mapping of developments, the technical working groups deployed a combination of a top-down and bottom-up approach, with a focus on the latter. The approach of the working groups was based on a 4-step approach: (1) literature research, (2) subject matter expert interviews, (3) stakeholder workshops, and (4) technical survey.
In the first step each technical working group performed a systematic literature review based on the following activities:
Identification of relevant type and sources of information
Analysis of key information in each source
Identification of key topics for each technical working group
Identification of the key subject matter experts for each topic as potential interview candidates
Synthesizing the key message of each data source into state-of-the art descriptions for each identified topic
The experts within the consortium outlined the initial starting points for each technical area, and the topics were expanded through the literature search and from the subject matter expert interviews.
The following types of data sources were used: scientific papers published in workshops, symposia, conferences, journals and magazines, company white papers, technology vendor websites, open source projects, online magazines, analysts’ data, web blogs, other online sources, and interviews conducted by the BIG consortium. The groups focused on sources that mention concrete technologies and analysed them with respect to their values and benefits.
The synthesis step compared the key messages and extracted agreed views that were then summarized in the technical white papers. Topics were prioritized based on the degree to which they are able to address business needs as identified by the sectorial forum working groups.
The literature survey was complemented with a series of interviews with subject matter experts for relevant topic areas. Subject matter expert interviews are a technique well suited to data collection and particularly for exploratory research because it allows expansive discussions that illuminate factors of importance (Oppenheim 1992; Yin 2009). The information gathered is likely to be more accurate than information collected by other methods since the interviewer can avoid inaccurate or incomplete answers by explaining the questions to the interviewee (Oppenheim 1992).
The interviews followed a semi-structured protocol. The topics of the interview covered different aspects of big data, with a focus on:
Goals of big data technology
Beneficiaries of big data technology
Drivers and barriers for big data technologies
Technology and standards for big data technologies
An initial set of interviewees was identified from the literature survey, contacts within the consortium, and a wider search of the big data ecosystem. Interviewees were selected to be representative of the different stakeholders within the big data ecosystem. The selection of interviewees covered (1) established providers of big data technology (typically MNCs), (2) innovative sectorial players who are successful at leveraging big data, (3) new and emerging SMEs in the big data space, and (4) world leading academic authorities in technical areas related to the Big Data Value Chain.
7.1.2 Sectorial Forums
The overall objective of the
sectorial forums was to acquire a deep understanding of how big data technology can be used in the various industrial sectors, such as healthcare, public, finance and insurance, and media.
In order to identify the user needs and industrial requisites
of each domain, the sectorial forums followed a research methodology encompassing the following three steps as illustrated in Fig. 2.6. For each industrial sector, the steps were accomplished separately. However, in the case where sectors were related (such as energy and transport) the results have been merged for those sectors in order to highlight differences and similarities.
The aim of the first steps was to identify both stakeholders and use cases for big data applications within the different sectors. Therefore, a survey was conducted including scientific reviews, market studies, and other Internet sources. This knowledge allowed the sectorial forums to identify and select potential interview partners and guided the development of the questionnaire for the domain expert interviews.
The questionnaire consisted of up to 12 questions that were clustered into three parts:
Direct inquiry of specific user needs
Indirect evaluation of user needs by discussing the relevance of the use cases identified at Step 1 as well as any other big data applications of which they were aware
Reviewing constraints that need to be addressed in order to foster the implementation of big data applications in each sector
In the second step, semi-structured interviews were conducted using the developed questionnaire. At least one representative of each stakeholder group identified in Step 1 was interviewed. To derive the user needs from the collected material, the most relevant and frequently mentioned use cases were aggregated into high-level application scenarios. The data collection and analysis strategy was inspired by the triangulation approach (Flick 2004). Reviewing and quantitatively assessing the high-level application scenarios derived a reliable analysis of user needs. Examinations of the likely constraints of big data applications helped to identify the relevant requirements that needed to be addressed.
The third step involved a crosscheck and validation of the initial results of the first two steps by involving stakeholders of the domain. Some sectors conducted dedicated workshops and webinars with industrial stakeholders to discuss and review the outcomes. The results of the workshops were studied and integrated whenever appropriate.
7.2 Cross-Sectorial Roadmapping
Comparison among the different sectors
enabled the identification of commonalities and differences at multiple levels, including technical, policy, business, and regulatory. The analysis was used to define an integrated cross-sectorial roadmap
that provides a coherent holistic view of the big data domain. The cross-sectorial big data roadmap was defined using the following three steps:
Consolidation to establish a common understanding of requirements as well as technology descriptions and terms used across domains
Mapping to identify any technologies needed to address the identified cross-sector requirements
to highlight which technologies need to be available at what point in time by incorporating the estimated adoption rate by the involved stakeholders
The remainder of this section describes each of these steps in more detail.
Alignment among the technical working groups, and between the technical working groups and the sectorial forums, was important and facilitated through early exchange of drafts, one-on-one meetings, and the collection of consolidated requirements through the SFs. In order to align the sector-specific labelling of requirements, a consolidated description was established. In doing so, each sector provided their requirements with the associated user needs. In dedicated meetings, similar and related requirements were clustered and then merged, aligned, or restructured. Thus, the initial list of 13 high-level requirements and 28 sub-level requirements could be reduced to 8 high-level requirements and 25 sub-level requirements. In summary the consolidation phase reduced the total number of requirements by 20 %.
For mapping technology to requirements the technical working groups indicated which technology could be used to address the consolidated requirements. Besides providing a mapping between requirements and technologies, the technical working groups also indicated the associated research challenges.
Within a 1-day workshop, the initial mapping of technologies and requirements was consolidated in two steps. First, the indicated technological capabilities were analysed in further detail by describing how the sector-specific aspects of each cross-sector requirement can be handled. Second, for each cross-sector requirement it was investigated whether the technologies from various technical working groups need to be combined in order to address the full scope of the requirement. At the end of the discussion, any technologies that were requested by at least two sectors were included into the cross-sector roadmap.
7.2.3 Temporal Alignment
After identifying the key technologies, their temporal alignment needed to be defined. This was achieved by answering two questions:
The development time for each technology indicates how much time is needed to solve the associated research challenges. This time frame depends on the technical complexity of the challenge together with the extent to which sector-specific extensions are needed. In order to determine the adoption rate of big data technology (or the associated use case) non-technical requirements such as availability of business cases, suitable incentive structures, legal frameworks, potential benefits, as well as the total cost for all the stakeholders involved (Adner 2012) were considered.