Keywords

1 Introduction

In August 2015 Fundacao Getulio Vargas (FGV)Footnote 1 organized a 3 day seminar on applied research and invited Jane Tinkler from the London School of Economics to give a plenary lecture on how to assess the impact of research in the social sciences on policy decisions. She stressed the fact that it often takes 15–20 years to see the effects of academic research in the real world. Her talk inspired the lead author to ask how research in the geosciences diffuses within academia and from there, into industry. Why is it important to understand how ideas are adopted by industry? Because in the future, in addition to publishing in top journals, academics will probably need to demonstrate that their research is generating innovations to fuel national economies. For example, the Australian government has been funding a national survey since 2001 to collect data on the commercialization of the results of publicly funded research, especially their impact on intellectual property.

Since the pioneering work of Schumpeter in the 1940s, economists have agreed that a large component of modern economic growth has been driven by “innovation”, that is, the arrival of new ideas. Nowadays, most papers on the relationship between scientific research and innovation use citation data to measure the production of new ideas in science and patent data to measure the creation of new potentially successful commercial ideas. Patents have become particularly important in this context for three reasons (Agrawal and Henderson 2002):

  • The patenting process requires that inventors’ names, dates, assignee institutions, locations and detailed descriptions of the invention’s claims be recorded. Innovation-related details are rarely recorded systematically outside of patent records.

  • Innovations that are patented are expected to be commercially useful.

  • Patenting data has recently become available in machine-readable form.

This approach has proved very fruitful in fields where the technology is evolving rapidly and where patents protect their inventors, for example, pharmacy and biotechnology, nanotechnologies, and wind and solar power generation. But it is not pertinent in sectors where patents are less common and where the transfer of new ideas from academia to industry follows different channels (Zellner 2003; Martin and Tang 2007; Moser 2012; Maietta 2015). Geosciences is one such domain.

In order to discover how ideas diffuse within academia and from there into industry, we chose to focus on a specific new method (plurigaussian simulations) which was invented in France in the 1990s for simulating the internal architecture of oil reservoirs (Galli et al. 1994; Armstrong et al. 2011). It rapidly proved useful in other domains in the earth sciences: mining, hydrology and history matching. In the first part of this chapter, after collecting citation data from Google Scholar, we use complex dynamic networks first developed in statistical mechanics to track the diffusion of the method in the academic world. In the second half of the chapter we study how this method moved into industry.

The chapter is divided into five sections. The next one (Sect. 9.2) is a literature review on complex dynamic networks, especially citation networks. In Sect. 9.3 this technique is applied to our citation network for plurigaussian simulations. As only 9 out of the 550 citations were patents, these were not the vector in transferring the method into industry. In Sect. 9.4 we identify three key indicators showing how this innovation was incorporated in industry. Our conclusions follow in Sect. 9.5.

2 Review of Complex Networks

Over the past 30 years the methods developed by physicists for studying networks in statistical mechanics have been adapted to analyzing other types of networks including the world-wide web (Broder et al. 2000; Albert 1999, 2000), power grids (Watts and Strogatz 1998), telephone call grids (Abello et al. 1998) and airline timetables (Amaral et al. 2000). Newman (2001) and Barabasi et al. (2002) both studied citation networks in which the authors were the nodes in the network and a link was formed between two authors when they co-authored a paper. Newman (2001) studied four such collaboration networks:

  1. 1.

    Los Alamos e-print Archive: a database of unrefereed preprints in physics submitted by the authors from 1992 to 2000;

  2. 2.

    Medline: a database of articles on biomedical research published in refereed journals from 1961 to 2000. The entries are submitted by maintainers, rather than the papers authors, giving it a greater coverage;

  3. 3.

    Stanford Public Information Retrieval System (SPIRES): a database of preprints and published papers in high-energy physics;

  4. 4.

    Networked Computer Science Technical Reference Library (NCSTRL): a database of preprints in computer science, submitted by participating institutions and stretching back about 10 years from 2000.

Although the databases went back earlier Newman limited his study to the window from 1995 to 1999 in order to obtain a good static photo of the conditions at that time. In contrast Barabasi et al. (2002) studied the evolution over time of patterns of collaboration in two specific fields: mathematics and neuro-science, over the period from 1991 to 1998, using databases consisting of 70,975 different authors and 70,901 papers for mathematics and 209,293 authors 210,750 papers for neuroscience.

By 2000, theoretical and empirical studies had uncovered three important results: firstly, most networks have the so-called small-world property which means that the average separation between nodes is rather small; secondly, real networks display a higher degree of clustering than expected for purely random networks and finally, the degree distribution follows a scale-free power-law form (Barabasi et al. 2002). Initially it had been expected that the Web would be a random network like those characterized by Erdos and Renyi (1959). In that case the probability of any two nodes being connected is constant, and most nodes have a degree (number of connections) that is close to the average and the degree distribution is exponential. Albert et al. (1999) showed that the distribution for the Web is a power-law, which means that a few nodes are highly connected while the vast majority have a smaller degree than average.

By computing the statistics of the number of authors per paper, the number of papers per author and the number of collaborators per author in various fields, Newman (2001) confirmed that their distributions follow a power-law form. All the networks contain a giant component of scientists, any two of whom can be connected by a shortest path of intermediate collaborators.

3 Network Analysis of Google Citations of Plurigaussian Simulations

The first step in our study consisted of collecting all the publications up to December 2015, found by Google Scholar for the term “Plurigaussian simulations”. A total of 555 references were obtained. Google Scholar had ordered them from the most relevant to the least (as determined by its algorithm). They include journal articles, working papers, doctoral and master’s theses, final year projects, patents and the two books on Plurigaussian Simulations together with chapters from the books which are sold separately by the publishers. These citations can be split into four groups:

  1. (1)

    Pertinent documents which develop the theory, or report case studies;

  2. (2)

    Papers which mention that plurigaussian simulations could be used to model the internal architecture of reservoirs or orebodies but which prefer to use another method (usually multipoint geostatistics);

  3. (3)

    Papers which mention plurigaussian simulations briefly. For example, Laigle et al. (2013) commented in their concluding paragraph that “Another use would be to constrain geostatistical simulations by the model results, e.g., training maps for multipoint or plurigaussian methods”.

  4. (4)

    Papers which do not mention plurigaussian simulations at all.

Of the original 555 references, 307 fell into the first category, 166 into either the second or third while 82 fell into the fourth category. The last group were eliminated from further study. For the 473 references in the first 3 categories, we noted the information listed in Table 9.1. Table 9.2 summarises the statistics of applications in the four main domains.

Table 9.1 Information noted for each of the 473 documents
Table 9.2 Results for the four main applied fields
  • Most papers were written by teams of authors (more than 3 per paper on average, up to 10 for some oil papers). Papers by single authors were usually dissertations. This confirms the finding by Wuchty et al. (2007) that papers are now being produced by teams of authors; solo papers are getting rarer.

  • International cooperation was a common feature: 28% of papers on oil, 21% for mining and 17% for water and history matching.

  • Many papers had authors from companies or consulting firms (57.8% for oil; 35.2% for mining; 23.8% for history matching) but far fewer for water (only 9.2%), probably because water is a public good whereas mining and oil companies are designed to make a profit.

  • Countries with strong mining and petroleum industries were well-represented amongst the papers.

  • Migration by scientists was a factor that accounted for the excellence of some countries.

  • Surprisingly only 9 of the documents were patents and these were all in the petroleum sector (either oil or history matching).

3.1 Building a Citation Network

In contrast to Newman (2001) and Barabasi (2002) who built their citation network by considering authors as nodes and linking those who had joint papers, we constructed the plurigaussian network by considering each publication as a node with an edge between two of them when one publication cites the other one, producing a directed network. Our network (Fig. 9.1) is displayed with different colours for the different fields of application: black for oil, mauve for mining, blue for water, red for history matching, green for agriculture, mustard for soil science and white for others. As expected, publications in the same field tend to be clustered together in the network.

Fig. 9.1
figure 1

The citation network for plurigaussian simulations, with different colours indicating the different fields of application: black for oil, mauve for mining, blue for water, red for history matching, green for agriculture, mustard for soil science and white for others. The size of the nodes are proportional to their rank according to PageRank and Betweenness centrality

As the network is composed of about 500 publications, it is interesting to know which nodes are the most important, and centrality measures are a good way to provide such answers based on the topology of the network. Here we used two measures: PageRank and Betweenness centrality. PageRank (Page et al. 1999) evaluates the importance of a node based on how many edges point to it, Betweenness centrality (Freeman 1977) estimates whether a node is likely to be placed between other pairs of vertices. Figure 9.1 shows the network of plurigaussian simulations when the node size is proportional to PageRank centrality (left panel) and Betweenness centrality (right panel). At first glance the figures look very similar but there are differences in the importance of some of the nodes as can be seen in Table 9.3 which lists the ten most important publications according to these two centrality measures.

Table 9.3 Rank of publications according centrality measures, namely Pagerank and Betweenness

4 Diffusion of the New Method into Industry

In our analysis of the citation network we had been surprised to find so few patents (only 9 out of 550). Moreover these only started in 2006 (i.e. 10 years after the invention of the method). This was because software could not be patented software before then (See Appendix 9.1 for more detail on this). As patent data could not be used to determine when the method actually reached industry, we need some other criteria. Based on Tijssen et al. (2009), we used the following:

  • One of the authors comes from a mining or an oil company, or

  • One of the authors comes from a software vendor or a consulting group

It is important to distinguish between the two. Resource companies like Shell or Chevron, or Rio Tinto or Anglo-American are “end-users” whereas consultants and software vendors transfer the idea to end-users, so their business plans are quite different.

The citations came from four main applied fieldsFootnote 2 (oil, mining, water resources and history matching). Looking back at Table 9.2, very few papers in water resources had an author from a company or a consulting firm (only 9.2%) compared to 57.8% for oil, 35.2% for mining and 23.8% for history matching. This is probably because water is a public good that generates relatively small profits compared to the oil industry or mining.

4.1 Co-authors and Repeat Co-authors from Industry

Although having a co-author from a company or a consulting group shows that the company is interested in the new technique, it does not tell us whether they have effectively adopted it. In some cases, co-authoring a paper with an academic is rather like “window-shopping”. It allows the company to test a new method on a case-study but adopting it as a standard procedure requires more time and effort (Martin and Tang 2007). Table 9.4 lists the companies and consultants which had co-authored more than 1 paper together with the number of papers, for each type of application. In applications to oil, seven companies and consulting groups had co-authored two or more papers, compared to 11 which had contributed to only 1; similarly five mining companies had co-authored two or more papers, compared to 8 which contributed to only 1 paper. It would be interesting to know what happened to the 11 oil companies that only participated in 1 paper, and likewise for the 8 mining companies. Did they lose interest in the method after an initial test study? Or did they decide to train their personnel or to outsource studies to consultants?

Table 9.4 Companies and consultants which had co-authored more than 1 paper together with the number of papers, for each type of applications

4.2 Surveys of Academics and Consultants

The last part of the study consisted of a survey to find out (a) which companies had started training their personnel by sending them to short courses or to postgraduate and masters courses, and (b) which were outsourcing studies. While there are clear limitations to what can be obtained from voluntary declarations because people tend to bias their answers and while our survey was far from exhaustive, the results give us some ideas about what has happened.

Three groups (the IFP at Rueil-Malmaison, the CG at Fontainebleau and Jeffrey Yarus and Rich Chambers, in the USA) ran extensive programs of short courses. Table 9.5 lists the short courses on truncated gaussian and plurigaussian simulations given by Christian RavenneFootnote 3 and Brigitte Doligez, both of the IFP. The Centre de Géostatistique was also active in giving short courses, often as pre-conference courses or in-house for oil companies, and the consulting and software company, Geovariances, regularly gives a 5 day course on conditional simulations applied to mining and has a 3 day course on advanced geostatistics for reservoir characterization. Both have modules on plurigaussian simulations. From 2000 to 2006, Jeffrey Yarus and Rich Chambers gave 4–5 courses per year through the Nautilus Training Organization and two more per year in Abu Dhabi for Schlumberger. After joining Landmark, they continued giving courses in Houston and London each year.

Table 9.5 Short Courses on the truncated gaussian method and on plurigaussian simulations by Christian Ravenne who was a geologist at the IFPEN before his retirement, and more recently by Brigitte Doligez, who is also a geologist at the IFPEN

Most postgraduate geostatistics courses have modules on simulation. Some students choose this topic for their project/thesis. The Ecole des Mines de Paris has been running a 9 month postgraduate geostatistics course called the CFSGFootnote 4 since 1980. The last 3 months are devoted to a personal project on a real case-study, usually provided by the company sponsoring the student. Similarly, final year undergraduates and masters students have carried out studies on plurigaussian simulations at the University of Chile, at Edith Cowan University (Western Australia), at the University of Adelaide (South Australia), at the federal university UFRGS (Rio Grande do Sul, Brazil), to mention just a few. As most of these are confidential, Google Scholar cannot find these. Table 9.6 lists the titles of projects that involved plurigaussian simulations and were carried out at various universities. One interesting feature is the number of studies that used data from the South American mining companies, Codelco and Vale, which were absent from the list of “repeat co-authors”.

Table 9.6 List of the titles of confidential reports on plurigaussian simulations by students at various universities

Lastly, the consulting arm of the IFPEN, Beicip-Franlab, kindly provided us with a list of the consulting projects involving plurigaussian simulations that they have carried out for clients (Table 9.7). The range of companies involved is striking. Almost all of them are national oil companies, many located in the Middle East.

Table 9.7 Consulting studies involving plurigaussian simulations carried out by the consulting arm of the IFPEN, Beicip-Franlab, from 2000

Looking through these three tables, it is clear that the publications found by Google Scholar are really only the tip of the iceberg. Underneath, there are many unpublished dissertations and project reports carried out by final year and masters level students which remain confidential—in contrast to PhD theses which are usually available on the internet. Most of these final year and masters dissertations were carried out on company data by a student who had been given time off work to study. We believe that these studies are a key step in getting new methods into to regular use in industry. This suggests that university assessments should take account of final year projects and master’s level dissertations, which is not the case at present in most countries, because this is one of the key channels for transferring new innovations into industry—at least as far as the earth sciences are concerned.

5 Conclusions and Perspectives for Future Work

Plurigaussian simulations were developed in France in the mid-1990s for simulating the internal architecture of oil reservoir in order to better predict oil and gas production. Although they were originally designed for the petroleum industry, they rapidly found applications in mining and hydrology and then for history matching in the oil industry. From France the technique diffused to other European countries, then to countries like the USA, Brazil and Chile.

This chapter uses complex dynamic networks to describe how the method diffused within the academic community. Citations found using Google scholar corresponding to the term “plurigaussian simulations” were used to track its diffusion within academia. In contrast to most citation networks where the nodes are the authors of papers and the link corresponds to co-authoring, in our network the papers themselves are the nodes which are linked when one paper cites another.

Papers were split according to the domain of the application: oil, mining, water or history matching. As expected, we found that

  • Most papers were written by teams of authors (more than 3 per paper on average). Papers by single authors were usually dissertations.

  • International cooperation was a common feature: 28% of papers on oil, 21% for mining and 17% for water and history matching.

  • Many papers had authors from companies or consulting firms (57.8% for oil; 35.2% for mining; 23.8% for history matching) but far fewer for water (only 9.2%), probably because water is a public good whereas mining and oil companies are designed to make a profit.

  • Countries with strong mining and petroleum industries were well-represented amongst the papers.

  • Migration by scientists was a factor that accounted for the excellence of some countries.

To our surprise there were few patents (only 9 out of 550) and these only started in 2006 (i.e. 10 years after the initial discovery). It turned out that software could not be patented software before then. Studies on innovation consider that the presence of an author from industry demonstrates that company’s interest in the innovation under study. In the earth sciences, companies often co-author papers in order to test new methods on their own data.

One of the main contributions of our chapter is to identify this “window-shopping effect”. We consider that co-authoring a single paper does not necessarily mean that the company has really adopted the method. More effort is required to absorb new methods. Instead, we postulate that co-authoring a second paper indicates a more serious interest: we call this “repeat co-authoring”. We found that seven oil companies and consulting groups had co-authored two or more papers compared to 11 which had contributed to only 1; similarly five mining companies had co-authored two or more papers compared to 8 which contributed to only 1 paper. It was surprising not to see South American mining companies such Codelco and Vale among the mining companies. We were also curious to find out whether the 11 oil companies and 8 mining that only co-authored 1 paper had lost interest in the method or had trained staff to carry out studies for them or had commissioned consultants to do them.

To find out what happened we carried out a survey of academics, end-users in companies and consultants. Clearly there are limitations to what can be obtained from voluntary declarations; people may bias their answers but the survey gave us some ideas about what had happened. The key results were:

  • Companies like Codelco and Vale had been active in providing data for final year and master’s level projects, but had not shown up as “repeat co-authors”.

  • A wide range of oil companies that had not published papers had chosen to provide in-house courses for personnel or had commissioned studies from Beicip-Franlab, the Consulting division of the IFP.

5.1 What Lessons Can Be Learned from the Study for Policy-Makers

Firstly, while studies on patents can be very effective for assessing the industrial impact of new discoveries in some fields, they would have completely missed the target in this field, for two reasons: it was not possible to patent software developments until after 2005, and secondly even after that date, the new developments in mining software for these simulations were not patented.

Citation networks proved to be more effective than patents in this field. They allowed us to track the development of plurigaussian simulations within four different but inter-related academic domains and to industrial partners who publish in journals with academics. But even citations do not really allow us to get past the superficial “window-shopping” aspect of publications. Studying “repeat co-authoring” provides more in-depth insights; surveys of users give a clearer picture of whether companies are actually implementing new methods.

As Martin and Tang (2007) noted, firms and other users need to expend considerable effort to exploit scientific knowledge. In order to develop the in-house capability to carry out plurigaussian simulations, they need to acquire software and to train personnel. This study highlights the importance postgraduate training and masters’ theses in transferring know-how and implicit knowledge to industry. The role of these courses in technology transfer to industry is undervalued in the current procedures for evaluating university departments.