Skip to main content
Log in

Playing the ‘Name Game’ to identify academic patents in Germany

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Identifying academic inventors is crucial for reliable assessments of academic patenting and for understanding patent-based university-to-industry technology transfer. It requires solving the “who is who” problem at the individual inventor level. This article describes data collection and matching techniques applied to identify academic inventors in Germany. To manage the large dataset, we adjust a matching technique applied in prior research by comparing the inventor and professor names in the first step after cleaning. We also suggest a new approach for determining the similarity score. To evaluate our methodology we apply it to the EP-INV-PatStat database and compare its results to alternative approaches. For our German data, results are less sensitive to the choice of name comparison algorithm than to the specific filtering criteria employed. Restricting the search to EPO applications or identifying inventors by professor title underestimates academic patenting in Germany.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. Improving data and methods to deal with these challenges was the objective of the project “Academic Patenting in Europe – Inventor Database”, which was supported by the European Science Foundation and provided the context for the present study. Results from this project have been published in Lissoni (2013).

  2. The country of residence is identified by using the country code information provided by PatStat.

  3. The standardized name id is provided by PatStat (doc_std_name_id) and allocates a standardized name to applicants and inventors with exactly the same or very similar names. However, the limited quality of the id does not allow us to use it as unique inventor id.

  4. Raffo and Lhuillery (2009) examined patents by professors at Ecole Polytechnique Fédérale de Lausanne (EPFL).

  5. Using European patent data is advantageous because more complete address information is available in PatStat. Addresses are provided for 94 % of the inventors listed on EPO patents but only for 0.2 % of German patent inventors.

  6. For a more detailed description of the cleaning stage please refer to Appendix 1.

  7. To calculate the Jaccard similarity coefficient, we used the Microsoft Fuzzy Lookup add-in for Microsoft Excel (downloadable at: http://www.microsoft.com/en-us/download/details.aspx?id=15011, last accessed on February 8, 2014).

  8. The matching procedure allows us to consider misspelled names and consequently reduce false negative matches (error type I). The algorithm, as discussed before, has the following disadvantage: the similarity score comprises PROFLIST names included in the names stored in DE-INV and vice versa. This results in false positive matches (error type II). To reduce the false positives, we delete matches wherein the length of the professor first name and surname differs from the inventor first and second names by more than two characters.

  9. We used a slightly modified procedure for the simple-string name comparison. Instead of comparing the full name-strings we first separated them into their components. Afterwards, we re-sampled the portions with first name and surname combinations. These name combinations were used for the simple-string name comparison.

  10. The term “name group” denotes the inventors matched with a professor in the first step.

  11. The identified combinations of criteria generated values of three. Therefore, we conservatively assumed that true positive matches would produce at least values greater than four.

  12. Project on academic patenting in Europe (APE-INV) funded by the European Science Foundation (ESF).

  13. We also disregarded professors from German Federal Armed Forces universities and professors affiliated with the Charité medical center.

  14. We could have taken these steps at the beginning of matching; however, we included these professors in another sample. The results are available upon request.

  15. In contrast, we did not find statistically significant differences between the groups regarding forward or backward citations.

  16. The precision rate is computed by dividing the number of true positive matches by the sum of true positive and false positive matches. The recall rate is computed by dividing the number of true positive matches by the sum of true positive and false negative matches (Lissoni et al. 2010).

  17. Cleaning was based on SQL scripts developed by Julio Raffo (http://wiki.epfl.ch/patstat/cleaning). We adapted the scripts and determined the order for execution. We executed the different steps in the following order.

    1. (1)

      Replacement of umlauts

    2. (2)

      Elimination of invalid characters

    3. (3)

      Replacement of accented letters

    4. (4)

      Conversion in upper case

    5. (5)

      Removal of title information and saving in a separate variable

    6. (6)

      Removal of double spaces as well as spaces at the beginning or end

    7. (7)

      Removal of commas

    8. (8)

      Replacement of abbreviations

    9. (9)

      Removal of letters not included in the Latin alphabet

References

  • Buenstorf, G., & Geissler, M. (2012). Not invented here: Technology licensing, knowledge transfer and innovation based on public research. Journal of Evolutionary Economics, 22, 481–511.

    Article  Google Scholar 

  • Cohen, W. M., Nelson, R. R., & Walsh, J. P. (2002). Links and impacts: The influence of public research on industrial R&D. Management Science, 48(1), 1–23.

    Article  Google Scholar 

  • Cohen, W. W., Ravikumar, P., Fienberg, S. E. (2003): A comparison of string metrics for matching names and records. Proceedings of IJCAI-03 Workshop on Information Integration, 73–78. Acapulco, Mexico.

  • Czarnitzki, D., Glänzel, W., & Hussinger, K. (2007). Patent and publication activities of German professors: an empirical assessment of their co-activity. Research Evaluation, 16(4), 311–319.

    Article  Google Scholar 

  • Czarnitzki, D., Hussinger, K., & Schneider, C. (2012). The nexus between science and industry: Evidence from faculty inventions. Journal of Technology Transfer, 37, 755–776.

    Article  Google Scholar 

  • De Rassenfosse, G., Schoen, A., Wastyn, A. (2014): Selection bias in innovation studies: a simple test. Technological Forecasting and Social Change, 81, 287–299.

  • Etzkowitz, H., & Leydesdorff, L. (2000). The dynamics of innovation: from National Systems and ‘‘Mode 2’’ to a Triple Helix of university–industry–government relations. Research Policy, 29(2), 109–123.

    Article  Google Scholar 

  • Geuna, A., & Nesta, L. J. J. (2006). University patenting and its effects on academic research: The emerging European evidence. Research Policy, 35(6), 790–807.

    Article  Google Scholar 

  • Harhoff, D., Scherer, F. M., & Vopel, K. (2003). Citations, family size, opposition and the value of patent rights. Research Policy, 32(8), 1343–1363.

    Article  Google Scholar 

  • Lissoni, F. (Ed.) (2013). Special Issue: Academic patenting in Europe. Industry and Innovation, 20(5), 379–502.

  • Lissoni, F., Coffano, M., Maurino, A., Pezzoni, M., Tarasconi, G. (2010): APE-INV’s “Name Game” algorithm challenge: A guideline for benchmark data analysis & reporting.

  • Lissoni, F., Llerena, P., McKelvey, M., & Sanditov, B. (2008). Academic patenting in Europe: New evidence from the KEINS database. Research Evaluation, 17(2), 87–102.

    Article  Google Scholar 

  • Lissoni, F., Lotz, P., Schovsbo, J., & Treccani, A. (2009). Academic patenting and the professors privilege: Evidence on Denmark from the KEINS Database. Science and Public Policy, 36(8), 595–607.

    Article  Google Scholar 

  • Lissoni, F., Sanditov, B., Tarasconi, G. (2006): The KEINS database on academic inventors: methodology and contents. Working Paper, Università Commerciale Luigi Bocconi.

  • Mejer, M. (2011): Entrepreneurial scientists and their publication performance. An insight from Belgium. Working Paper, ECARES.

  • OECD. (2003). Turning science into business: Patenting and licensing at Public Research Organisations. Paris: OECD Publishing.

    Google Scholar 

  • Raffo, J., & Lhuillery, S. (2009). How to play the “Names Game”: patent retrieval comparing different heuristics. Research Policy, 38(10), 1617–1627.

    Article  Google Scholar 

  • Schmoch, U. (2007). Patentanmeldungen aus deutschen Hochschulen. Karlsruhe: Studien zum deutschen Innovations system, Fraunhofer Institut System- und Innovationsforschung.

    Google Scholar 

  • Schmoch, U., Dornbusch, F., Mallig, N., Michels, C., Schulze, N., & Bethke, N. (2012). Vollständige Erfassung von Patentanmeldungen aus Universitäten. Karlsruhe: Fraunhofer Institut System- und Innovationsforschung.

    Google Scholar 

  • Schoen, A., & Buenstorf, G. (2013). When do universities own their patents? An explorative study of patent characteristics and organizational determinants in Germany. Industry and Innovation, 20(5), 422–437.

    Article  Google Scholar 

  • Thursby, J., Fuller, A. W., & Thursby, M. (2009). US faculty patenting: inside and outside the university. Research Policy, 38(1), 14–25.

    Article  Google Scholar 

  • Trajtenberg, M., Shiff, G., Melamed, R. (2006): The “Names Game”: harnessing inventors’ patent data for economic research. Working Paper, National Bureau of Economic Research.

  • von Proff, S., Buenstorf, G., & Hummel, M. (2012). University patenting in Germany before and after 2002: What role did the professors’ privilege play? Industry and Innovation, 19(1), 23–44.

    Article  Google Scholar 

Download references

Acknowledgments

The authors are grateful for comments received from the participants at the Name Game Workshops in Brussels and Leuven. Financial support by the European Science Foundation (project ESF-APE-INV) is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anja Schoen.

Appendices

Appendix 1

Cleaning

The cleaning stage was aimed at reducing noise for the relevant data without losing relevant information in subsequent stages (Raffo and Lhuillery 2009). The number of false positive matches should be as low as possible. Raffo and Lhuillery (2009) tested the impact of different cleaning algorithms on the matching results using the “simple-string match” algorithm as a standard. The results showed that, separately, each cleaning algorithm produces a small improvement. However, together, the precision rate was increased 7 % and the recall rate by 64 %.Footnote 16

Based on these results, the cleaning stage in this project comprises nine steps and was applied to DE-INV and PROFLIST. In addition to capitalization and comma deletion, these nine steps include eliminating title information in the name field and saving it in a separate file.Footnote 17 Replacing “umlauts” (ä, ö, ü) was especially important for Germany. Moreover, we separated the address field into street, postal code, and country information as well as extracted initials for names. Where either the postal code or the city was specified, we supplemented the missing information.

Appendix 2

See Table 10.

Table 10 Identified criteria combinations inventor–inventor filtering

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Schoen, A., Heinisch, D. & Buenstorf, G. Playing the ‘Name Game’ to identify academic patents in Germany. Scientometrics 101, 527–545 (2014). https://doi.org/10.1007/s11192-014-1400-x

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-014-1400-x

Keywords

JEL classification

MSC classification

Navigation