What do firms know? What do they produce? A new look at the relationship between patenting profiles and patterns of product diversification


In this work, we analyze the relationship between the patterns of firm diversification, if any, across product lines and across bodies of innovative knowledge, proxied by the patent classes where the firm is present. Putting it more emphatically, we investigate the relationship between “what a firm does” and “what a firm knows.” Using a newly developed dataset matching information on patents and products at the firm level, we provide evidence concerning firms’ technological and product scope, their relationships, the size-scaling and coherence properties of diversification itself. Our analysis shows that typically firms are much more diversified in terms of products than in terms of technologies, with their main products more related to the exploitation of their innovative knowledge. The scaling properties show that the number of products and technologies increases log-linearly with firm size. And the directions of diversification themselves display coherence between neighboring activities also at relatively high degrees of diversification. These findings are well in tune with a capability-based theory of the firm.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9


  1. On the limitations, but also the relevance of the use of patents as a proxy for innovation, see the critical review by Griliches (1990).

  2. See, in particular, the evidence presented in Dernis et al. (2015) about the patenting strategies of world corporate top R&D investors across the five top intellectual property offices (IP5) which include, in addition to the USPTO and the EPO, the Japan Patent Office (JPO), the Korean Intellectual Property Office (KIPO) and the State Intellectual Property Office of the People’s Republic of China (SIPO).

  3. Notice that a single patent may be owned by more than one firm. In case of co-patenting, our analysis credits the patent to each co-patentee, as it is usually done in the literature (see, for example, Breschi et al. 2003).

  4. Information on total turnover is available only in 2000 and 2003.

  5. More on the dataset in Grazzi et al. (2013).

  6. Notice that figures relative to more recent years are less reliable due to the time lag between a patent application and its grant.

  7. See in particular the Schmoch et al. (2003) concordance table.

  8. In what follows, we will assume that “products” and “technological fields” map one-to-one into each other (a 4-digit ISIC Rev.3 sector).

  9. For a detailed description of the transaction level trade data and the product classification employed refer to Bernard et al. (2015).

  10. We focus on exports to Extra-EU destinations for several reasons. Most importantly, firm-level exports to the EU are not recorded for all exporters and these criteria have changed over time.

  11. See Hirsch and Lev (1971) and Kim et al. (1993) for empirical evidence on the negative relationship between firm diversification and sales volatility.

  12. The matrix is calculated taking into account only firms which are active in at least two products (technological fields), and only products (technological fields) in which at least two firms are active. The resulting matrix is 136 × 136 in the case of products (taking into account 2571 firms) and 102 × 102 in the case of technological fields (taking into account 940 firms).

  13. When making inference about skewed distributions, as for example distributions of firms across products or technological fields, P value is to be preferred to statistics-based inference, like the t-statistics used in Teece et al. (1994). The point is discussed at greater length in Bottazzi and Pirino (2010).


We thank the participants at the Conference on Entrepreneurship, Innovation and Enterprise Dynamics organized by the OECD Working Party on Industry Analysis (WPIA), Paris, December 2014; the Finkt final conference in Rimini (2015); CAED conference in Istanbul (2015); XXII Organization Science Winter Conference Park City (2016) and participants to faculty seminars in Notre Dame University; and SPRU, University of Sussex. We also thank Giulio Bottazzi and Davide Pirino for having allowed to use the computing routines associated to their coherence measure. Marco Grazzi gratefully acknowledges the Centre for Business Research at the University of Cambridge and in particular Andrea Mina for a very fruitful visiting period and the Fondazione Cassa dei Risparmi di Forlì, Grant ORGANIMPRE, for financial support. Daniele Moschella received financial support by the Italian Ministry of Education and Research under the SIR Programme (Project code RBSI14JAFW). The Project has been partly supported by the European Commission under the H2020 RIA (Grant Agreement 649186). Without the unique backing of the Italian Statistical Office, and in particular Roberto Monducci, this all endeavor would not have been possible. The usual disclaimer applies.

