To examine the information transfer between the USPTO and retail investors, I estimate the following regression:
$$ \begin{array}{@{}rcl@{}} Log(1 + Local \ trading_{i,j,t})|Patent_{j,t} = \beta_{0} + \beta_{1}Libraries_{i,t} \\ + \sum{}Controls + \sum{} County_{i} + \sum{} Firm_{j} \ x \ Date_{t} \end{array} $$
(1)
where Local trading is the Local tradingvolume of retail investors in county i to firm j’s release of a patent at time t. The variable is constructed over a three-day window; that is, patents are released on Tuesdays in my sample period and local trading volume accounts for trading from Tuesday to Thursday. All patents in my sample period are released on Tuesdays. Libraries represents an indicator variable that takes the value of one if there is a PTDL in county i at time t and zero otherwise. The coefficient on Libraries represents the trading volume after the release of a patent in counties with a PTDL, in contrast to those without one. I assume that investors who receive the patent information react similarly to the release of a patent across locations, since the information is identical in all locations. In other words, investors in Santa Clara County and New York County would react similarly. I also assume investors near Boston are more likely to use the resources at the Boston Public Library than those at other libraries. This assumption comports with the findings of Brown and Arshem (1993), who show that most visitors originate from the area near the PTDL. In addition, the setting requires that only a subset of local investors use the PTDL. These investors then disseminate their information, ensuring local transfer (Ivković & Weisbenner, 2007; Hong et al. 2005).
I include four county characteristics to control for the decision to open a PTDL in a county. (1) I include Log(1 + County patents), which is defined as the natural logarithm of one plus the the number of patents in thousands released in county i in the year of the event date. (2) I include Log(1 + County scientificvalue), which is defined as the natural logarithm of one plus the adjusted number of citations in thousands of patents released in county i in the year of the event date. (3) I include Log(1 + County economicvalue), which is defined as the natural logarithm of one plus the economic value in billions of patents released in county i in the year of the event date. (4) I include Log(County population), which is defined as the natural logarithm of the population in thousands in county i in the year of the event date. I also include Local firm, which is an indicator variable that takes the value of one if the headquarters of firm j is located in county i and zero otherwise. Please refer to Appendix Appendix for descriptions of all variables.
I include county fixed effects to control for county time-invariant characteristics (e.g., larger counties might cause stronger reactions than smaller ones). I include firm x date fixed effects to control for all fundamentals of the event (e.g., the scientific and economic value of the patent) and any simultaneous actions (e.g., positive or negative media coverage). By clustering residuals by county, I compute standard errors that allow each county to have its own unobserved effect on trading volume (Petersen, 2009).
Descriptive statistics
Table 3 reports the descriptive statistics for the variables in this analysis. 867,034 (19.4%) observations are in counties with a PTDL, while 3,603,893 (80.6%) are in counties without one. In other words, for each observation with a PTDL, I observe on average four observations without one, which allows me to vary the patent information availability for different groups of investors for the same patent release. The mean of the Local tradingvolume is substantially higher in counties with a PTDL than in counties without one ($550.25 versus $205.07). Furthermore, counties with a PTDL have on average a larger population and generate more patents. These patents have higher scientific and economic values.
PTDL and local trading volume
In Table 4, I examine the association between patent information availability and local trading volume. In columns (1) to (3), I show the estimates of Eq. 1 with the natural logarithm of one plus the absolute dollar trading volume in firm j’s stock in county i within three days of the patent release as the dependent variable. Column (1) considers only the treatment variable Libraries and standard errors clustered by county. The result of column (1) suggests that investors in counties with a PTDL increase their local trading volume by 12.9%, in contrast to investors in counties without a PTDL. To ensure that these results are not driven by the USPTO selection mechanism, I include the variables Log(1 + County patents), Log(1 + County scientificvalue), Log(1 + County economicvalue), and Log(County population) as control variables in column (2) (Sneed, 2000). These variables mimic the USPTO selection mechanism and ensure an unbiased coefficient estimate for the treatment variable Libraries. Furthermore, to ensure that the results are not driven by any simultaneous action that materializes in counties with a PTDL, due to the location choice of firms, I include the variable Local firm. Firms might select counties with a PTDL as a location for their headquarters to benefit from resources at the library. If any simultaneous action is correlated with the firm’s location choice, the coefficient estimate for the treatment variable Libraries would be biased. The variable also accounts for any home bias that might affect local trading (Ivković & Weisbenner, 2005; Seasholes & Zhu, 2010). And column (2) includes county fixed effects to control for county time-invariant characteristics. The results of column (2) indicate that the existence of a PTDL increases local trading volume by 5.5%, versus counties without a PTDL. However, this specification ignores that some events might be associated with higher trading volumes for reasons unrelated to the PTDL. Hence, to control for these factors, column (3) adds firm x date fixed effects to the model. The inclusion of these fixed effects decreases the coefficient estimates for Libraries to 4.6%. The coefficient estimates for Libraries for these specifications are positive and significant at the p ≤ 0.1 level. The adjusted R2 increases from 0.2% in column (1) to 3.7% in column (3), highlighting the importance of these characteristics in this analysis.
To ensure that the results are not driven by outliers, columns (4) and (5) present the coefficient estimates for Libraries for two alternative dependent variables that treat each observation equally. Column (4) uses the natural logarithm of one plus the number of trades in firm j’s stock in county i within three days of patent release. Column (5) uses a local trading indicator variable that takes the value one if there was trading in firm j’s stock in county i within three days of patent release and zero otherwise. I use a linear probability model to account for the many fixed effects in this analysis. Both dependent variables minimize the influence of large or small trades on the estimation. The existence of a PTDL increases the number of local trades by 0.4% and the probability of a local trade by 0.5%. In addition, the coefficient estimates for the treatment variable Libraries are significant at the p ≤ 0.1 level, indicating the results of the baseline specification are not driven by outliers.
The level of significance is due to the fact that the coefficient estimates represent the average effect across low-value and high-value patents. However, the retail investor reaction is driven by the value of the patent information and not by the PTDL itself. In Table 5, I show greater significance, due to the interaction with the scientific value in the analysis.
A frequently stated concern with difference-in-difference research designs is the violation of the parallel trends assumption. To address this concern, I define leads (t − 4 to t − 2) and lags (t = 0 to t + 2) of the treatment variable Libraries as half-year increments and use them as treatment variables.Footnote 6 For example, Libraries t − 2 represents an indicator variable that takes the value of one a year before the opening of a PTDL and zero otherwise. Similarly, Libraries t + 2 is an indicator variable that takes the value of one a year after the opening of a PTDL and zero otherwise. The baseline period in this analysis is the period t − 1. In Fig. 1, I examine the coefficient estimates before and after the opening of a PTDL to provide evidence of the validity of the parallel trends assumption. The coefficient estimates for the leads of the treatment variables Libraries are not statistically significant at the p ≤ 0.1 level, highlighting that there are no systematic differences in local trading volume between both groups of counties before the opening of a PTDL.
The analysis only considers short-term market reactions after the release of a patent to ensure a clean identification of the information transfer. As a result, the size of the coefficient estimates for Libraries is not equivalent to the total economic effect of the information transfer. In fact, the results of this analysis likely constitute only a fraction of the total economic effect, since patent information is also used in licensing agreements and debt contracts, which might be affected by the information transfer (Mann, 2018). In addition, these information transfers might influence the efficiency of stock price discovery (Hegde et al. 2018). In sum, the results of my analysis underline the existence of an information transfer between the USPTO and retail investors. These results also highlight that retail investors collect patent information and use it in their investment decisions.
Pre-release local trading volume, scientific value, and snow
To ensure that the results of the previous analysis are not driven by factors unrelated to an information transfer between the USPTO and retail investors, I examine the financial market reaction before the release of a patent. The specification of the previous analysis would be questionable if the existence of a PTDL were associated with the retail investor reaction before the release of a patent. The dependent variables in columns (1) and (2) of Table 5 are considered on days -3, -2, and -1 before the patent release date. The coefficient estimates for Libraries are not statistically significant at the p ≤ 0.1 level for both specifications, suggesting that any increase in the local trading volume in the previous analysis is not driven by any factors unrelated to the release of a patent and hence unrelated to the information transfer.
In addition, if the results of the previous analysis are driven by an information transfer between the USPTO and retail investors, I expect the value of the information to drive the retail investor reaction. Consequently, columns (3) and (4) of Table 5 show the coefficient estimates for the treatment variable Libraries and its interaction with the scientific value Log(1 + Scientific value). I define Log(1 + Scientific value) the natural logarithm of one plus the number of citations of the patent event. I follow prior research and deduct from the citation count the mean number of citations for all patents in the same technology class in the same year. This adjustment accounts for systematic differences between patents in different technology classes in different years (Hall et al. 2001). To ensure nonnegative values, I set the adjusted citations to zero for patents that have adjusted citations below zero.Footnote 7 The coefficient estimates for Libraries are not statistically significant at the p ≤ 0.1 level; however, the coefficient estimates for the interaction are significant at the p ≤ 0.01 level for both specifications. The result of column (3) indicates that a 1% increase in the scientific value of a patent increases the local trading volume by 3.1% if a PTDL exists in county i at time t. A similar association is observable in column (4) for the number of local trades. These results suggest that the investors are not reacting to the opening of a PTDL; they are reacting to the release of a patent and the value of the information.
To rule out that the results of the previous analysis are driven by the USPTO selection mechanism, I follow the approach of Engelberg and Parsons (2011) and use extreme snowfall as a exogenous shock to patent information availability. I exploit the fact that the sample period is largely pre-Internet and investors must physically access patent information if they want to use it. Extreme snowfall is unrelated to the USPTO selection mechanism; however, it can prevent patent information from reaching investors by, for example, blocking roads. Columns (5) to (6) of Table 5 include the coefficient estimates for the variable Snow and its interaction with the treatment variable Libraries. The variable is an indicator variable that takes the value of one if the snowfall in county i within three days after patent release exceeds 20 inches and zero otherwise.
In this setting, the coefficient estimates for Libraries represent the effect of a PTDL on local trading volume on days without extreme snowfall. The inclusion of the interaction of Libraries and Snow does not substantially change the coefficient estimate for the treatment variable Libraries, compared to the baseline regression. Furthermore, the lack of statistical significance of the coefficient estimate for Snow indicates that extreme snowfall itself does not influence local trading volume significantly. The variable of interest is the interaction of Libraries and Snow, which is negative and significant at the p ≤ 0.05 level. These coefficient estimates indicate that information transfer breaks down on days with extreme snowfall, suggesting the existence of a transfer between the USPTO and retail investors. It is noteworthy that the interaction variable offsets the positive effect of the PTDL (i.e., the sum of these coefficients is not statistically different from zero at the p ≤ 0.1 level), which means that the information transfer is severely disrupted.Footnote 8 Overall, these results ensure that the results of the previous analysis are not driven by the USPTO selection mechanism but by an information transfer between the USPTO and retail investors.
Characteristics of the information transfer
In Table 6, I examine the characteristics of the information transfer. In columns (1) to (5), I use the market reactions on the individual days after the release of a patent to measure the speed of the transfer. The coefficient estimates indicate that the transfer is relatively fast, since the strongest market reactions occur on day 0 (release date) and day 1 after the release (p ≤ 0.1). The existence of a PTDL increases local trading volume by 1.6% on day 0 and by 2.5% on day 1. Market reactions from day 2 to day 4 are weaker, suggesting that the retail investors incorporate the patent information quickly into the stock price. This finding comports with the results of Kogan et al. (2017), who demonstrate that the strongest aggregate financial market reactions occur from day 0 to day 2. The speed of the market reactions also mitigates the concern that the results of the previous analysis are driven by local news coverage, since this would require patent information to be disseminated on release day, which is unlikely, given the existence of print deadlines.
The literature has shown that distance is an important determinant of information transfers (e.g., Belenzon and Schankerman 2013; Agrawal et al. 2017). I follow this literature and examine how distance affects the information transfer in my setting. In column (6) of Table 6, I include the variable Log(Distance), which is defined as the natural logarithm of the distance of investors in county i to the closest PTDL at time t in miles.Footnote 9 The coefficient estimates for Log(Distance) are negative and statistically significant at the p ≤ 0.05 level. The result in column (6) indicates that a 1% increase in the distance to the closest PTDL decreases the local trading volume by 1.3%, suggesting that distance does affect the information transfer significantly. A possible explanation for this result is that the patent information needs time to reach investors farther from the PTDL.
Investor characteristics
In Table 7, I focus on the characteristics of different groups of retail investors. The results of this analysis should be viewed as descriptive and interpreted with caution. They do not rule out alternative explanations, and a large share of observations lack necessary data.
In columns (1) to (4), I examine whether investors with different jobs differ in their sensitivity to the availability of patent information. Columns (1) and (2) include the coefficient estimates for Libraries for investors in management jobs and technical jobs, respectively. Both coefficient estimates are not statistically significant at the p ≤ 0.1 level. Columns (3) and (4) include the coefficient estimates for Libraries for retired investors and the remaining investors, respectively. The coefficient estimate for the retired investors is significant at the p ≤ 0.01 level, while it is not statistically significant at the p ≤ 0.1 level for the remaining investors. A possible explanation for these results is that retail investors have sufficient knowledge to process patent information, independent of their jobs, but that collecting and understanding patent information requires adequate time. However, the insignificant coefficient estimates should be interpreted with caution, as they are marginally significant and might be the result of a lack of statistical power.
In columns (5) and (6), I examine whether females and males differ in their use of patent information. The coefficient estimate for Libraries for female investors is not statistically significant at the p ≤ 0.01 level, while it is at the p ≤ 0.05 level for male investors. A possible explanation for the difference is differing levels of confidence in the precision of the information (Barber & Odean, 2001).
Information environment, trading costs, and attention
In columns (1) and (2) of Table 8, I examine whether the information environment affects the retail investors and their use of patent information. I measure the information environment using two variables that are positively associated with a strong information environment: analyst following and institutional ownership (Bhushan, 1989; Bushee & Noe, 2000; Shores, 1990). I define Log(1 + Analyst following) as the natural logarithm of one plus the number of analyst earnings forecasts for firm i before the release of the patent. I define Institutional ownership as the percentage of firms j’s shares held by institutional investors before the release of the patent. I expect that retail investors react more strongly when the information environment is stronger and hence search costs are lower. Columns (1) and (2) include the coefficient estimates for the interactions of Libraries with Log(1 + Analyst following) and Institutional ownership, respectively. Both coefficients are positive and statistically significant at the p ≤ 0.01 level, consistent with the argument that retail investors complement traditional curated disclosures, such as financial statements and analyst reports, with uncurated patent disclosures.
In columns (3) and (4), I examine whether trading costs affect the retail investors. I am using two measures of liquidity that are associated with trading costs: turnover and bid-ask spread. I define Turnover as the ratio of daily volume and shares outstanding the day before the release of the patent. Similarly, I define Bid − ask spread as the bid-ask spread the day before the release of the patent. The bid-ask spread is calculated using the Corwin and Schultz (2012) estimator. I expect that retail investors react more strongly to patent information when liquidity is high and trading costs are low. Columns (3) and (4) include the coefficient estimates for the interactions of Libraries with Turnover and Bid − ask spread, respectively. Both coefficients are statistically significant at the p ≤ 0.05 level, consistent with the argument that low trading costs make trades more attractive for retail investors.
Finally, in columns (5) and (6), I examine whether the intensity of attention affects retail investors and their use of patent information. I use the amount of competing information and a salient corporate event as proxies for attention. I define Log(Daily patents) as the natural logarithm of the number of patents released on day t. I define M&A news as an indicator variable that takes the value of one if there is any M&A news within the seven days before time t and zero otherwise. Since stocks in the news grab the attention of retail investors, I expect these investors will react more to patent information if there is less information competing for their attention and if there is M&A news, increasing attention on the pertinent firm (Barber & Odean, 2008). Columns (5) and (6) include the coefficient estimates for the interactions of Libraries with Log(Daily patents) and M&A news, respectively. The negative coefficient Log(Daily patents) (p ≤ 0.05) and the positive coefficient on M&A news (p ≤ 0.01) are consistent with argument that attention influences the response of retail investors.
PTDL and profitability
In Table 9, I examine whether trades made by retail investors with access to patent documents are more profitable than those made by retail investors without access. I use a sample of individual trading days with buying activity, which allows me to accurately calculate the returns. I calculate the cumulative abnormal returns (CAR) and buy-and-hold abnormal returns (BHAR) for 20, 60, and 240 trading days. I use a market-adjusted model, which uses abnormal returns, defined as returns in excess of the CRSP value-weighted market return, to calculate the abnormal returns. I estimate a regression with the CAR and BHAR as dependent variables to determine the influence of patent information on profitability. Columns (1) to (6) include coefficient estimates for Libraries across event windows. The coefficient on Libraries represents the difference in returns for trades made by retail investors in counties with a PTDL, compared to trades made by retail investors in counties without a PTDL. The coefficient estimates for Libraries are significant at the p ≤ 0.05 level across all specifications. The return differentials range from 0.7% to 7.9%, depending on the model and the event window. These return differentials translate into returns of 115 and 1,293 dollars, respectively. This result suggests that retail investors collect and use patent information, since availability of this information is associated with significantly higher stock returns.