This section gives a detailed description about the compilation of the dataset used to estimate Eqs. 1 and 2. The section describes the main variables and provides a descriptive portrait of the spatial distribution of machine imports.
Compiling the dataset
The empirical analysis is based primarily on the Customs Statistics (CS) as well as financial information at firm level provided by the Hungarian Statistical Office. It contains the universe of exports and imports by Hungarian economic agents between 1992 and 2003. It gives information on yearly trade aggregated to the 6-digit Harmonized System product level and gives the country of origins and destinations as well. The quantity measurements allow the calculation of unit prices. It is important to point out that while trade data is available after 2003, its structure and classifications change after Hungary’s EU accession in 2004. This hinders the investigation to go beyond that date.Footnote 6
This dataset is merged with firm level information, a panel of Hungarian manufacturing firms between 1992–2003 with very detailed firm-level information on balance sheets. It allows to include the following firm level characteristics into the empirical estimations: firm size defined by the average annual employment, foreign ownership indicating majority foreign share in the subscribed capital of the firm and total factor productivity (TFP).Footnote 7 The dataset provides sectoral classification of NACE rev. 1. For more details on this data see Békés et al. (2011).
Table 1 Number of machines allocated to manufacturing sectors To identify events of machine import we rely on the Standard International Trade Classification (SITC) rev. 3. which we match to CS. No. 7 group of SITC classification titled Machinery and transport equipment that defines capital products used in sector specific production. As in this study the focus is on manufacturing machines only, transport equipment and vehicles are excluded. Anyway, vehicles are less production-specific and most widely available via wholesalers in Hungary and importing them is less likely than procuring them locally. This leaves us with a range of machines listed in SITC classification from Power generating machinery and equipment (71) to Electrical machinery, apparatus and appliances (77).
Let us now define a list of machineries that each industrial sector uses by generating it from our import data. As a preliminary step, we consider only a subset of the manufacturing sectors and omit industries where the imported machines can be in fact materials to firms’ final product, i.e. Manufacture of machinery and equipment. See Table 1 for the list of manufacturing sectors considered. We match the set of machines from SITC 71-77 at the 5 digits to each sector by looking at actual machine imports from 1992–2003. A machine is matched to the sector if it is imported by at least 3 firms. Additionally, machines for general industry purposes such as computers or air conditioning units are excluded. We have also checked that the machine is in line with industry activity. That is, matches like Manufacture of textiles (17) and gas-operated metalworking machinery (73742) are not considered for the analysis.
The matching process resulted in allocating 143 individual machines to industries, with Tobacco industry having only 3 and the Fabricated metal products sector having the maximal number of 40 machines. In Table 1 the sum of machines is 210, which implies that we matched one machine to more than one sector. For example industrial sewing machines can be used by both textiles and wearing apparel industries.Footnote 8 For details on the list of machines, see Table 20 of the Online appendix.
Given the list of machines per sectors we can look at machine importing events at the firm. Only the first import of a machine is considered, subsequent imports afterwards are omitted. To improve reliability of the data and improve economic significance of the research, we omit firms with less than 10 employees on average.
We also make some restrictions on the country dimension. For each machine we consider only the 15 most important trade partners ranked by volume share of imports for that particular machine and only those machines are considered that are imported from at least 3 countries. This restriction ensures that firms have country choices—and only applies when country choice is investigated: when only machine choice is investigated we keep machines from all countries. The partner list consist of 35 countries with Germany, Italy and Austria as chief suppliers of imported machines. The list of countries is presented in Table 16.
Descriptions of machines and machine importers
Only a small fraction of manufacturers import machines directly. Table 2 shows the number of firms in the selected manufacturing sample. It shows that only about half of the firms import any goods from abroad, intermediate goods included. Machine importers are even scarcer. Only about fifth of the firms import machines. Note that these are only those firms that import from our industry specific list of machines. This approach will hence, underestimate their share.
Table 2 Number of firms by import activity On average, a firm that ever imported (in our period), will on average import 1.7 machines a year. When we look at the firm activity, we observe an importing firm for 6 years on average, and the firm will import a total of 6 different machines. Firms import from 3.2 different countries, on average. The largest number of different machines imported by one firm is 31, and the firm that imports machines from the highest variety of sources imports from 16 countries all together.
Table 3 provides statistics on importing firms by the number of machines they import. The upper panel concentrates on core machines (used in a single sector) only. We find that while more firms import only one machine, more than 43% of importers are multi-machine importing firms. About 10% of them import 5 or more machines. Looking at imports in shorter period or even in a single year reveals that about 17–27% of the importers import multiple machines in a given year. This provides sufficient within firm variation for our estimation strategy, even when only core machines are considered.
The lower panel shows corresponding statistics for any machine imported. Patterns are similar to the core machines. As the variety of machines considered increases, consequently the number of firms that import a single machine only decreases. About one third of the importers import more than one machine in a year.
Table 3 Share of importers by the number of machines imported As earlier evidence suggests,Footnote 9 when we compare importing firms to non-importers, we shall find that these firms are larger and have superior productivity. Regressing a dummy of being machine importer on firm characteristics, we find that machine importing firms are 110% larger (in terms of the number of employees) and 40% more productive (in terms of total factor productivity)—see details in Table 14 of the Appendix.
The data allows to describe the distribution of the unit prices of the machines firms import. The prices show considerable heterogeneity both across and within the machine category. Average within machine category standard deviation of log price equals standard deviation of all the prices. They vary considerably across countries as well, for at least two reasons. Import prices are recorded including cost, insurance and freight (CiF) which suggest that duties and distance increase the price of the machines. Also, prices vary due to the value added and the price of technology embedded in the machines. Figure 1 illustrates this showing the difference in the price distribution of machines from Italy, USA and UK. Differences may be explained by both differences in shipping costs as well as differences in quality of products shipped from different countries.
Location of peers
Investigating the effect of peers on importing activity requires heterogeneity across space. If machine imports exhibit stickiness in space, that is, a new machine importer is influenced by previous importers, new importers should be relatively close to previous ones.
The data also includes the location of the firm’s headquarter at the municipality level including postcode.Footnote 10 Using this information we geo-code the location information and assign geographical coordinates to each firm at the level of postcode using Geonames.org dataset and using Google Earth. In Hungary most settlements have single post-codes, here the coordinates refer to the center of the settlement. Most larger cities and agglomerations, however have multiple post codes.Footnote 11 Also there is a small share of settlements that share the same postcode, hence it is important to define location by both postcode and settlement. We will call these spatial units postal districts. Geo-coding firms this way enables measuring the shortest distance between them.Footnote 12 Each such district can be aggregated into larger geographical administrative units: to municipalities, micro-regions or counties. Hungary has 20 counties representing the European Union’s NUTS3 classification (see Table 15 in the Appendix).
Machine importing activity is observed in 2329 postal districts, which is 63% percent of all 3658 districts where any production activity in the selected manufacturing sectors can be detected. This is illustrated in Fig. 2 which displays the map of Hungary and shows the distribution of the total number of machines imported in each location over the sample period. In over forty districts more than 50 machines get imported. These are predominantly located in larger townships in Hungary. About 100 districts, we see imports between more than 25 but less than 50 machines, over 670 districts have firms importing less than 25 but more than 5 machines. The remaining districts, a bit more than 1500, local firms import 5 machines or less.
As a next step, we look at machine import instances and categorize them according to the existence of previous activities. We use threshold values starting from 1 to 50 km with 5 km steps to investigate within what distances peers are most likely to locate. Figure 3 shows the share of imported machines in selected years that do not have peers within a specific distance. As distance of peers can be dependent on the size of a given agglomeration, we show results by three size categories: for firms in the capital, for firms in larger cities (20 county capitals) and all locations smaller.
Take the results for firms in Budapest, the capital city (denoted by the red line with solid dots) and consider the mid-point of 1997. We see that 80% of the machine imports took place without the same machine being imported by peers within 1 km. This ratio drops sharply to about 20% when we consider peers within a 5 km radius, decreases to a close to zero level around 10–15 km. The red shaded area shows the corresponding ratios for 1993 and 2003 for the beginning and the end of our sample; the count of peers being cumulative the peerless ratios are always lower for a later point in time.
The statistics for larger cities are presented in yellow (line with hollow circle). Here, in 1997, about 70% of the machine imports without same machine peers within 1 km, and then drops to about just 45% within 5 km before decreasing gradually to 20% within the 50 km radius. Interestingly, the band around the 1997 value is rather wide in Fig. 3 for larger cities, which suggest a significant variation in the presence of peers over time. For smaller cities and settlement (green line marked with cross), results show the highest share of firms without peers, over 85% in any year. This ratio gradually decreases with the distance and statistics become similar to those calculated for larger cities when distance exceeds the 30 km radius.
The figure suggests that firms located in different types of districts will have to face different amount of difficulties to find another firm who had imported relevant machines. In the capital city, 80% machine imports have had a peer within 5 km, while small cities this around 20%.
While the findings from Fig. 3 already give motivation to use distance thresholds 1 km, 5 km, 15 km and 30 km for the analysis, it is still worth looking at the distribution of peers from a different perspective. Instead of the share of peerless imports within a distance Table 4 looks at the distance of the closest peer for the same three time periods. The table has two panels, the left one shows the distribution of machine imports by closest same-machine peers, while the right panel looks at imports by the spatial distribution of same machine-same country peers.
Table 4 Share of imports with and without previous importers in selected years Even in the second year of our sample, in 1993, 11.5% of the importing events are involving machines that have been imported in the previous year by other firms within the 1 km vicinity and more than half of them have peers within the 30 km radius. As time advances the chance of not having any peer diminishes, and more and more firms have local peers when they import machines. By 2000, half of the imports take place in locations where there was previous import in the 5 km radius.
Additionally, Table 4 shows that even in 1993, at least 70% of the machine imports had same-country peers. About 5% of the imports have peers within 1 km, while 27% of them within 30 km. We find an accumulation of peers with the 1–15 km range. By the year 2000, the share of imports with immediate (1 km) same-country peers increases to 8%, those with peers within 15 km, increases to 30%.
Timing of imports
Investigating the effect of peers on importing activity requires an additional heterogeneity: across time. If a new machine importer is influenced by previous importers, those who import earlier should be closer to peers than those who import later.
To investigate the timing of machine imports, let us look at adoption—importing a machine that had been imported before. In any given postal district and machine, we calculate the number of years it took for the second importing activity to follow the first one. We can take a district level average across machines imported, and we plot it in Fig. 4. The distribution of timing shows considerable variation in adoption rate. It shows that, on average, timing is negatively correlated with city size: average early adoption (1–2) years is concentrated around agglomerations such as the capital city and important manufacturing centers. At the same time, late adoption (7+ years) is found in smaller settlements and in the greater vicinity of agglomeration. That is, foreign machines are adopted in smaller municipalities later than in larger cities. In fact, in major cities the imported machine arrives first, in 1992 or 1993. New machines get imported in smaller settlements much later.
We examine the possible spatial dependence of imports by looking at average distances between importers in kilometers over time. Figure 5 investigates how far technology—as embodied by machines—travels in time. The distance is calculated in the following manner. Assume that at time zero (1992 in our case) a set of K firms \(k=1\ldots K\) firms import machine m. The next year new firms import the same machine m. We measure their distance to the closest firm of the existing set of k firms. If the new importers are in the same district as any of the previous k importers the distance can be assumed to be zero. An average of the distances calculated in this fashion will tell us how much a machine “travels” a year.
The distance is calculated for each year after the first import of a given machine m, always with respect to the original k firm. If the locations of the successive waves of imports are independent of the location of the pioneer importers distance should be uniform over time. Figure 5 shows that in years immediately after the first import followers are located closer on average than in later years. It shows that if new machine imports tend to be close to old ones within 3–4 years of the first import. Additionally, it also shows that investigation should cover the 15–30 km radius in addition to the very close peers. The 15–30 km radius can be considered to cover a group of settlements (an urbanized center) or a micro-region.
All-in-all, these results are consistent with the idea that machine imports exhibit peer effects and learning takes place in a rather limited geography, even allowing time for information spillover.
One idea behind the spillover effects, as mentioned above, is that peer effects can lower the fixed cost of importing for following firms and as a consequence relatively lower productivity firms can catch-up. This would suggest that firms that import are more productive than the ones that follow. Table 5 tests this idea and compares firm productivity by the relative timing of the machine imports. The baseline group consists of firms that import machine 5 years or later than the pioneer. The pioneer is the firm that imports a given machine within a given distance first. The first column compares importer firm to national pioneers. The remaining columns make comparison between the firm and local pioneers: pioneers within 30 km distance, within 15 km distance and within the 1 km neighborhood. Consequently, the initial sample size containing all firm-pairs decreases as the size of the neighborhood shrinks.
Results show that pioneer importers are always more productive than followers, especially more than those that import 5 years or later. This is a common finding across all distances we look at, but in some cases results are more pronounced. For example, in the first column, where firms are compared in productivity with respect to their time lag to the country level pioneer, pioneers are 170% more productive than firms that follow tem five or more years later. Even firms that follow 1 or 2 years later or firms that follow 3–4 years later are more productive, by about 45% each.
Let us consider local pioneers only—firms that are pioneer (imports a machine m first) within a certain distance. When the analysis is restricted to comparing follower firms to local pioneers, the differences are smaller but still robust, even in the case of the smallest distance examined. When firms within 1 km of each other are compared, the productivity premium of the pioneers is 51% that of the firms lagging behind by 1 or 2 years has only 25% productivity advantage over the base group. Eventually, there does not seem to be significant productivity difference across firms that import the machine 3–4 years or 5 or more years later than the local pioneer.
Table 5 Relative productivity advantage of machine importer pioneers Peer effects in import decision
This work focuses on understanding the drivers of machine selection—comparing choices within the firm. Before we turn to our main results, we take a look at the basic question of how local spillovers could affect the choice to become a machine importer at all—whether firms with local experienced peers are more likely to import machines.
In Table 6 we look at the probability that a firm imports any machine from its choice set depending on the local presence of past importers. We focus only on core machines, which specifies the peers to be same sector importers. We look at three cross-sections and allow the dependent variable to take on the value one if the firm imports for the first time during a period of 3 years (1994–1996, 1997–1999 or 2000–2002). In each period we regress the import dummy on four indicator variables separately which measure the existence of past imports at various distances.
Table 6 Propensity to import any machine Results in Table 6 suggest that firms with local peers are more likely to import a core machine. Compared to the baseline probability of machine import, an average of 11% in the examined years, peer presence suggests an over 30% increase. We also find that the correlation is higher the smaller the distance at which peer presence is measured.
In this specification peer presence means the existence of previous firms that have imported any core machines. Past importer could have not actually imported machine m, but another one from the set. Hence, these findings are more of an indicative nature.