Data Sources and Methodology
Macroeconomic data were taken from the February 2006 version of the IMF's International Financial Statistics
. Bank-level data were downloaded from the February 2006 version of Bankscope16
and cleaned up by carefully matching bank identities and deleting duplicate entries, as well as the entries with possible measurement errors. The Bankscope data set was complemented with confidential supervisory data on the composition of bank loans obtained from the central banks of all CEECs, except Latvia and Hungary, as well as data on bank ownership from various sources, such as Euromoney
and banks' websites. Details on the coverage and compatibility of different components of the data set are also presented below. Tables A1
present the summary statistics for the final dataset. The definitions of variables and units of measurement for bank-level and macroeconomic data are presented in Table A3
Summary statistics by country
Matching bank identifiers: Bankscope uses a unique identifier for each bank. This identifier remains unchanged when the bank's name changes and sometimes even when the bank is merged with or acquired by another bank. Only if a merger or an acquisition intrinsically changes the bank is a new identifier assigned to the new bank. Data for the banks operating in central and Eastern Europe during 2002–2004 were first downloaded using the February 2006 update of Bankscope. The data were then merged with the historical dataset provided by Ugo Panizza, using the unique identifiers and cross-checking based on the 2002 data.
Avoiding duplications: Bankscope includes both consolidated and unconsolidated balance sheet data. When both are available for the same bank, a different identifier is assigned to each type of data. Moreover, at the time of mergers, the banks involved might stay in the dataset along with the merged entity. To make sure that observations are not duplicated for the same bank, the following procedure was applied to include information from only one of the balance sheets. First, using the ‘rank’ variable in Bankscope, which ranks the banks within a country, nonranked banks were dropped to avoid duplications. However, a second step was necessary to make sure that the duplication was not due to a merger event. If a bank was not ranked but had assets greater than the country average, its history of mergers and acquisitions was examined carefully. Next, the premerger banks were reranked to ensure that they were included in the dataset, and the postmerger banks were deranked to exclude them from the premerger period. Many such banks had both consolidated and unconsolidated balance sheets. To be able to identify individual banks, the unconsolidated data were preserved when both balance sheets were available. If unconsolidated data were unavailable, consolidated data were used to avoid dropping the banks from the sample.
Excluding outliers: To ensure that the analysis is not affected by potential measurement errors and misreporting, about 4% of the observations on the tails of the distributions of the two main variables (bank-level credit growth and DD) were dropped.
Coding ownership: Bankscope does not provide historical information about bank ownership; it provides only the share held by foreign and public investors in the current year. Thanks to extensive work by Micco et al. (2004), the historical ownership data up to 2002 were available for the study. While extending the time coverage to 2004, the most recent ownership information from Bankscope data on central and Eastern European banks was obtained. This information was complemented with information from banks' websites and Bankscope data on parent banks to update ownership information for 2003 and 2004.
Merging in loan breakdowns: The central banks in six of the eight countries included in the study provided bank-by-bank data on the composition of loans, as collected by supervisory authorities. The data covered the period from 1995 to 2005 (except in the Czech Republic, where the coverage was from 2000 to 2005) and broke down total loans into (i) loans to households in local currency, (ii) loans to corporates in local currency, (iii) loans to households in foreign currency, and (iv) loans to corporates in foreign currency. For confidentiality reasons, most countries were unable to disclose the identity of the banks. Banks from the supervisory dataset and from the Bankscope dataset were matched using data on total loans and total assets. To reduce the likelihood of measurement errors and ensure data consistency, dummy variables identifying banks with rapidly growing household and foreign-currency portfolios, rather than actual data on household and foreign-currency loans, were used.