In the following discussion, based on the information included in the country-specific analyses in this Special Issue (Careja & Bevelander, 2018; Salentin & Schmeets, 2017; Serrano Sanguilinda et al., 2017), we have extracted few basic dimensions which allow us to build a broad picture of the sampling frames available. The analyses contain much more information and details relevant for researchers interested in survey-based studies in these countries. Therefore, we strongly recommend scholars to read the articles. The aim of this concluding article is to identify possibilities to sample immigrant minorities in a way which would support comparative research. As probability samples are regarded the cornerstone of sound scholarly survey research, our discussion about the sampling strategies leading to comparable high-quality samples of immigrant minorities will focus on the possibilities to obtain probability samples in the six countries.
From the framework discussed in “A frame of comparison: definitions, identification strategies, and sampling procedures” section, we have derived four larger questions used to organize the available information: (i) What are basic concepts and terminological definitions in the national discourse on migration? (ii) How can the target population of immigrant minorities be identified? (iii) Given the importance of probability samples, do population registers exist that identify the target population and allow scholars to develop a sampling frame? (iv) If such register information is not available, incomplete, unreliable, or not accessible for scientific research, what other sampling designs have been used? In the following section we summarize the main conclusions from the country analyses. Additional information can be found in Table 1, where the six countries are arranged from the left to the right, with countries in which the ideal case of a register-based probability sample is feasible to the left (Denmark, Sweden, The Netherlands, Spain), and countries in which the ideal case is difficult to obtain (Germany) or impossible (Italy) to the right.
Basic concepts and definitions
Foreign citizenship is the identification criterion present in all six countries’ official statistics and government publications. But depending on naturalization rates and citizenship law, this criterion does not allow the identification of all persons with immigrant background. Naturalization of first-generation immigrants depends on residence status, residence duration, language proficiency, etc. But if it comes to children of those immigrants, it becomes quickly quite complicated. For example, while Germany traditionally applied ius sanguinis for new-born children, it introduced ius soli for the second generation of immigrants in 2000, which grants them two citizenships until adult age and individuals have to choose one of the two until age 23.Footnote 7 In Denmark, to mention another example, it depends on place of birth and citizenship of the parents, with the mother’s citizenship being somewhat more important. It should also be noted that registers may record only one citizenship if an individual has several, and may not record their former citizenship if the individual acquires the host country’s citizenship. Moreover, it is important to acknowledge that there are individuals entering the country who already have acquired the country’s citizenship or who acquire it more or less automatically, because they themselves or their ancestors emigrated out of the country in former years. Important examples in this respect are German (late) emigrants (“(Spät)Aussiedler”)) from the former eastern European countries, which after the fall of the Iron Curtain migrated to Germany; or the children of Jewish emigrants from Germany due to the Holocaust, for whom it has become attractive to own a second (“European”) citizenship in addition to their first one (Harpaz, 2013, 2015). While the latter do not necessarily immigrate into Germany and rather use the German passport for easier travelling, the former certainly do.
Country of birth provides a much more comprehensive account of the immigrant population, as the numbers in Fig. 4 suggest. However, with respect to official statistics and government publications, foreign born is only used in The Netherlands, Sweden, and Spain. Denmark distinguishes between persons of Danish origin, immigrants, and descendants of immigrants, with immigrants identified by country of birth. Germany uses the term migration background to identify foreign born individuals and their children, but this information is based on samples (the German “Mikrozensus”) and does not come from registers or census data. Finally, foreign born is not a term in Italian official statistics and government publications, which only acknowledge foreign citizen.
The Netherlands and Denmark also differentiate their information with respect to first and second generations of immigrants, i.e., between immigrants and their descendants. In principle, a similar distinction can also be made in Germany, which differentiates individuals with migration background whether they have or have not own migration experiences. The later ones are those who have been born in Germany (and hence, not migrated to Germany). Nevertheless, this group does not comprise the whole second generation, because there are children of immigrants that have been born outside Germany. In Sweden, identification of children of immigrants based on official statistics is more difficult, as the Swedish-born category also includes children of Swedish ethnic origin.
This brief overview underlines that any comparative sampling endeavor must take into account the fact that the officially used terms are not perfectly overlapping, and therefore a careful preliminary mapping of the population of interest as captured by country-specific terminology must be undertaken.
Available registers and coverage
As Table 1 in the Appendix shows, all six countries have population registers, either at the national or the local (municipal) level which, in principle, include the aforementioned immigrant groups as long as they hold a valid residence permit. Local registers exist in Germany (“Melderegister”) and in Italy (“Anagrafe”), while registers in The Netherlands (BRP – Basic Population Register), Denmark (“Det Centrale Person Register”), Sweden (“Folkbokförd”), and in Spain (“Padrón”) are maintained at the national level, even though the data may come from local authorities. As will be discussed below, identification problems are smallest for first-generation immigrants, because many of them can be identified by their foreign citizenship or country of birth, while second generation immigrants, foreign-born citizens, refugees, or asylum seekers are more difficult to identify either because they do not yet have a residence permit or because the citizenship criterion does not work with foreign-born citizens and naturalized second generation immigrants. Refugees and asylum seekers are kept in separate registers in some countries, and hence may be identifiable, for example, in The Netherlands, Denmark and Sweden, but not in the other countries.
Needless to say, sampling frames are much more easily derived from centralized registers. If registers are available only at the local level, one has to draw a sample of localities and then, from those localities, draw samples from local registers.Footnote 8 These multi-stage cluster samples are (in statistical terms) less efficient than simple random samples and moreover, need more resources for their implementation.
After arrival (usually within few weeks or months) and given a minimum duration of residence, registration is compulsory in all countries except in Italy. However, incentives for registration vary significantly between countries and possibly are enforced differently. For example, in Germany, fines can be imposed for infringement, but is rarely enforced due to government authorities’ lack of resources to control unregistered residents. In the other countries, access to welfare benefits and services is contingent on being a legal resident, and non-registration may have negative consequences. For example, in Sweden and Denmark each legal (i.e. registered) resident has their own personal identification number, which makes opening a bank account and accessing daily public services easier. In Spain, registration is mandatory for accessing basic services, such as primary health care and education. There are also public campaigns for registration because municipal budgets depend on the number of registered residents. However, statistical information on the amount of undercoverage is missing in almost all countries, and it is difficult to estimate. Given the voluntary inscription rules in Italy, it is reasonable to expect that immigrants are strongly underrepresented in Italian population registers compared to registers in the other countries.
Compared to the risk of undercoverage, overcoverage seems to be the larger problem because there are hardly any incentives for deregistration if individuals leave their place of residence and this risk is especially high for mobile persons such as immigrants. Countries implement various strategies to “clean” their records, with varying rates of success. Table 1 in the Appendix shows some scattered evidence of the amount of overcoverage for some countries. In principle, this information should also be available from the last census round in the EU, but systematic research is missing here.
The discrepancy between the out-movement of immigrants and the register information at a given point in time can be problematic for obtaining a representative sample of immigrants. These problems are likely to increase if out-migration is not random, which is a reasonable expectation. In the EU context, immigrants of EU origin can more easily move across borders, and this increased mobility means that they are more likely than other groups to have left the country but failed to remove themselves from the register.
Content of the registers
Parallel to the issues of under- and overcoverage, the content of the registers is of crucial importance. Are all the necessary variables (see Fig. 2) included in order to identify different groups of migrants, to stratify sampling units according to important socio-demographic characteristics (say, age and gender), or to apply alternative sampling strategies, such as identification strategies based on names? Moreover, how complete and reliable is this information? And finally, is all this information accessible to scholars from academia?Footnote 9
For each of the six countries, the authors of the country-specific articles rated the available variables with respect to their completeness and reliability (see Table 1 in the Appendix). As previously discussed, it is not surprising that the citizenship criterion is available in all countries, has a low amount of missing data, and the existing information is highly reliable. The same is true for possible stratifiers such as age and gender. However, already when it comes to additional citizenships in order to (at least partly) identify naturalized immigrants, registers in five of the six countries do not include this information. Country of birth is available with similar quality (few missings; existing information highly reliable) in all countries, except in Italy. In Germany, however, the amount of missing information related to country of birth is higher than for citizenship and in some municipalities country of birth is withheld by the local authorities. Hence, identifying foreign-born individuals is the least problematic only in The Netherlands, Denmark, Sweden, and Spain. The same is true if one wants to differentiate immigrants according to their time of residence in the host country. Date of arrival is not available in Germany and Italy, only for the other four countries (in Spain a special petition is needed to access this information). In order to identify second generation immigrants, information on their parents is needed. These links are only available in the countries with a long tradition of register-based statistical accounting (The Netherlands, Denmark, and Sweden) and for minors with a migration background in Germany, but not in Italy and Spain. Finally, identification of immigrants by names is problematic in of itself. Inherent problems aside, name identification is also difficult to apply in a comparative sampling design because of data-protection regulations in some of the six countries.
Accessibility and statistical infrastructure
Table 1 in the Appendix also gives information on the accessibility of the registers for scholars from academia and on the national statistical infrastructure that researchers can use to design their research, such as regular national reports or regular data sources (censuses, large-scale surveys) providing national statistics on migration and other relevant side information for sampling designs. Registers are available for scientific research in all countries except Italy, however, the countries differ widely in the conditions attached to access. For example, the research has to be in the public interest (Germany), supported by the public administration (Spain), or done in cooperation with a national (domestic) research institute or university (The Netherlands, Denmark, Sweden). Certainly, these conditions are difficult to fulfill for foreign researchers and hence, cooperation or affiliation with national researchers is always useful, if not necessary. Linkage with other registers is also possible (and available for researchers) in the countries with a long tradition of register-based statistical accounting (such as Denmark or Sweden), but not in Germany with its historical reluctance towards centralized and linked registers after the experience of the Third Reich. In Italy, only the national statistical office ISTAT has the possibility to link information in registers.
Other registers and sampling methods
Finally, Table 1 in the Appendix mentions some other registers and sampling methods that have been or could be used in the six countries to sample immigrants. Electoral, foreigner, or telephone registers exist, but are not accessible for scientific research (e.g., the foreigner register in Germany); have severe coverage problems (telephone registers in all countries), or are not a viable alternative because they are based on the population registers (electoral registers). Screening the whole population for immigrants by area-based sampling methods or random dialing of telephone numbers is feasible in principle, but in practice is connected with excessive search costs because immigrants are still a minority. Costs can be reduced in area-based sampling methods due to the fact that immigrants tend to concentrate in particular geographical units. However, using the geographically clustered nature of the target population increases the risk of overlooking immigrants outside the clusters and decreases the efficiency of estimates based on these samples. However, this cost-reducing method cannot be applied to samples based on other than geographical units, such as blocks of telephone numbers, and even more importantly, due to prepaid and foreign mobile phones, the total population of telephone numbers is often unknown. If all these registers and sampling methods fail, respondent-driven or centre-based sampling techniques are the only practical alternatives.