Skip to main content

Advertisement

Log in

The rise of cross-national survey data harmonization in the social sciences: emergence of an interdisciplinary methodological field

  • Published:
Quality & Quantity Aims and scope Submit manuscript

Abstract

Cross-national survey data harmonization combines surveys conducted in multiple countries and across many time periods into a single, coherent dataset. Methodologically, ex post survey data harmonization is especially complex because it combines projects that were not specifically designed to be comparable. We examine the institutional and intellectual history of nine large scale ex post survey data harmonization (SDH) projects in the social sciences from the 1980s to the 2010s. An interdisciplinary methodological field of SDH slowly emerges, facilitated in part by a partnership between academia and government and from the coordinated contributions of social scientists, survey methodologists and computer scientists. While there has been a learning process, it is in terms of accumulated practicalities, and not with the coordination or institutional apparatus one would expect from a 30 year effort.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. We are thankful for the anonymous reviewer who provided the above insights.

  2. http://www.lisdatacenter.org/wps/liswps/1.pdf. Accessed February 7, 2014.

  3. http://www.lafollette.wisc.edu/facultystaff/smeeding/smeeding-timothy-cv.pdf. Accessed February 7, 2014.

  4. http://www.lisdatacenter.org/wps/liswps/12.pdf. Accessed February 7, 2014. Harmonization speeds forward via technological advance. Since its inception, LIS faced a major problem in supplying its data to interested researchers. By 1987, LIS began to solve the problem of data access through the then new system called BITNET, described as “an electronic mail and file transfer network” that linked about 400 academic and research institutions around the world (Rainwater and Smeeding 1987, p. 9). It was the early internet.

  5. Data had to be available for 1979, the baseline for all datasets. They transgressed that rule by including 1981 data from Germany and Canada, which they justified on practical grounds (see Smeeding et al. 1985: Footnote 3, p. F-1).

  6. “…(as measured by response rates and other indicators of nonsampling error)” (Smeeding et al. 1985, p. 4). They begin to discuss data quality in detail on p. 18.

  7. Household, family or both. This problem required some harmonization of what “household” versus “family” means.

  8. Classifications Newsletter, Number 27, August 2011, p. 5. ISCED was not updated again until 1997, and then again in 2011. http://www.uis.unesco.org/Education/Documents/UNSD_newsletter_27e_ISCED.pdf. Accessed February 7, 2014. Sometime afterward, LIS began to also use ISCED http://www.lisdatacenter.org/wp-content/uploads/standardisation-of-education-levels.pdf. Accessed February 7, 2014.

  9. The LIS harmonization process is described here: http://www.lisdatacenter.org/wp-content/uploads/our-lis-documentation-harmonisation-guidelines.pdf. According to the LIS Guidelines (p. 12): “Socio-demographic variables… are all individual level variables and report the major socio-demographic characteristics of the household members.” They are: Living arrangements, demographics, immigration, health, and education. LIS’ wealth data also has harmonization guidelines, but as of this writing, they are not available on the website. Instead, they have a document on “behavioral variable mapping” http://www.lisdatacenter.org/wp-content/uploads/2011/02/behavioural-variable-mapping-2011-03.pdf.

  10. MTUS has time use data from the 1960s to the present.

  11. Burkhauser and Lillard (2005, p. 10).

  12. It has since been dated back to 1970 http://www.human.cornell.edu/pam/research/centers-programs/german-panel/cnef.cfm. Accessed February 11, 2014.

  13. Frick et al (2007, p. 630) write: “The original core variables to be harmonized were income and demographic characteristics of respondents to the PSID and the German Socio-Economic Panel SOEP, and reflects the objectives of the original project that motivated the creation of the CNEF to compare and understand income-based inequality and income mobility in the US and Germany.”

  14. “The effort to harmonize existing panel studies share one significant organizational feature: active researchers conceived, planned, and carried out how the data would be harmonized. While data managers, some in government statistical agencies, were often involved in the process, it was researchers who decided how to define equivalently the variables of interest.”

  15. Burkhauser and Lillard (2005, p. 12): “Researchers guided by theory and concepts flowing from the research pertinent to the object of their studies are best able to make the assumptions necessary to harmonize data across countries.”

  16. We are thankful to one of the reviewers who formulated these four principles of sustained SDH success.

  17. Again, we are thankful for the reviewer who pointed this out to us.

  18. For more discussion on the shortcomings of ECHP, see Burkhauser and Lillard (2005, pp. 14–15); CHER 2014, pp. 6–7; and undated CHINTEX document, p. 5; for more information on ECHP (2011), see the EuroPanel Users Network. http://epunet.essex.ac.uk/echp.php.html. Accessed February 13, 2014.

  19. https://www.destatis.de/DE/Methoden/Methodenpapiere/Chintex/ResearchResults/FinalConference/Downloads/Hahlen.pdf;jsessionid=7E7753BEE731A012B3D321FA73D189F0.cae3?__blob=publicationFile. Accessed February 13, 2014.

  20. “Furthermore, the project investigates important hypotheses about the data quality of panel surveys (non-response, reporting errors and panel effects) which are of general interest for survey statisticians.” (CHINEX data unknown, p. 1).

  21. https://www.destatis.de/DE/Methoden/Methodenpapiere/Chintex/ResearchResults/FinalConference/Einfuehrung.html. Accessed February 14, 2014.

  22. See CROS, a website maintained by the European Commission. http://www.cros-portal.eu/content/chintex. “The CROS Portal is a content management system based on Drupal and stands for "Portal on Collaboration in Research and Methodology for Official Statistics". http://www.cros-portal.eu/page/about-cros-portal. “The European Commission maintains this website to enhance public access to information about its initiatives and European Union policies in general.” http://www.cros-portal.eu/page/legal-notice.

  23. “One objective of this chapter is to establish a simple framework of harmonisation which is useful to pinpoint the research issues of CHINTEX and to highlight the differences to other issues of harmonisation”.

  24. https://www.h2.scb.se/tus/tus/introduction1.html. Accessed February 14, 2014.

  25. http://ec.europa.eu/research/social-sciences/projects/010_en.html. Accessed February 13, 2014. According to CEPS’ website, “CEPS/INSTEAD is a centre of reference for research in the social and economic sciences in the Grand Duchy of Luxembourg, a public institution under the jurisdiction of the Ministry of Higher Education and Research.” http://www.ceps.lu/?type=module&id=53. Accessed February 13, 2014.

  26. There is no official information about the fate of CHER. From existing documents, it seems to have ended in 2003 when they submitted their final report to the European Commission. To get more information, and since CHER was a CEPS project, I emailed a colleague to ask when CHER ended. They wrote, “The project has been on for a while as an experiment to test whether various national level panel studies could be homogeneized and merged in one big dataset. I don't know exactly the date when the CHER was closed. Probably it was 2003, but the data it contains should be older, probably dating to 1999. As a matter of fact, it was a nice data-set but I've never used it because it was too old. Not by chance only 16 papers have been produced with CHER.” Anonymous, Personal Communication, February 13, 2014.

  27. As of April 2006, EPUnet “is now finished,” according to its website. “Jean-Marc Museux from EUROSTAT has produced three papers on EU-SILC, the successor to the ECHP”. http://epunet.essex.ac.uk/view_news.php%3FID=36.html. Accessed February 13, 2014.

  28. “The EU-SILC project was launched in 2003 on the basis of a "gentlemen's agreement" in six Member States (Belgium, Denmark, Greece, Ireland, Luxembourg and Austria) and Norway. The start of the EU-SILC instrument was in 2004 for the EU-15 (except Germany, the Netherlands, the United Kingdom) and Estonia, Norway and Iceland.” http://epp.eurostat.ec.europa.eu/portal/page/portal/microdata/eu_silc. Accessed February 13, 2014.

  29. Most SDH is about economics and was run by economists. Only the TUS, ISMF, GBS and now, the Harmonization Project, are not explicitly about economics; they are run by sociologists and political scientists.

  30. Schröder and Ganzeboom (2013), “Measuring and Modeling Levels of Education in European Societies”. European Sociological Review.

  31. Of the founding date, the only evidence we’ve found is that, in 1993, Ganzeboom and Treiman published a paper in which they reference the ISMF (Ganzeboom and Treiman 1993, p. 470). In a book chapter in the International Handbook of Sociology (2000) Treiman and Ganzeboom describe the history of social stratification and mobility research. They presented a clear trend toward cross-national surveys and attempts at standardizing measurement of stratification variables (see also Ganzeboom and Treiman 2000).

  32. http://www.harryganzeboom.nl/ISMF/index.htm. Accessed February 14, 2014.

  33. Ganzeboom and Treiman (2000) Ascription and achievement after career entry. Paper presented at the 2000 meeting of ISA’s RC 28.

  34. http://ccsg.isr.umich.edu/harmonization.cfm. Accessed February 7, 2014.

  35. http://ccsg.isr.umich.edu/pdf/13DataHarmonizationNov2010.pdf. Accessed February 7, 2014.

  36. This is a problem with social science in general: Social researchers rarely keep good records on the research process and are reluctant—for whatever reasons—to share enough the ups-and-downs of their scientific pursuit.

References

Download references

Acknowledgements

This work is supported by the project “Democratic Values and Protest Behavior: Data Harmonization, Measurement Comparability, and Multi-Level Modeling in Cross-National Perspective”, funded by the National Science Centre, Poland, under the grant number 2012/06/M/HS6/00322. We thank Kazimierz M. Slomczynski and Marta Kolczynska for their comments on an earlier draft, and Dean Lillard for advice on the citation of harmonized data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joshua Kjerulf Dubrow.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dubrow, J.K., Tomescu-Dubrow, I. The rise of cross-national survey data harmonization in the social sciences: emergence of an interdisciplinary methodological field. Qual Quant 50, 1449–1467 (2016). https://doi.org/10.1007/s11135-015-0215-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11135-015-0215-z

Keywords

Navigation