Computers and the Humanities

, Volume 30, Issue 5, pp 381–392

Integrating nineteenth-century Canadian and American census data sets


  • Lisa Y. Dillon
    • Ph.D. Program, Department of HistoryUniversity of Minnesota

DOI: 10.1007/BF00054021

Cite this article as:
Dillon, L.Y. Comput Hum (1996) 30: 381. doi:10.1007/BF00054021


The comparative use of census data is a useful way to study social characteristics across national boundaries. However, truly comparative demographic history is not possible without fully integrating separate census data, uniting multiple data files with a common set of comparably coded variables. This paper describes the integration of the 1871 Canadian census public use sample with similar samples of the 1850 and 1880 American censuses to form the Integrated Canadian-American Public Use Microdata Series (ICAPUMS). These data sets lent themselves well to integration because of their strong similarities in sampling design, data collection and data organization. Consistency in the availability and treatment of variables also eased integration of the samples, although the harmonization of occupation variables presented significant challenges. The ICAPUMS features a general household relationship variable which allows us to examine household structure across the two countries and three years. The paper concludes by proposing some general principles of census data set integration. This integrated data set is now available to researchers on the website of the University of Minnesota Historical Census Projects (

Key words

comparative demographic history census data set integration ICAPUMS IPUMS coding schemes Canada United States



Public Use Microdata Sample


Integrated Canadian-American Public Use Microdata Series


Integrated Public Use Microdata Series

Copyright information

© Kluwer Academic Publishers 1997