4.1 Overview of PIAAC Data Available for Secondary Analysis

Based on the PIAAC data, diverse interdisciplinary questions—such as social inequality, competency and ageing issues, and the role of digitalisation—can be investigated in an internationally comparable way, thereby addressing genuine political demands. PIAAC data contain information about basic skills (literacy, numeracy, and problem solving in technology-rich environments) that are considered to be prerequisites for understanding specific domains of knowledge in a broad range of contexts, from education through work to everyday life. Furthermore, the PIAAC data include a wide range of information on variables, such as social background, and engagement with literacy, numeracy, and information and communication technologies (ICTs) that influence the development and maintenance of skills. The data also include information on respondents’ current activity, employment status and income, and generic skill use in the workplace (e.g. social skills, manual skills). In addition, PIAAC includes questions on health status, volunteering, political efficacy, and social trust (OECD 2014).

The number of publications that refer to PIAAC has increased strongly in recent years (for an overview, see Maehler et al. 2020). Questions addressed by these publications include, for example:

  • To what extent does the educational attainment acquired through formal education predict literacy skills needed in daily life?

  • What is the relationship between skills and labour market outcomes in terms of wages and employment chances?

  • Who participates in further education, and why?

  • How should the (forced) migrant population be covered in future surveys?

  • How is test taking (dis)engagement related to cognitive ability or item difficulty?

  • To what extent are non-cognitive skills (e.g. the Big Five) related to cognitive skills such as literacy?

The present chapter provides an overview of the PIAAC datasets available worldwide (see Table 4.1).Footnote 1 It differentiates the available datasets in terms of their accessibility, the extent of the information provided, the population group in focus, and the design of the underlying study. For example, PIAAC Public Use Files are accessible mainly for public purposes and are therefore highly anonymised, whereas PIAAC Scientific Use Files and Restricted Use Files provide access to more detailed variables and are available only for scientific research purposes after signing a data use agreement, as they may contain individually identifiable information that is confidential and protected by law. By contrast, public use files are freely available and integrated in data analysis web tools (see Chaps. 5 and 6 in this volume), which take the complex study design into account and allow international comparisons to be made without advanced knowledge of statistical programmes. Scientific use files are provided mainly by the statistics centres or research data centres of the respective countries. For data protection reasons, access to scientific use files is subject to the conclusion of a data use agreement, and sophisticated statistical knowledge is required for their evaluation (see, e.g. Chaps. 7, 8, 9, and 10 in this volume). The present chapter also presents PIAAC datasets that focus on specific population groups—for example, the population of 66- to 80-year-olds in Germany (Friebe et al. 2017) and the incarcerated adult population in the United States (Hogan et al. 2016a). Regarding the design of the underlying studies, although some longitudinal data exist, most available datasets are cross-sectional. All datasets reported in this chapter are listed in the reference list.

The most datasets presented in what follows can be merged using the respondent ID in order to perform cross-national analyses. When merging the datasets for the various countries, the variables SEQID and CNTRYID_E should be used as identifiers. Although SEQID is a unique identification key within each country dataset, it is not unique across countries. Thus, an identifier combining both variables must be created. Variable labels are identical throughout all PIAAC Public Use Files. Labels in the PIAAC Scientific Use Files (e.g. the German Scientific Use File) may differ in the case of variables that include country-specific information when categories are collapsed for data protection reasons (e.g. CNT_CITSHIP). Therefore, in order to avoid loss of information, care must be taken when merging datasets. The International Database (IDB) Analyzer can also be used to merge PIAAC datasets (see also Chap. 6 in this volume).

The datasets are presented in this chapter in the order in which they appear in the columns in Table 4.1, beginning with the public use files and ending with the description of the PIAAC datasets on non-cognitive skills. The datasets of the countries within the different dataset groups are presented in alphabetical order.

4.2 PIAAC Public Use Files

File Description

The PIAAC Public Use Files contain information on the respondents’ background and on their cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments).

Mode of Data Collection

Face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

In each participating country, the sample comprised approximately 5000 adults aged 16–65 years.Footnote 2

Format and Access

The PIAAC Public Use Files (see OECD 2016d to OECD 2016gg; OECD 2019b to OECD 2019g)Footnote 3 containing individual unit record data are freely available and accessible for downloading in SAS, SPSS, and CSV format (https://www.oecd.org/skills/piaac/data/) for each of the countries that participated in the Survey of Adult Skills in 2011–2012 (Round 1 of the first cycle: 24 countries), 2014–2015 (Round 2 of the first cycle: nine countries), and 2017 (Round 3 of the first cycle: six countries). A do-file to import CSV into Stata is also available.

The Australian PIAAC Public Use File is not available on the OECD website. However, researchers can apply to the Australian Bureau of Statistics for data accessFootnote 4,Footnote 5 or use the International Data Explorer to analyse the Australian data (see Chap. 5 in this volume).

Table 4.1 PIAAC datasets by country and year of assessment

The Cypriot PIAAC Public Use File (Michaelidou-Evripidou et al. 2016) is available for downloading in SPSS and Stata format at the GESIS Data Archive.Footnote 6 The US PIAAC Public Use File (Holtzman et al. 2014a) is also downloadable in SPSS, SAS, and ASCII format at the National Center for Education Statistics.Footnote 7 The Canadian PIAAC Public Use File (Canadian Public Use Microdata File/PUMF)Footnote 8 is also provided by Statistics Canada.Footnote 9

For cross-national analyses, the public use files can be merged using the respondent ID. The International Database (IDB) Analyzer can also be used to merge the PIAAC datasets (see Chap. 6 in this volume).

Documentation

Information on the methodology, design, and implementation of PIAAC can be found in the technical reports on the study (OECD 2014, 2016a; see also Chap. 2 in this volume) and in the results reports (OECD 2013, 2016b, c). An international master questionnaire is available for downloading at the OECD PIAAC Data and Tools webpage.Footnote 10 The questionnaires in the country-specific languages are also available on that webpage, as are an international codebook and a derived variables codebook.

4.3 PIAAC Log Files

File Description

The log files from the PIAAC study provide information on how participants processed their answers (OECD 2019a). During the PIAAC assessment 2011–2012 (Round 1 of the first cycle), user interactions with the computer were logged automatically. This means that respondents’ actions (e.g. starting a unit, opening a webpage, entering an answer) within the assessment tool were recorded and stored with time stamps in separate log files. These log files contain paradata for each participant in the domains of literacy, numeracy, and/or problem solving in technology-rich environments. More information on the log files and their analysis is available in Chap. 10 of this volume.

Sample Description and Size

PIAAC log file data are available for 17 countries that participated in Round 1 of the PIAAC study (see Table 4.1). The sample in each participating country comprised approximately 5000 adults aged 16–65 years.

Format and Access

The log data from the PIAAC cognitive assessments are available as public use files (see OECD 2017a to OECD 2017q) and can be downloaded free of charge from the GESIS Data Archive Footnote 11 after registering on the corresponding webpage. The PIAAC log files are provided in their raw XML format. The files usually contain the complete log data for individual respondents. However, information that could potentially identify an individual respondent has been removed. The data can be matched with corresponding background and cognitive response data available in the PIAAC Public Use Files using the SEQID variable.

To help researchers to analyse log data, a customised analysis tool—the PIAAC LogDataAnalyzer—is available (access currently here: http://piaac-logdata.tba-hosting.de/download/). The tool includes functions such as data extraction, data cleaning, and the visualisation of the log data files. The tool can be used for some data analysis tasks as well as for the export of selected data to data files that can be used by other tools. Users can select variables for export. When doing so, predefined variables can be generated, for example: ‘Number of using cancel button’; ‘Time on task’; and ‘Number of page visits’.

Documentation

Information on the methodology, design, and implementation of PIAAC can be found in the technical reports on the study (OECD 2014, 2016a; see also Chap. 2 in this volume) and in the results reports (OECD 2013, 2016b, c). An overview of process data recorded in log files in the PIAAC study and how to use them can be found in OECD (2019a), in Chap. 10 of the present volume, and on the PIAAC Log Data Documentation webpageFootnote 12. The aforementioned webpage provides, inter alia, information on released items, an overview of the interactions that are possible with the items, the corresponding log events, and the booklet order of the domains of cognitive assessment.

The documentation regarding released items is available for all users. Depending on the research question, the full documentation with information about non-released items may be required. As the full documentation contains information regarding non-released items, individuals who wish to obtain access must apply to the OECD and sign a confidentiality agreement.Footnote 13 The completed application form and the signed confidentiality agreement must be sent to the contact officer at the OECDFootnote 14. If the application is approved, the user will be provided with a username and password that will grant access to the full documentation online.

4.4 Extended PIAAC Data File Versions

This section describes extended national datasets that are available for Austria, Canada, Germany, Italy, New Zealand, and the United States. They contain additional information (e.g. some of the national adaptations) and/or more detailed information (e.g. age or income).

Extended data files are also available for Norway (see Norwegian Center for Research Data)Footnote 15 and Sweden (see Statistics Sweden).Footnote 16 However, rules of use in these countries are more restrictive (permitted only for researchers within the country), and information is available only in the language of the respective country. As the Norwegian and Swedish PIAAC data can be linked to administrative information, the datasets will be presented in Sect. 4.6 on the linking of PIAAC data to administrative data.

4.4.1 Austria

4.4.1.1 Extended PIAAC Public Use File for Austria

File Description

The Austrian PIAAC Public Use File (OECD 2016d) contains information on the respondents’ background and on their cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments). The Extended PIAAC Public Use File for Austria contains additional national education variables.

Mode of Data Collection

Face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

The sample comprised 5130 adults aged 16–65 years.

Format and Access

The dataset (Statistics Austria 2015) is available for downloading free of charge (in SPSS and Excel format) at Statistics Austria’s website.Footnote 17 The Extended PIAAC Public Use File for Austria can be merged with the PIAAC datasets of other participating countries in order to perform cross-national analyses. The International Database (IDB) Analyzer can be used to merge the PIAAC datasets (see Chap. 6 in this volume).

Documentation

Information on the methodology, design, and implementation of PIAAC can be found in the technical reports on the study (OECD 2014, 2016a; see also Chap. 2 in the present volume) and in the results reports (OECD 2013, 2016b, c). The international master questionnaire, the international codebook, and a derived variables codebook are available on the OECD PIAAC Data and Tools webpage. A German version of the background questionnaire is available for downloading at Statistics Austria’s website.Footnote 18

4.4.1.2 Scientific Use File PIAAC 2011/2012 for Austria

File Description

The Austrian PIAAC Public Use File (OECD 2016d) contains information on the respondents’ background and on their cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments). It excludes certain background variables (e.g. some of the national adaptations), and some variables were not released in all the available detail. The majority of the variables were suppressed or coarsened to comply with national data protection legislation. The Austrian PIAAC Scientific Use File includes many of the suppressed background variables. Furthermore, other variables (e.g. age and income) have been released in full detail.

Mode of Data Collection

Face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

The sample comprised 5130 adults aged 16–65 years.

Format and Access

The dataset (Statistics Austria 2014) is available in SPSS format and is accessible for academic research only. Researchers must sign an individual data distribution contract (in English or German) provided at Statistics Austria’s websiteFootnote 19. The data distribution contract must be signed by the project leader; key information (e.g. title, description, and duration of project) about the project and the user(s) must be provided. The data are delivered free of charge. Users are expected to make publications resulting from the research available to the data provider. The Scientific Use File PIAAC 2011/2012 for Austria can be merged with the PIAAC datasets of other participating countries in order to perform cross-national analyses. The International Database (IDB) Analyzer can also be used to merge the PIAAC datasets (see Chap. 6 in this volume).

Documentation

Information on the methodology, design, and implementation of PIAAC can be found in the technical reports on the study (OECD 2014, 2016a; see also Chap. 2 in this volume) and in the results reports (OECD 2013, 2016b, c). The international master questionnaire, the international codebook, and a derived variables codebook are available on the OECD PIAAC Data and Tools webpage. A German version of the background questionnaire is available for downloading at Statistics Austria’s website.Footnote 20

4.4.2 Canada

4.4.2.1 Canadian Public Use Microdata File (PUMF)

File Description

While the Canadian PIAAC Public Use File (OECD 2016f) contains information on the respondents’ background and on their cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments), the Canadian PIAAC Public Use Microdata File (PUMF; Statistics Canada 2013) contains additional national variables (e.g. education).

Mode of Data Collection

Face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

The sample comprised 26,683 adults aged 16–65 years.

Format and Access

The dataset (Statistics Canada 2013)Footnote 21 can be ordered free of charge (in SPSS and Excel format) at Statistics Canada’s website.Footnote 22 The PUMF can be merged with the PIAAC datasets of other participating countries in order to perform cross-national analyses. The International Database (IDB) Analyzer can be used to merge the PIAAC datasets (see Chap. 6 in this volume).

Documentation

English-language and French-language information on the methodology, design, and implementation of PIAAC can be found on the Canadian PIAAC website (http://www.piaac.ca). Furthermore, general information on the methodology, design, and implementation of PIAAC can be found in the technical reports on the study (OECD 2014, 2016a; see also Chap. 2 in this volume) and in the results reports (OECD 2013, 2016b, c). The PIAAC questionnaires (in English and French) can be downloaded at Statistics Canada’s website.Footnote 23 The international master questionnaire, the international codebook, and a derived variables codebook are available on the OECD PIAAC Data and Tools webpage. Furthermore, a Canadian Data Dictionary is available at the Canadian PIAAC website.

4.4.3 Germany

4.4.3.1 PIAAC Germany Scientific Use File (SUF)

File Description

The German PIAAC Public Use File (OECD 2016h) contains information on the respondents’ background and on their cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments). It suppresses certain background variables (e.g. some of the national adaptations), and some of the included variables have not been released in all available detail. Background variables were suppressed or coarsened to comply with national data protection legislation. The German PIAAC Scientific Use File includes many of these suppressed variables and releases other variables in full detail (e.g. age and income).

Mode of Data Collection

Face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

The sample comprised 5465 adults aged 16–65 years.

Format and Access

The dataset (Rammstedt et al. 2016a) is available in SPSS and Stata format for academic research only, after signing a data distribution contract (in English or German).Footnote 24 The data distribution contract requires the provision of key information about the project (e.g. title, description, and duration) and the users. The data can be used only during the time period specified by the contract. Users are charged a processing fee. The PIAAC Germany Scientific Use File can be merged with the PIAAC datasets of other participating countries in order to perform cross-national analyses (the procedure is described by Perry et al. 2017). The International Database (IDB) Analyzer can also be used to merge the PIAAC datasets (see Chap. 6 in this volume).

Documentation

Information on the methodology, design, and implementation of PIAAC in Germany can be found in the technical report on the study (Zabal et al. 2014) and in the results reports (OECD 2013; Rammstedt et al. 2013). The German background questionnaire is available in PDF formatFootnote 25 and in HTML format.Footnote 26 A codebook in Excel format and a study description are available at the GESIS Data Archive.Footnote 27Further documentation is also available on the PIAAC Research Data Center website.Footnote 28 Moreover, a User Guide (Perry et al. 2017) provides information necessary for conducting basic analyses using the corresponding PIAAC data.

4.4.3.2 PIAAC Germany Scientific Use File (SUF): Regional Data

File Description

This dataset provides detailed regional information that was excluded from the regular German PIAAC Scientific Use File due to national data protection legislation. Additionally available indicators include, for example, municipality code, classified size of the political municipality, and number of the sample point.

Mode of Data Collection

For the sample selection of the PIAAC study in Germany, the regional information was extracted from the official statistics of the Federal Statistical Office as of December 30, 2009 (Zabal et al. 2014).

Sample Description and Size

The sample comprised 5465 adults aged 16–65 years.

Format and Access

The dataset (Rammstedt et al. 2016b) is available in SPSS and Stata format and accessible for academic research only. For analyses, the data must be merged with the German PIAAC Scientific Use File (Rammstedt et al. 2016a) using the respondent ID. Use of these regional data is subject to special contractual provisions. Due to the sensitive nature of the data, special restrictions apply, and the data can be analysed only on-site at a guest workstation in the Safe Room at GESIS (contact: PIAAC Research Data Center).Footnote 29

Documentation

Information on the methodology, design, and implementation of PIAAC in Germany can be found in the technical report on the study (Zabal et al. 2014) and in the results reports (OECD 2013; Rammstedt et al. 2013). The German background questionnaire is available in PDF formatFootnote 30 and in HTML format.Footnote 31 A codebook in Excel format and a study description are available at the GESIS Data Archive.Footnote 32 Further documentation is also available on the PIAAC Research Data Center website.Footnote 33

4.4.3.3 PIAAC Germany Scientific Use File (SUF): Microm Data

File Description

The dataset contains contextual information that describes either the household or the neighbourhood of the respondents. This information was not included in the regular PIAAC Scientific Use File due to national data protection legislation. These spatial data are provided by microm Micromarketing-Systeme und Consult GmbH in Neuss, Germany.Footnote 34 The microm data available include more than 100 variables from the domains of sociodemographics and socio-economics, consumer behaviour, area and site planning, and strategic segmentation models. For example, variables contain information about the type of residential area, the number of private households and businesses, sociodemographic and socio-economic characteristics (e.g. unemployment, religious denominations, ethnic composition), mobility (e.g. population fluctuation), affinity towards fundraising, communications and print media, Sinus-Milieus®, and purchasing power at the level of street sections.

Mode of Data Collection

The microm data are compiled from several cooperation partners, with a focus on market research (e.g. public opinion), financial data (e.g. credit institutions), or institutions working with digital or IT data (e.g. telephone companies). The PIAAC survey collects the background information by means of a face-to-face interview (computer-assisted personal interview, CAPI); the assessment of skills in literacy, numeracy, and problem solving in technology-rich environments is computer-based or paper-based.

Sample Description and Size

The sample comprised 5465 adults aged 16–65 years.

Format and Access

The dataset (Rammstedt et al. 2017a, b) is available in SPSS and Stata format and accessible for academic research only. For analyses, the data must be merged with the German PIAAC Scientific Use File (Rammstedt et al. 2016a) using the respondent ID. Use of this dataset is subject to special contractual provisions. Due to the sensitive nature of the data, special restrictions apply, and the data can be analysed only on-site at a guest workstation in the Safe Room at GESIS (contact: PIAAC Research Data Center).Footnote 35

Documentation

Information on the methodology, design, and implementation of PIAAC in Germany can be found in the technical report on the study (Zabal et al. 2014) and in the results reports (OECD 2013; Rammstedt et al. 2013). The German background questionnaire is available in PDF formatFootnote 36 and in HTML format.Footnote 37 A codebook in Excel format and a study description are available at the GESIS Data Archive Footnote 38. Further documentation is also available on the PIAAC Research Data Center website.Footnote 39

4.4.4 Italy

4.4.4.1 PIAAC Italian Extended File

File Description

For Italy, an Extended PIAAC Public Use File contains additional national variables on respondent’s background—for example, regional information (macro region: North East, North West, Centre, South, Islands) and information on parents’ occupation (e.g. according to ISCO-08).

Mode of Data Collection

Face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

The sample comprised 4621 adults aged 16–65 years.

Format and Access

The Italian PIAAC Public Use File – Extended (INAPP 2018) is usually provided in SPSS format. However, on specific request, the dataset can also be provided in SAS or Stata format. Researchers or other interested persons must sign an individual data distribution contract (in English or Italian) provided by the Istituto Nazionale per l’Analisi delle Politiche Pubbliche INAPP Footnote 40. The agreement does not specify a data usage period. The data are provided free of charge. The Italian PIAAC Public Use File – Extended can be merged with the public use files of other participating countries in order to perform cross-national analyses. The International Database (IDB) Analyzer can be used to merge the PIAAC datasets (see Chap. 6 in this volume).

Documentation

Information on the methodology, design, and implementation of PIAAC can be found in the technical reports on the study (OECD 2014, 2016a; see also Chap. 2 in this volume) and in the results reports (OECD 2013, 2016b, c). An Italian-language version of the background questionnaire is available for downloading at the INAPP websiteFootnote 41. The questionnaire is also available on the OECD PIAAC Data and Tools webpage, as are an international codebook and a derived variables codebook.

4.4.5 New Zealand

4.4.5.1 PIAAC New Zealand Extended File

File Description

The New Zealand PIAAC Public Use File (OECD 2016w) contains information on respondents’ background and on their cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments). For New Zealand, an extended public use file is available with country-specific variables (e.g. education) and international variables (e.g. a continuous age variable) that were confidentialised or suppressed for the public use file version.

Mode of Data Collection

Face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

The sample comprised 6177 adults aged 16–65 years. The sample design included screening for two subpopulations, 16- to 25-year-olds and persons of Māori ethnicity. This supports more in-depth analysis by providing additional samples for these subpopulations. The total achieved sample sizes for the subpopulations were 16- to 25-year-olds, N = 1422, and Māori, N = 1146.

Format and Access

The extended New Zealand PIAAC Public Use File (Ministry of Education of New Zealand 2016) is provided in a range of formats (SPSS, Stata, and SAS) by the Government of New Zealand. The Ministry of Education makes this dataset available to researchers under a memorandum of understanding (MOU). The following webpage provides information on New Zealand’s participation in PIAAC: https://www.educationcounts.govt.nz/data-services/data-collections/international/piaac.Footnote 42 A data usage period is not specified by the contract. The MOU continues to apply while the researcher is using or retains the dataset. The data are provided free of charge.

The PIAAC New Zealand Extended File can be merged with the public or extended use files of other participating countries in order to perform cross-national analyses. The International Database (IDB) Analyzer can be used to merge the PIAAC datasets (see Chap. 6 in this volume).

Documentation

Information on the methodology, design, and implementation of PIAAC can be found in the technical reports on the study (OECD 2014, 2016a; see also Chap. 2 in this volume) and in the results reports (OECD 2013, 2016b, c). The international master questionnaire, the international codebook, and a derived variables codebook are available on the OECD PIAAC Data and Tools webpage. Furthermore, a data dictionary is available for the New Zealand national variables.

4.4.6 United States

4.4.6.1 US PIAAC 2012 Restricted Use File (RUF)

File Description

The US PIAAC Restricted Use File (RUF) contains information on respondents’ background and on their cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments) from the US PIAAC main study, for which data collection was completed in 2012. In addition to the variables in the US PIAAC Public Use File (NCES 2014-045REV; OECD 2016gg), the US PIAAC Restricted Use File contains detailed versions of variables (e.g. continuous age and earnings variables) and additional data (e.g. on race and ethnicity) collected through US-specific questionnaire routing. The data contain sensitive information, which is confidential and protected by US federal law.

Mode of Data Collection

Face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

The sample comprised 5010 adults aged 16–65 years.

Format and Access

The US PIAAC Restricted Use File (Holtzman et al. 2014b) is available in SPSS and SAS formats and accessible only for scientific research purposes and only in the United States. Individual researchers must apply through an organisation in the United States (e.g. a university or a research institution). The organisation must apply for and sign a contract prior to obtaining access to the restricted-use data. Depending on the type of organisation, this contract takes the form of a restricted-use data licence or a memorandum of understanding (MOU).Footnote 43 The application must be submitted via an online application systemFootnote 44. Key information must be provided about the project (e.g. title, description, and duration) and the user. The data can be used only during the time period specified by the contract. Users are charged a processing fee and are expected to make publications resulting from the research available to the data provider.

Documentation

Information on the methodology, design, and implementation of PIAAC can be found in the technical reports on the study (OECD 2014, 2016a; see also Chap. 2 in this volume) and in the results reports (OECD 2013, 2016b, c). In addition, specific information on the methodology, design, and implementation of PIAAC in the United States can be found in the technical report on the study (Hogan et al. 2013) and in the results reports (Goodman et al. 2013; OECD 2013). An English-language and a Spanish-language background questionnaire (HTML format) are available for downloading at the National Center for Education Statistics website.Footnote 45 The US codebook and background compendium are provided together with the data.

4.4.6.2 PIAAC 2012/202014: US National Supplement Public Use Data File (PUF) – Household

File Description

The PIAAC 2012/2014 US National Supplement Public Use Data Files – Household (Holtzman et al. 2016a; NCES 2016667REV) contain information on respondents’ background and on their cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments) from the first and second US PIAAC data collections completed in 2012 and 2014, respectively. The 2014 sampling design supported oversampling (younger adults, aged 16–35, and unemployed adults) and the addition of a population group (older adults, aged 66–74), but the data cannot be analysed separately from the 2012 data on a national level. The expanded national sample of the combined data collections supports more accurate and reliable national estimates for these subgroups and, in the case of older adults, estimates for new groups not represented in the first round of PIAAC.

Mode of Data Collection

Face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

The US PIAAC main study (2012) sample comprised 5010 adults aged 16–65 years. The US PIAAC National Supplement (2014) household sample comprised 3660 adults aged 16–74 years. Hence, the dataset contains a total of 8670 surveyed respondents.

Format and Access

The US PIAAC 2012/2014 National Supplement Public Use File (Holtzman et al. 2016a) is available for downloading in SPSS, SAS, and raw format at the National Center for Education Statistics website.Footnote 46 A version of the Public Use File is provided on the OECD website Footnote 47, thus enabling researchers to conduct cross-country analyses using the 2012/2014 combined household US sample.

Documentation

Information on the methodology, design, and implementation of PIAAC can be found in the technical reports on the study (OECD 2014, 2016a; see also Chap. 2 in this volume) and in the results reports (OECD 2013, 2016b, c). In addition, specific information on the methodology, design, and implementation of PIAAC in the United States can be found in the technical report on the study (Hogan et al. 2016a) and in the results report (Rampey et al. 2016). An English-language and a Spanish-language background questionnaire (HTML format) and a codebook and background compendium are available for downloading at the National Center for Education Statistics website.Footnote 48

4.4.6.3 PIAAC 2012/2014: US National Supplement Restricted Use Data File (RUF) – Household

File Description

The US PIAAC 2012/2014 National Supplement Restricted Use Data Files – Household (Holtzman et al. 2016b; NCES 2016668REV) contain information on respondents’ background and on their cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments) from the first and second US PIAAC data collections, completed in 2012 and 2014, respectively. The 2014 sampling design supported oversampling (younger adults, aged 16–35, and unemployed adults) and the addition of a population group (older adults, aged 66–74), but the data cannot be analysed separately from the 2012 data on a national level. The expanded national sample of the combined data collections supports more accurate and reliable national estimates for these subgroups and, in the case of older adults, estimates for new groups not represented in the first round of PIAAC. The Restricted Use Files contain detailed versions of variables and additional data collected through US-specific questionnaire routing (e.g. continuous age and earnings variables, language spoken). A detailed variable-level comparison of the PUF and RUF versions is available in the technical report (Table E-5; Hogan et al. 2016).

Mode of Data Collection

Face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

The US PIAAC main study (2012) sample comprised 5010 adults aged 16–65 years. The US PIAAC National Supplement (2014) household sample comprised 3660 adults aged 16–74 years. Hence, the dataset contains a total of 8670 surveyed respondents.

Format and Access

The PIAAC 2012/2014 US National Supplement Restricted Use File (Holtzman et al. 2016b) is available in SPSS and SAS format and accessible only for scientific research purposes and only in the United States. Individual researchers must apply through an organisation in the United States (e.g. a university or a research institution). The organisation must apply for and sign a contract prior to obtaining access to the restricted-use data. Depending on the type of organisation, this contract takes the form of a restricted-use data licence or a memorandum of understanding (MOU). The application must be submitted via an online application systemFootnote 49. Key information must be provided about the project (e.g. title, description, and duration) and the user. The data can be used only during the time period specified by the contract. Users are charged a processing fee and are expected to make publications resulting from the research available to the data provider.

A synthetic version of the Restricted Use File (S-RUF) is provided on the OECD websiteFootnote 50 in order to enable researchers outside the United States to prepare computer code for the analysis of PIAAC data on the US Restricted Use File (RUF). The generated code (in SAS, SPSS, or Stata) must then be submitted to the American Institutes for ResearchFootnote 51, where the requested analyses will be run on the real US RUF. The output undergoes a confidentiality review and is returned to the researcher after approval. The synthetic version does not include variables with open-ended/verbatim responses or variables with a high degree of detail (e.g. occupation).

Documentation

Information on the methodology, design, and implementation of PIAAC can be found in the technical reports on the study (OECD 2014, 2016a; see also Chap. 2 in this volume) and in the results reports (OECD 2013, 2016b, c). In addition, specific information on the methodology, design, and implementation of PIAAC in the United States can be found in the technical report on the study (Hogan et al. 2016a) and in the results report (Rampey et al. 2016). An English-language and a Spanish-language background questionnaire (HTML format) are available for downloading at the National Center for Education Statistics website.Footnote 52 The codebook and background compendium are provided together with the data. For the synthetic version of the RUF (and researchers outside the United States), a codebook and a User Guide are available on the OECD PIAAC Data and Tools webpage.

4.5 PIAAC Data Files with a Focus on Specific Population Groups

4.5.1 Germany

4.5.1.1 German PIAAC National Supplement (SUF): Prime Age

File Description

The German PIAAC Prime Age dataset comprises a national oversample of adults in former East Germany aged 26–55 years from Round 1 of the PIAAC data collection in Germany, which was completed in 2012. This is considered to be an age group whose members are in the active employment phase and have usually completed vocational training. Respondents were surveyed using the same procedures, instruments, and assessments that were used for the PIAAC main study. The dataset contains background information and information on the cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments).

Mode of Data Collection

Face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

The oversample comprised 560 adults aged 26–55 years. In total (i.e. together with the participants of the German PIAAC main study in the corresponding age group), the sample contains 4000 adults aged 26–55 years.

Format and Access

The dataset (Solga and Heisig 2015) is available in SPSS and Stata format for academic research only, after signing a data distribution contract (in English or German).Footnote 53 In addition, key information about the project (e.g. title, description, and duration) and the user must be provided. The data can be used only during the time period specified by the contract. Users are charged a processing fee and are expected to make publications resulting from the research available to the data provider.

Documentation

Information on the methodology, design, and implementation of PIAAC in Germany can be found in the technical report on the study (Zabal et al. 2014) and in the results reports (OECD 2013; Rammstedt et al. 2013). The German background questionnaire is available in PDF formatFootnote 54 and in HTML format.Footnote 55 The codebook (in Excel format) is available at the GESIS Data Archive Footnote 56, and further documentation is also available on the PIAAC Research Data Center website.Footnote 57.

4.5.1.2 German PIAAC National Supplement (SUF): Competencies in Later Life (CiLL)

File Description

The German PIAAC CiLL study (Friebe et al. 2014) comprises a national oversample of adults aged 66–80 years from Round 1 of the PIAAC data collection in Germany, which was completed in 2012. Respondents were surveyed using the same procedures, instruments, and assessments that were used for the PIAAC main study. The dataset contains background information and information on the cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments).

Mode of Data Collection

Face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

The sample comprised 1392 adults aged 66–80 years.

Format and Access

The dataset (Friebe et al. 2017) is available in SPSS and Stata format for academic research only, after signing a data distribution contract (in English or German).Footnote 58 In addition, key information about the project (e.g. title, description, and duration) and the user must be provided. The data can be used only during the time period specified by the contract. Users are charged a processing fee and are expected to make publications resulting from the research available to the data provider.

Documentation

Information on the methodology, design, and implementation of PIAAC in Germany can be found in the technical report on the study (Zabal et al. 2014) and in the results reports (OECD 2013; Rammstedt et al. 2013). The German background questionnaire is available in PDF formatFootnote 59 and in HTML format.Footnote 60 The codebook in Excel format and a study description are available at the GESIS Data Archive Footnote 61. Further documentation is also available on the PIAAC Research Data Center website.Footnote 62.

4.5.2 United States

4.5.2.1 PIAAC 2014: US National Supplement Public Use Data Files (PUF)-Prison

File Description

The PIAAC 2014 US National Supplement Public Use Data Files-Prison (Hogan et al. 2016a; NCES 2016337REV) contain information on the background and the cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments) of incarcerated adults surveyed in the US PIAAC National Supplement Prison Study, data collection for which was conducted in 2014. The direct assessments of literacy, numeracy, and problem solving in technology-rich environments administered to adult inmates were the same as those administered to the US PIAAC household participants. However, the household background questionnaire was modified and tailored specifically to address the experiences and needs of this subgroup.

Mode of Data Collection

Face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

The sample comprised 1319 adults aged 16–74 years incarcerated in prisons in the United States.

Format and Access

The PIAAC 2014 US National Supplement Public Use Data Files-Prison (Hogan et al. 2016a) are available for downloading in SPSS, SAS, and raw format at the National Center for Education Statistics website Footnote 63.

Documentation

Information on the methodology, design, and implementation of US PIAAC can be found in the technical report on the study (Hogan et al. 2016a) and in the results report (Rampey et al. 2016). An English-language and a Spanish-language background questionnaire (HTML format) and a codebook and background compendium are available for downloading at the National Center for Education Statistics websiteFootnote 64.

4.5.2.2 PIAAC 2014: US National Supplement Restricted Use Data Files (RUF)-Prison

File Description

The PIAAC 2014 US National Supplement Restricted Use Data Files-Prison (Hogan et al. 2016b; NCES 2016058REV) contain information on the background and the cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments) of incarcerated adults who were surveyed in the US PIAAC National Supplement Prison Study, data collection for which was conducted in 2014. The direct assessments of literacy, numeracy, and problem solving in technology-rich environments administered to adult inmates were the same as those administered to the US PIAAC household participants. However, the household background questionnaire was modified and tailored specifically to address the experiences and needs of this subgroup. The Restricted Use File contains detailed versions of variables and additional data collected through US-specific questionnaire routing (e.g. continuous age and earnings variables, language spoken). A detailed variable-level comparison of the PUF and RUF version is available in the technical report (Table E-6; Hogan et al. 2016b).

Mode of Data Collection

Face-to-face interview (computer-assisted personal interview, CAPI); computer-based or paper-based measurement of basic skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

The sample comprised 1319 adults aged 16–74 years incarcerated in prisons in the United States.

Format and Access

The PIAAC 2014 US National Supplement Restricted Use Files-Prison (Hogan et al. 2016b) are available in SPSS and SAS format and accessible only for academic research and only in the United States. Individual researchers must apply for access through an organisation in the United States (e.g. a university or a research institution). The organisation must apply for and sign a contract prior to obtaining access to the restricted-use data. Depending on the type of organisation, this contract takes the form of a restricted-use data licence or a memorandum of understanding (MOU).Footnote 65 The application must be submitted via an online application systemFootnote 66. Key information must be provided about the project (e.g. title, description, and duration) and the user. The data can be used only during the time period specified by the contract. Users are charged a processing fee and are expected to make publications resulting from the research available to the data provider.

Documentation

Information on the methodology, design, and implementation of US PIAAC can be found in the technical report on the study (Hogan et al. 2016a) and in the results report (Rampey et al. 2016). An English-language and a Spanish-language background questionnaire (HTML format) are available for downloading at the National Center for Education Statistics websiteFootnote 67. The codebook and background compendium are provided together with the data.

4.6 Linking PIAAC Data Files to Administrative Data

To date, datasets linking PIAAC data to administrative data are in the pilot phase and are partially available in Canada (Longitudinal and International Study of Adults, LISA), the Nordic countries (Denmark, Estonia, Finland, Norway, and Sweden), and Germany.

In Canada, the LISA data, which include the PIAAC data in the first wave of measurement, are available for in-country research. The LISA data can be linked to historical administrative data since 1982 (e.g. Pension Plan in Canada, PPIC, or the Immigration Database). The linkage to administrative data is available for 8600 LISA respondents who underwent PIAAC assessments (at Wave 1).

Norway and Sweden already offer researchers the possibility of analysing the respective country data on PIAAC by linking them to administrative data. In Norway, however, this possibility is available only to researchers within the country. Therefore, NordMAN (Nordic Microdata Access Network; http://nordman.network/) has been established; it will integrate PIAAC survey data linked to administrative data for five European countries (Denmark, Estonia, Finland, Norway and Sweden) on a common platform, thereby extending the user radius for researchers within the Network. An extension of the use for researchers outside this network is currently being discussed; it is bound up, for example, with legal issues. These data will be described in Sect. 4.6.4.

By means of a pilot project, the German PIAAC-Longitudinal (PIAAC-L) data have been individually linked to the employment biography data provided by the German Institute for Employment Research (IAB). The resulting dataset is known as PIAAC-L-ADIAB. The linked administrative data are available for 2086 PIAAC-L respondents (at Wave 1). The data was tested and analysed by researchers in a pilot project. An exemplary description of the work with these data can be found in Chap. 11 in this volume.

4.6.1 Canada

4.6.1.1 Longitudinal and International Study of Adults (LISA)

File Description

The Longitudinal and International Study of Adults (LISA) examines changes in Canadian society over time. There have been four waves of LISA data collection to date: Wave 1 in 2012, Wave 2 in 2014, Wave 3 in 2016, and Wave 4 in 2018 (not yet released). Data collection for Wave 5 will begin in January 2020. In Wave 1 (2011–2012), to improve operational efficiency and enhance analytical value, LISA and PIAAC shared a portion of their samples. LISA collects a wide range of information about education, training and learning, families, housing, health, labour, income, pensions, spending, and wealth. Variables are obtained through the administration of the survey component and subsequent integration with various administrative files. The Canadian PIAAC data contain information on respondents’ background and on their cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments). The target populations of LISA and PIAAC (2011–2012) differed. The LISA target population covered individuals aged 15 years and over, whereas the PIAAC target population covered only 16- to 65-year-olds. The common sample for both PIAAC and LISA allows the analysis of various variables and administrative data with which proficiency scores can be analysed.

Mode of Data Collection

Face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

LISA uses household interviews to collect information from approximately 34,000 Canadians aged 15 years and over from more than 11,000 households (23,900 responding persons in 2012). Data from the PIAAC assessment are available for 8600 respondents and are available only in the LISA 2012 (Wave 1) microdata files.

Format and Access

It should be noted that the LISA data are currently available only in Canada, via Canadian Research Data Centres (RDCs). Researchers must submit proposals to the RDC Program requesting LISA data and must specify whether they require access to the LISA survey data or the LISA data integrated with administrative data. The application process and guidelines depend on the affiliation of the principal investigator (e.g. researcher who works for an academic institution that is or is not a member of the Canadian Research Data Centre Network) and the type of research to be conducted. Detailed information on the data access process can be found on the Statistics Canada website.Footnote 68 Users are expected to make publications resulting from the research available to the data provider.

Documentation

English-language and French-language information on the methodology, design, and implementation of LISA and PIAAC can be found on the Statistics Canada websiteFootnote 69 and on the Canadian PIAAC websiteFootnote 70. Furthermore, general information on the methodology, design, and implementation of PIAAC can be found in the technical reports on the study (OECD 2014, 2016a; see also Chap. 2 in this volume) and in the results reports (OECD 2013, 2016b, c). The questionnaires (in English and French) of all waves of LISA can be downloaded at Statistics Canada’s website.Footnote 71 The international master questionnaire, the international codebook, and a derived variables codebook are available on the OECD PIAAC Data and Tools webpage.

4.6.2 Norway

4.6.2.1 Linking PIAAC Norway Data to Administrative Data

File Description

The Norwegian PIAAC data contain information on respondents’ background and on their cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments). The PIAAC data provided by the Norwegian Centre for Research Data (NSD)Footnote 72 contain more detailed information—for example, on earnings, country of birth, and occupation (detailed, four-digit, ISCO-08 codes)—than that available in the Norwegian PIAAC Public Use File (OECD 2016v).

Furthermore, the Norwegian PIAAC data can be extended with administrative (register) data, such as demographic data (e.g. citizenship and marital status), data on educational attainment and current education, employment, occupation and industry, and information about the workplace of the respondents and about social security for the years 2010–2020. These linked data are provided by Statistics Norway.

Mode of Data Collection

PIAAC data: face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments. The administrative data are derived from administrative registers (e.g. the population register).

Sample Description and Size

The sample comprised 5128 adults aged 16–65 years.

Format and Access

The Norwegian PIAAC data (Statistics Norway 2015) are provided in SPSS, Stata, and SAS format by the Norwegian Centre for Research Data (NSD) to researchers, teachers, and students located in Norway. Data can be ordered via NSD’s order form (currently at https://nsd.no/nsd/english/orderform.html).Footnote 73 Users must sign an access letter and a confidentiality agreement that stipulates conditions for use. The data distribution contract must be signed by each member of a project who wishes to use the data. In addition, key information about the project and the user must be provided. The data contract can be concluded for a term of 2 years.

The Norwegian PIAAC dataset can also be extended with variables from administrative registers. Anonymous datasets are created by Statistics Norway for specific research projects. In other words, when researchers apply for access, a dataset with the specific variables ordered is created for the research project in question.Footnote 74

Documentation

Information on the methodology, design, and implementation of PIAAC can be found in the technical reports on the study (OECD 2014, 2016a; see also Chap. 2 in this volume) and in the results reports (Fridberg et al. 2015; OECD 2013, 2016b, c). There is also a national documentation report (Gravem and Lagerstrøm 2013). Further documentation for administrative data is made available when the data/variables are ordered. A Norwegian-language version of the background questionnaire is available for downloading at the OECD PIAAC Data and Tools webpage, as are an international codebook and a derived variables codebook.

4.6.3 Sweden

4.6.3.1 Linking PIAAC Sweden Data to Administrative Data

File Description

The Swedish PIAAC data contain information on respondents’ background and on their cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments). The PIAAC data at Statistics Sweden contain more detailed information—for example, on earnings, country of birth, and occupation (detailed, four-digit, ISCO-08)—than that available in the Swedish PIAAC Public Use File (OECD 2016dd). Furthermore, the Swedish PIAAC data were extended with administrative (register) data, such as demographic data (e.g. citizenship and marital status), data on educational attainment and current education, employment, occupation and industry, and information about the workplace of the respondents and about social security. This information is available for the years 2008 and 2011 for each respondent of PIAAC 2012. It is also possible to combine the PIAAC data with register data about the region in which the respondent lives (e.g. NUTS 2).

Mode of Data Collection

PIAAC data: face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments. The administrative data are derived from administrative registers (e.g. the population register).

Sample Description and Size

The sample comprised 4469 adults aged 16–65 years.

Format and Access

The Swedish PIAAC data are provided in SPSS, Stata, SAS, and R format for research purposes within the EU/EEA through the remote access system MONA (Microdata Online Access) at Statistics Sweden Footnote 75. MONA is a tool for delivering microdata at Statistics Sweden. Users of MONA work in a Windows environment via remote connection. Microdata are visible on the computer screen and can be processed using statistical software available in MONA. Results can be retrieved via email, but processed microdata are stored in MONA and may not be downloaded.

There is not one standard dataset. Rather, datasets with register variables have to be created for a specific research project. When researchers apply for access, a dataset with the specific variables ordered is created for the research project in question. Research projects must apply to Statistics Sweden for access to the data; a research plan, also containing a description of variables, should be included in the application. Statistics Sweden conducts a confidentiality review based on the research plan. If the application is approved and confidentiality agreements between Statistics Sweden and the research project are signed, the project obtains access to the data through MONA.

Documentation

Information on the methodology, design, and implementation of PIAAC can be found in the technical reports on the study (OECD 2014, 2016a; see also Chap. 2 in this volume) and in the results reports (Fridberg et al. 2015; OECD 2013, 2016b, c). Further documentation for administrative data is made available when the data/variables are ordered. A Swedish-language version of the background questionnaire is available for downloading at the OECD PIAAC Data and Tools webpage, as are an international codebook and a derived variables codebook.

4.6.4 The Nordic PIAAC Database

File Description

The Nordic PIAAC database contains microdata from the survey, as well as data from registers of five Nordic European countries: Denmark, Estonia, Finland, Norway, and Sweden. It contains information on respondents’ background and on their cognitive assessment (in literacy, numeracy, and problem solving in technology-rich environments) from the PIAAC data collection completed in 2012. Furthermore, data from national registers in Denmark, Estonia, Finland, and Sweden for the reference years 2008 and 2011 and in Norway for 2011 are available for each respondent. Diverse types of register data are available, such as demographic data (e.g. citizenship and marital status), data on educational attainment and current education, employment, occupation and industry, and information about the workplace of the respondents and about social security.

Mode of Data Collection

PIAAC data: face-to-face interview (computer-assisted personal interview, CAPI) to collect the background information; computer-based or paper-based assessment of skills in literacy, numeracy, and problem solving in technology-rich environments. The administrative data are derived from the administrative registers of the respective countries (e.g. the population register).

Sample Description and Size

The sample comprised 7328 adults aged 16–74 years in Denmark, 7632 adults aged 16–74 years in Estonia, 5464 adults aged 16–74 years in Finland, 5128 adults aged 16–74 years in Norway, and 4469 adults aged 16–74 years in Sweden (Fridberg et al. 2015).

Format and Access

The Nordic PIAAC database is stored in safe domains of the Nordic National Statistical Institutions (Nordic NSIs). It is currently provided only for research purposes within the Network countries and can be accessed via NordMAN (Nordic Microdata Access Network)Footnote 76. NordMAN describes the processes for obtaining access to Nordic PIAAC data combined with register data (application forms and procedures, confidentiality review and agreements, etc.). The data can be accessed via remote access systems at the statistical offices in Sweden, Finland, and Denmark.

There is not one standard dataset. Rather, datasets with register variables have to be created for specific research projects. When researchers apply for access, a dataset with the specific variables ordered is created for the research project in question. The application must be submitted to a committee comprising representatives from each country. In Sweden, for instance, the application must be approved by Statistics Sweden. If the application is approved, the researcher signs the necessary contracts and confidentiality agreements and is then allowed to analyse the Nordic microdata in SPSS, Stata, SAS, or R format via NordMAN. Prior to data delivery, all outputs are subject to output control by the data-hosting NSI. Fees are charged for the data preparation procedure and the use of the system.

Documentation

Information on the methodology, design, and implementation of PIAAC can be found in the technical reports on the study (OECD 2014, 2016a; see also Chap. 2 in this volume) and in the results reports (Fridberg et al. 2015; OECD 2013, 2016b, c). A Danish-language, Estonian-language (and Russian-language), Finnish-language, Norwegian-language (and English-language), and Swedish-language version of the background questionnaire are available for downloading at the OECD PIAAC Data and Tools webpage, as are an international codebook and a derived variables codebook.

4.7 PIAAC Longitudinal Data Files

Four countries that participated in the first cycle of PIAAC (2011–2012)—Canada, Germany, Italy, and Poland—have carried out follow-up studies with different strategies and focus. Thereby, only the German PIAAC Longitudinal study included a reassessment of basic skills using PIAAC instruments (see Rammstedt et al. 2017a, b).

In Canada, a subset of the respondents of the Canadian social survey Longitudinal and International Study of Adults (LISA; N = 27,285) participated in PIAAC. These respondents are being reinterviewed biennially as part of LISA (see Situ 2015). The LISA study is described in more detail in Sect. 4.6 on linking PIAAC data to administrative data.

In Germany, respondents who had participated in the 2011/2012 PIAAC survey were reapproached for the panel study PIAAC-Longitudinal (PIAAC-L). PIAAC-L (N = 3758) consisted of three follow-up waves to the initial PIAAC 2012 survey, which were conducted in 2014, 2015, and 2016. Extensive background information and information on non-cognitive skills, household composition, and living conditions was collected, and a reassessment of literacy and numeracy was carried out in 2015.

A follow-up to PIAAC in Italy (2014/2015) collected longitudinal information on Italian PIAAC respondents (N = 2003) and focused on non-cognitive skills. A Polish follow-up to PIAAC (postPIAAC) also focused on non-cognitive skills. Conducted in 2014/2015 (N = 5224), it collected additional background information on the PIAAC respondents as well as information on their non-cognitive skills (e.g. the Big Five personality traits, grit). Basic cognitive skills tests (e.g. working memory test or coding speed test) and a basic ICT skills test were applied (e.g. Palczyńska and Świst 2016). As the data from the ItalianFootnote 77 and Polish (to access the data, contact the Polish Educational Research Institute [IBE])Footnote 78 follow-up studies have not yet been published and made available to external researchers, they are not presented in detail here. As only the German PIAAC-L study included a reassessment of basic skills, it will therefore be described in the following section.

4.7.1 Germany

4.7.1.1 PIAAC-Longitudinal Scientific Use File

File Description

The German PIAAC-Longitudinal (PIAAC-L) study was a collaborative effort undertaken by GESIS – Leibniz Institute for the Social Sciences (lead), the German Institute for Economic Research (DIW), and the Leibniz Institute for Educational Trajectories (LIfBi). PIAAC-L was designed as a three-wave follow-up survey to PIAAC (2012), with data collections in 2014, 2015, and 2016 (for a overview see Fig. 4.1). The PIAAC-L questionnaires were based on core instruments from the German Socio-Economic Panel (SOEP) and also included various additional questions and modules on the respondents’ background. In addition, assessment instruments from PIAAC and the National Educational Panel Study (NEPS) measuring key competencies were implemented.

Fig. 4.1
figure 1

German PIAAC-Longitudinal (PIAAC-L) study

The person questionnaire included questions on the following topics: background information, family, and childhood; biographical calendar; formal education (general and vocational education) and continuing professional education; work status, situation, and history; income and benefits; health, attitudes, personality, opinions, and satisfaction; and time use and leisure activities. The household questionnaire assessed living situation, conditions, and costs; household income and benefits and wealth; and children and other household members.

The objective of the PIAAC-L project was to significantly expand the German PIAAC database by adding a longitudinal dimension and enhancing the depth and breadth of information available on the German PIAAC respondents (for an overview of the rationale and design of the study, see Rammstedt et al. 2017a, b).

Mode of Data Collection

Face-to-face interview (CAPI) and computer-based or paper-based cognitive assessment.

Sample Description and Size

The sample comprised German PIAAC 2012 respondents aged 18–65 years who agreed to participate in PIAAC-L and other members of their household aged 18 years and over (total initial sample at the first wave: N = 6231). Whereas the focus and the groups of addressed persons varied somewhat across waves, German PIAAC 2012 respondents (N = 5465)—the so-called anchor persons—were consistently the central response units in PIAAC-L (Zabal et al. 2016). Wave 1 was designed to target anchor persons (n = 3758) and their household members aged 18 years and over (i.e. born in 1996 or earlier; n = 2473). In Wave 2, anchor persons (n = 3263) and their partners, if living in the same household, were addressed (n = 1368). The design of the third wave was similar to that of the first wave: anchor persons (n = 2967) and all household members aged 18 years and over (i.e. born in 1998 or earlier) were to be interviewed (n = 1914).

Format and Access

The German PIAAC-L data (GESIS et al. 2017) are available as a scientific use file (SPSS and Stata format) for academic research only, after signing a data distribution contract (in English or German).Footnote 79 In addition, key information about the project (e.g. title, description, and duration) and the user must be provided. The data can be used only during the time period specified by the contract. Users are charged a processing fee and are expected to make publications resulting from the research available to the data provider. The data of the anchor persons from all three PIAAC-L waves can be matched to data from the German PIAAC Scientific Use File (see Sect. 4.4.3).

Documentation

Information on the methodology, design, and implementation of PIAAC-L can be found in the German-language fieldwork report (Steinacker and Wolfert 2017) and the English-language technical reports on the study (Bartsch et al. 2017; Martin et al. 2018; Zabal et al. 2016). The person and household questionnaires (in German, as administered in the field, but with English labels) can be downloaded at the PIAAC Research Data Center websiteFootnote 80 English-language codebooks for data on persons, households, and weights are available (in Excel and PDF format) on the respective websites.

4.8 Linking PIAAC Data to Other Surveys

Three PIAAC participating countries—Denmark, Singapore, and the United States—have surveyed persons who had been surveyed before in another large-scale assessment, namely, the Programme for International Student Assessment (PISA).

In Denmark 1881 participants aged 15–16 years at PISA 2000 were retested and interviewed again in PIAAC 2011–2012. The Danish Center for Social Science Research (VIVE) is responsible for PIAAC; for more information on the corresponding data and the availability for the scientific research, please contact VIVE.Footnote 81 As no updated information on this dataset is currently available, it cannot be described in this volume.

Singapore surveyed persons who participated in PISA 2009. However, these data are not available for research purposes (OECD 2016a). Finally, in the United States, PISA 2012 participants were issued with PIAAC questionnaires. These datasets can be used for research purposes and will be described below.

4.8.1 US Program for International Student Assessment Young Adult Follow-Up Study (PISA YAFS) Data

File Description

The Program for International Student Assessment Young Adult Follow-Up Study (PISA YAFS) is a new study that examines a key transition period for US young adults in terms of their characteristics, academic skills, and other life outcomes. It was conducted in the United States with a sample of students who participated in PISA 2012, when they were 15 years old. These students were assessed again 4 years later in 2016, at about age 19, with the OECD’s Education and Skills Online (ESO) literacy, numeracy, and problem solving in technology-rich environments assessments, which were based on the Programme for the International Assessment of Adult Competencies (PIAAC). They were also given a background questionnaire about their education and employment status, attitudes, and interests.

Thus, in addition to providing information on skills performance at age 19, PISA YAFS can also examine the relationship between that performance and young adults’ performance on PISA 2012 at age 15. Moreover, it can examine the relationship between their earlier PISA 2012 performance and other aspects of their lives at age 19, such as their engagement in postsecondary education, their participation in the workforce, their attitudes towards their lives, their ability to make their own choices, and their vocational interests.

Mode of Data Collection

Online data collection, using a platform developed for PISA YAFS in combination with the OECD-provided platform Education and Skills Online (ESO). The specially developed PISA YAFS platform gathered information on (i) current education study status (participation, level of degree, area of study); (ii) formal education activities; and (iii) nonformal learning activities in the 12 months preceding the study. The ESO non-cognitive modules collected information on respondents’ (i) basic demographics, (ii) career interests and intentionality (CII), (iii) behavioural performance competencies (BPC), and (iv) subjective well-being and health (SWBH). The ESO platform also assessed participants’ skills in literacy, numeracy, and problem solving in technology-rich environments.

Sample Description and Size

The PISA YAFS sample comprised around 2320 young adults who were about 19 years old in 2016, who participated in PISA 2012 at the age of 15, and who provided contact information for follow-up.

Format and Access

The PISA YAFS data are scheduled to be available in 2020. The data will be in the form of the public use files, provided in SPSS and SAS formats on the National Center for Education Statistics website.Footnote 82

Documentation

Information on the methodology, design, and implementation of PISA YAFS are planned to be available in 2020 and will be found in the technical and in the results reports on the study (https://nces.ed.gov/surveys/pisa/followup.asp). General information on the methodology, design, and implementation of PIAAC can be found in the technical reports on the study (OECD 2014, 2016a; see also Chap. 2 in this volume) and in the results reports (OECD 2013, 2016b, c). Information, questionnaires, and codebooks on PISA are available on the OECD website.Footnote 83

4.9 PIAAC Data Files on Non-Cognitive Skills

The PIAAC Pilot Studies on Non-Cognitive Skills were designed to test the measurement properties of nine personality scales: the Big Five, Traditionalism, Self-Control, Self-Efficacy, Honesty/Integrity, Socio-Emotional Skills, Intellectual Curiosity, Job Orientation Preferences, and Vocational Interests (Kankaraš 2017). The first study—the English Pilot Study on Non-Cognitive Skills—was realised with a complex design in the United States and the United Kingdom. The second study—the International Pilot Study on Non-Cognitive Skills—was realised in five countries (Germany, Spain, France, Japan, and Poland); the questionnaire focused on the properties of selected personality scales.

4.9.1 PIAAC English Pilot Study on Non-Cognitive Skills (SUF)

File Description

This online survey (see also Kankaraš 2017) was designed to test the measurement properties of nine personality scales: the Big Five, Traditionalism, Self-Control, Self-Efficacy, Honesty/Integrity, Socio-Emotional Skills, Intellectual Curiosity, Job Orientation Preferences, and Vocational Interests. Eight of these nine scales were existing scales (or combinations of existing scales) available for use in the public domain. The study (data collection period: June–July 2016) was conducted in two phases, each with a somewhat different study design. The objectives of the online survey were to test (a) the measurement characteristics of the selected scales; (b) the relationships of the selected scales with background and other characteristics of respondents; (c) different item formulations—original vs. simplified; (d) different response options—with or without a neutral/middle category; (e) scales with different item formats—multiple choice vs. forced choice (Vocational Interests Scale); and (f) the new balanced scales (compared to the original unbalanced scales).

Mode of Data Collection

The entire survey was conducted online. It was implemented using the SurveyMonkey platform.

Sample Description and Size

The sample comprised 5910 adults aged 16–65 years from the United States and the United Kingdom in the first phase and 1606 in the second phase (only United States).

Format and Access

The English Pilot Study on Non-Cognitive Skills (OECD 2018a) is available as a scientific use file (in SPSS and Stata format) for academic research only, after signing a data distribution contract.Footnote 84 The scientific use file contains data from the first and second phases. In addition, key information about the project (e.g. title, description, and duration) and the user(s) must be provided. The data can be used only during the time period specified by the contract. Users are charged a processing fee.

Documentation

Information on the methodology, design, and implementation of the PIAAC English Pilot Study on Non-Cognitive Skills can be found on the PIAAC Research Data Center websiteFootnote 85 and the GESIS Data Archive.Footnote 86 A questionnaire item bank (Excel format), a codebook (Excel format), and further information are also available on the aforementioned webpage.

4.9.2 International Pilot Study on Non-Cognitive Skills (SUF)

File Description

This study was designed with the following objectives: first, to test the measurement characteristics of selected scales, and second, to test the cross-national comparability of selected scales. The measurement properties of nine personality scales—the Big Five, Traditionalism, Self-Control, Self-Efficacy, Honesty/Integrity, Socio-Emotional Skills, Intellectual Curiosity, Job Orientation Preferences, and Vocational Interests—were tested (data collection period: January–March 2017).

Mode of Data Collection

The entire survey was conducted online. It was implemented using the SurveyMonkey platform.

Sample Description and Size

The sample comprised 6924 adults aged 16–65 years from Germany, Spain, France, Japan, and Poland.

Format and Access

The International Pilot Study on Non-Cognitive Skills (OECD 2018b) is available as a scientific use file (in SPSS and Stata format) for academic research only, after signing a data distribution contract.Footnote 87 In addition, key information about the project (e.g. title, description, and duration) and the user(s) must be provided. The data can be used only during the time period specified by the contract. Users are charged a processing fee.

Documentation

Information on the methodology, design, and implementation of the PIAAC International Pilot Study on Non-Cognitive Skills can be found on the PIAAC Research Data Center websiteFootnote 88 and at the GESIS Data Archive.Footnote 89 The questionnaires in the respective country languages (PDF format), item translations in the respective country languages (Excel format), an English-language codebook (Excel format), and further information are also available on the aforementioned webpage.