Experimental and Longitudinal Data for Scientiﬁc and Policy Research: Open Access to Data Collected in the Longitudinal Internet Studies for the Social Sciences (LISS) Panel

This chapter presents the Dutch Longitudinal Internet Studies for the Social sciences (LISS) panel. This infrastructure provides an innovative method of data collection and data access. It offers researchers the opportunity to field surveys and conduct experiments, to analyze the effect of interventions, and to link collected survey data to administrative data available from Statistics Netherlands. The infrastructure is used mostly by academic researchers but also by applied researchers, in a wide variety of studies in various disciplines. The authors demonstrate how the LISS infrastructure can be used to carry out relevant scientific and policy research to tackle contemporary societal challenges. The examples of policy-relevant research presented here focus on the adequacy of retirement savings, retirement expenditure goals, and the behavioral responses of individuals to certain policies through stated preference analyses.


Longitudinal Core Study
To prevent panel members having to answer similar questions month after monthdue to the popularity of certain topics at certain times-it was decided to have a rich and lengthy core questionnaire. This core questionnaire, designed with assistance from international experts in the relevant fields, follows changes over the life course of individuals and households. The questionnaire is repeated annually and covers eight modules, each with its own theme: • Health • Politics and values • Religion and ethnicity • Social integration and leisure • Family and household • Work and schooling • Personality • Economic situation: assets, income, and housing Data from the longitudinal core study allow for analyses of changes in people's lives, their reaction to life events, and the effects of societal changes and policy measures. For example, the core modules "'Politics and Values" and "'Social Integration and Leisure" were used by Van Ingen and Van der Meer (2016) to test four possible explanations for the well-documented correlation between civic engagement and political socialization. Kalmijn (2015) used data from the core module on "Family and Household" to examine the effects of divorce and repartnering on the relationships that fathers have with their adult children. Six waves of the "Health" module were used by Cabus et al. (2016) to estimate the short-run causal effect of tumor detection and treatment on psychosocial well-being, work, and income.
The major strength of the longitudinal core study, however, is the opportunity it provides to combine data from studies and experiments proposed by researchers with data from the core study. This greatly enhances the cost-effectiveness and scientific value of the experimental modules proposed by researchers. It eliminates the need to collect an array of background variables in each survey or experiment, and allows for links with a wealth of other (non-retrospective) information available on the panel members.
An example is a multi-wave study on mental health (Lamers et al. 2011). The researchers who proposed this study argued that there is a growing consensus that mental health is not merely the absence of mental illness but also includes the presence of positive feelings (emotional well-being) and positive functioning in both individual life (psychological well-being) and community life (social wellbeing). Lamers et al. examined a new self-report questionnaire for positive mental health assessment (the so-called Mental Health Continuum-Short Form (MHC-SF)). The collected data were enriched with data from the longitudinal core study (modules "Health," "Personality," "Politics and Values," and "Social Integration and Leisure"). So far, data from this particular multi-wave study combined with data from the data archive have resulted in ten articles published in peer-reviewed scientific journals, a book chapter, a Master's thesis, and a PhD thesis. In addition to the scientific value, the project has also had a societal impact, as the MHC-SF is now widely used by the Dutch Association of Mental Health and Addiction Care.
Another example in which data were successfully merged with the longitudinal core study is a project that aims to determine whether or not people change their reform preferences when faced with increasing reform pressures such as an aging society (Naumann et al. 2015). The researchers collected data in July 2013, September 2013, and January 2014 and combined their data with data from the core modules "Economic Situation (Income)," "Politics and Values," and "Work and Schooling" from the 2008 and 2013 waves. Naumann et al. confirmed theoretical expectations that people change their support for unemployment benefits in reaction to changes in their individual material circumstances. Job loss leads to increased support for public unemployment benefits. The availability of longitudinal data, in particular, covering the period of the international economic crisis (2008)(2009), made the analysis possible.

Experimental Data
The LISS panel has also been used successfully for various types of experiments. Before the main recruitment of the LISS panel even started, a comprehensive pilot study was fielded to determine the optimal recruitment strategy for an infrastructure such as LISS (Scherpenzeel and Toepoel 2012). The factors that were considered in the pilot study were contact mode (recruitment either by telephone, in person, or by a combination of these methods), incentive amount, timing of the incentive, content of the advance letter, and timing of the panel participation request. Scherpenzeel and Toepoel showed that all incentives were found to have much stronger effects on response rates when they were distributed with the advance letter (prepaid) than when they were paid later (promised). The highest response rate was found with a prepaid incentive of EUR 10. For more results of the recruitment pilot, we refer to Scherpenzeel and Toepoel (2012).
The LISS panel is an ideal infrastructure for studying survey methodological issues, in particular, when they relate to the mode of interviewing (online). However, many experiments contributing to substantive research were also run in LISS. For example, Bellemare and Sebald (2011) presented a class of two-player extensiveform games allowing measurement of belief-dependent preferences, including guilt aversion as an important special case. A total of 2000 LISS panel members were invited to participate in a sequential game that was played across 2 consecutive months. Bellemare and Sebald found evidence of significant guilt aversion in the Dutch population: a significant proportion of the population was found to be willing to pay to avoid letting down the other player, in line with predictions of beliefdependent models of guilt aversion.
Another example of a substantive experiment concerned ethical behavior and class status (Trautmann et al. 2013). In this experiment, randomly selected LISS panel members had to make decisions that determined how much money they and someone else would earn. Trautmann et al. showed that ethical behavior is affected by moral values, social orientation, and the costs and benefits of taking various actions. Strong class differences emerged in each of these areas, leading to differences in behavior.

Innovations in Data Collection
One of the goals in LISS is to innovate data collection methods through experiments with new technologies. The first large-scale experiment started in 2010. A random sample of (about) 1000 LISS households was provided with an advanced bathroom scale. This scale measures body weight and impedance (on which fat and muscle percentage is based). The scale establishes a wireless connection with the gateway via a radio signal. This minimizes the respondents' burden to provide the data; they are requested only to step on the scale (once a day, once a week, or at an unspecified frequency). A first empirical analysis is reported in Kooreman and Scherpenzeel (2014), based on almost 80,000 measurements collected in 2011.
The measurement of time use typically relies on paper diaries. As this is quite burdensome for the respondents, response rates in time-use surveys are generally low. In addition, the traditional setup is rather expensive. A smartphone allows timeuse data to be collected in a more efficient way. In close collaboration with the Netherlands Institute for Social Research (SCP), a pilot study was carried out in the LISS panel to test the feasibility of collecting time-use data with smartphones. A total of 2000 LISS panel members participated in the study; some members of the sample did not own a smartphone and were lent one for a short period of time. A time-use app was developed specifically for this study, which had similarities with the paper version. There were also differences with the paper version, such as the possibility of copying repeated activities from a previous time slot and of filling in activities such as sleeping and working for longer time periods. The results of this feasibility study can be found in Sonck and Fernee (2013).
Smartphones can also be used to track travel behavior, using the phone's global positioning system (GPS) functionality. Traditionally, data on travel behavior are collected through cross-sectional travel surveys using a paper diary. This entails a serious time investment by the respondent, especially when travel data are collected for a longer period of time. Moreover, due to memory effects, the accuracy of the data is rather low. Using a dedicated app, travel data were collected from a random selection of LISS panel members. Data collection took place in 3 years (2013, 2014, and 2015); in each year a random selection of 500 panel members participated in the study, using their own smartphone with the app installed or a loan smartphone. Geurs et al. (2015) analyzed the first dataset and concluded that using the app is a promising alternative to traditional travel diaries.
The measurement of physical activity is a final example of an experiment with a new measurement device. The goal of this experiment was to form a more realistic and complete picture of physical activity when objective measures and self-reports are combined, particularly in the context of international studies on physical activity. The study involved an accelerometer, developed by GENEActiv (https://www.activinsights.com/products/geneactiv/). The device is wearable as a watch and is waterproof. It measures acceleration in three dimensions, body temperature, and light intensity, at a frequency of 60 measurements per second (60 Hz). Approximately 1000 LISS panel members participated in the main study, using 300 devices. Panel members wore the device for 8 consecutive days, day and night. For each participant this resulted in a large dataset; for the entire sample in the experiment, approximately 5 terabytes of raw data were produced. The same devices were shared with research teams running the English Longitudinal Study of Ageing (United Kingdom) and the Understanding America Study (United States). Interesting results were obtained. Kapteyn et al. (2018) showed that self-reports and objective measures of physical activity tell a strikingly different story about differences between the Netherlands and the United States: for the same level of selfreported activity, the Dutch are significantly more physically active than Americans.

Open-Access Data Policy
Access to the data collected in LISS is open to every researcher, free of charge, both in the Netherlands and abroad. Data are made available through the advanced LISS Data Archive (www.dataarchive.lissdata.nl/). This archive, based on existing international specifications, has been awarded the Data Seal of Approval (www.datasealofapproval.org), confirming adherence to the guidelines for trusted digital repositories. Any researcher who signs a confidentiality statement can use the data. Use of variables collected in different waves (or studies) is facilitated by allowing researchers to collect such variables in a "shopping basket," which then automatically generates a dataset according to the user's specification(s).
In June 2018 more than 2800 users were registered, affiliated with more than 100 institutes worldwide (including top universities such as Harvard, Stanford, and the University of Michigan). Data are used for both scientific and policyrelevant/socially relevant research. So far, more than 491 papers based on LISS data have been published, including 236 articles in peer-reviewed international scientific journals and 31 PhD theses.

Linking to Administrative Data
SN collects a large variety of data which can be accessed by researchers under very strict privacy and confidentiality procedures. Through collaboration between SN and CentERdata, all LISS data can be linked to administrative data. This is only possible within the remote access environment of SN. Researchers can send LISS data to SN through a secure connection. SN then matches identification numbers from LISS panel members with the identification numbers available in SN's records. LISS panel members are informed about this linking and can opt out at any time. Once a particular person has opted out, record linkage for that person is no longer possible. Less than 10% of panel members have opted out. Linkage with administrative data lays the groundwork for an even richer data resource, as it is possible to augment survey records with administrative data on, for example, labor, income, wealth, pension entitlement, and health care. Section 3 presents some examples of research projects based on LISS data combined with data from SN registers.
Some studies (pending publication) link LISS data to external data sources available from institutes other than SN. One study links data from the core study ("Health" module) to data on air pollution (available from the Dutch National Institute for Public Health and the Environment (RIVM)). A second example is a study that links LISS data to weather data (available from the Royal Netherlands Meteorological Institute (KNMI)). In both cases files with postal codes and the variable of interest are merged to create a file with LISS data enriched with postal codes. The merging of files is performed by CentERdata; the researcher receives the merged file excluding the postal codes. As long as the variable of interest is at a sufficiently high level of aggregation, no individual panel member can be identified in this way.

Societal Challenges
Society is currently experiencing a number of significant trends, with a host of attendant challenges: -The population is aging, with implications for the cost of health care, sustainability of the social security and pension systems, and the structure of labor and product markets. -Health disparities are substantial, and health status varies strikingly by socioeconomic status. Obesity rates are rising sharply, and levels of physical activity are falling. -The immigrant population has become sizeable and has not been effectively integrated into the labor market, the educational system, or the social fabric. -Volatility in the international financial system has put savers' investments at risk. -Work patterns are changing across generations and age groups, as well as by gender. With people now working until later in life, older workers often require adaptations in workplaces and shorter working hours. Female participation in the labor force has increased dramatically, but women are much more likely to work part time than men, with implications for career prospects and compensation.
Through its open-access data policy, LISS offers a rich and valuable source of information that can be used to address the challenges posed by these and other trends. Sound policy that can positively shape the future of citizens will depend on high-quality research in the social sciences to inform decision-making, both in government and in industry.
Survey data from LISS in combination with administrative data have played an important role in policy discussions on the future of the Dutch pension system. Results have also been used for a "pension coach" app to help the Dutch public prepare for retirement. The next three subsections describe policy-relevant research based on combinations of LISS panel data and administrative data and its impact on society.

Retirement Savings Adequacy
Population aging and the poor performance of financial markets in recent years have put the sustainability of pension arrangements in many Western countries under pressure. To investigate if the Dutch population will be able to cope with possible cutbacks in pension benefits, De Bresser and Knoef (2015) analyzed their preparedness in 2008, on the eve of the prolonged economic slump. To do so they compared self-reports of minimal and preferred expenditures during retirement with annuitized wealth from administrative data. The rationale for this approach is that preferences and constraints are likely to vary across individuals and households. Measuring readiness against a single universal threshold, such as a retirement income equal to or greater than 70% of previous earnings, 1 fails to capture relevant differences in coping strategies.
For the subjective assessment of minimal and preferred expenditure levels during retirement, De Bresser and Knoef (2015)  A question about minimal retirement expenditure was raised at the beginning of the survey, after a couple of items regarding housing costs during retirement. The question was phrased as follows:

This question refers to the overall level of spending that applies to you [and your partner/spouse] during retirement. What is the minimal level of monthly spending that you want during retirement?
Please think of all your expenditures, such as food, clothing, housing, insurance, etc. Remember, please assume that prices of the things you spend your money on remain the same in the future as today (i.e., no inflation).
The quality of any evaluation of retirement readiness depends on ability to measure financial resources. Survey reports of assets are known to suffer from substantial non-response and under-reporting, particularly when it comes to categories of ownership such as stocks and savings accounts (Bound et al. 2001;Johansson and Klevmarken 2007). Therefore, De Bresser and Knoef (2015) preferred to use more reliable administrative sources. They matched the LISS survey data with tax records and data from pension funds and banks that are available at SN. This allowed them to construct a complete and precise measure of the resources available to households.
The quality of the self-reported expected retirement expenditures is also important. This depends on the degree to which people can predict their expenditure needs during retirement. De Bresser and Knoef (2015) showed that people report reasonable expenditures compared with their current income level. Furthermore, young people provide similar answers to retirees, who know what it is like to be retired. Finally, the model controls for the fact that some individuals have thought about retirement more than others and that some people will find it more difficult than others to answer questions about consumption needs during retirement.
De Bresser and Knoef (2015) found that, overall, the Dutch population was well prepared for retirement. The median difference between the after-tax annuity that can be obtained at age 65 and the individual-specific level of minimal expenditure was 25%, taking into consideration public and occupational pensions. Still, for a sizable minority of the sample, close to 20%, the annuity falls short of minimum expenditure, even if all sources of wealth are taken into account (including private savings and housing wealth, in addition to public and private pensions). The size of those deficits is large enough to be problematic, with a median shortfall of around 30%. The self-employed and the divorced stand out as vulnerable groups with relatively modest pension entitlements.
The results of De Bresser and Knoef (2015) stimulated the policy debate in the Netherlands on shortages but also on households that save more than they need to finance their retirement. As explained by Knoef et al. (2015), 2 there is considerable variation in retirement savings adequacy. The results of the study were commented on print media, on the radio, and on televised news programs in the Netherlands. The advisory report by the Social and Economic Council on the future of the Dutch pension system referred to the results (SER 2015). This advisory report was taken seriously by the relevant ministries in their plans to reform the pension system. Furthermore, the results were used by pension funds and insurance companies in the pension field to gain insight into the composition of wealth in Dutch households. The results were also presented to the State Secretary of the Ministry of Social Affairs and Employment, who was especially interested in vulnerable groups (regarding pension accumulation). Finally, the Dutch central bank made use of the results in its paper on Dutch household balance sheets (DNB 2015). With this paper the Dutch central bank aimed to limit disruptive tax incentives with regard to the accumulation of wealth by Dutch households.

Retirement Expenditure Goals After the Crisis
The Dutch Authority for the Financial Markets raised the question of whether or not retirement expenditure goals changed after January 2008, in response to the financial crisis. Therefore, a new questionnaire was fielded in the LISS panel in December 2014, again asking about retirement expenditure goals. In addition to the question on minimal retirement expenditure described above, De Bresser et al. (2018) also asked a question about preferred retirement expenditure. Figure 1 shows the median of minimal and preferred retirement expenditure goals by income quintile. 3 Both minimal and preferred retirement expenditure goals increase with income. However, when retirement expenditure goals are divided by current income, these ratios decline with income. Whereas the median poor person needs about 100% of current income after retirement, the median rich person needs about 60%.
Descriptive statistics about minimal and preferred expenditure are in themselves interesting. Therefore, they are used by a large Dutch insurance company in its "pension coach" app, which is available for the whole Dutch population free of charge. After filling in retirement expenditure goals and wealth and pension entitlements, the app indicates what the person must do to reach his/her retirement expenditure goal. To help people determine their retirement expenditure goal, they are offered information on the interquartile range of retirement goals that peers with the same income and household situation reported in the representative LISS panel. Fig. 1 Retirement expenditure goals by income quintile. Note: This figure shows preferred and minimal retirement expenditure goals (left) and preferred and minimal retirement expenditure goals divided by current income (right). Source: Own calculations based on data described in De Bresser et al. (2018) Comparing the data for January 2008 and December 2014, De Bresser et al. (2018) show that minimal retirement expenditure goals (in real terms) declined by about EUR 200 per month over that period. In particular, high-income individuals, homeowners, widows, and men with self-employed individuals in their households reduced their retirement expenditure goals.
People may have adjusted their pension ambitions downward because of gloomy media reports about pensions. Or, in line with the life cycle model of Modigliani and Brumberg (1954), individuals may smooth out exogenous wealth shocks from the crisis over their remaining life cycle by changing their expenditure and/or labor supply. De Bresser et al. (2018) show that a shock in pension wealth of EUR 100 reduced retirement expenditure goals on average with EUR 23-33. In addition, gloomy media reports may have led to lower expectations regarding the future (for all people, not related to individual declines in expenditure goals).
The results were presented at a World Economic Forum expert meeting and were used by insurance companies to inform their financial advisors on "what is an adequate pension." The Dutch Authority for the Financial Markets and the Ministry of Finance used the results in Fig. 2 to identify groups that are vulnerable in terms of retirement savings adequacy. For the year 2017, the Ministry of Finance has announced a focus on divorced people, based partly on these results. Expected pension annuities are also calculated in Knoef et al. (2017) for a large administrative dataset (the Income Panel Survey, covering about 90 000 Dutch households). Here, as a benchmark to judge savings adequacy, a 70% replacement rate was used (e.g., Haveman et al. 2007) but also income-dependent replacement rates based on the self-reported expenditure goals in the LISS panel (Fig. 1). These results were used by the Ministry of Social Affairs and Employment (SZW 2016) to formulate policy options to reform the Dutch pension system (ahead of the elections in March 2017). Policies were proposed to stimulate pension accumulation for the self-employed (since the results showed the self-employed to be a vulnerable group with regard to pension accumulation). The report also proposed to differentiate between pensions for homeowners and renters, since the results showed large differences in the pension accumulation of renters and homeowners. The Ministry of Finance used the results for its study group on sustainable growth and also in the annual government report that clarifies the expected income and expenditure of the national government. 4

Stated Preference Analyses to Guide Policies
Sometimes preferences for policies cannot be measured, for example, because the policy is not yet in place. In such cases, surveys can be used to estimate the behavioral responses of individuals to certain policies by a stated preference analysis. In a stated preference analysis, respondents are typically placed in a hypothetical decision context and are asked to make several choices. In this way preferences can be elicited which can guide policies in the near future. Below the authors describe three stated preference analyses in the LISS panel that guide policies in an aging society. The first two are about the labor market for older workers, and the third is about the design of long-term care. Kantarci and Van Soest (2013) estimated preferences for retirement plans. By using a stated preferences analysis, they studied the effects of pension incentives and increasing retirement age on the preferences for retiring full time or part time at a later age. They find that two in five respondents prefer partial retirement over early or delayed abrupt full retirement. This suggests scope for policy interventions that emphasize partial retirement plans, offering flexible solutions for employees to optimize their retirement paths. Furthermore, the results show that individuals are responsive to an increasing retirement age at both the extensive and intensive margins.
Oude Mulders et al. (2014) fielded a stated preference analysis among all managers in the LISS panel, to examine a manager's considerations in the decision to rehire employees after mandatory retirement. This information is important for governments that want to increase the labor force participation among the elderly.
The results show that employers are strongly affected by employees who offer to work for a significantly lower wage. Overall, employers are disinclined to rehire employees after mandatory retirement, although large differences exist between employees.
More recently, Van Ooijen et al. (2017) investigated the public's willingness to pay for long-term care in the Netherlands. Their first results show that people are more willing to pay for household chores and personal care than for chaperoning, entertainment, or insurance for the purchase of medical devices. First results also show that people who expect to need relatively more long-term care are more inclined to buy an insurance plan for long-term care. This means that adverse selection is likely to be a serious concern when long-term care responsibilities are transferred from the government to the individual and insurers wish to enter the market.

Future Developments and Challenges
This chapter explained the LISS panel, an ultramodern and (cost-)efficient research infrastructure that is now solidly in place. More than 9 years' worth of rich and innovative data has now been collected through this infrastructure. Researchers worldwide have accessed the data for use in scientific, policy, and societal studies. The open data policy has facilitated a vast amount of monodisciplinary research, but multidisciplinary research is also possible by merging data from different disciplines. The authors have shown how results of the LISS panel in combination with administrative data have an influence in the political arena.
The world of primary data collection will change. New forms of data collection, including wearable computing and data collected through sensor technology, will replace the more traditional ways of collecting information. Lengthy surveys might be replaced by shorter high-frequency surveys. Infrastructures such as the LISS panel can provide a natural environment to launch new forms of data collection and to conduct high-frequency data collection. Disseminating the data may become a challenge, however, in part due to the size of databases-as in the accelerometer example described in Sect. 2.3-but also on account of privacy regulations. The more detailed information that becomes available on individuals, the easier it might become to identify individual respondents. Take, for example, the high-frequency data that were collected on travel behavior (Sect. 2.3). The daily GPS data can easily reveal where the respondent lives and works. It is impossible to make these data openly available for the research community, especially in combination with all the other data collected in LISS. All data need to be thoroughly anonymized before they can be made available (e.g., by disseminating only distances travelled).
The biggest challenge concerns the funding of infrastructures such as LISS. It is a public benefit and needs budgets from scientific funding agencies to maintain its high-grade scientific standards. Generating income through users of such infrastructures might be an option, but this may not be to the advantage of the open data policy. Users who pay for their data are, in general, not in favor of immediately sharing their data with others. And even if they are willing to share the data, resources are required to make the data dissemination-including accessible metadata-feasible. It seems unfair to charge these costs to the individual researcher or research team who commissioned the survey. Budget is also needed to make the administrative data more easily accessible for both scientific and policy-relevant research. Investments need to be made in computational power and software tools, and to serve the international research community, all documentation of metadata needs to be made available in English.
We live in a society where policies are improved by insights derived from data. These can be survey data but also administrative data, unstructured data, or a combination of these. As a research community, we need to make sure that data remain as accessible as possible.
Marike Knoef is Professor Empirical Microeconomics at Leiden University and board member of the Network for Studies on Pensions, Aging and Retirement (Netspar). In addition, she is a fellow at the Research Centre for Education and the Labor Market (ROA). She holds a PhD in Econometrics from Tilburg University. Before Marike joined Leiden University, she worked at CentERdata. Furthermore, she gained experience at the Netherlands Bureau for Economic Policy Analysis and the Dutch Social and Economic Council. Marike's research interests include household saving behavior, economics of aging, labor economics, and health. She gives master classes on these topics at the TIAS School for Business and Society. Recently, she was granted a subsidy for one of her research projects "Uncertainty over the life cycle: implications for pensions and savings behavior." Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.