This chapter gives an overview of policy toolkits focusing on ageing-related issues in the area of employment and pensions. The figurative notion of policy toolkit does not have a well-defined meaning. Intuitively, it refers to a set of items that aid in the development or assessment of policies. Given this ambiguity, it is beyond the scope of the current paper to examine all existing proposals for policy tools or toolkits. Instead, we will exclusively focus on analytical policy tools, which are used to assess the efficacy and efficiency of existing policies in the area of ageing. In other words, rather than attempting to provide a full inventory of the previous work in the field, this chapter explores the existing knowledge with the objective to identify pervasive practices regarding the link between research and policies. To this aim, it proposes a clearer definition of policy toolkits and a typology of policy tools, which is subsequently used to provide a synthetic overview of toolkits available in the broader field of ageing. Then we spell out the underlying conception of the relationship between research and policy-making that informs the analysis of specific policies. We also raise some critical questions regarding the public role of policy toolkits in the concluding section.

What is a Policy Toolkit?

Policy tools or toolkits are a common end product of any policy driven research. ‘Policy toolkits’ are conceived here as comprehensive sets of recommendations for the setup or reform of policies that are based on insights gained from research. In other words, the primary objective of policy toolkits is to inform policy makers of the key parameters that need to be considered for specific policy decisions relevant to a particular issue. Toolkits (a) establish the existing evidence that is relevant to a given policy goal (e.g. extending working life ) , (b) lay out the potential solutions, (c) address their applicability across contexts and (d) assess their long-term impact.

This initial conceptualisation is still markedly broad as the referenced tools and their finality can be conceived in a myriad of ways. An important distinction refers to whether the purpose of the tools is analytical or strategic . An analytical toolkit aims to identify which policies best achieve certain given objectives. By contrast, a strategic toolkit aims to influence the policy process in a particular way that has been established a priori. Informing and influencing policies is the main purpose of think tanks and many interest groups are similarly looking for ways to effectively advocate their political goals. For example, toolkits for civil society organisations in Africa have been released by both an alliance of NGOs and the UN Developmental Programme (Sonke Gender Justice Network 2013) as well as by the Catholic Church (CAFOD 2005). Additionally, more technical or implementation-oriented toolkits consisting of concrete guidelines exist that inform or instruct policy-makers concerned with reforming existing policy schemes or setting up new ones. An example of an implementation-oriented toolkit is the Policy Toolkit for Strengthening Health Sector Reform published by the Latin American and Caribbean Regional Sector Reform Initiative (Scribner and Brinkerhoff 2000), a joint effort of the US Agency for International Development and other organisations, which primarily addresses government officials. Similarly, the OECD (2010) has produced a Consumer Policy Toolkit directed at policy makers, which reviews policy tools and gives guidelines on developing an adequate consumer policy. The present chapter focuses exclusively on analytical toolkits.

The audiences of policy tools are not only politicians, policy makers and public administration, but also (other) social scientists as well as the interested public audience in general: the tools are also used in the wider debate around the mentioned policy issues, and can serve the articulation of public opinions in democratic societies. At the same time, it is the least complex tools that are more often used in wider debates as they lend themselves more readily to addressing general audiences.

A Typology of Tools

As a framework to map existing analytical policy tools, we propose the following typology: (1) good practice; (2) social indicators ; (3) programme evaluation; (4) simulation and forecast. Table 4.1 gives an overview of the different tool types and their key properties. The order used here follows the degree of technical complexity.

Table 4.1 Overview of types of policy tools

We speak of toolkits if various similar tools are provided as a package. For example, the OECD Employment Outlook periodically publishes a series of standardised labour market indicators (e.g. employment rate , long-term unemployment rate) broken down by multiple variables (country, sex, age, etc.). Each report can thus be understood as a toolkit containing a set of tools.

Each type of policy tool functions in a different way, given its distinct purpose, as we explain in more detail below. In addition, some of the strengths and weaknesses of each type of policy tool are also briefly discussed. Definitions and concrete examples of each type of policy toolkit are provided in Table 4.2.

Table 4.2 Definition and examples of policy tool kits

Good Practice

The most basic analytical tool consists of the identification of good practice policies. The status of ‘good practice’ is attained based on the positive assessment of a policy or practice, typically through expert opinions or public discourse. It is the simplest, yet possibly also the most powerful analytical tool. It emphasises the virtues of a particular case that achieves good results, stressing the elements or defining features that are deemed responsible for its outstanding performance. The identification is usually based on predominantly qualitative analysis which employs interpretative research methods involving a case-oriented and context-sensitive perspective. Ultimately, this tool aims at imitation as the main implementation mechanism.

However, this tool rests on the often problematic assumption that the model of ‘good practice’ can be simply copied partly or entirely to improve the functioning of other cases. Moreover, the acquisition of the status as ‘good practice’ is often based on merely anecdotal evidence. Lacking a systematic method for comparison, the outstanding position that is discursively assigned to certain pioneer cases, role models or prototypes, can be incidental. What practice is en vogue and counts as the ‘best’ is partly subject to dynamics of herd mentality and groupthink. Not unlike the fashion cycle, perceptions of boom or bust can also change quickly as fresh empirical evidence becomes available. For example, the German model of publicly subsidised private pensions (Riester-Rente) were first considered a failure as uptake was slow initially, then deemed good practice during a number of years as participation rates rose at a healthy pace, and now seriously questioned again as projected benefit levels disappoint and administration costs turn out too high given the moderate average performance of funds (Hagen 2018). Therefore, it is important to maintain a critical distance and not place too much weight on the presumed superiority of a given practice over others before it has been put to a more rigorous test, e.g. through more technically refined policy tools such as social indicators or programme evaluation (which are described in detail below).

There are two classes of ‘good practice’ that are relevant in the present context: (a) good practice in legislation and public welfare programmes on the one hand, and (b) good practice at the workplace level on the other.

In the realm of legislation, a famous case of a ‘good practice’ is the switch to a non-financial or notional defined contribution system of pensions in Sweden, which is considered the first major pension reform in an advanced industrial society to react to the challenges posed by population ageing . By adjusting benefits according to average life expectancy and economic growth it offered a systematic solution that would ensure system sustainability (Glans 2008). Many international observers took note as the reform tackled a common problem many other countries were facing in a similar manner. The system was celebrated in the pension policy discourse and several of its components were adopted in other national pension reforms (see, e.g. Palmer 2000). As another example, in 2014 the German parliament passed legislation introducing a minimum wage, thereby ending a decade-long controversy in the country on the subject. In the public debate on the issue, the presence of minimum wage regulations in most other advanced economies was a powerful argument. In addition, a commission of trade union and business representatives evaluates the minimum wage every two years, which led to the recent increase from 8.50€ to 8.84€ per hour. Interestingly, by introducing the system of regular monitoring, stakeholders are building a body of evidence to influence further policy development.

While both ‘good practices’ mentioned here are examples of large-scale systemic welfare state reforms, smaller pieces of legislation can also become ‘good practice’. For example, in the ‘employer toolkit’ for managers of older workers published by the UK Department for Work and Pensions (2016), it is recommended to limit exposure to night work for workers over 60 and increase rest periods (despite recognising that there exists no robust evidence that shift work has more adverse consequences for the wellbeing of older workers). An extensive report of good practice based on company case studies recommends work groups that are of mixed age (European Commission 2006: 145). These are typical examples of good practice at the workplace level.

Social Indicators

Social indicators are ‘[e]asily identified features of a society which can be measured, which vary over time, and are taken as revealing some underlying aspect of social reality’ (Scott and Marshall 2005: 61). They are clearly defined quantitative measures assessing the outcomes that current policies produce in specific societal domains. Social indicators are often established as time series to ease comparability and are used in all fields of policy. Examples for social indicators in the field of old age and work are the unemployment rate among 55–64 year-old persons, poverty rates among people of pension age, or average replacement rates offered by national pension schemes .

An indicator usually consists of a single figure that contains the relevant information in a very condensed form. At the same time, there are often variations of one and the same indicator (e.g. poverty levels based on different poverty definitions). In some cases (such as poverty), these variations reflect a lack of agreement on which is the most appropriate measure of an underlying matter. Other indicators, by contrast, are highly standardised and conventional (for example mortality rates) . Social indicators are based on administrative data, censuses or large social surveys. They are particularly useful for comparing outcomes over time, between gender, age or social groups, between spatial units (such as cities, regions, countries) or between administrative units. Due to their condensed form, social indicators are very powerful and attractive tools which are easy to use and to disseminate.

Still, as they are so condensed it is of paramount importance to understand the origin of an indicator, i.e. (the generation of) its data base and its mathematical derivation, in order to interpret it accurately. Their reductionism is thus also the weakness of social indicators , as they can be easily drawn upon or understood in oversimplifying or erroneous ways. Misinterpretations can arise, for example, if the content of what the indicator measures is misconceived, if trends are misread or if variations across different subpopulations are not adequately shown. As a famous Churchill quote illustrates (‘I only believe in statistics that I doctored myself’), social indicators carry the risk of being instrumentalised in detrimental ways.

Notably, not every quantitative measure relating to policy outcomes is a social indicator. Rather, social indicators are those measures which are seen as capturing a crucial aspect of policy outcomes, such as the distribution of resources, economic performance, etc. What kinds of measures become important and conventional as social indicators is the result of social processes, in particular the interaction of social sciences and policy practice, in the course of which the related measure becomes charged with meaning (see section “Good Practice” for further details).

Nowadays, social indicators are widely used on different policy levels, be they local, regional, national, or international. Complex infrastructures producing and reporting social indicators have been established (at least) on national and international levels. International organisations like Eurostat, the Organisation for Economic Co-operation and Development (OECD) or the International Labour Organisation (ILO), use a multitude of social indicators for reports on various features of societies. While indicators are frequently compared between countries and over time, similar reporting systems often exist on national and regional levels.

Programme Evaluation

Programme evaluation refers to the measurement of the efficacy and efficiency of public policies or workplace practices. It focuses on comparing costs and benefits of a given programme, thus calculating the effectiveness and productivity of specific investments. This puts decision-makers in the public and private sector in the position to make informed choices about the efficient allocation of resources. To be capable of comparing inputs and outputs in an orderly manner, programme evaluation is based on the precise definition of the aims of the programme, the sound accounting of budgets and clear definitions of the applied financial concepts. Often, pre-defined ‘performance indicators’ (which share many features of the social indicators described in the foregoing section) are used to measure outputs.

The gold standard to measure the efficacy and efficiency of a policy programme or intervention consists of the application of an experimental research design. Simply comparing participants with non-participants or measuring the output of interest before and after participation in the programme may lead to flawed results because of possible confounding factors, selection effects and environmental influences. Rather, a rigorous impact assessment aims to find out whether a possible change in the target population has indeed been a direct consequence of the programme, or possibly would have happened anyway. The causal effect of the programme is identified by means of comparison with a counter-factual scenario in which the programme does not exist. Therefore, such programme evaluations characteristically involve closed experiments with treatment and control groups (or sometimes natural experiments),Footnote 1 to examine the direct effects of a given policy reform or public intervention. To further illustrate this tool type, Text Box 4.1 provides an example of a US programme evaluation of an organisational redesign policy aimed at facilitating flexibility in the workplace.

Programme evaluation can also be used for the appropriate fine-tuning of policy programmes, to check whether there are problems in their implementation (this also falls under the label of process evaluation), whether given programmes work better for certain subgroups of the population or segments of the economy, etc. Sometimes, rather than employing an experimental design , the evaluation of public policies is based on a dense narrative or process tracing of the policy and its success. In these instances, the boundary to ‘good practice’ tools (described above) is blurred as both approaches rely on “soft” methods for the measurement of performance.

The strength of programme evaluations resides in their analytical power and especially in the elegance of the experimental design. As it ideally produces clear-cut estimates of the causal impact of a programme, it is highly appealing to decision-makers who can convincingly demonstrate tangible results to stakeholders. The proven impact and cost-benefit relation of a particular programme may also serve as performance threshold for similar programmes, thereby providing validated measurement scales that allow benchmarking the efficacy and efficiency of policy interventions in different areas.

Text Box 4.1: Example of a Programme Evaluation

The STAR programme is a prime example of a program evaluation in the context of extending working lives policies. The study was carried out by Phyllis Moen, Erik Kojola, Erin L. Kelly and Yagmur Karakaya and published in the journal “Work, Employment and Retirement” in 2016. The policy evaluated in this randomised controlled trial was called “Support. Transform. Achieve. Results”, a programme that targeted workers aged 50 to 64 years. This organisational intervention was carried out in the IT division of a large US company. The intervention involved three elements:

  1. (1)

    participatory training sessions in which working groups discussed ideas to increase employees’ working time flexibility by improving the efficiency of work processes;

  2. (2)

    training sessions for supervisors to become more mindful of employees’ private affairs and aware of possible work-life imbalances in their organisations;

  3. (3)

    evaluating measures to focus on results over hours at the workplace rather than “face time”, e.g. by avoiding inefficient meetings requiring unnecessary physical presence.

The authors report substantial effects on expectations of later retirement measured five years after the introduction of STAR: “the likelihood of expecting to retire later, at age 67 or older, is on average 10.3% points higher for those in STAR, net of all other factors” (Moen et al. 2016: 330). Although the exact mechanisms behind this positive outcome are not unambiguously clear, the findings convincingly demonstrate that flexibility interventions are capable of altering retirement expectations. By making working conditions more accommodating for older workers, later retirement becomes a more attractive option.

The most important disadvantages of this methodology are pragmatic in nature. Implementation issues include elevated costs, work intensity and time requirements, especially if oversimplifying approaches like before-after-comparisons are to be avoided. Evaluating a public policy of a certain scale is a demanding task because often many actors are involved who need to be coordinated to ensure the proper setup of the experiment (e.g. compliance with assignment to treatment status, avoidance of contamination effects, etc.). Since programme evaluation usually involves considerable personnel costs and time requirements, there is the risk that eventual efficiency gains will be outweighed by the administrative and other costs of implementing the evaluation. Finally, as was the case with ‘good practice’, the functioning of a policy programme is always to some extent context-dependent, and it is possible that a given programme will not work in the same way in a different social environment.

Forecasts, Projections and Simulations

Projections, forecasts and simulations usually serve to predict future outcomes (in the case of projections and forecasts) or to speculate on potential outcomes (simulations) of a policy or several interrelated policies. They usually refer to the aggregate level of outcomes,Footnote 2 not to the individual level, and involve several indicators that have been collected through large-scale surveys, censuses or administrative data. Based on models using advanced statistical methods, this type of tool serves to infer from past and current policy outcomes and their causes to future or potential outcomes in order to establish clearly determined scenarios of what will happen or of what might happen if certain ancillary conditions change in a specific way.

In more detail, projections and forecasts often target an important social indicator. Forecasts extrapolate past changes and current influences on the targeted measure into the future, while projections are based on specific assumptions regarding ancillary conditions.Footnote 3 As the latter are often uncertain, projections are frequently based on different scenarios. Typical examples of this are population projections, which are usually established on the basis of several different scenarios regarding births, deaths and net-migration (e.g. Tabeau et al. 2001). Simulations work quite differently as they recreate real individual-level events. Moreover, assumptions about ancillary conditions tend to involve changes that are currently not very probable. For example, a simulation may be combined with projections in order to answer so-called “what if” questions, such as what would happen if a certain policy was introduced or ceased, or what would have happened if it had not been introduced. As can be seen from the above example, boundaries between projections and forecasts on the one hand, and simulations, on the other, can be fluid.

Projections, forecasts and simulations become more complex the more ancillary conditions are included into the underlying statistical model. In most cases, projections and forecasts can only provide a simplified prediction of the future, because it is not possible to include all ancillary conditions in the model. Moreover, trends and ancillary conditions can change in unpredictable ways, for example due to unforeseen events, such as wars or economic crisis. Generally, results are more precise for the nearer than for the far future.

Projections, forecasts and simulations are very challenging tools to assess policy results, as they require detailed quantified assumptions about the crucial influences on the outcome of interest. The latter can only be derived from good statistical explanations of the past or very good theories—simple extrapolations from past trends to the future, without any ancillary assumptions, will often produce inadequate projections.

As projections, forecasts and simulations can help to speculate about the future in a systematic way and to assess potential outcomes of a policy, they can be crucial for political planning. As other tools, however, they have to be adequately understood and interpreted to fully exploit their potential, and not doing so might result in consequential fallacies about the success or failure of policies. An adequate understanding of projections, forecasts and simulations importantly also includes the uncertainties inherent in each of these tools. Therefore, these types of tools tend to be targeted at expert audiences, be they policy experts or social scientists.

In addition to these four different types of tools, it should be mentioned that ‘policy briefs’ are common synthetic toolkits which can combine the insights from several or all of the four types of analysis tools to recommend a compact set of policies. Recent examples in the area of ageing are the Gender Extended Working Life Policy Briefs (e.g. Ardito et al. 2018; Lössbroek et al. 2018).

Interaction of Toolkits and Policy Processes

While a typology of toolkits provides a useful categorisation to delineate policy toolkits by types, it provides little insight into the effectiveness of these toolkits. In this regard, it is critical to understand the policy process which these tools aim to inform. To study how policy toolkits influence actual policy decisions, some researchers focus on the ways in which policies are produced, captured and packaged as ‘knowledge products’ (such as national policies or service frameworks) and/or how these knowledge products are then transferred to the realm of practice. Such approaches discuss the existence of a ‘gap’ between research and practice, which is usually manifest in the low uptake of research evidence, in the patchy implementation of policies , and in stakeholder behaviour defending particular interests. According to these approaches, it is important to rethink knowledge and policy utilisation, and in fact, to frame knowledge and policy as integral element of practice, rather than apart from it (Gkeredakis et al. 2011).


Policies are actions aiming to achieve certain outcomes in response to ‘some sort of problem that requires attention’ (Birkland 2011: 8). While the term policy encompasses a wide range of actions and legislation, in the context of this chapter, the interactions between policy toolkits and regulatory policies are in focus. Commonly, a distinction is drawn between public policies and other policies such as company policies. Public policies are ‘ultimately made by governments’ (Birkland 2011: 9) at various levels. Especially in the European Union (EU) supranational policies have been increasingly influential for policy-making in the member states. Next to supranational and national policies, in a number of countries, such as for instance Germany, Spain or the US, legislative competences also exist at subnational level. The extent to which these subnational authorities can pass legislation varies distinctively from country to country. Policy-making at these different levels thus never stands alone, but is structurally embedded in a multi-level surrounding. In addition to public actors, the private sector also influences policy making. Particularly in the area of extended working life , corporate practices and workplace arrangements regarding older employees are a critical component of the broader policy framework.

Policy-Making and Policy Toolkits

In order to assess the impact of policy toolkits , it is important to take account of the way policy-making works. The most influential and most commonly applied framework for policy analysis is the concept of the policy cycle . It emphasises ‘the political process as a continuous process of policy-making’ (Jann and Wegrich 2007: 44) that consists of different phases or stages, which serve heuristic purposes. In practice, the different stages might not be clearly distinguishable as temporal phases (Sabatier 2007). In addition, not all phases necessarily form part of every policy process . The most common framework of the policy cycle , distinguishes four phases of policy-making: (1) Problem recognition and agenda-setting, (2) policy formulation and adoption, (3) policy implementation , and (4) policy evaluation .

  1. (1)

    The starting point of every policy process is the identification of a given development, trend or situation as a problem that requires political action (Jann and Wegrich 2007). Agenda-setting has been characterised as ‘an ongoing competition among issue proponents to gain the attention of media professionals, the public, and policy elites’ (Dearing and Rogers 1996: 1–2). Policy tools and expertise can support efforts to put a specific problem on the political agenda. Especially toolkits that provide accurate and reliable information about the current state of society, ongoing trends or expected developments—such as social indicators and forecasts—can provide the basis for the identification of societal issues and their wider ramifications. In this way, they can become an important part of the assessment of the situation and of a particular problem being articulated as political issue. Finally, analytical policy tools can contribute to legitimising political action (Barkenbus 1998).

  2. (2)

    Once a political issue has become part of the political agenda, the goals of the policy dealing with it have to be defined, alternative routes of action considered and a decision on the course of action has to be adopted. Within this stage, a different set of policy tools gains importance: Here, policy toolkits that provide insights into potential implications of different policy designs , key factors for minimising negative side effects or unintended consequences—such as good practices, evaluation of previous policies or simulations—are especially valuable tools that can aid the formulation of a policy. Good practice and policy evaluation can draw the attention to relevant features of institutional arrangements, helping to identify an adequate route for political action.

  3. (3)

    Once a specific policy is adopted, its implementation can leave considerable space for interpretation that affects outcomes. Policies are thus interpreted and applied during implementation, influencing their shape and outcomes (Sabatier and Mazmanian 1980). In this phase, policy toolkits can provide information about factors that enable or impede a successful implementation.

  4. (4)

    The last stage of the policy cycle is the evaluation of policies and of their implementation. Previous evaluations, indicators and good practice examples can be used in the course of this evaluation. In this way, policy analysis becomes an integrative part of the political process. For instance, based on best practice examples and the evaluation of similar existing policies, lessons can be learnt and depending on its outcome, either a new policy cycle is started or the policy process is terminated.

In brief, the role of policy toolkits is clearest in the evaluation stage, where policy outputs are systematically examined and analysed, but policy toolkits can provide important input during the other stages as well. Due to the inherent particularities of every stage of the policy cycle , different types of policy toolkits can gain importance to different degrees in these stages. Toolkits providing information on societal developments, trends and problems are helpful in the initial stage of the policy cycle , and toolkits offering detailed insights on policy features can be used in policy formulation as well as during the stage of implementation. The role of analytical toolkits in the first stages of the policy cycle is contingent on the specific circumstances. In democratic societies, in principle all policies are subject to public debates regarding their legitimacy and the efficient use of resources. Policy toolkits provide a sound empirical basis for this analytical task and thus fulfil a crucial function at the interface between research and practice.

Discussion and Conclusions

This chapter has defined policy toolkits as evidence-based sets of recommendations to create or change specific policies. We have developed a typology of tools and provided a structured overview of examples of existing policy toolkits in the area of employment and pension reforms in ageing societies . Furthermore, we have placed policy toolkits within a conceptual framework of the overall policy process , and have shown how toolkits may enter the different stages of the policy cycle . The identified policy toolkits need to be further reviewed to better understand their effectiveness in improving the policy process . We have also suggested that it is critical to understand the policy cycle and understand which stage of the cycle the policy toolkit is addressing.

This conception of policy toolkits inevitably entails some limitations. As a precondition for the development of toolkits for policy analysis, there needs to be at least a tentative consensus on the societal goals and challenges that said policies aim to address. Notably, this starting point implies a normative position that has a political dimension and is influenced by national and international debates involving diverse sets of actors and stakeholders. The objectives established by the European Commission include the promotion of healthy and active ageing to guarantee the sustainability of European welfare states , but also the inclusiveness and social cohesion of European societies. In the public debate , these goals are arguably widely shared across European societies as well as among different social actors and segments of the population. However, these goals are also notoriously vague, and discordant voices that criticise the ideological connotations of the ‘active ageing’ paradigm (e.g. van Dyk et al. 2013), highlight the adverse effects of extending working life on gender equality (Ní Léime and Street 2017), or question the scope of the demographic ‘burden’ in the first place (e.g. Spijker and McInnes 2013). There is also the more general debate on the extent to which social sciences actually should be judged by their capacity to produce “useful” knowledge in the first place (e.g. Demers 2011).

Moreover, it can be questioned to what extent the different policy goals are congruent with each other, and can be simultaneously achieved. To a certain extent, the two sets of goals―those pertaining to efficiency and those pertaining to equality―are in fact at least partially competing with each other. Thus trade-offs between them need to be negotiated. In this case, is the main benchmark for public policies the extent to which they contribute to economic efficiency or whether they help attenuate social inequalities in terms of health, gender, class or other dimensions of stratification? Obviously, it should be on democratically elected politicians, not scientists or technocrats, to establish the order of political priorities which applied research should adhere to.