Background

A more sustainable energy system is key for addressing global environmental and social challenges, and an imperative feature for its realization is increased energy efficiency in the built environment. The built environment alone accounts for a large part of global final energy use—32% according to the Intergovernmental Panel on Climate Change (IPCC) [1]—and the energy demand is expected to increase over the coming decades. At the same time, the sector holds an estimated energy efficiency potential worldwide of 50–75% and 50–90% in existing and new buildings respectively [1]. To realize these potentials, and to actualize a transformation towards a more energy-efficient built environment, policy interventions will be necessary to overcome market failures and to accelerate desirable changes in the socio-technical system [1,2,3].

Over the years, various types of policy instruments for energy efficiency have been introduced, including legislative instruments such as building codes and minimum standards; financial instruments such as subsidies and tax reductions; informative instruments such as campaigns and labelling schemes; but also new types of instruments such as technology procurement programs, voluntary agreements, actor network platforms and white certificate schemes [4,5,6]. By applying and combining various types of policy instruments, the goal is to achieve not only incremental improvements to a more energy efficient built environment but also to support a transition of the society, i.e. the introduction of new emerging technologies as well as changes in social norms, behaviour and institutional capacity.

Energy efficiency in buildings is a complex field, minted by multiple actors (e.g. designers, developers, contractors, engineers, owners, tenants, capital providers, etc.), institutions (e.g. authorities, regulations and norms) and technological factors shaping it [7]. Building type, age, use and structure varies, as do lifestyles of the tenants, which creates mixed incentives for undertaking of energy efficiency measures, at the same time as buildings intrinsically are slow to change due to long lifespans of measures [1, 7]. Altogether, the design of policy measures to achieve a transition towards a more energy-efficient built environment is a complex process that requires advanced knowledge on how various policy measures affect change.

To understand which policy measures to use and how to design successful policy strategies for transformative changes will require evaluations and a vigorous dialogue on how to evaluate [8,9,10]. Today, however, the methods and processes underpinning evaluations in the energy and environmental fields are often described as multifarious and largely suboptimal in terms of systematization [9, 11].

The objective of this paper is, thus, to propose a theory-based framework that can be used for assessing existing policy evaluation practices, with a view to enabling evaluations to further support transformative policy strategies for energy efficiency. This framework is based on evaluation theory, policy analysis and transition research, which we will return to below. The objective is also to assess to which extent existing evaluations already today apply a transformative evaluation approach.

We apply Vedung’s [12] definition of evaluation as a ‘careful retrospective assessment of the merit, worth and value of administration, output and outcome of government interventions, which is intended to play a role in future, practical action situations’. Moreover, we extend the boundaries of this definition to not only include retrospective ex-post evaluations but also prospective ex-ante assessments, since both types are recognized in the European Commission’s [13] guidelines for better regulation, as well as in the European Union’s (EU) currently enforced 7th Environment Action Programme as means to ‘improve environmental integration and policy coherence’ (see also [14]).

In principle, a thorough evaluation approach that supports development of relevant knowledge as well as processes of change in society should build on essential theoretical knowledge-bases such as evaluation theory and policy analysis. In evaluation theory, attention is brought to the need for broad and reflexive methods [15, 16]; value judgements that reflect multiple stakeholders’ concerns; and facilitation of use of evaluations through stakeholder involvement [12, 17,18,19,20]. Policy analysis brings forth a complementary emphasis on assessing the policy instrument’s managing and administration, its role in a societal system (see, e.g. [21]), and the need for policy-mixes to enhance system transformations [22, 23].

Furthermore, if to capture and support more transformative changes in the energy system—as often argued for and described in the processes of motivating and designing new energy policy instruments—transition research could be a very valuable complement to the evaluation design. Some key concepts that are central in most of the transition literature, and that can provide insights to a policy evaluation, include a system-oriented, scale-oriented and multi-actor based approach, along with processes that motivate transitions, such as visioning, experimentation and learning (see, e.g. [24,25,26,27].

At present, however, it is unclear to which extent existing policy evaluations and policy evaluation approaches focus on, and are capable of, capturing and supporting transformative processes for the achievement of an energy efficient society. To which extent do evaluations, either on their own or in combination with other evaluations, support the development of relevant knowledge, and to which extent do they rely on a theoretical knowledge-base, providing a comprehensive and rigorous analysis and assessment? In order to answer this, we review and assess 33 evaluations of policy instruments for energy efficiency in buildings in Sweden. Based on the review, we discuss the evaluation practices applied today, and seek to identify means to further support the development and application of transformative evaluation strategies for energy policy. Sweden is chosen as a suitable case for the review due to it being one of the forerunners as regards policy evaluation, with an extensive evaluation practice spanning multiple sectors [8, 28].

The outline of the paper is as follows: the ‘Theoretical framework’ section provides a presentation of the proposed theory-based framework along with its theoretical underpinnings. The ‘Methods’ section presents the methods used for data collection and analysis when conducting the review of 33 policy evaluations. In the ‘Results’ section follows results drawn from the review, showcasing actual evaluation practices of policy instruments for energy efficiency in buildings in Sweden. In ‘Discussion’ and ‘Conclusion’ sections, we discuss and summarize our conclusions on how to further support a transformative evaluation approach and the development of relevant knowledge for a more energy efficient society.

Theoretical framework

In this paper, we present a theory-based evaluation framework designed to assess existing evaluations and their abilities to present a transformative and comprehensive evaluation approach for policy instruments. The framework is based on evaluation theory and is, moreover, complemented with insights drawn from the fields of policy analysis and transition research. While evaluation theory and policy analysis provide insights on methodological and contextual conditions of the evaluations, transition research provides conceptual approaches on how to describe the structure and processes that support transformative societal changes (see, e.g. [24, 29,30,31]). Although transition research is a field consisting of multiple sub-orientations, there are certain key concepts that are transversally pervading throughout the wide span of orientations. For this study, we have identified such key concepts and infused them into the theoretical framework. First, we identify and introduce the need for a system-, scale- and multi-actor approach in the evaluations to support transformative changes. Secondly, we identify the need for evaluations to capture processes of visioning, experimentation and learning. We will return to these concepts in the description of the framework below.

The framework is categorized in accordance with Alkin and Christie’s [17] evaluation tree, arranged around the three main theoretical branches of (I) methods applied in evaluations, (II) value judgments in evaluations and (III) use of evaluations. For each category, a number of sub-categories have been identified, referring to key issues from evaluation theory, policy analysis and transition research, as described below. A full list of the categories and sub-categories that guide the review is found in Appendix 1.

Methods applied in evaluations

Methods applied in evaluations cover the entire process of evaluation, from theoretical points of departure to data collection and analysis. Methods for assessing interventions and, to some extent, data collection can largely be described as quantitative (e.g. statistical analysis and models) or qualitative (e.g. interviews, document analysis) [10, 32], and a mixed methods approach has been stressed by scholars [10, 32, 33] for the provision of comprehensive results. In relation, triangulation of data, methods, analysts and theories is emphasized for a systematic validation of consistency in findings from various approaches [34, 35], along with a robust counterfactual for attributing effects [36].

In the field of transition research, the application of a system approach is emphasized for assessing transformative change, along with the need to capture the interconnectedness of multiple components of the system [24, 26, 37]. Systems can be described in terms of technology, actors, networks and institutions which are all interlinked. For the case of energy efficiency and an evaluation point of view, this would require a consideration of norms and institutions, actors and their behaviour and interlinkages and dependencies within the societal system in which the evaluated energy policy is enforced. Moreover, the evaluation should encompass multiple actors (e.g. authorities, organizations, businesses and individuals) that hold a stake in or are affected by the energy policy, acknowledging that some actors may be driving transformative changes while others may be counteracting it [27]. It would also require an outlook to any technological factors that may challenge or change the current socio-technical system configuration. Related to this, scale is seen as another key component, often highlighting the importance of niches—hubs where small-scale experimentation can be carried out—and their protection through active shielding, nurturing or empowering efforts [38]. As an example, niches within energy efficiency in buildings may be small test-beds for, e.g. new building materials or appliances, which are supported by policy instruments that facilitate their activities.

Transition research also emphasizes processes that motivate and drive transitions, such as processes of visioning, experimentation and learning. Visioning is emphasized for its guiding of transitions by, for example, providing common goals and inspiring and activating actors [27, 31, 39]. Visions related to energy efficiency may be found in transnational goals, such as the EU energy and climate targets, and in nationally determined goals, thus relating to the institutional components of the system-perspective laid out above. Capturing of visioning within an evaluation may, however, not only include the investigation of efforts made towards reaching such goals, but may also include the assessment of future outcomes and their contribution to societal changes, calling for a long-term perspective [30, 31, 40].

In addition to visioning, experimentation and learning are recognized as essential transformative drivers [25, 27, 40, 41]. Learning is here viewed as the acquiring of new knowledge, for example derived from experiences and experimentation, which leads to improvement of some kind, through actions and policy decisions, or through alterations of paradigms and ideas [42,43,44]. Experimentation for energy efficiency in buildings can include physical innovation in terms of, e.g. building materials and technological equipment, but arguably also innovative policy design. Thus, the capturing of experimentation and learning in the evaluation of energy policy instruments requires an investigation of innovative efforts both in terms of policy design, its ability to facilitate or favour experimentation, and its intended outcome. Lastly, to focus on transformative efforts, forces upholding the status quo also need to be acknowledged, along with the efforts geared towards disrupting those [45].

To summarise this category, we review the traditional evaluation methods applied, approaches to assess impact and the construction of counterfactuals when performing impact evaluation, along with the assessment of side-effects, rebound effects and triangulation. Moreover, the potential to capture transformative efforts within the evaluations is assessed by identifying the use of a system-wide perspective, scale, a multi-actor approach, visioning, experimentation and learning (see Appendix 1).

Value judgements in evaluations

The decision regarding which value criteria to be employed may be guided by actors in a descriptive approach, departing from the wide range of values held by stakeholders, or be determined by the evaluator in a prescriptive manner, guided by particular, justified values [46]. Criteria traditionally concern policy outcomes (effects) and goal attainment [10, 12, 33], but may also concern efficiency (cost-effectiveness or cost-benefit), relevance, flexibility, persistence or predictability [47]. They can also relate to democracy and social justice [48], by investigating, e.g. acceptability, transparency and equity [47], or may, as emphasized within policy analysis, concern consistency, coherence, credibility and comprehensiveness [23]. In addition, value judgements should express reflexivity regarding established goals needs and methods—frequently called double-loop learning [49]—which refers to the examination of contributions and implications of a policy program’s goals on a societal level, with regard to how they serve to improve societal welfare (see, e.g. [15]). From a transition research perspective, a particular focus is placed on different interests held by different stakeholders, and their role in the policy process, as captured by a multi-actor approach.

To summarize this category, we review the selected criteria used in the evaluations, how these criteria reflect the interests of different groups, by whom they have been decided and whether the evaluation expresses reflexivity in terms of challenging established goals, needs and methods (see Appendix 1).

Use of evaluations

Making use of evaluation results is paramount in evaluation theory [12, 18, 50]; however, the drivers for conducting an evaluation in terms of its utilization vary, and span accountability, learning and political nature [12, 14]. Ideally, an evaluation should capture and address the concerns of various stakeholders—not only the decision makers—and this calls for an elaborate evaluation design that is able to heed multiple kinds of uses [20].

The making use of evaluation results, moreover, hinges upon the timing of the evaluation, especially so in a policy field like energy efficiency that is rapidly developing [51]. As such, the design of the evaluation should mirror the timeframe of the use, where more rapidly executed evaluations may be favourable over more extensive ones at certain occasions, and vice versa.

From a utilization perspective, the key components adapted from the field of transition research are a multi-actor approach and learning. Learning through its role in contributing to improved policies and processes [42, 44], a multi-actor approach for the capturing of how different groups affect or are affected by a transition and for spurring increased utilization of evaluation results by focusing it on relevant questions and by enabling commitment [12, 18]. Thus, it is important to assess what is done in the evaluation process to increase the accessibility of the evaluation to user groups which are not the obvious participants.

To summarize this category, we review the potential use of evaluations by investigating the identification and involvement of stakeholders in the evaluation process, both in terms of partaking in evaluation design and for data collection, and by other activities undertaken in order to facilitate further use. We also review the time frame for the use of the results (see Appendix 1).

Methods

Assessment and review approach

The assessment approach applied in this paper builds on the theory-based evaluation framework presented in the ‘Theoretical framework’ section, which gathers core aspects to be considered in order for evaluations to be able to provide essential knowledge in complex fields such as energy efficiency. By reviewing evaluations conducted for energy efficiency policy instruments in Sweden, following the theory-based evaluation framework, the intention is to identify the extent to which the aspects highlighted in the framework are applied already today in practice.

The review is designed as a qualitative, systematic review of existing policy evaluations. The use of a systematic review for analysing existing evaluations is an established method, but rather than focusing on synthesizing results from the evaluations (see, e.g. [52, 53]), we instead place focus on the conduct of evaluation: focusing on methods, value and use (see ‘Theoretical framework’ section). Each of these three framework categories has a number of sub-categories attached, which creates a protocol to guide the review (see Appendix 1). The review covers 33 Swedish policy evaluations, each of which has been closely read and qualitatively analysed according to the framework. When reviewing, we have identified methods, value judgements and aspects of use, as they have been presented and described in the evaluation reports. Thus, we have avoided interpretation of the evaluation content. Due to the scoping of the study, we have not conducted interviews with stakeholders such as evaluation commissioners, evaluators or other stakeholders involved in the implementation of the evaluated policy instruments.

Evaluations performed in Sweden and evaluations included in the review

Over the last decade, a variety of policy instruments aimed at advancing energy efficiency in buildings have been introduced in Sweden, and many of these policy instruments have been evaluated to form the basis for decision making and reporting. Although policy evaluations may be conducted by a range of actors—governmental, non-governmental and researchers—we include in this study only those evaluations that were directly commissioned by governmental authorities. This is in line with the aim of the study which is to assess the current governmental evaluation practices in Sweden, and the extent to which such evaluations have been used to inform transformative policies for energy efficiency. The evaluations reviewed in this study were selected using the following criteria:

  • The evaluated policy instrument was aimed at energy efficiency in buildings

  • The evaluation was commissioned by governmental authorities

  • The evaluation was commissioned between the years 2005 and 2015

The evaluations were collected from state websites, national reports and interpersonal contacts with Swedish governmental authorities (e.g. the Swedish Energy Agency; the National Board of Housing, Building and Planning) as well as with external consultants who conducted evaluations. All evaluations collected that met the selection criteria were reviewed. Thus, this study presents the review of 33 evaluations of policy instruments conducted or commissioned by Swedish governmental authorities in the period 2005–2015. The following policy initiatives are included:

  • Legislative instruments (two evaluations): revision of building codes and energy requirements.

  • Financial instruments (five evaluations): subsidies and tax reductions for energy efficiency measures and subsidies for performing energy audits.

  • Informative instruments (18 evaluations): demonstration projects, municipal energy advisory programs, energy performance certificates for buildings, online information portals, facilitating energy services and public authority best practice.

  • Other instruments (eight evaluations): technology procurement programs and cooperative network programs for energy efficiency in buildings.

Moreover, a number of evaluations were collected that were deemed to not fully meet the selection criteria. These evaluations concerned policy instruments that were only partly or indirectly connected to energy efficiency in buildings, and were therefore not included in the review:

  • Evaluations of multi-sectoral policy instruments: the contribution of the policy instrument to energy efficiency in buildings was not evaluated specifically.

  • Assessments of policy mixes using energy system models; economic modelling; or bottom-up/top-down effect calculations: these evaluations contained a number of energy policy instruments that were assessed in combinations, commonly spanning different sectors, of which not all directly referred to energy efficiency in buildings. Examples of such policy instruments include energy taxes, carbon taxes and policies targeting fuel substitution.

Thus, the final selection of policy evaluations that was reviewed within the scope of this study places focus on evaluations conducted of individual policy instruments. An overview of the final sample is found in Table 1, showcasing the distribution of type of evaluand, commissioner and evaluator. A full reference list of the evaluations, both those included in the review and those that were not, is found in Appendix 2.

Table 1 Overview of review sample of 33 evaluations, showcasing distribution of policy instrument type, commissioner and evaluator

Results

Below, we present results drawn from the review of evaluations, and assess the extent to which the Swedish evaluations of energy efficiency policies apply the components of transformative evaluations, presented in the theory-based evaluation framework in the ‘Theoretical framework’ section.

Before going into depth in the review, we start with providing some context as to the reviewed evaluations. As outlined above, all 33 evaluations were commissioned by governmental authorities. Two of these authorities—the Swedish Energy Agency and the National Board of Housing, Building and Planning—also conducted evaluations, whereas the majority were conducted by external consultants, as illustrated in Table 1. The external consultants amount to eight different firms, their contribution in the evaluations spanning from data collection, compiling of reports, to the full execution of the evaluation.

The stated purposes for the evaluations within the reviewed sample spanned many objectives: many evaluations focused on outcomes and impacts, other evaluations focused more on process-related inquiries, such as administrative burdens, the organizations involved in the implementation, or views of different stakeholders. In all, the foci of the commissioned evaluations were relatively limited in scope and time, and although the evaluations provided learning on essential elements, none of them explicitly expressed a purpose to advance knowledge supporting a broad transformative change in society. Within the review sample, certain policy programs were recurring through yearly evaluations, such as the municipal advisory program, or through evaluations conducted mid-term and at the end of an implementation, providing insights over time.

Below, we present the findings from the review following the three main categories: methods applied, value judgements and use.

Methods applied in evaluations

A key challenge in evaluating energy efficiency policies is how to provide credible results and vital learning, and an important aspect of this challenge is the choice of methods to be applied in the evaluations. The evaluations of energy efficiency policies covered in this study show a respectable mix of methods used to assess the results of the policy instruments (Fig. 1). The majority of the evaluations were based on two (17/33) or three methods (12/33); the most frequently used being interviews (30) and document analysis (covering, e.g. applications, regulations, statues) (28).

Fig. 1
figure 1

Methods used for analysis in the reviewed evaluations (‘Interviews in groups’ is a method employed to gather several different types of stakeholders for interviews and discussions regarding early evaluation results)

Methods for data collection and analysis overlap in surveys, interviews and document analysis. Surveys, and in some instances document studies, were commonly used to provide data used for both qualitative analysis (e.g. opinions, functions of a policy instrument) and quantitative analysis (e.g. number of measures taken, number of applications). The purely quantitative methods for analysis that were used in the sample were calculations and statistical analysis. Altogether, the reviewed evaluations leaned towards qualitative methods and assessment of effects of a ‘softer’ nature, geared towards influencing stakeholders in taking energy efficiency measures and facilitating knowledge transfer between actors. Although the review does indicate a use of multiple methods, triangulation as a tool for systematic control of the consistency and validation of results in general was scarce. The combination of different data sources or methods was not commonly used for testing the consistency of findings, but rather as complements to each other, for providing input to specific questions.

Both evaluation theory and the policy analysis literature stress a counterfactual construction for the assessment of the actual impact of any policy intervention (see, e.g. [36]. The review shows that the most prevailing type of counterfactual analysis applied in the evaluations of energy efficiency policies in Sweden was derived from interviews or surveys (14/15), where evaluators sought to determine the additionality by asking stakeholders for their opinion on the extent to which the policy instrument had affected their decisions and actions. The applications of calculated counterfactuals and reference scenarios were less commonly used (3/15). Two of the reviewed evaluations used multiple methods for the construction of counterfactuals, which were synthesized from two and three methods respectively, the latter combining surveys, calculations and a reference group [54, 55]. The use of a reference group was only seen in this one evaluation, which concerned a subsidy for energy audits. Lastly, the choice of methods for constructing the counterfactuals was, however, rarely discussed in the reviewed evaluation reports. Multi-actor involvement occurred in the interviews or surveys, but reflections relating to the selection of actors to be involved, or the number of respondents needed for constructing a robust counterfactual were not apparent in the written reports—but may of course have been undertaken during the evaluation design process.

To provide knowledge on transformative changes in society, and the potential drivers and barriers for energy efficiency, a system-wide evaluation approach is required. Policy analysis specifically brings forward the assessment of aspects such as side-effects and rebound effects. Among the reviewed evaluations, only one incorporated a more thorough consideration of side-effects, by identifying the environmental impacts and assessing their costs [56]. Eight other evaluations mentioned side-effects, or rather co-benefits, such as marketing and competition advantages, altered value of buildings as an effect of energy efficiency measures and the creation of joint platforms for knowledge exchange between authorities and businesses. Rebound effects were mentioned in one evaluation, a matter of increased energy efficiency leading to increased energy use due to improved comfort [57].

Transition research emphasizes a system and scale-oriented perspective. Such an assessment can be designed in many different ways, including for example the assessment of system components such as actors, institutions and technological factors. The review of the Swedish evaluations shows that many evaluations focused on actors (20/33): on outcomes and effects that influenced different actors in their actions (e.g. the municipal advisory program or the energy audits), or their knowledge acquisition (e.g. within the collaborative network programs). The actors that were considered included authorities, beneficiaries, organizations and companies. Institutional aspects were considered in seven evaluations, of which four referred to evaluation of the building codes [56, 58], the act on energy declarations in buildings [59] and the regulation on support for investment in energy efficiency [60]. It should be mentioned that institutional factors, such as overarching regulations and institutions (both national and on an EU-level), were mentioned in some evaluations with regards to how the evaluand related to or aided in fulfilling overarching initiatives. For example, 11 evaluations—performed by the same external consultant—schematically mapped the evaluand in relation to such institutions, but did not elaborate further on potential effects of their interactions. Technological factors, or the role of new technological innovations, were seen in nine evaluations. These evaluations mainly concerned policy programs for technology procurement, demonstration programs and the cooperative network program LÅGAN. One evaluation concerning a tightening of building codes acknowledged technological factors from the perspective that the current level of technology merited a sharpening of the regulations [56]. In all, the reviewed evaluations assessed actors, institutions and technological factors, but did not elaborate further on their interactions within the system.

Transition research also advocates for a multi-actor approach, to capture the potential and the effects of transformative interventions. For the particular case of energy efficiency in buildings, such actors include private and public house owners and tenants, the construction industry, other businesses and organizations either engaged in providing or adopting energy efficiency measures and authorities and municipalities. In the reviewed sample, 31 evaluations incorporated stakeholders to some extent, and approximately half of these evaluations (16/31) did take a multi-actor approach, involving two or more groups either targeted or otherwise involved in the implementation of the policy instrument. These were commonly authorities, beneficiaries and representatives from businesses and organizations. Discussions about actor groups that were consequently not involved were uncommon within the reviewed material, but may be a valuable discussion for opening up the evaluation boundaries.

Transition research, moreover, brings forward processes of visioning, experimentation and learning as central in the analysis of transformative changes. Visioning may be captured through a long-term evaluation approach. In the reviewed documents, we did not find such approaches, albeit some recurring evaluations of the same policy instruments, especially in the cases of the municipal energy advisory programs, cooperative network programs and subsidies for energy audits. Visioning may also be captured and supported in evaluations through a combination of ex-ante and ex-post evaluations; one evaluation [55] was constructed as such, creating an initiative for both learning and prediction of future outcomes simultaneously.

In the assessments of experimentation and learning, we looked for the acknowledgement of experimental efforts in terms of innovative policy and potential outcome, as well as consideration of facilitation for experimentation. In all, we found four evaluations that took an experimentation criterion into account. These evaluations did not showcase different foci than did other evaluations in the sample, but differed in that they concerned policy instruments that may be more prone to experimentation. Three of them, concerning cooperative network programs [61], technology procurement [62] and a demonstration program for passive houses [63], stated that the evaluated instruments provided platforms for experimentation that facilitated the development of new energy efficient technologies, including, e.g. new building materials, windows and energy steering systems. A number of examples of successful technologies were presented within particularly the two latter evaluations, to showcase good practices that had been nurtured through the implementations. Another key feature that was mentioned within these evaluations was the need to take risks and perhaps fail in order to learn. Related to experimentation was, moreover, the upscaling of new innovations, which to a certain degree was captured in these evaluations through their assessments of innovative projects: from their inception to making an impact on a larger scale. Although these are important components for capturing transformational efforts, the notion of experimentation was, in these cases, embedded within the policy instruments, and as such a required evaluation angle.

Conversely, the fourth evaluation showed a differing assessment regarding experimentation. This evaluation concerned a policy instrument aimed at providing investment support for various measures for energy efficiency [54], and stated that the instrument had been unsuccessful in providing support for experimentation relating to unconventional technology and technology development. This is indeed an important note, since it draws attention to the fact that the policy instrument in question was failing in breaking new ground, which in itself is a valuable insight to be drawn from the evaluation results. It should, however, be emphasized that not all policy instruments allow experimentation to the same extent; a program for technology procurement or a demonstration project holds greater potential to spur experimentation than do, e.g. regulations or energy audits. The lack of acknowledgement of experimentation thus needs to be regarded on the basis of the particular policy instrument’s context and objectives.

Another important aspect of learning to be drawn from a policy evaluation is the identification of potential lock-in effects and path dependencies, i.e. to analyse how the current system configuration, with its norms and current technologies, is affecting and potentially hindering stakeholders in pursuing a more energy efficient pathway. It would also be valuable to uncover potential efforts to disrupt such configurations, e.g. to question norms or other factors that have an inhibiting effect on energy efficiency uptake. In the review sample, such discussions were not apparent, apart from the reflection on the (lacking) ability of a financial policy instrument to support development of new technology presented above [54].

Value judgments in evaluations

Evaluation theory states that the nature of evaluation is normative, and that assessments require a value base and criteria for valuation. Valuing is, however, broader than just criteria; it is also about the legitimacy of the value claims, which is closely related to the involvement of multiple actors, social justice and reflexivity. It is crucial to take all these aspects into account in order to evaluate and support transformative processes. The potential success of policy instruments for energy efficiency in buildings, as well as the realization of transformative changes in society, will very much rely on values held by the many involved actors. For this reason, a multi-criteria evaluation approach is often advocated [47], which is able to illuminate the implementation from various perspectives and angles.

In the reviewed evaluation reports, various criteria for valuation have been applied; the number of criteria ranging from one to four. The most frequently used criteria in the reviewed sample were effectiveness and impact (Fig. 2). These criteria partly evolved around elements that could be measured in, e.g. saved amount of kilowatt hours or in monetary terms, but were also of a softer nature, concerning, e.g. impacts on stakeholders’ actions in taking energy efficient measures, etc. The third most commonly applied criterion was instrumental feasibility and assessments concerning administrative processes (13/33).

Fig. 2
figure 2

Value criteria used for assessment within the review sample

In order to support transformative changes and efforts, evaluation criteria should not be limited to just impacts and effectiveness, but should also include the drivers for change and their implications. Such criteria may be more process-related and aimed at the mechanisms behind a successful implementation, such as acceptability, relevance and coordination with other policies. While relevance as a criterion was scarcely applied within the reviewed evaluations (1/33), five evaluations explicitly assessed policy coordination with other, similar, policy instruments, thus proving valuable insights on potential synergies or conflicts.

A broader assessment may also aim to capture various viewpoints of stakeholders. Energy efficiency policies may affect different stakeholders in different ways, leading to issues related to who benefits and who does not from the policy instrument. Discussions concerning legitimacy of value claims were not apparent within the written reports, leaving issues concerning the evaluation approach, value constructions or value dissonance—different values held by different stakeholders—in the evaluation process unmentioned. Moreover, it is also important to cover aspects of reflexivity in the evaluations in terms of challenging established goals and needs. Such reflexive elements were found in eight evaluations, which presented discussions concerning the instrument’s goals or design, or the intention underlying the instrument. These evaluations covered instruments of the informative (3) and financial (3) kinds, as well as a procurement program and a cooperative network program. Concrete examples include the evaluation of a subsidy for house owners to change windows [57], which questioned the need for the policy instrument due to lacking additionality, and an evaluation of the municipal advisory program [64], which questioned the focusing of the advisory program to a wide range of actors, instead of focusing it on a more limited range with high saving potential.

Use of evaluations

Evaluation theory emphasizes that evaluations are undertaken in order to be used, and stakeholder involvement in the evaluation process is essential in order to enhance use. Involvement of stakeholders will help gear the evaluation towards issues of importance that can lead to essential learning, and stakeholders should be involved in the evaluation process to focus it, to make it timely, to participate in decisions on methods and data collection, in interpretation of findings and to influence value judgements [12, 19]. From the perspective of transition research, the involvement of multiple actors is, furthermore, emphasized for the understanding of various actors’ roles in driving a transition [27]. In this paper, we thus argue that for a realization of an energy transition, that in part is fuelled by strong energy efficiency policies, a reflection on use of results, actor involvement and learning in evaluations is essential.

As illustrated in Table 1, the predominant way for Swedish governmental authorities to proceed in the undertaking of a policy evaluation was to commission external consultants for conducting the evaluation (26/33). There are examples of the authorities conducting evaluations as well: in the role of both commissioner and evaluator (4/33); as commissioned by the government (1/33); and in more extensive collaborations between both authority and consultant (2/33).

In terms of involvement and use, it is not apparent within the written reports to which extent the commissioners were involved in the design of the evaluation. Following the notion that actor involvement may increase and facilitate use of evaluation results, we looked further into actor involvement in the reviewed evaluations. The review showed that the means of actor involvement was dominated by involvement through interviews, with the number of respondents per study ranging from one to 700. Discussion about the scope and size of the group of respondents was uncommon in the reviewed evaluations, and the involvement was largely limited to data collection rather than for facilitating further use. Evaluations conducted by one consultant included seminars where a group of selected stakeholders were invited to discuss preliminary results—the interviews in groups. While this may have had a facilitating effect on future use of the results, it was seemingly not the primary aim. Although a clear strategy for facilitation of further use was not clearly stated in the reviewed evaluations, it may have been part of either the design process or otherwise carried forward after the closing of the report. It should also be mentioned that the majority of the review sample was freely available on state websites, which in itself may increase use.

Lastly, returning to the foci of the reviewed evaluations, a number of evaluations had a stated objective of providing knowledge that could be used for decisions for further financial support of the program, and for improvements of the next phase of a policy program, as such indicating direct areas for use. From a learning perspective, in creating a knowledge base and capturing new knowledge, the review revealed there was an ex-post emphasis in the evaluation practice, as the absolute majority of the reviewed evaluations were conducted in retrospect (30/33). Many evaluations concluded with a set of recommendations or suggestions for further improvements of the policy design, thus summarizing and emphasizing what had been learnt from the evaluation of the implementation. If used and conveyed properly, this may serve as a valuable source of knowledge for how to improve current and new designs of policy instruments for accelerating energy efficiency in buildings.

Discussion

Achieving energy efficiency and transformative changes in the built environment entails addressing multiple components of a complex system, and without proper evaluation of policy instruments, they risk being ineffective. In this paper, we argue that for evaluations to be able to inform about transformative effects, a sound evaluation methodology is essential along with a comprehensive evaluation approach that is able to capture transformative efforts of various scales.

The review and assessment of the Swedish evaluations, first of all, show an impressive range of evaluations undertaken, with multiple evaluations conducted yearly. Moreover, we see a general application of sound evaluation approaches and, to some extents, evaluation aspects supporting transformative processes of change. The sample varies both in terms of evaluation foci and scopes: while some evaluations were rather extensive and evaluated a number of different aspects of the implementations, others were significantly shorter, narrower and were underbuilt by a limited amount of data. On this note, we acknowledge that the aims of the commissioning of evaluations may be based on limited time and resources, and that evaluations therefore differ in scope.

In all, the sample showed a lean towards qualitative methods. Thus, the sample indicates that the individual, mainly qualitative evaluations of this study, provide a good complement to other purely quantitative schemes for monitoring and evaluation, such as indicators and modelling assessments, which commonly focus on policy packages (see, e.g. excluded evaluations in Appendix 2). The frequent combination of methods in the sample, notably interviews and document studies, moreover showed that the evaluations frequently do apply several methods. In these evaluations, the document studies provided insights about, for example, how steering documents, application forms and reports supported or shaped the intervention, while the interviews provided deeper understanding about the workings of the policy, how it had been perceived and received and where there was need for further tuning or development. In combination, they provided valuable insights on the functions and effects of energy policies, and were transparent in their representation of the stakeholders that were involved.

Nevertheless, triangulation of methods for systematic validation of findings was rare, contrasting the predominant application of multiple methods and data sources for the conduct of the same evaluation. Thus, we see an opportunity to apply these methods more flexibly to allow triangulation for validation of findings, and to cross-fertilize and potentially gain additional information when sources are cross-checked. Of course, this is a matter of time and resources as pointed out above, but since the foundation within the evaluation practices largely is in place, the application of triangulation does not necessarily require extensive amounts of additional resources, while presenting a potential to provide added value and robustness to the evaluation results.

The same argument holds for the application of counterfactual constructions for assessing the impact of the policy instruments, where we see that the practices may be strengthened further by combining different methods, by discussing the limitations of the counterfactual and by deliberately including a wide range of actors in its construction. A good practice to be highlighted from the review was the evaluation combining three different methods for synthesizing a counterfactual, including actor involvement through a reference group [55].

A last note on the methods applied is that, as discussed by Vedung [12] and Weiss [10], the purpose of the evaluation guides the evaluator towards selecting appropriate methods. As we have seen, the methods applied in the reviewed evaluations do, to certain degrees, incorporate key aspects from transition research, even though none of the reviewed evaluations had a stated purpose of investigating the transformative potential of the policy instrument. Thus, by bringing transition to the focal area of the inquiry, the methods applied can also be further strengthened and developed to investigate transformative contributions of an energy policy instrument.

As regards value judgements within the evaluations, the majority of evaluations in the review applied two or more criteria, as such taking a wider approach. However, the criteria were often limited to effectiveness and impact, whereas criteria for investigating mechanisms of the implementation that aided or counteracted the processes or outcomes of a policy instrument, such as acceptability and predictability, were underrepresented. As was assessments of relevance, a criterion that has been recognized as a key criterion for environmental policy evaluation [13, 65]. Consequently, evaluations that opened up for a broader scope and set of criteria, often combining assessments of outcomes and impacts with assessments regarding instrumental and administrative feasibility or coordination with other policy instruments, gave rise to deeper understanding of the workings of the implementation. These evaluations are, moreover, also potentially more prone to be informing about transformative contributions of the evaluand, since they to a larger degree could engage in discussions regarding the design and processes of a policy instrument, and how they link to effects on energy efficiency in the built environment.

Turning then to the evaluations’ application and acknowledgement of key aspects for capturing contributions to an energy transition, the reviewed evaluations showed that assessments of changes on a system level were limited, and predominantly characterised by scattered evaluations of side-effects and rebound effects. An evaluation focus on the potential transformation of the current institutions, technological factors or the participation and behavioural aspects among actors was scarce, if treated at all, neither was evaluation of the interaction between these different system components. As for actor involvement, good practices were seen, e.g. in six evaluations that used interviews in groups as a method for discussing results and issues, which allowed for a deeper participation and discussions among actors. Otherwise, actor involvement was largely limited to data collection, where selected groups were asked to partake in surveys, interviews or workshops. Generally, the evaluations did not explicitly explain to which extent stakeholders such as beneficiaries, businesses, organizations and authorities had been invited to partake in decisions regarding evaluation design and the selection of criteria or methods to be used in the evaluation processes, but we acknowledge that such discussions may have been undertaken outside of the final evaluation report. Nevertheless, transitions are not isolated events, but are complex processes involving and affecting multiple actors and system components (see, e.g. [24, 27], and without accounting for these factors, the evaluations risk becoming isolated snapshots of effects that are not put in relation to other efforts or policy initiatives. Thus, if evaluations take a deliberate systems and a multi-actor approach, they may be better equipped for a deeper analysis of whether the policy instrument is in fact delivering crucial contributions to a transition.

As for the process-related aspects of a transition, such as the acknowledgement of visioning, experimentation, upscaling and learning, our review showed that these were featured in some evaluations, through combined ex-post and ex-ante assessments, or through discussions about how the evaluand had promoted experimentation or otherwise capitalized on learning. In large, however, such assessments were subordinated, which is possibly explained by the fact that the evaluations’ foci were not aimed at them. As was previously mentioned, we thus see great potential in deliberately gearing the evaluations’ foci towards capturing transformative efforts, and thereby prompting an adaptation of methods to include key concepts for a transition.

Furthermore, many policy instruments for energy efficiency in buildings are aimed at encouraging a change of behaviour among citizens, organizations and authorities alike. Energy efficiency in buildings is, thus, ultimately not only a matter of new technological innovation and improvement of building materials and equipment, but certainly also a matter of altering our behaviour and how we live in and interact with our buildings—something that was frequently overlooked in the reviewed evaluations. This calls to be addressed if evaluations are to fully provide multi-faceted information that can support the highly multi-faceted challenge of realizing an energy transition.

Since the study at hand has been scoped by close reading of existing evaluation reports, a further investigation of the evaluation design processes is suggested for future research. This could include interviews with commissioners, evaluators and other stakeholders involved in the implementation, and would allow an extended analysis of the current evaluation practices with regards to underlying decisions and commission requirements. An extended analysis could also include evaluations commissioned and performed by a broader span of actors, such as academia and stakeholders in the private sector. While we in this study focused on evaluations commissioned by governmental authorities in order to facilitate an assessment of how these evaluations in particular are performed with regards to assessing transformative contributions, an extended analysis could further provide insights about how policy evaluations from different sectors can strengthen each other in order to support transformative processes of change.

Conclusion

Evaluation holds great potential to support the realization of energy efficiency potentials in the built environment, by informing policy makers in designing and deciding upon strong and effective policy instruments. If evaluations are equipped to address the complexity of the energy system, and the transformative processes of change in the system, essential insights will be gained to accelerate energy efficiency. Supported by the theory-based framework, our review shows that the current evaluation practices have the structures in place to certain degrees for capturing transformative efforts and effects on a system level. In order to fully harness the potential of evaluations, however, some areas are outlined that may be further strengthened.

First, this study shows that numerous policy evaluations are performed, but their use is fragmented and lacking in coordination building up to an overarching transformative evaluation approach. To address this, we propose the application of an evaluation framework that can guide the structuring of evaluation designs so that they, in combination, can provide a more comprehensive overview of effects from current implementations at work.

Secondly, in order to capture transformative efforts, evaluations should more actively gear the focus towards effects on the system level, acknowledge the scale of change, stress a more thorough multi-actor approach and include actor involvement in the evaluation designs. Based on the review, we conclude that such aspects are present in various combinations and extents in the evaluations, but do not yet effectively assess interlinkages and effects on a grander scale. A further development of these evaluation components can enable a more transformative approach of the evaluation. In addition to this, evaluations targeting transformative changes should also seek to address drivers for change, such as visioning, experimentation and learning. The review suggests that these aspects were acknowledged in certain policy evaluations; visioning through a long-term evaluation approach; experimentation and learning seemingly dependent on the type of evaluand and its ability to be experimental. Nonetheless, these drivers are key components if to inform about whether the energy policy is contributing to new technologies or insights about policy designs, and to ensure that learning is capitalized and conveyed into new and stronger policy initiatives.

Lastly, concerning the methodologies for conducting evaluations, current evaluation practices show strengths in the predominant application of multiple methods and criteria in the same evaluation. We also notice some good practices in how counterfactuals are constructed, building on different methods and sources. In order to further advance good practices, an emphasis should be placed on extending the use of multiple criteria, and especially criteria promoting reflexivity, to further capture processes and effects of a policy instruments. The frequent application of multiple methods, furthermore, paves way for triangulation that can validate findings and thus provide added robustness to the results.

Altogether, we find the proposed theory-based evaluation framework useful for assessing and discussing both robustness and transformative efforts of current policy instruments and evaluation practices. In order to further promote an evaluation approach that can support transformative changes in society, we stress the need to further link evaluation theory and policy analysis with transition research to design evaluations that can provide a more systematic picture of the progress towards a transition to a more sustainable energy system.