Evaluating the achievements and impacts of EC framework programme transport projects
The purpose of this paper is to present what kind of elements and evaluation methods should be included into a framework for evaluating the achievements and impacts of transport projects supported in EC Framework Programmes (FP). Further, the paper discusses the possibilities of such an evaluation framework in producing recommendations regarding future transport research and policy objectives as well as mutual learning for the basis of strategic long term planning.
The paper describes the two-dimensional evaluation methodology developed in the course of the FP7 METRONOME project. The dimensions are: (1) achievement of project objectives and targets in different levels and (2) research project impacts according to four impact groups. The methodology uses four complementary approaches in evaluation, namely evaluation matrices, coordinator questionnaires, lead user interviews and workshops.
Based on the methodology testing, with a sample of FP5 and FP6 projects, the main results relating to the rationale, implementation and achievements of FP projects is presented. In general, achievement of objectives in both FPs was good. Strongest impacts were identified within the impact group of management and co-ordination. Also scientific and end-user impacts of the projects were adequate, but wider societal impacts quite modest.
The paper concludes with a discussion both on the theoretical and practical implications of the proposed methodology and by presenting some relevant future research needs.
KeywordsTransport research projects Evaluation of research programmes Evaluation methods
Evaluation has been a legislative requirement for European Research and Technology Development (RTD) programmes since the early 1980s. Since then the Commission Services have gained various experiences in evaluating research. The launch of the Fourth Framework Programme (FP) in 1994 led the European Commission (EC) to introduce a new evaluation scheme consisting of annual reporting of continuous monitoring, and a five-year assessment that includes the review of two previous research programmes . The most recent ex-post Evaluation of FP6  deals with the entirety of FP6 and provides some input into the interim evaluation of FP7 to be performed in 2010. The Expert Group of the FP6 evaluation addressed three broad sets of issues, particularly: the rationale, implementation and achievements of FP6. For the FP7, a new monitoring system or an internal management tool, consisting of series of annual reports and system of indicators is under development.
In the field of transport, the evaluation of European research projects’ achievements and impacts does not have a long tradition. Some national level evaluations have been carried out in the recent years (e.g. Pihlajamaa and Berg , Kalenoja et al. , in Finland; Albrecht and Vaněček  in the Czech Republic) but essentially the research evaluation is a new, emerging field in the transport context both at national and European levels.
Currently the EU RTD evaluation practices comprise of continuous monitoring, 5 year assessments and mid-term evaluations. They are characterised by a strong focus on monitoring compared to impact assessment, on projects and programmes rather than the broad policy context, and a heavy reliance on expert panels rather than studies. Also, there is a constraint imposed by the limited time and financial resources devoted to evaluation (EC Joint Research Centre (2002) RTD Evaluation Toolbox. http://www.fteval.at/files/evstudien/epub.pdf). Georghiou and Polt  detail that in terms of evaluation on a European level, ‘there is no single model of good practice’, but peer reviews and expert groups are used for evaluation processes. This is also emphasised by Durieux and Fayl , who state that the most important means within the evaluation of European RTD programmes are independent expert panels, interviews, questionnaires and core indicators. The panels are made up of people with high levels of responsibility in the field, which in practice results in a balance of experts with either an industrial or an academic background . Experts are selected by the Commission on the basis of their experience and knowledge of community research policy, which indicates that they will be drawn primarily from the knowledge sector. Efforts are also made to ensure a balance among different sectors of the research community as well as a geographic spread of evaluators.
The range of users of knowledge produced by evaluations is broad, because evaluations may be conducted both internally and externally and by different organisations. The most typical user categories are decision makers, policy makers, practitioners, scientists, consultants, auditors, trained evaluators, programme and project managers, project participants, economic analysts, NGOs and consumer groups [8, 11, 23, 24].
Within each of the categories, there is a significant diversity of users, whose expectations for evaluation results and methodologies may vary. Consequently, the nature of produced knowledge depends on its use, i.e. by whom and how the knowledge will be used. In addition, the utilisation of produced evaluation knowledge seems to be very challenging. Even though FP evaluations are becoming a permanent practice, the development of evaluation methodologies is often too short sighted, not continuous and the results are not disseminated widely to different stakeholders. Our view is that these issues need to be addressed carefully in the future, to allow evaluations to gain greater role in guiding future policy and research agendas.
Our interest in the transport research evaluation was initiated by the METRONOME project, financed under the FP7 of the EC, aiming to develop a methodology for evaluation of research project impacts in the field of transport. The project, together with four other transport research evaluation projects (AGAPE, AIMS, MEFISTO and SITPRO Plus), presents transport research’s contribution to the overall trend in the EC FP evaluations and provides a mean to get a more detailed view on transport research achievements in the previous FP projects. In addition, the project contributes to new research policy objectives in the field of transport.
In the traditional view, European investment in RTD creates a demand for information on the efficiency with which RTD is managed, the quality of the work itself, and the economic and social returns. Evaluation schemes set up to supply this information are important tools for policymakers, and they give the research community an opportunity to demonstrate its achievements. Hence, the traditional role of the research programme evaluations have been to legitimate the past research activities. Since the focus has been on ex-post evaluations, only very little attention has been given to the elements of future development, learning and strategic long term planning, the elements, which are growing strong in the contemporary evaluation literature (e.g. [2, 9, 14, 15, 16, 17]). Kuhlmann , for example, argues that current RTD arena with well organised actors (having differing interests, values, and power) but no dominant player, competition for impact and resources, and search for (some) alignment and policy learning, requires more from evaluation practices than just legitimacy. It requires considering research evaluations as ‘Strategic Intelligence’ in order to steer the research policy developments of the future. Arnold  complements Kuhlmann’s arguments by claiming that growing EU research budget means also: increased need for accountability; efficiency of the European RTD system under scrutiny; timing of forthcoming evaluations in line with need to have an informed debate on future EU RTD policy; need to focus more on the “fundamental” aspects and less on minor implementation issues; and need to develop evaluation capacities as part of the European Research Area.
What kind of elements should a framework for evaluating the achievements and potential impacts of transport projects supported in EC framework programmes include?
What are the forms of evaluation methods required within such a framework?
Can such an evaluation framework produce recommendations for future transport research and policy objectives as well as mutual learning for the basis of strategic long term planning—the strategic intelligence?
In order to find answers to the above questions, we have structured the article as follows: First, we present the theoretical background for evaluation of research. Second, we describe the evaluation methodology developed in the course of the METRONOME project. In the subsequent section we present, based on the methodology testing, the main results relating to the rationale, implementation and achievements of the FP5 and FP6 transport research projects. We conclude with a discussion on both the theoretical and practical implications of our method and by presenting some relevant future research needs.
2 Theoretical background for evaluation of research
According to the classical definition of Scriven , “Evaluation is the process of determining the merit, worth and value of things.” It is the process of distinguishing the worthwhile from the worthless, the precious from the useless. Chelimsky  emphasises that evaluation by definition is social research. As regards programme evaluation, she points out that it is application of systematic research methods to the assessment of programme design, implementation and effectiveness.
Output: the concrete result of a research project (e.g. final report of a project)
Outcome: the product or process arising from the research result (e.g. new methodology, software tool, process)
Impact:the product, event, condition and/or change that follows from the outcome (e.g. policy initiative, new product/service development)
Effect/effectiveness:broad, general, societal change that indicates the extent to which the impacts of a programme, policy or organisation have promoted the achievement of set goals and/or initiated societal change (e.g. established norms and regulation, contributed to strategy processes of public and private organizations) .
In the RTD evaluation, there are two basic types of evaluation. The first, summative evaluationfocuses on relationships between inputs and outputs. Here, we can make distinctions between the impact and effectiveness evaluation and goal achievement evaluation. Impact and effectiveness evaluation differs from goal achievement evaluation in that it takes into account the side impacts or unanticipated impacts that a programme may have, which the latter type of evaluation does not cover. In this light, it is useful to divide impacts as: (1) anticipated and unanticipated, (2) inside and outside the target area (or relevant or irrelevant) and (3) productive and detrimental (or neutral in impact) (e.g. ). The goal achievement evaluation again, focuses on the relevance of objectives or the costs arising from the activity, which the former type does not take into account.
The second type, formative evaluation, focuses on future development, learning, strategic long term planning and structural change, issues grouped under an umbrella concept ‘strategic intelligence’ in the contemporary evaluation literature (e.g. [2, 14, 15, 16, 17]).
In general, the main purpose of the recent research programme evaluations has been to justify the past research actions (value for money) and consequently the focus has been on summative evaluations. It seems, however, that the perspective of ‘strategic intelligence’ is growing stronger also in European programme evaluation. The METRONOME evaluation method we present in the following sections includes features both on summative and formative evaluations.
Evaluations of research often include both qualitative and quantitative elements. The qualitative aspect tends to constitute a process of peer review by people with expertise within the appropriate area or different kinds of participatory approaches (workshops, interviews, etc.), whilst the quantitative aspect frequently involves the use of indicators. In the latter case, the data can be obtained e.g. by questionnaire survey. Traditionally, involvement of informed peers has been regarded as the most reliable and comprehensive way (and indeed sometimes the only way) to judge scientific quality and societal impact [4, 21]. Quantitative data has been seen as a supportive element to the peer review process .
Basically, carrying out valid evaluations requires complementary information and knowledge, produced by various methods. In addition, the nature of produced knowledge depends on its use, i.e. by whom and how the knowledge will be used. For example, for legitimising purposes, the knowledge can be indicator based and quantitative, but if the focus is on strategic development, the knowledge needs to be qualitative and participatory.
Setting and defining of evaluation objectives
Choice of evaluation methods
Specification of goals of the policy, programme, organisation or similar to be evaluated
Identification of the evaluation target’s impact and effectiveness mechanisms
Identification of contextual issues
Reviewing objectives in relation to observed impacts
Utilisation of evaluation information in setting the goals and future needs
Based on the previous practical and theoretical considerations, there seems to be a need for improved strategic intelligence and indicators to understand the actual dynamics (see e.g. LEG ) and impacts of research programmes and involvement of relevant stakeholders. It is not, however, realistic to try to find one general methodology for programme evaluation, but rather to specify different mixes of approaches depending on overall focus and purpose of the evaluation. The following framework for evaluation of the impacts of transport research projects presents our view on such a framework for the transport domain.
3.1 The framework
The proposed evaluation framework focuses on three themes currently relevant for European transport research: Strengthening industrial competitiveness (IndCo); Contributing to sustainable development (SuD); and Improving community and public policies (CPP). The methodology takes a two-dimensional approach to project impact evaluation. On the one hand, the projects’ achievements are evaluated against the FP Work Programme objectives and targets set for IndCo, SuD and CPP themes (goal achievement evaluation). On the other hand, it evaluates, through the METRONOME impact model, the impacts of the FP research projects according to four impact groups (impact evaluation). Based on these two approaches including a mix of evaluation methods, ‘strategic intelligence, i.e. recommendations relating to definition of performance targets for future FPs and new research policy objectives, research instruments and actor networks can be formed (formative evaluation).
Identification of European transport research and policy objectives for Industrial Competitiveness; Sustainable Development; and Community and Public Policies
Screening and selection of FP themes and projects for the evaluation
Evaluating project achievements and impacts
In the first phase, the thematic European transport research and policy objectives are derived from relevant European policy documents and research work programmes. The second phaseincludes the following three steps. First, the FP themes and key actions relevant for the transport theme are identified from the FP Work Programme. Second, outputs of projects (final reports) under the selected themes are gathered. Third, projects to go through a detailed evaluation are selected with the help of text mining software and checklist (for details, see METRONOME Deliverable D2.1). As a result of the project selection, a sample of (e.g. 30) “best matching” projects within each of the evaluation themes (IndCo, SuD and CPP) can be selected for detailed evaluation. The third phase, the actual project evaluation in the METRONOME framework is based on the following two pillars.
The METRONOME impact model thus proposes four indicator groups (Table 1). Impact indicators on management and coordinationreflect the ‘enabling factors’ or ‘tools’ for complementing the impacts measured in the other three groups. Scientific impact indicatorsreflect the quality and validity of research project results (outcomes) versus the project’s own and FP objectives and targets set on different levels. Customer/End user impact indicatorsreflect the (short-term) benefit of the research results to their actual end users (e.g. EC, industry, national governments, ministries, research organisations, etc.). Societal impact indicatorsreflect long-term effects of the research on the society (e.g. on the transport system end-users: individuals, logistics companies, industry, etc.).
3.2 The evaluation methods
The following sections present the four complementary evaluation methods developed and tested in the course of the FP7 METRONOME project. The methods are: two different project evaluation matrices (based on project reports); coordinator questionnaire; and lead user interviews. In order to get a comprehensive view of the programme achievements and impacts, a specific mix of evaluation methods was applied for each of the three evaluation themes (IndCo, SuD, and CPP).
3.2.1 Evaluation of achievements of FP objectives and targets by research projects—a matrix approach
- Step 1
Identification of Industrial Competitiveness domainsBased on a detailed review of scientific and European Union’s policy document literature, relevant domains are identified. As an example, such domains can be:
Technologies. Processes and Services
Patents & Standards
Societal & Environmental
- Step 2
Identification of Framework Programme specific objectives and targets related to Industrial Competitiveness
Here, a detailed analysis of the policy objectives and measurable targets of the FP Work Programmes is carried out.
- Step 3
Definition of Indicators based on each Framework Programme target
An indicator is defined as the effort to quantify and simplify phenomena and help understand complex realities. Indicators are aggregates of raw and processed data but they can be further aggregated to form complex indices. The indicators are defined by transforming the targets set by the FP Work Programmes to measurable statistics and indices.
- Step 4
Grouping of indicators based on Framework Programme objectives
The grouping of indicators is carried out in two levels: (1) according to the objectives that the targets—which are addressed by each indicator—are related to; (2) in order to achieve a reduction of indicators that address the same topic both semantically and logically. In this case two or more indicators—within the same group of indicators per objective—can be merged into one indicator that will measure more than one characteristic.
- Step 5
Relating each indicator to one of the domains
Each indicator is associated to the domains that are addressed by it. This occurs with semantic and logical terms. The association indicates both the exact domains that each indicator is associated to and also provides some useful qualitative insights for each indicator in terms of relations to these domains.
- Step 6
Definition of the Evaluation Framework and success/failure criteriaThe Evaluation Framework is a database which consists of general information of the project under evaluation (such as name, acronym, etc.). In addition, there are fields where each indicator is measured. The indicators’ selection for each project assessed is based on the objectives—and thus resulting targets which are then transformed to indicators according to steps 3 and 4. The overall question that shall be answered for each indicator is: “Rate the extent to which the project contributed/addressed the indicator”. The measuring scale for each indicator is presented in Table 2.Table 1
Indicator groups and examples of indicators
Examples of indicators
Impact indicators on management and coordination
• Improved or new networks with public/private organizations
• Networks with global/EU/national partners,
• Systematic dialogue with policymakers, customer involvement in project planning
• Efficiency of the research—results (outcomes) versus resources used
Scientific impact indicators
• Achievements of research projects—outcomes versus FP objectives/targets set
• Fit between framework and data
• The power to address previously unsolved questions
• Number of publications and/or patents
Customer/End user impact indicators
• Public-policy initiatives, new business initiatives/activities
• Long-term product or service development
• Advantage and stability of the research results
Societal impact indicators
• Implementation of research output by policy field, industry or other societal stakeholders
• Active use of implemented research output by societal groups
• Contribution of priority setting, e.g. future research goals
• Contribution to strategy processes of public and private organizations
• Established norms, standards, regulationTable 2
The scale for measuring indicators and related definitions
The project contributed significantly/addressed in a very large extent the indicator.
The project contributed averagely/addressed in a moderate extent the indicator.
The project contributed/addressed in a moderate extent the indicator, although the project did not aim to do so.
No—Not at all
The project did not contribute/address the indicator.
The project did not contribute/address the indicator, because the project did not aim to do so.The actual implementation of the proposed evaluation method is carried out by deploying the Evaluation Framework files and filling in the necessary information by measuring the extent to which the project under evaluation addressed each indicator. An indicative sample of an Evaluation Framework database file is presented in Table 3.Table 3
Sample Evaluation Framework database file
Rate the extent to which the project contributed/addressed the indicator
No—Not at all
- Step 7
Definition of the Justification Matrix for selecting projects
The selection of the projects to be evaluated is determined through a project selection Justification Matrix. This matrix is comprised by the domains, which have already been defined. In order for a project to be selected, at least one of the Industrial Competitiveness domains must be addressed by the project.
- Step 8
Selection of projects based on the Justification Matrix and sampling
The identification of the projects to be evaluated is executed in two ways. First, the thematic area addressed by the project, together with the project objectives is identified. Second, a two page indexed project identification document is created for each project. In case that one or more of the search criteria are identified during this indexed search process, then this domain is considered as relevant to the specific project and it is marked positively in the Justification Matrix. The process is an iterative procedure which has to be repeated several times in order to identify the existence of relevance with each one of the domains assessed.
- Step 9
Testing the applicability of the method on a small number of projects.
In order to ensure the applicability of the proposed evaluation method, a validation step has to be executed in this stage . The testing of the method occurs with applying it on a small number of projects. This number of projects to undergo this testing step is considered sufficient when it reaches 2–5% of the total projects to be assessed and should be within a range of 12–25 projects in total, independent of the actual total number of projects [27, 28]. Note that this is only the sample size for the testing of the method, i.e. investigating the effectiveness and applicability of the data-mining techniques of the previous 8 steps and it is not the actual sample size required for the projects, as mentioned in step 8. In case the application of the method is not considered successful (e.g. no results can be measured, no projects can be found, no association between indicators and targets can be justified etc.), then the user is advised to return to Step 3 and re-run the evaluation process, based on the shortcomings identified.
- Step 10
Qualitative analysis of all projects and analysis of the resultsThis step involves the analysis of the results of the above described evaluation process. Each selected project is rated for each indicator. The end results of ratings of all projects are then analysed collectively in the following manner: Each scale used for rating indicators is assigned a number from one to five. An index is then created for each scale according to the number that this scale has been used (through the rating process) for all projects assessed for each indicator. The sum of these indexes always must sum up to one. This procedure is repeated for each indicator per objective. A graphical chart is then created as follows: the x-Axis is labelled with the five scales and the indicators of each objective. The y-Axis is labelled with the index achieved for each indicator per scale. An illustrative example of such a chart is presented in Fig. 4.
The same procedure results to the analysis of each indicator separately against each objective or Framework Programme.
- Step 11
Relation of all projects’ evaluation results to Industrial Competitiveness domains
The project results are related to the defined domains (see step 6) as follows. Each time that a response according to the evaluation scale is recorded, a relation to the respective domains which are assigned to the respective indicator is made. The total number of responses according to the five scale values of the evaluation process indicates the performance for each evaluation domain.
- Step 12
Conclusions, recommendations and further use by EC services
The final step of the proposed method consists of the interpretation of the results and drawing of conclusions and recommendations.
3.2.2 A simple matrix approach to evaluate project achievements and impacts
An alternative matrix approach, more simple than the one presented above, was tested in the course of the METRONOME project. The approach includes two complementary evaluation matrices and contributes both to goal achievement and impact assessment. It was tested under the Sustainable Development and Community and Public Policies themes (for details see METRONOME Deliverable D4.1 and METRONOME Deliverable D5.1).
The first evaluation matrix supports a qualitative evaluation of the extent to which research projects financed under FP have contributed to the evaluation theme, e.g. SuD. Based on a review of the FP research and commissioning structures, at least three levels of objectives can be identified as relevant to many of the transport research projects commissioned under past two Framework Programmes. These are: (1) FP Work Programme-level (WP) objectives; (2) Work Programme sub-level (thematic) objectives (that the project was commissioned under); (3) Project-level objectives. The evaluation matrix enables evaluators to specify whether each of the above objectives have been met fully, partially, indirectly or not at all. The same approach is applied to evaluate the potential impacts of research projects in four impact groups and with related indicators (see Table 1). In addition, each completed evaluation needs to be accompanied by a textual summary. The summary supplements the evaluation matrix by detailing other relevant and specific information about projects and/or their outcomes. The research objectives on different levels and impact indicators form the basis of the evaluation matrices. The evaluation matrices are completed basing on the published Final Reports of projects. A skeleton template for the approach adopted is shown in Appendix 1.
The second part of the evaluation matrix concerns the success of project result dissemination (Appendix 2). FP projects typically result in the publication of a wide range of deliverables and outputs, both formal and informal. The dissemination quality matrix enables evaluators to specify the characteristics of specific dissemination activities undertaken during and after the project lifetime. Consequently, it assesses the potential effects of project results and indicates whether estimated impacts upon the objectives are likely to have been achieved in practice. The matrix indicators (list of activities) are selected on the basis that they are comprehensive whilst also feasible to be answered based upon written documents in the public domain. Project dissemination reports, project final reports and websites provide evidence of the scope and nature of dissemination activities conducted. In addition, each completed evaluation should be accompanied by a textual summary of the dissemination information assessment, detailing other relevant and specific dissemination information about projects.
3.2.3 Assessment of potential project impacts
Relationships between indicators and questions in the METRONOME co-ordinator questionnaire
Level of definition of research goals
The research goals required specific elaboration at the start of the project
Level of theoretical difficulties in the definition of the methodology
There were theoretical difficulties in defining the research methodology
Level of achievement of research objectives
The research objectives were all met
Fitness of project resources for the project needs/expenditures
The research budget and human resources available were insufficient
Level of publication of results
The project results have been adequately published in scientific journals and/or books
Transferability into policy initiatives
The project results have been transferred into policy initiatives, recommendations and/or regulations
Fitness between end-user needs and results
Needs and views of end-users were taken into consideration
Involvement level of civil servants involved
Civil servants and/or policy makers were involved in the project
Involvement level of transport operators involved
Transport operators or service sector were involved in the project
Involvement level of transport industry involved
Transport industry sector was involved in the project
Encouragement of potential for future research
The project raised new unsolved research questions
Quality of the dissemination of results
The project results have been adequately disseminated to end-users
Quality of dissemination through the website
The project webpage was user-friendly and updated regularly
Level of encouragement received by society from the project
The project encouraged the participation of society in research (development of awareness campaigns, public inquiries, etc.)
Extent to which the project produced a helpful networking
The project (consortium) has improved networking between researchers and public/private organisations
Level of stability of networking
The consortium members have developed a stable research network
Adequacy of the frequency of project meetings
The project included too many consortium meetings and workshops
Adequacy of output of the project in terms of the extension of reports
Additional effort should be made to reduce the extension of project deliverables
Adequacy of the financial instrument
The financial instrument was adequate for the project
During the METRONOME project, it was discovered that to complement the information from the questionnaires, it is advisable to carry out detailed co-ordinator interviews. Interviews can provide additional information about the projects, dissemination and use of results in order to draw conclusions on the impacts of the evaluated projects.
3.2.4 Lead-user views on project achievements and impacts
The following fourth approach was considered as the most important one to collect information on programme or project impacts among the METRONOME consortium. This approach includes a workshop with potential lead-users, and interviews conducted among potential and target users of FP projects. The approach was tested in the context of Contribution to Community and Public Policies theme (for details see METRONOME Deliverable D5.1).
The lead-users can be defined as persons (civil servants, consultants, scientists, policy makers, etc.) really using the knowledge gained from EU-research project.
The workshop organized as part of the METRONOME-project was primarily focused on gathering information on specific evaluation indicators that would be relevant to lead-users. Based on this information a specific questionnaire was produced and used to collect information from potential lead-users. The selection of potential lead-users was based on respondent’s characteristics and not on their potential interest for specific projects in the METRONOME sample sets. This meant that the lead-users views did not reflect their opinion on the sample projects, but on a wider sample of FP projects.
The data collection, using the questionnaire, took place in two ‘waves’. In the first ‘wave’ the METRONOME partners approached self-selected lead-users. By using the developed uniform questionnaire format, consistency between the results of interviews by different partners was maintained. The partners were free to use either telephone or face-to-face interviews or distribute the questionnaire by email to pre-selected respondents. The questions in the questionnaire related to the perceived impact of FP research in general, the results of specific projects in which the respondents had been involved, the benefits for the respondent and his/her organisation and what did and did not work in FP projects. After analyses of the first response wave, it was decided to enlarge the number of responses by adding a second ‘wave’. This second wave included mainly the people registered as potentially interested participants for a planned 2nd METRONOME workshop. In addition, the questionnaire format was slightly changed to better accommodate the use of email.
Of the total number of questionnaires that became available for analysis, around 20% were from respondents that had not (in any way) been involved in FP projects and could therefore not answer any questions on specific project results. For those involved in projects, this involvement differed from project partner to participant in project events (workshops, etc.).
In order to test the feasibility of the developed framework and evaluation methods within, a case study of 100 FP5 and FP6 transport projects was carried out in the course of the METRONOME project. The projects represented the themes of Industrial Competitiveness (50 projects), Sustainable Development (30 projects) and contribution to Community and Public Policies (20 projects). A specific combination of the evaluation methods presented above was applied for each of the themes. The case study projects were financed under either the FP5 thematic priorities Sustainable Mobility and Intermodality and Land Transport and Marine Technologies or FP6 priorities Sustainable Surface Transport and Research for Policy Support. Altogether 700 transport projects were financed under those priorities during the years 1999–2006.
Specific Work Programme (WP) or Thematic Area objectives
Key Action (KA) of the Work Programme or Programme Subdivision (PS) objectives (or targets)
Strategic project objectives
FPs and number of thematic objectives/targets set for different levels
Community and public policies
The set of objectives that were best met were the strategic project objectives. This is hardly surprising as these are the objectives that have the most direct relevance to the project. A surprising finding, based on the matrix evaluations, was that in the fields of Sustainable Development and Community and Public Policies, both FP5 and FP6 projects were considered to have contributed more to higher-level WP objectives than the lower-level KA or PS objectives, which could be considered more directly applicable to the projects commissioned. One explanation could be that the higher level objectives are more general and thus easier to meet than the lower level objectives that are more specific. Also, when a project meets its specific objectives satisfactorily, but not the European policy objectives, it might be because the project is focused on one single goal only and thus has been evaluated low for the wider objectives. In the field of SuD only 20%, and in the field of CPP 50%, of the projects reviewed met their strategic project objectives and the relevant KA (or equivalent) objectives that they were commissioned under, as well as one or more of the relevant WP objectives. This suggests that there could be considerable discrepancies between the SuD and CPP components of different levels of objectives set.
Based on the co-ordinator surveys, the level of funding was considered sufficient in FP5 projects, but not among FP6 projects. For example, in the field of CPP less than 30% of the respondents considered the research budget being adequate. In addition, the (input) data availability was considered much better in FP5 than in FP6. The lead-user interviews did not complement the above results. Instead, based on the interviews, the cost effectiveness of the projects in terms of money or resources spent was considered better in FP6 than in FP5.
In general, and based on all approaches, project management in both FPs was carried out satisfactorily. The level of expertise among project participants was considered high in both FPs by both the co-ordinators and lead-users. Dissemination of project results, however, seemed to be a contradictory issue. On the one hand, the majority of project co-ordinators agreed that the project results in both FPs were adequately disseminated to the end-users. The lead-users agreed with the good dissemination level regarding FP6 projects. On the other hand, in more general terms, e.g. for FP evaluation purposes neither the project result dissemination level nor the quality were adequate. At the time of METRONOME evaluation, the project results were not easily available from a centralised web address.
Within all three evaluation themes, the vast majority of strategic project objectives of the project sample were considered to be fully met. This indicates that on individual project levels, in terms of both substance and practicalities, the projects worked well. However, as argued in the previous section, this does not guarantee a positive contribution to higher-level objectives, since there seem to be discrepancies between the different levels of objectives and targets set for the FPs.
In the field of Industrial Competitiveness, major achievements were found in the fields of development of advanced technologies, processes and services, and in contribution to societal, environmental (e.g. safety, traffic congestion) and financial issues. These same fields were emphasised in both FPs. The main contributions of Sustainable Development projects in both FPs were identified as developing, integrating and managing a more efficient, safer, more secure and environmentally friendly transport system to provide user-friendly door-to-door services. Contribution to the development of decision-making tools was the main achievement of Community and Public Policies related projects in both FPs.
Based both on the lead-user survey and the co-ordinator survey, scientific publications and a high level of scientific expertise in general were considered as the main immediate impacts of FP5. Improved networking between researchers and public/private organisations and strengthened networks between international parties were seen as the major immediate impacts of FP6. In addition, patents and standards produced in IndCo projects (especially in FP6), represent the immediate impacts, even though the transport industry and service sector seem not to have been greatly involved.
As regards the intermediate impacts, the major successes of the activities of both FPs can be considered to be strengthened expertise, development of decision-making tools, and co-operation with end-users in the projects. Contributions (often indirectly) to new transport policy development but also to new products or service development were also considered slightly positive. The evoked networking or co-operation seemed to be strongest among project research partners, but could be identified also among stakeholders in both public and private sectors. Failure to convert project results into standards, norms or regulations and the fact that the projects did not raise new unsolved research questions were considered to be the weaknesses of the FP5 and FP6 transport projects. This indicates that even though tools for decision-making have been developed and some contributions to e.g. transport and SuD policies and strategies have been made, the practical, regulatory outcomes have either been modest or are not known. In addition, the identified discrepancy between the FP objectives of different levels and the low level of achievement of European level objectives by the projects in matrix evaluations supports this finding.
Evaluating the ultimate impacts, which might be realised 10 or more years down the road, is a difficult task within all of the METRONOME evaluation themes, since most of the FP5 and FP6 projects are more recent. In addition, investigating the impact pathways and mechanisms (e.g. follow-up research projects and their impacts and consequences) was considered too time and money consuming for our case study resources, but should certainly be considered as an essential part of future evaluations. However, improved transport safety and awareness of environmental impacts from transport and consequent utilisation of developed environmental impact assessment methods or even implementation of identified transport measures are examples of such ultimate impacts.
Testing the METRONOME methodology illustrated that different mixes of evaluation methods (both qualitative and quantitative) are needed for evaluation of projects under the themes of IndCo, SuD and CPP. The main findings regarding the suitability of tested evaluation methods are presented below.
First, the project evaluation matrix provided an indication of the results of each research project evaluated, as well as a holistic summary of the research project findings, their contribution to objectives, and estimation of impact areas and types. The dissemination quality matrix supported the Final Report analysis by providing a more detailed indication of potential impacts of research projects. The matrix approaches were easy to apply, but time consuming. However, in order to gain a thorough understanding of the projects, their background and achievements, allocating enough time to their evaluation is necessary.
Second, the co-ordinator survey provided co-ordinators’ self-evaluation of the potential and actual impacts of the projects. The results were useful as supplements to other evaluation methods, but by themselves the risk of bias in co-ordinator responses, as always in self-evaluations, is present. The lead-user interviews were found to be the most valuable source of information regarding the actual use of research results. These kinds of interviews should be promoted in the future, combined with in-depth, long-term project impact evaluations, in co-operation with technology platforms and EC officers.
The third motivation for using various methods in thematic evaluations stems from the different time perspectives of the expected impacts. The most typical immediate impacts of all projects were publications and networking. If we exclude those, IndCo projects were more likely to produce immediate results (e.g. patents and prototypes) than SuD and CPP projects, which focused more on intermediate impacts, like strengthened expertise, public discourse and support for decision making and strategy development (see also Fig. 1). The ultimate impacts in all themes were very difficult to evaluate because of the long-term perspective (10 years or more). In addition, the different target areas of thematic impacts would require different approaches.
The main difficulty encountered during the METRONOME methodological development was the availability of project result (e.g. Final Report) data. A structured, up-to-date FP project result database that is ready and available for the evaluators would enable more reliable, less time-consuming and less costly FP impact evaluations. Other major difficulties identified during the methodological development were the relatively low response rate in co-ordinator survey and interpretation of the multi-level objective and target structures of the FPs as the basis for evaluation. In order to avoid missing or misinterpreted objectives and targets, input of strategic research objectives and targets from official EC data sources should be ready and available to the evaluators. As regards the surveys, sending the questionnaires officially by EC bodies could improve the response rate; the responses could even be demanded as a part of project proceedings.
In our view, METRONOME evaluation presents only a first phase in a FP impact evaluation process. As it often takes a long time for project impacts to materialise, only a repeated (and simultaneously elaborated) evaluation process can provide more detailed analysis of project or programme impacts. Further, and as a complement to the former, more emphasis and resources are needed in integrating such future elements (formative evaluation) into evaluation methodologies that can better support strategic research and policy planning (including WP objective setting), in the changing European transport and research environment.
6 Conclusions and future research needs
Based on the testing of the METRONOME framework with a sample of 100 FP5 and FP6 projects, we can conclude the following. The evaluation methodology proved to be useful in producing information for definition of performance targets for future FPs and for new research policy objectives from the perspectives of: (1) achieving FP objectives and targets, (2) FP’s implementation and operational environment, and (3) research project outcomes and impacts. These areas represent the traditional evaluation perspectives. Further, we can claim that the framework provided information also from the new evaluation perspectives such as using complementary evaluation methods, focusing on wider perspective than just (managerial) implementation issues (e.g. WP structure analysis) and searching the alignment and mutual learning with research and policy development. Consequently, we may argue that the developed framework can be seen as a first step towards formative evaluation, the requested “strategic intelligence”, in transport research evaluations.
Testing our methodology showed that achievement of objectives in both FPs was good throughout and in some cases even very good. Potential impacts in all four impact groups (management and co-ordination, scientific, end-user, and societal) were positive. The impacts of projects in both FPs were strongest within the group of management and co-ordination. Also scientific and end-user impacts were adequate, but wider societal impacts quite modest.
To conclude, it seems that FP5 and FP6 have certainly played a significant role in the European science and technology agenda. For evaluation of the role of the FPs on the global map or their contribution to EU research competitiveness at international level, the project sample does not give a representative insight. Experience from the METRONOME evaluation methodology development and testing revealed the following future research needs in relation to FP impact evaluation and transport research in general, in the fields of IndCo, SuD and CPP.
If one looks at the potential impacts of FPs on shaping the European Research Area (ERA), the most critical issues are the availability and dissemination of FP project result data. This concerns both lower, individual project level and centralised EC level. Currently the project results are not easily available for the use of individual projects/persons or for FP evaluation purposes. Consequently, the FP output quality needs improvements both on a project level (e.g. longer supported maintenance of web sites) and community level (centralised FP project output database). Managerial incentives such as rewards and bonuses for successful projects and excellent R&D achievements by the Commission could provide here a possibility to increase the project quality in terms of project results and dissemination activities alike.
Another important issue is the lack of consistency identified between the different levels of objectives set for FP Work Programmes. Only a few of the evaluated projects met their own strategic objectives, WP objectives on two levels and relevant European policy objectives. In order to clarify the FP future evaluation in terms of objective achievements, the consistency of the WP objective structure should be increased. In addition, the supporting role of current and future FP evaluation methodologies in WP objective/target setting should be analysed carefully and methodologies developed further.
Other aspects identified relevant for formative future FP evaluations were the following. First, close co-operation with technology platforms and EC officials in project evaluations might result in a more comprehensive and detailed view of project achievements and enhance the uptake of evaluation results. Second, and related to the former, investigating the follow-up research project paths that certain (groups of) projects have evoked might lead to a detailed understanding of the intermediate or even ultimate impacts of FP projects on a certain field. Third, including transport projects commissioned under other programmes than transport (e.g. Information Society, Environment and Security) into the evaluation could provide a more comprehensive view of the impacts of FP transport research. Finally, finding the right time for FP evaluation is always difficult. In our case, for example, FP5 and FP6 stand on a different line in the evaluation because of the temporal aspects. Later implementation of FP6 might have evoked, depending on the circumstances, more intense (positive or negative) responses in the surveys than the more distant FP5.
- 1.Albrecht V, Vaněček J (2008) Assessment of participation of the Czech Republic in the EU framework programmes. Technology Centre of the Academy of Sciences of the Czech RepublicGoogle Scholar
- 3.Arnold E (2009) How meta-evaluation helps us understand the effects of the framework programmes. Presentation at the EUFORDIA Conference, Prague, 25 February 2009Google Scholar
- 4.Arnold E, Guy K (1997) Technology diffusion programmes and the challenge for evaluation. In: OECD conference on policy evaluation in innovation and technology, chapter 6. ParisGoogle Scholar
- 6.Durieux L, Fayl G (1997) The scheme used for evaluating the European research and technological development programmes. In: OECD (1997a). Policy evaluation in innovation and technology: towards best practicesGoogle Scholar
- 7.Expert Group (2009) Evaluation of the sixth framework programmes for research and technological development 2002–2006. Report of the Expert Group. Chairman: E. Th. Rietschel, Rapporteur: E. ArnoldGoogle Scholar
- 8.Fahrenkrog G, Polt W, Rojo J, Tubke A, Zinocker K (eds) (2002) RTD evaluation toolbox: assessing the socio-economic impact of RTD policiesGoogle Scholar
- 10.Georghiou L, Polt W (2004) EU evaluation practice. INTEREG Research Report Series. Institute of Technology and Regional PolicyGoogle Scholar
- 11.IEG (2007) Sourcebook for evaluating global and regional partnership programs. Indicative principles and standardsGoogle Scholar
- 12.Kalenoja H, Mäntynen J, Pöllänen M (2004) Evaluation of JALOIN programme and suggested measures for promoting pedestrian and bicycle traffic in Finland. Publications of the Ministry of Transport and Communications 40/2004, 98 pGoogle Scholar
- 13.Kuitunen S, Hyytinen K (2004) Julkisten tutkimuslaitosten vaikutusten arviointi. Käytäntöjä, kokemuksia ja haasteita. VTT Tiedotteita 2230. Espoo: VTT. [Impact evaluation of public research organisations. Practices, experiences and challenges.] (Only in Finnish.) http://virtual.vtt.fi/inf/pdf/tiedotteet/2004/T2230.pdf
- 14.Kuhlmann S (2009) Lessons from and for research policy impact evaluation. Presentation at the EUFORDIA Conference, Prague, 24–25 February 2009Google Scholar
- 15.Kuhlmann S (2003) Evaluation as a source of “Strategic Intelligence”. In: Shapira Ph, Kuhlmann S (eds) Learning from science and technology policy evaluation: experiences from the United States and Europe. E. Elgar, Cheltenham, pp 352–379Google Scholar
- 16.LEG Expert Group for the follow-up of the research aspects of the revised Lisbon strategy (2008) The governance challenge for knowledge policies in the Lisbon Strategy: between revolution and illusion. Final Synthesis Report, 18 June 2008Google Scholar
- 17.LEG Expert Group for the follow up of the research aspects of the Lisbon strategy (2009) The open method of coordination in research policy: assessment and recommendations. European Commission Directorate-General for Research, Directorate C-European research area: knowledge based economy, Unit C.3—Economic analysis and monitoring of national research policies and the Lisbon strategy. January 2009Google Scholar
- 18.Leonard JD, Washington S, Williams B, Manning DG, Roberts C, Baccus AR, Devanhalli A, Ogle J, Melcher D (2003) Scientific approaches to transportation research. NCHRP 20–45 Report. Georgia Institute of TechnologyGoogle Scholar
- 19.Lähteenmäki-Smith K, Hyytinen K, Kutinlahti P, Konttinen J (2006) Research with an impact. Evaluation practises in public research organisations. VTT Research Notes 2336. Espoo 2006Google Scholar
- 20.Merkx F, van der Weijden I, Oostveen A-M, Spaapen J (2007) Evaluation of research in context. A quick scan of an emerging field (www.eric-project.nl)
- 22.Nagarajan N, Vanheukelen M (1997) Evaluating EU expenditure programmes: a guide to intermediate and ex post evaluation. XIX/02—Budgetary overview and evaluation. DG XIX, European CommissionGoogle Scholar
- 23.OECD (1999) Improving evaluation practices. Best practice guidelines for evaluation background paper. PUMA/PAC (99)1. 45 pGoogle Scholar
- 24.OECD (1997) Policy evaluation in innovation and technology: towards best practicesGoogle Scholar
- 25.Pihlajamaa J, Berg P (2008) Understanding the client at Finnra: ASTAR research programme final assessment report. Helsinki 2008. Finnish Road Administration, Central Administration, Finnra reports 1/2008. 32 p. + app. 11 pp 998–6Google Scholar
- 26.Scriven M (1991) Evaluation thesaurus. Sage Publications, Newbury ParkGoogle Scholar
- 29.Tassey G (2003) Methods for assessing the economic impacts of government R&D. Planning report 03–1. National institute of Standards & Technology, NISTGoogle Scholar
METRONOME Project deliverables
- 30.López-Lambas M, López-Suárez E, La Paix Puello L, Binsted A, Tuominen A, Järvi T (2009) METRONOME D4.1 Sustainable development methodology development and application results. European Commission, FP7Google Scholar
- 31.Mitsakis E, Tyrinopoulos Y, Binsted A (2009) METRONOME D3.1 Industrial competitiveness methodology development and application results. European Commission, FP7Google Scholar
- 32.Sitov A, van der Waard J, Flikkema H, Binsted A, Ferris C, Tuominen A (2009) METRONOME D5.1 Methodology for evaluation of FP5 and FP6 project impacts on community and public policies. European Commission, FP7Google Scholar
- 33.Tuominen A, Järvi T, Hyytinen K, Pesonen A, van der Waard J, Mitsakis E, Sito A, Binsted A, López-Lambas M, López-Suárez E, La Paix-Puello L (2009) METRONOME final deliverable. METRONOME methodology for evaluation of research project impacts in the filed of transport. European Commission, FP7Google Scholar
This article is published under license to BioMed Central Ltd. Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.