A Primer on Public Sector Evaluations

Deb, Saubhik; Shah, Anwar

doi:10.1007/978-3-030-48567-2_2

Saubhik Deb² &
Anwar Shah³

374 Accesses

Abstract

This chapter provides a brief survey of the program evaluation methods, their objectives, strengths, and weaknesses. The chapter first presents methods like cost-benefit analysis, cost-effectiveness analysis, the social marginal cost of funds analysis, and data envelopment analysis that are appropriate for guiding efficient allocation and utilization of resources. This is followed by a discussion on multiple-criteria evaluation (MCE), which is much more holistic. Apart from efficiency, it addresses issues like the relevance of a program, effectiveness of a program in achieving its objectives, and sustainability of the program benefits. Newer MCE approaches such as the Iron Triangle and the Results-Oriented Management Evaluation are briefly sketched. Finally, theory-based evaluation is outlined, where the focus is not just on whether a program succeeds or fails but also on how and why a program succeeds or fails.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Andrews, Matthew, and Anwar Shah. 2001. From Washington to ROME: A Road Less Traveled by Public Sector Reformers. Washington, DC: Operations Evaluation Department, World Bank.
Google Scholar
Andrews, Matthew and Anwar Shah, 2005. Citizen-Centered Governance: A New Approach to Public Sector Reform. In Public Expenditure Analysis, ed. Anwar Shah, Chapter 6, 153–182. Washington, DC: World Bank.
Google Scholar
Angrist, Joshua, Guido Imbens, and Donald Rubin. 1996. Identification of Causal Effects Using Instrumental Variables. Journal of the American Statistical Association 91 (434): 444–455.
Article Google Scholar
Banker, R.D., and R.C. Morey. 1986a. Efficiency Analysis for Exogenously Fixed Inputs and Outputs. Operations Research 34 (4): 513–521.
Article Google Scholar
———. 1986b. The Use of Categorical Variables in Data Envelopment Analysis. Management Science 32 (12): 1613–1627.
Article Google Scholar
Barnow, B., G. Cain, and A. Goldberger. 1980. Issues in the Analysis of Selectivity Bias. In Evaluation Studies Review Annual, ed. E. Stromsdorfer and G. Farkas, vol. 5. San Francisco: Sage.
Google Scholar
Boruch, Robert F. 1997. Randomized Experiments for Planning and Evaluation: A Practical Guide. Thousand Oaks: Sage.
Book Google Scholar
Briggs, A., and P. Fenn. 1998. Confidence Intervals or Surfaces? Uncertainty on the Cost-Effectiveness Plane. Health Economics 7: 723–740.
Article Google Scholar
Charnes, A., W.W. Cooper, and Edwardo L. Rhodes. 1978. Measuring the Efficiency of Decision Making Units. European Journal of Operational Research 2 (6): 429–444.
Article Google Scholar
———. 1979. Short Communication: Measuring the Efficiency of Decision Making Units. European Journal of Operational Research 3 (4): 339.
Article Google Scholar
———. 1981. Evaluating Program and Managerial Efficiency: An Application of Data Envelopment Analysis to Program Follow Through. Management Science 27 (6): 668–697.
Article Google Scholar
Charnes, A., W.W. Cooper, A.Y. Lewin, and L.M. Seiford. 1994. Data Envelopment Analysis: Theory, Methodology, and Application. Boston: Kluwer.
Book Google Scholar
Chen, H., and P.H. Rossi. 1980. The Multi-Goal, Theory-Driven Approach to Evaluation: A Model Linking Basic and Applied Social Science. Social Forces 59 (1): 106–122.
Article Google Scholar
———. 1989. Issues in the Theory-Driven Perspective. Evaluation and Program Planning 12 (4): 299–306.
Article Google Scholar
Corbeil, R. 1986. Logic on Logic Models. In Evaluation Newsletter. Ottawa: Office of the Comptroller General of Canada.
Google Scholar
Davidson, E. Jane. 2006. The “Baggaging” of Theory-Based Evaluation. Journal of Multidisciplinary Evaluation 4: iii–xiii.
Google Scholar
Dehejia, Rajeev. 2015. Experimental and Non-experimental Methods in Development Economics: A Porous Dialectic. Journal of Globalization and Development 6 (1): 47–69.
Article Google Scholar
Dehejia, R., and S. Wahba. 2002. Propensity Score Matching Methods for Non-experimental Causal Studies. Review of Economics and Statistics 84 (1): 441–462.
Article Google Scholar
Department of Finance. 1987. The Choice of Discount Rate for Evaluating Public Sector Investment Projects: A Discussion Paper. Department of Finance, Australia.
Google Scholar
Donaldson, Stewart, and Laura Gooler. 2003. Theory-Driven Evaluation in Action: Lessons from a $20 Million Statewide Work and Health Initiative. Evaluation and Program Planning 26: 355–366.
Article Google Scholar
Duflo, Esther, Rachel Glennerster, and Michael Kremer. 2007. Chapter 61 Using Randomization in Development Economics Research: A Toolkit. In Handbook of Development Economics, vol. 4, 3895–3962. https://doi.org/10.1016/S1573-4471(07)04061-2.
Chapter Google Scholar
Flay, B.R., T.Q. Miller, D. Hedeker, O. Siddiqui, C.F. Britton, B.R. Brannon, A. Johnson, W.B. Hansen, S. Sussman, and C. Dent. 1995. The Television, School and Family Smoking Prevention and Cessation Project: VIII. Student Outcomes and Mediating Variables. Preventive Medicine 24 (1): 29–40.
Article Google Scholar
Heckman, J., N. Hohmann, J. Smith, and M. Khoo. 2000. Substitution and Dropout Bias in Social Experiments: A Study of an Influential Social Experiment. Quarterly Journal of Economics 115 (2): 651–694.
Article Google Scholar
Husser, Phillpe. 2019. Do Not Stick to the Iron Triangle in Project Management. https://www.philippehusser.com/do-not-stick-to-the-iron-triangle-in-project-management-2/
Imbens, Guido W., and Joshua D. Angrist. 1994. Identification and Estimation of Local Average Treatment Effects. Econometrica 62 (2): 467–475.
Article Google Scholar
Kittelsen, S.A.C., and F.R. Førsund. 1992. Efficiency Analysis of Norwegian District Courts. Journal of Productivity Analysis 3 (3): 277–306.
Article Google Scholar
Kusek, Jody Zall, and Ray C. Rist. 2004. Ten Steps to a Results-Based Monitoring and Evaluation System. Washington, DC: The World Bank.
Book Google Scholar
Levin, Henry M. 1983. Cost Effectiveness: A Primer. Beverly Hills: Sage.
Google Scholar
Lipsey, M.W., and A.J. Pollard. 1989. Driving Toward Theory in Program Evaluation: More Models to Choose From. Evaluation and Program Planning 12 (4): 317–328.
Article Google Scholar
Luellen, Jason K., William R. Shadish, and M.H. Clark. 2005. Propensity Scores: An Introduction and Experimental Test. Evaluation Review 29 (6): 530–558.
Article Google Scholar
McClendon, McKee J. 1994. Multiple Regression and Causal Analysis. Itsaca: F. E. Peacock.
Google Scholar
O’Brien, B., and A. Briggs. 2002. Analysis of Uncertainty in Health Care Cost-Effectiveness Studies: An Introduction to Statistical Issues and Methods. Statistical Methods in Medical Research 11: 455–468.
Article Google Scholar
Owens, D.K. 1998. Interpretation of Cost-Effectiveness Analysis. Journal of General Internal Medicine 13: 716–717.
Article Google Scholar
Pawson, R., and N. Tilley. 1997. Realistic Evaluation. London: Sage.
Google Scholar
Pearce, D., G. Atkinson, and S. Mourato. 2006. Cost-Benefit Analysis and the Environment: Recent Developments. Paris: OECD.
Google Scholar
Pollack, Julien, Jane Helm, and Daniel Adler. 2018. What Is the Iron Triangle, and How It Has Changed? International Journal of Managing Projects in Business 11 (2): 527–547.
Article Google Scholar
Quade, Edward S. 1967. Introduction and Overview. In Cost-Effectiveness Analysis, ed. Thomas A. Goldman. New York: Frederick A. Praeger.
Google Scholar
Rosenbaum, Paul R., and Donald B. Rubin. 1983. The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika 70: 41–55.
Article Google Scholar
Rossi, Peter Henry, and Howard E. Freeman. 1993. Evaluation: A Systematic Approach. Newbury Park: Sage.
Google Scholar
Rush, B., and A. Ogborne. 1991. Program Logic Models: Expanding Their Role and Structure for Program Planning and Evaluation. Canadian Journal of Program Evaluation 6 (2): 95–106.
Google Scholar
Scriven, Michael. 1967. The Methodology of Evaluation. In Perspectives of Curriculum Evaluation, AERA Monograph Series on Curriculum Evaluation, ed. Ralph W. Tyler, Robert M. Gagne, and Michael Scriven, vol. 1. Chicago: Rand McNally.
Google Scholar
———. 1991. Evaluation Thesaurus. Newbury Park: Sage.
Google Scholar
Shadish, William R., Thomas D. Cook, and Donald T. Campbell. 2001. Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Boston: Houghton Mifflin.
Google Scholar
Shah, Anwar. 2005. On Getting the Giant to Kneel: Approaches to a Change in the Bureaucratic Culture. Chapter 9. In Fiscal Management, ed. Anwar Shah, 211–228. Washington, DC: World Bank.
Google Scholar
Squire, Lyn, and Herman G. van der Tak. 1975. Economic Analysis of Projects. Baltimore: Johns Hopkins University Press.
Google Scholar
Suchman, E.A. 1967. Evaluative Research: Principles and Practice in Public Service and Social Action Programs. New York: Russell Sage Foundation.
Google Scholar
Todd, Petra. 2007. Chapter 60 Evaluating Social Programs with Endogenous Program Placement and Selection of the Treated. In Handbook of Development Economics, vol. 4, 3847–3894. https://doi.org/10.1016/S1573-4471(07)04060-0.
Chapter Google Scholar
Tulkens, H. 1993. On FDH Efficiency Analysis: Some Methodological Issues and Application to Retail Banking, Courts and Urban Transit. Journal of Productivity Analysis 4 (1–2): 183–210.
Article Google Scholar
Van Hout, B.A., M.J. Al, G.S. Gordon, and F.F. Rutten. 1994. Costs, Effects and C/E-Ratios Alongside a Clinical Trial. Health Economics 3: 309–319.
Article Google Scholar
Weiss, Carol H. 1987. Where Politics and Evaluation Research Meet. In The Politics of Program Evaluation, ed. D. Palumbo. Newbury Park: Sage.
Google Scholar
———. 1995. Nothing as Practical as Good Theory: Exploring Theory-Based Evaluation for Comprehensive Community Initiatives for Children and Families. In New Approaches to Evaluating Community Initiatives: Volume 1, Concepts, Methods and Contexts, ed. J.P. Connell, A.C. Kubisch, L.B. Schorr, and C.H. Weiss. Washington, DC: The Aspen Institute.
Google Scholar
———. 1997a. How Can Theory-Based Evaluation Make Greater Headway? Evaluation Review 21 (4): 501–524.
Article Google Scholar
———. 1997b. Theory-Based Evaluation: Past, Present, and Future. In Progress and Future Directions in Evaluation, New Directions for Evaluation, ed. D.J. Rog, vol. 76. San Francisco: Jossey-Bass.
Google Scholar
———. 1998. Evaluation. Upper Saddle River: Prentice Hall.
Google Scholar
Wholey, J.S. 1994. Assessing the Feasibility and Likely Usefulness of Evaluation. In Handbook of Practical Program Evaluation, ed. J.S. Wholey, H.P. Hatry, and K.E. Newcomer. San Francisco: Jossey-Bass.
Google Scholar
World Bank. 2002. Guidelines and Criteria for OED Project Evaluations. Operations Evaluation Department, Unpublished Note, July 1, 2000, The World Bank Group.
Google Scholar
———. 2020. Guidance. The Independent Evaluation Group, Unpublished Note, January.
Google Scholar

Download references

Author information

Authors and Affiliations

Kolkata, India
Saubhik Deb
Governance Studies, Brookings Institution, Washington, DC, USA
Anwar Shah

Authors

Saubhik Deb
View author publications
You can also search for this author in PubMed Google Scholar
Anwar Shah
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Governance Studies, Brookings Institution, Washington, DC, USA
Anwar Shah

Annex: An Example of a Multi-Criteria Evaluation Approach—The Practice by the World Bank Operations Evaluation Department/the Independent Evaluation Group

The World Bank has been a premier institution using MCE in evaluating its programs and projects. The approach used by the World Bank Evaluation Department (earlier the so-called Operations Evaluation Department, OED, and now the Independent Evaluation Group, IEG) has evolved over time. The OED/IEG approach used the following criteria.

Relevance of Objectives

Definition: The extent to which the project’s objectives are consistent with the country’s current development priorities and with current Bank country and sectoral assistance strategies and corporate goals as expressed in Poverty Reduction Strategy Papers, Country Assistance Strategies, Sectoral Strategy Papers and Operations Policy papers.

The IEG considers the following factors in overall relevance: government ownership and commitment; explicit Bank strategy; results framework; analytical underpinning; flexibility; strategic focus; appropriateness of instrument mix, Bank capacity; Bank and IFC coordination; and Bank and other development partners’ collaboration.

Rating of relevance by OED/IEG:

High/Mostly Relevant: Most of the major objectives were highly relevant.
Substantial/Partially Relevant: Most of the major objectives are at least substantially relevant.
Modest/Partially Relevant: Most of the major objectives were not highly or substantially relevant.
Negligible/Not Relevant: Most of the major objectives were irrelevant or negligibly relevant.

Efficacy

Definition: The extent to which the project’s objectives were achieved, or expected to be achieved, taking into account their relative importance.

Rating of Efficacy by OED/IEG:

High/Achieved: Major objectives were fully met, or expected to be fully met, with no shortcomings.
Substantial/Mostly Achieved: Major objectives were met, or expected to be met, with only minor shortcomings.
Modest/Partially Achieved: Major objectives were met, or expected to be met, but with significant shortcomings.
Negligible/Not Achieved: Most objectives were not met, or expected not to be met, due to major shortcomings.

Efficiency (by OED; the IEG Dropped This Criterion)

Definition: The extent to which the project achieved, or is expected to achieve, a return higher than the opportunity cost of capital and benefits at least cost compared to alternatives.

Ratings by OED (Note: IEG discontinued this criterion)

High: Project represents sector/industry best practice in terms of cost-effectiveness, and economic returns (if estimates are available) greatly exceed the opportunity cost of capital.
Substantial: Project meets sector/industry standards in terms of cost-effectiveness, and economic returns (if estimates are available) exceed the opportunity cost of capital.
Modest: Project fails to meet sector/industry standards in terms of cost-effectiveness, and economic returns (if estimates are available) are near the opportunity cost of capital.
Negligible: Project is well below sector/industry standards in terms of cost-effectiveness, and economic returns (if estimates are available) are significantly below the opportunity cost of capital.

Sustainability (by OED; the IEG Dropped This Criterion)

Definition: The resilience to risk of net benefits flows over time.

Assessments of sustainability take into account nine factors:

Technical resilience
Financial resilience (including policies on cost recovery)
Economic resilience
Social support (including conditions subject to Safeguard Policies)
Environmental resilience
Government ownership (including by central governments and agencies, and availability of O&M funds)
Other stakeholder ownership (including local participation, beneficiary incentives, civil society/NGOs, private sector)
Institutional support (including supportive legal/regulatory framework, and organizational and management effectiveness)
Resilience to exogenous influences (including terms of trade, economic shocks, regional political, and security situations)

OED Ratings:

Highly Likely: Project net benefits flow meets most of the relevant factors determining overall resilience at the “high level,” with all others rated at the “substantial” level
Likely: Project net benefits flow meets all relevant factors determining overall resilience at the “substantial” level
Unlikely: Project net benefits flow meets some but not all relevant factors determining overall resilience at the “substantial” level
Highly Unlikely: Project net benefits flow meets few of the relevant factors determining overall resilience at the “substantial” level
Not Evaluable: Insufficient information available to make a judgment

Result (New Criterion by the IEG)

Definition: To what extent specified output targets were met.

Ratings by IEG:

Met: Specified output targets were fully met.
Mostly met: Major output targets were met.
Partially met: Some output targets were met.
Not Met: Most output target were not met.

Outcome/Effectiveness

Definition: The extent to which the project’s major relevant objectives were achieved, or are expected to be achieved, efficiently.

Ratings by OED/IEG (Note that the IEG has consolidated the ratings into four as follows).

Highly Satisfactory/Achieved: Project achieved or exceeded, or is expected to achieve or exceed, all its major relevant objectives efficiently without major shortcomings.
Satisfactory/Achieved: Project achieved, or is expected to achieve, most of its major relevant objectives efficiently with only minor shortcomings.
Moderately Satisfactory/Mostly Achieved: Project achieved, or is expected to achieve, most of its major relevant objectives efficiently but with either significant shortcomings or modest overall relevance.
Moderately Unsatisfactory/Partially Achieved: Project is expected to achieve its major relevant objectives with major shortcomings or is expected to achieve only some of its major relevant objectives, yet achieve positive efficiency.
Unsatisfactory/Not Achieved: Project has failed to achieve, and is not expected to achieve, most of its major relevant objectives with only minor development benefits.
Highly Unsatisfactory/Not Achieved: Project has failed to achieve, and is not expected to achieve, any of its major relevant objectives with no worthwhile development benefits.

An important limitation of the OED approach to the assessment of outcome is that the outcome is considered independent of sustainability. A project may be judged “Highly Satisfactory” while it may not have been sustained.

Institutional Development Impact (IDI; by the OED Only)

Definition: The extent to which a project improves the ability of a country or region to make more efficient, equitable, and sustainable use of its human, financial, and natural resources through: (a) better definition, stability, transparency, enforceability, and predictability of institutional arrangements, and/or (b) better alignment of the mission and capacity of an organization with its mandate, which derives from these institutional arrangements. IDI considers that the project is expected to make a critical contribution to the country’s/region’s ability to effectively use human, financial, and natural resources, either through the achievement of the project’s stated ID objectives or through unintended effects.

OED Ratings:

Substantial: Project as a whole made, or is expected to make, a significant contribution to the country’s/region’s ability to effectively use human, financial, and natural resources, either through the achievement of the project’s stated ID objectives or through unintended effects.
Modest: Project as a whole increased, or is expected to increase, to a limited extent the country’s/region’s ability to effectively use human, financial, and natural resources, either through the achievement of the project’s stated ID objectives or through unintended effects.
Negligible: Project as a whole made, or is expected to make, little or no contribution to the country’s/region’s ability to effectively use human, financial, and natural resources, either through the achievement of the project’s stated ID objectives or through unintended effects.

The IEG no longer uses this criterion.

The Bank Performance

The OED in addition also rated the Bank performance based upon “Quality at Entry” and “Supervision” as follows.

The Quality of Entry ratings took into consideration: project consistency with Bank strategy for the country; grounding in economic and sector work; development objective statement; approach and design appropriateness; government ownership; involvement of stakeholders/beneficiaries; adequacy of technical analysis; economic and financial impact analysis; environmental assessment; impact on poverty reduction and social issues; institutional analysis; adequacy of financial management arrangements; readiness for implementation; and assessment of risk and sustainability.
The Supervision ratings took into account two major factors: focus on development impact and adequacy of supervision inputs and processes. The focus on development impact includes: timely identification/assessment of implementation and development impact; appropriateness of proposed solutions and follow-up; effectiveness of Bank actions. The Supervision ratings took into account: adequacy of Bank supervision resources; supervision reporting quality; attention to fiduciary aspects, and attention to monitoring and evaluation.

OED Ratings on Bank Performance

Highly Satisfactory: Bank performance was rated as Highly Satisfactory on both quality at entry and supervision, or Highly Satisfactory on the one dimension with significantly higher impact on project performance and at least Satisfactory on the other.
Satisfactory: Bank performance was rated at least Satisfactory on both quality at entry and supervision, or Satisfactory on the one dimension with significantly higher impact on project performance and no less than Unsatisfactory on the other.
Unsatisfactory: Bank performance was not rated at least Satisfactory on both quality at entry and supervision, or Unsatisfactory on the one dimension with significantly higher impact on project performance and no higher than Satisfactory on the other.
Highly Unsatisfactory: Bank performance was rated as Highly Unsatisfactory on both quality at entry and supervision, or Highly Unsatisfactory on the one dimension with significantly higher impact on project performance and no higher than Unsatisfactory on the other.

The IEG instead rates Bank performance based upon (a) strategic relevance at country level and (b) effectiveness of Bank interventions. The effectiveness is assessed by relevance, result, efficacy, and overall effectiveness criteria.

The Borrower Performance (by the OED Only)

Definition: The extent to which the borrower assumed ownership and responsibility to ensure quality of preparation and implementation, and complied with covenants and agreements, towards the achievement of development objectives and sustainability.

OED rated borrower performance on there counts: (a) preparation; (b) implementation; and (c) compliance. The preparation took into consideration institutional and financial constraints. The implementation considered macro and sectoral policies/conditions; government commitment; appointment of key staff; counterpart funding; and administrative procedures. The implementing agency performance was also considered. The compliance considered all major covenants and commitments undertaken by the borrower.

Ratings

Highly Satisfactory: Borrower performance was rated Highly Satisfactory on at least two of the three performance factors.
Satisfactory: Borrower performance was rated at least Satisfactory on two of the three factors.
Unsatisfactory: Borrower performance was not rated at least Satisfactory on two of the three factors.
Highly Unsatisfactory: Borrower performance was rated Highly Unsatisfactory on at least two of the three factors.

The IEG no longer rates the borrower performance as indicated by the above criterion.

Source: World Bank (2002, 2020).

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Deb, S., Shah, A. (2020). A Primer on Public Sector Evaluations. In: Shah, A. (eds) Policy, Program and Project Evaluation. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-030-48567-2_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-48567-2_2
Published: 03 November 2020
Publisher Name: Palgrave Macmillan, Cham
Print ISBN: 978-3-030-48566-5
Online ISBN: 978-3-030-48567-2
eBook Packages: Economics and FinanceEconomics and Finance (R0)

Publish with us

Policies and ethics

A Primer on Public Sector Evaluations

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Annex: An Example of a Multi-Criteria Evaluation Approach—The Practice by the World Bank Operations Evaluation Department/the Independent Evaluation Group

Annex: An Example of a Multi-Criteria Evaluation Approach—The Practice by the World Bank Operations Evaluation Department/the Independent Evaluation Group

Relevance of Objectives

Efficacy

Efficiency (by OED; the IEG Dropped This Criterion)

Sustainability (by OED; the IEG Dropped This Criterion)

Result (New Criterion by the IEG)

Outcome/Effectiveness

Institutional Development Impact (IDI; by the OED Only)

The Bank Performance

The Borrower Performance (by the OED Only)

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation