Skip to main content

A Primer on Public Sector Evaluations

  • Chapter
  • First Online:
Policy, Program and Project Evaluation
  • 374 Accesses

Abstract

This chapter provides a brief survey of the program evaluation methods, their objectives, strengths, and weaknesses. The chapter first presents methods like cost-benefit analysis, cost-effectiveness analysis, the social marginal cost of funds analysis, and data envelopment analysis that are appropriate for guiding efficient allocation and utilization of resources. This is followed by a discussion on multiple-criteria evaluation (MCE), which is much more holistic. Apart from efficiency, it addresses issues like the relevance of a program, effectiveness of a program in achieving its objectives, and sustainability of the program benefits. Newer MCE approaches such as the Iron Triangle and the Results-Oriented Management Evaluation are briefly sketched. Finally, theory-based evaluation is outlined, where the focus is not just on whether a program succeeds or fails but also on how and why a program succeeds or fails.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Andrews, Matthew, and Anwar Shah. 2001. From Washington to ROME: A Road Less Traveled by Public Sector Reformers. Washington, DC: Operations Evaluation Department, World Bank.

    Google Scholar 

  • Andrews, Matthew and Anwar Shah, 2005. Citizen-Centered Governance: A New Approach to Public Sector Reform. In Public Expenditure Analysis, ed. Anwar Shah, Chapter 6, 153–182. Washington, DC: World Bank.

    Google Scholar 

  • Angrist, Joshua, Guido Imbens, and Donald Rubin. 1996. Identification of Causal Effects Using Instrumental Variables. Journal of the American Statistical Association 91 (434): 444–455.

    Article  Google Scholar 

  • Banker, R.D., and R.C. Morey. 1986a. Efficiency Analysis for Exogenously Fixed Inputs and Outputs. Operations Research 34 (4): 513–521.

    Article  Google Scholar 

  • ———. 1986b. The Use of Categorical Variables in Data Envelopment Analysis. Management Science 32 (12): 1613–1627.

    Article  Google Scholar 

  • Barnow, B., G. Cain, and A. Goldberger. 1980. Issues in the Analysis of Selectivity Bias. In Evaluation Studies Review Annual, ed. E. Stromsdorfer and G. Farkas, vol. 5. San Francisco: Sage.

    Google Scholar 

  • Boruch, Robert F. 1997. Randomized Experiments for Planning and Evaluation: A Practical Guide. Thousand Oaks: Sage.

    Book  Google Scholar 

  • Briggs, A., and P. Fenn. 1998. Confidence Intervals or Surfaces? Uncertainty on the Cost-Effectiveness Plane. Health Economics 7: 723–740.

    Article  Google Scholar 

  • Charnes, A., W.W. Cooper, and Edwardo L. Rhodes. 1978. Measuring the Efficiency of Decision Making Units. European Journal of Operational Research 2 (6): 429–444.

    Article  Google Scholar 

  • ———. 1979. Short Communication: Measuring the Efficiency of Decision Making Units. European Journal of Operational Research 3 (4): 339.

    Article  Google Scholar 

  • ———. 1981. Evaluating Program and Managerial Efficiency: An Application of Data Envelopment Analysis to Program Follow Through. Management Science 27 (6): 668–697.

    Article  Google Scholar 

  • Charnes, A., W.W. Cooper, A.Y. Lewin, and L.M. Seiford. 1994. Data Envelopment Analysis: Theory, Methodology, and Application. Boston: Kluwer.

    Book  Google Scholar 

  • Chen, H., and P.H. Rossi. 1980. The Multi-Goal, Theory-Driven Approach to Evaluation: A Model Linking Basic and Applied Social Science. Social Forces 59 (1): 106–122.

    Article  Google Scholar 

  • ———. 1989. Issues in the Theory-Driven Perspective. Evaluation and Program Planning 12 (4): 299–306.

    Article  Google Scholar 

  • Corbeil, R. 1986. Logic on Logic Models. In Evaluation Newsletter. Ottawa: Office of the Comptroller General of Canada.

    Google Scholar 

  • Davidson, E. Jane. 2006. The “Baggaging” of Theory-Based Evaluation. Journal of Multidisciplinary Evaluation 4: iii–xiii.

    Google Scholar 

  • Dehejia, Rajeev. 2015. Experimental and Non-experimental Methods in Development Economics: A Porous Dialectic. Journal of Globalization and Development 6 (1): 47–69.

    Article  Google Scholar 

  • Dehejia, R., and S. Wahba. 2002. Propensity Score Matching Methods for Non-experimental Causal Studies. Review of Economics and Statistics 84 (1): 441–462.

    Article  Google Scholar 

  • Department of Finance. 1987. The Choice of Discount Rate for Evaluating Public Sector Investment Projects: A Discussion Paper. Department of Finance, Australia.

    Google Scholar 

  • Donaldson, Stewart, and Laura Gooler. 2003. Theory-Driven Evaluation in Action: Lessons from a $20 Million Statewide Work and Health Initiative. Evaluation and Program Planning 26: 355–366.

    Article  Google Scholar 

  • Duflo, Esther, Rachel Glennerster, and Michael Kremer. 2007. Chapter 61 Using Randomization in Development Economics Research: A Toolkit. In Handbook of Development Economics, vol. 4, 3895–3962. https://doi.org/10.1016/S1573-4471(07)04061-2.

    Chapter  Google Scholar 

  • Flay, B.R., T.Q. Miller, D. Hedeker, O. Siddiqui, C.F. Britton, B.R. Brannon, A. Johnson, W.B. Hansen, S. Sussman, and C. Dent. 1995. The Television, School and Family Smoking Prevention and Cessation Project: VIII. Student Outcomes and Mediating Variables. Preventive Medicine 24 (1): 29–40.

    Article  Google Scholar 

  • Heckman, J., N. Hohmann, J. Smith, and M. Khoo. 2000. Substitution and Dropout Bias in Social Experiments: A Study of an Influential Social Experiment. Quarterly Journal of Economics 115 (2): 651–694.

    Article  Google Scholar 

  • Husser, Phillpe. 2019. Do Not Stick to the Iron Triangle in Project Management. https://www.philippehusser.com/do-not-stick-to-the-iron-triangle-in-project-management-2/

  • Imbens, Guido W., and Joshua D. Angrist. 1994. Identification and Estimation of Local Average Treatment Effects. Econometrica 62 (2): 467–475.

    Article  Google Scholar 

  • Kittelsen, S.A.C., and F.R. Førsund. 1992. Efficiency Analysis of Norwegian District Courts. Journal of Productivity Analysis 3 (3): 277–306.

    Article  Google Scholar 

  • Kusek, Jody Zall, and Ray C. Rist. 2004. Ten Steps to a Results-Based Monitoring and Evaluation System. Washington, DC: The World Bank.

    Book  Google Scholar 

  • Levin, Henry M. 1983. Cost Effectiveness: A Primer. Beverly Hills: Sage.

    Google Scholar 

  • Lipsey, M.W., and A.J. Pollard. 1989. Driving Toward Theory in Program Evaluation: More Models to Choose From. Evaluation and Program Planning 12 (4): 317–328.

    Article  Google Scholar 

  • Luellen, Jason K., William R. Shadish, and M.H. Clark. 2005. Propensity Scores: An Introduction and Experimental Test. Evaluation Review 29 (6): 530–558.

    Article  Google Scholar 

  • McClendon, McKee J. 1994. Multiple Regression and Causal Analysis. Itsaca: F. E. Peacock.

    Google Scholar 

  • O’Brien, B., and A. Briggs. 2002. Analysis of Uncertainty in Health Care Cost-Effectiveness Studies: An Introduction to Statistical Issues and Methods. Statistical Methods in Medical Research 11: 455–468.

    Article  Google Scholar 

  • Owens, D.K. 1998. Interpretation of Cost-Effectiveness Analysis. Journal of General Internal Medicine 13: 716–717.

    Article  Google Scholar 

  • Pawson, R., and N. Tilley. 1997. Realistic Evaluation. London: Sage.

    Google Scholar 

  • Pearce, D., G. Atkinson, and S. Mourato. 2006. Cost-Benefit Analysis and the Environment: Recent Developments. Paris: OECD.

    Google Scholar 

  • Pollack, Julien, Jane Helm, and Daniel Adler. 2018. What Is the Iron Triangle, and How It Has Changed? International Journal of Managing Projects in Business 11 (2): 527–547.

    Article  Google Scholar 

  • Quade, Edward S. 1967. Introduction and Overview. In Cost-Effectiveness Analysis, ed. Thomas A. Goldman. New York: Frederick A. Praeger.

    Google Scholar 

  • Rosenbaum, Paul R., and Donald B. Rubin. 1983. The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika 70: 41–55.

    Article  Google Scholar 

  • Rossi, Peter Henry, and Howard E. Freeman. 1993. Evaluation: A Systematic Approach. Newbury Park: Sage.

    Google Scholar 

  • Rush, B., and A. Ogborne. 1991. Program Logic Models: Expanding Their Role and Structure for Program Planning and Evaluation. Canadian Journal of Program Evaluation 6 (2): 95–106.

    Google Scholar 

  • Scriven, Michael. 1967. The Methodology of Evaluation. In Perspectives of Curriculum Evaluation, AERA Monograph Series on Curriculum Evaluation, ed. Ralph W. Tyler, Robert M. Gagne, and Michael Scriven, vol. 1. Chicago: Rand McNally.

    Google Scholar 

  • ———. 1991. Evaluation Thesaurus. Newbury Park: Sage.

    Google Scholar 

  • Shadish, William R., Thomas D. Cook, and Donald T. Campbell. 2001. Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Boston: Houghton Mifflin.

    Google Scholar 

  • Shah, Anwar. 2005. On Getting the Giant to Kneel: Approaches to a Change in the Bureaucratic Culture. Chapter 9. In Fiscal Management, ed. Anwar Shah, 211–228. Washington, DC: World Bank.

    Google Scholar 

  • Squire, Lyn, and Herman G. van der Tak. 1975. Economic Analysis of Projects. Baltimore: Johns Hopkins University Press.

    Google Scholar 

  • Suchman, E.A. 1967. Evaluative Research: Principles and Practice in Public Service and Social Action Programs. New York: Russell Sage Foundation.

    Google Scholar 

  • Todd, Petra. 2007. Chapter 60 Evaluating Social Programs with Endogenous Program Placement and Selection of the Treated. In Handbook of Development Economics, vol. 4, 3847–3894. https://doi.org/10.1016/S1573-4471(07)04060-0.

    Chapter  Google Scholar 

  • Tulkens, H. 1993. On FDH Efficiency Analysis: Some Methodological Issues and Application to Retail Banking, Courts and Urban Transit. Journal of Productivity Analysis 4 (1–2): 183–210.

    Article  Google Scholar 

  • Van Hout, B.A., M.J. Al, G.S. Gordon, and F.F. Rutten. 1994. Costs, Effects and C/E-Ratios Alongside a Clinical Trial. Health Economics 3: 309–319.

    Article  Google Scholar 

  • Weiss, Carol H. 1987. Where Politics and Evaluation Research Meet. In The Politics of Program Evaluation, ed. D. Palumbo. Newbury Park: Sage.

    Google Scholar 

  • ———. 1995. Nothing as Practical as Good Theory: Exploring Theory-Based Evaluation for Comprehensive Community Initiatives for Children and Families. In New Approaches to Evaluating Community Initiatives: Volume 1, Concepts, Methods and Contexts, ed. J.P. Connell, A.C. Kubisch, L.B. Schorr, and C.H. Weiss. Washington, DC: The Aspen Institute.

    Google Scholar 

  • ———. 1997a. How Can Theory-Based Evaluation Make Greater Headway? Evaluation Review 21 (4): 501–524.

    Article  Google Scholar 

  • ———. 1997b. Theory-Based Evaluation: Past, Present, and Future. In Progress and Future Directions in Evaluation, New Directions for Evaluation, ed. D.J. Rog, vol. 76. San Francisco: Jossey-Bass.

    Google Scholar 

  • ———. 1998. Evaluation. Upper Saddle River: Prentice Hall.

    Google Scholar 

  • Wholey, J.S. 1994. Assessing the Feasibility and Likely Usefulness of Evaluation. In Handbook of Practical Program Evaluation, ed. J.S. Wholey, H.P. Hatry, and K.E. Newcomer. San Francisco: Jossey-Bass.

    Google Scholar 

  • World Bank. 2002. Guidelines and Criteria for OED Project Evaluations. Operations Evaluation Department, Unpublished Note, July 1, 2000, The World Bank Group.

    Google Scholar 

  • ———. 2020. Guidance. The Independent Evaluation Group, Unpublished Note, January.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Annex: An Example of a Multi-Criteria Evaluation Approach—The Practice by the World Bank Operations Evaluation Department/the Independent Evaluation Group

Annex: An Example of a Multi-Criteria Evaluation Approach—The Practice by the World Bank Operations Evaluation Department/the Independent Evaluation Group

The World Bank has been a premier institution using MCE in evaluating its programs and projects. The approach used by the World Bank Evaluation Department (earlier the so-called Operations Evaluation Department, OED, and now the Independent Evaluation Group, IEG) has evolved over time. The OED/IEG approach used the following criteria.

Relevance of Objectives

Definition: The extent to which the project’s objectives are consistent with the country’s current development priorities and with current Bank country and sectoral assistance strategies and corporate goals as expressed in Poverty Reduction Strategy Papers, Country Assistance Strategies, Sectoral Strategy Papers and Operations Policy papers.

The IEG considers the following factors in overall relevance: government ownership and commitment; explicit Bank strategy; results framework; analytical underpinning; flexibility; strategic focus; appropriateness of instrument mix, Bank capacity; Bank and IFC coordination; and Bank and other development partners’ collaboration.

Rating of relevance by OED/IEG:

  • High/Mostly Relevant: Most of the major objectives were highly relevant.

  • Substantial/Partially Relevant: Most of the major objectives are at least substantially relevant.

  • Modest/Partially Relevant: Most of the major objectives were not highly or substantially relevant.

  • Negligible/Not Relevant: Most of the major objectives were irrelevant or negligibly relevant.

Efficacy

Definition: The extent to which the project’s objectives were achieved, or expected to be achieved, taking into account their relative importance.

Rating of Efficacy by OED/IEG:

  • High/Achieved: Major objectives were fully met, or expected to be fully met, with no shortcomings.

  • Substantial/Mostly Achieved: Major objectives were met, or expected to be met, with only minor shortcomings.

  • Modest/Partially Achieved: Major objectives were met, or expected to be met, but with significant shortcomings.

  • Negligible/Not Achieved: Most objectives were not met, or expected not to be met, due to major shortcomings.

Efficiency (by OED; the IEG Dropped This Criterion)

Definition: The extent to which the project achieved, or is expected to achieve, a return higher than the opportunity cost of capital and benefits at least cost compared to alternatives.

Ratings by OED (Note: IEG discontinued this criterion)

  • High: Project represents sector/industry best practice in terms of cost-effectiveness, and economic returns (if estimates are available) greatly exceed the opportunity cost of capital.

  • Substantial: Project meets sector/industry standards in terms of cost-effectiveness, and economic returns (if estimates are available) exceed the opportunity cost of capital.

  • Modest: Project fails to meet sector/industry standards in terms of cost-effectiveness, and economic returns (if estimates are available) are near the opportunity cost of capital.

  • Negligible: Project is well below sector/industry standards in terms of cost-effectiveness, and economic returns (if estimates are available) are significantly below the opportunity cost of capital.

Sustainability (by OED; the IEG Dropped This Criterion)

Definition: The resilience to risk of net benefits flows over time.

Assessments of sustainability take into account nine factors:

  • Technical resilience

  • Financial resilience (including policies on cost recovery)

  • Economic resilience

  • Social support (including conditions subject to Safeguard Policies)

  • Environmental resilience

  • Government ownership (including by central governments and agencies, and availability of O&M funds)

  • Other stakeholder ownership (including local participation, beneficiary incentives, civil society/NGOs, private sector)

  • Institutional support (including supportive legal/regulatory framework, and organizational and management effectiveness)

  • Resilience to exogenous influences (including terms of trade, economic shocks, regional political, and security situations)

OED Ratings:

  • Highly Likely: Project net benefits flow meets most of the relevant factors determining overall resilience at the “high level,” with all others rated at the “substantial” level

  • Likely: Project net benefits flow meets all relevant factors determining overall resilience at the “substantial” level

  • Unlikely: Project net benefits flow meets some but not all relevant factors determining overall resilience at the “substantial” level

  • Highly Unlikely: Project net benefits flow meets few of the relevant factors determining overall resilience at the “substantial” level

  • Not Evaluable: Insufficient information available to make a judgment

Result (New Criterion by the IEG)

Definition: To what extent specified output targets were met.

Ratings by IEG:

  • Met: Specified output targets were fully met.

  • Mostly met: Major output targets were met.

  • Partially met: Some output targets were met.

  • Not Met: Most output target were not met.

Outcome/Effectiveness

Definition: The extent to which the project’s major relevant objectives were achieved, or are expected to be achieved, efficiently.

Ratings by OED/IEG (Note that the IEG has consolidated the ratings into four as follows).

  • Highly Satisfactory/Achieved: Project achieved or exceeded, or is expected to achieve or exceed, all its major relevant objectives efficiently without major shortcomings.

  • Satisfactory/Achieved: Project achieved, or is expected to achieve, most of its major relevant objectives efficiently with only minor shortcomings.

  • Moderately Satisfactory/Mostly Achieved: Project achieved, or is expected to achieve, most of its major relevant objectives efficiently but with either significant shortcomings or modest overall relevance.

  • Moderately Unsatisfactory/Partially Achieved: Project is expected to achieve its major relevant objectives with major shortcomings or is expected to achieve only some of its major relevant objectives, yet achieve positive efficiency.

  • Unsatisfactory/Not Achieved: Project has failed to achieve, and is not expected to achieve, most of its major relevant objectives with only minor development benefits.

  • Highly Unsatisfactory/Not Achieved: Project has failed to achieve, and is not expected to achieve, any of its major relevant objectives with no worthwhile development benefits.

An important limitation of the OED approach to the assessment of outcome is that the outcome is considered independent of sustainability. A project may be judged “Highly Satisfactory” while it may not have been sustained.

Institutional Development Impact (IDI; by the OED Only)

Definition: The extent to which a project improves the ability of a country or region to make more efficient, equitable, and sustainable use of its human, financial, and natural resources through: (a) better definition, stability, transparency, enforceability, and predictability of institutional arrangements, and/or (b) better alignment of the mission and capacity of an organization with its mandate, which derives from these institutional arrangements. IDI considers that the project is expected to make a critical contribution to the country’s/region’s ability to effectively use human, financial, and natural resources, either through the achievement of the project’s stated ID objectives or through unintended effects.

OED Ratings:

  • Substantial: Project as a whole made, or is expected to make, a significant contribution to the country’s/region’s ability to effectively use human, financial, and natural resources, either through the achievement of the project’s stated ID objectives or through unintended effects.

  • Modest: Project as a whole increased, or is expected to increase, to a limited extent the country’s/region’s ability to effectively use human, financial, and natural resources, either through the achievement of the project’s stated ID objectives or through unintended effects.

  • Negligible: Project as a whole made, or is expected to make, little or no contribution to the country’s/region’s ability to effectively use human, financial, and natural resources, either through the achievement of the project’s stated ID objectives or through unintended effects.

The IEG no longer uses this criterion.

The Bank Performance

The OED in addition also rated the Bank performance based upon “Quality at Entry” and “Supervision” as follows.

  • The Quality of Entry ratings took into consideration: project consistency with Bank strategy for the country; grounding in economic and sector work; development objective statement; approach and design appropriateness; government ownership; involvement of stakeholders/beneficiaries; adequacy of technical analysis; economic and financial impact analysis; environmental assessment; impact on poverty reduction and social issues; institutional analysis; adequacy of financial management arrangements; readiness for implementation; and assessment of risk and sustainability.

  • The Supervision ratings took into account two major factors: focus on development impact and adequacy of supervision inputs and processes. The focus on development impact includes: timely identification/assessment of implementation and development impact; appropriateness of proposed solutions and follow-up; effectiveness of Bank actions. The Supervision ratings took into account: adequacy of Bank supervision resources; supervision reporting quality; attention to fiduciary aspects, and attention to monitoring and evaluation.

OED Ratings on Bank Performance

  • Highly Satisfactory: Bank performance was rated as Highly Satisfactory on both quality at entry and supervision, or Highly Satisfactory on the one dimension with significantly higher impact on project performance and at least Satisfactory on the other.

  • Satisfactory: Bank performance was rated at least Satisfactory on both quality at entry and supervision, or Satisfactory on the one dimension with significantly higher impact on project performance and no less than Unsatisfactory on the other.

  • Unsatisfactory: Bank performance was not rated at least Satisfactory on both quality at entry and supervision, or Unsatisfactory on the one dimension with significantly higher impact on project performance and no higher than Satisfactory on the other.

  • Highly Unsatisfactory: Bank performance was rated as Highly Unsatisfactory on both quality at entry and supervision, or Highly Unsatisfactory on the one dimension with significantly higher impact on project performance and no higher than Unsatisfactory on the other.

The IEG instead rates Bank performance based upon (a) strategic relevance at country level and (b) effectiveness of Bank interventions. The effectiveness is assessed by relevance, result, efficacy, and overall effectiveness criteria.

The Borrower Performance (by the OED Only)

Definition: The extent to which the borrower assumed ownership and responsibility to ensure quality of preparation and implementation, and complied with covenants and agreements, towards the achievement of development objectives and sustainability.

OED rated borrower performance on there counts: (a) preparation; (b) implementation; and (c) compliance. The preparation took into consideration institutional and financial constraints. The implementation considered macro and sectoral policies/conditions; government commitment; appointment of key staff; counterpart funding; and administrative procedures. The implementing agency performance was also considered. The compliance considered all major covenants and commitments undertaken by the borrower.

Ratings

  • Highly Satisfactory: Borrower performance was rated Highly Satisfactory on at least two of the three performance factors.

  • Satisfactory: Borrower performance was rated at least Satisfactory on two of the three factors.

  • Unsatisfactory: Borrower performance was not rated at least Satisfactory on two of the three factors.

  • Highly Unsatisfactory: Borrower performance was rated Highly Unsatisfactory on at least two of the three factors.

The IEG no longer rates the borrower performance as indicated by the above criterion.

Source: World Bank (2002, 2020).

Rights and permissions

Reprints and permissions

Copyright information

© 2020 The Author(s)

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Deb, S., Shah, A. (2020). A Primer on Public Sector Evaluations. In: Shah, A. (eds) Policy, Program and Project Evaluation. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-030-48567-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-48567-2_2

  • Published:

  • Publisher Name: Palgrave Macmillan, Cham

  • Print ISBN: 978-3-030-48566-5

  • Online ISBN: 978-3-030-48567-2

  • eBook Packages: Economics and FinanceEconomics and Finance (R0)

Publish with us

Policies and ethics