, Volume 101, Issue 1, pp 85–107 | Cite as

A propensity score approach in the impact evaluation on scientific production in Brazilian biodiversity research: the BIOTA Program

  • Fernando A. B. ColugnatiEmail author
  • Sergio Firpo
  • Paula F. Drummond de Castro
  • Juan E. Sepulveda
  • Sergio L. M. Salles-Filho


Evaluation has become a regular practice in the management of science, technology and innovation (ST&I) programs. Several methods have been developed to identify the results and impacts of programs of this kind. Most evaluations that adopt such an approach conclude that the interventions concerned, in this case ST&I programs, had a positive impact compared with the baseline, but do not control for any effects that might have improved the indicators even in the absence of intervention, such as improvements in the socio-economic context. The quasi-experimental approach therefore arises as an appropriate way to identify the real contributions of a given intervention. This paper describes and discusses the utilization of propensity score (PS) in quasi-experiments as a methodology to evaluate the impact on scientific production of research programs, presenting a case study of the BIOTA Program run by FAPESP, the State of São Paulo Research Foundation (Brazil). Fundamentals of quasi-experiments and causal inference are presented, stressing the need to control for biases due to lack of randomization, also a brief introduction to the PS estimation and weighting technique used to correct for observed bias. The application of the PS methodology is compared to the traditional multivariate analysis usually employed.


Quasi-experiment Propensity score Impact evaluation Biota program Bibliometrics 

Mathematics Subject Classification




This work was supported by the São Paulo Research Foundation (FAPESP) [Grant number 2008/58628-7]; and Coordination for the Improvement of Higher Level Personnel (CAPES) [Grant AUX-PE-PNPD- 1945/2008].

Conflict of interest

Authors declare none conflict of interest. FAPESP played no role in the design of the study, data analysis, or in manuscript preparation.


  1. Abadie, A. (2005). Semiparametric difference-in-differences estimators. The Review of Economic Studies, 72(1), 1–19.MathSciNetCrossRefzbMATHGoogle Scholar
  2. Azoulay, P., Zivin, J. S. G., & Manso, G. (2011). Incentives and creativity: Evidence from the academic life sciences. RAND Journal of Economics, 42(3), 527–554.CrossRefGoogle Scholar
  3. Böhmer, S., & von Ins, M. (2009). Different—Not just by label: Research-oriented academic careers in Germany. Research Evaluation, 18(3), 177–184.CrossRefGoogle Scholar
  4. Carney, J., Smith, W., Parsad, A., Johnston, K., & Millsap, M. (2008). Evaluation of the Faculty Early Career Development (CAREER) Program. Bethesda, MD: Abt Associates Inc.Google Scholar
  5. Chen, S., Mu, R., & Ravallion, M. (2008). Are there lasting impacts of aid to poor areas? Evidence for rural China. Policy Research Working Paper 4084, World Bank, Washington, DC.Google Scholar
  6. Crump, R. K., Hotz, J., Imbens, W., & Mitinik, O. A. (2009). Dealing with limited overlap in estimation of average treatment effects. Biometrika, 96(1), 187–199.MathSciNetCrossRefzbMATHGoogle Scholar
  7. D’Agostino, R. B, Jr, & Rubin, D. B. (2000). Estimating and using propensity scores with partially missing data. Journal of the American Statistical Association, 95, 749–759.CrossRefGoogle Scholar
  8. Ferraro, P. J. (2009). Counterfactual thinking and impact evaluation in environmental policy. New Directions for Evaluation 122 (Special Issue: Environmental Program and Policy Evaluation: Addressing Methodological Challenges).Google Scholar
  9. Ferraro, P. J., & Pattanayak, S. K. (2006). Money for nothing? A call for empirical evaluation of biodiversity conservation investments. PLoS Biology, 4(4), e105. doi: 10.1371/journal.pbio.0040105.CrossRefGoogle Scholar
  10. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.MathSciNetCrossRefzbMATHGoogle Scholar
  11. Frondel, M., & Schmidt, C. M. (2005). Evaluating environmental programs: The perspective of modern evaluation research. Ecological Economics, 55(4), 515–526.CrossRefGoogle Scholar
  12. Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. New York: Springer.CrossRefzbMATHGoogle Scholar
  13. Hirano, K., Imbens, G. W., & Ridder, G. (2003). Efficient estimation of average treatment effects using the estimated propensity score. Econometrica, 71, 1161–1189.MathSciNetCrossRefzbMATHGoogle Scholar
  14. Hornbostel, S., Böhmer, S., Klingsporn, B., Neufeld, J., & von Ins, M. (2009). Funding of young scientist and scientific excellence. Scientometrics, 79(1), 171–190.CrossRefGoogle Scholar
  15. Hoshino, T. (2008). A Bayesian propensity score adjustment for latent variable modeling and MCMC algorithm. Computational Statistics & Data Analysis, 52(3), 1413–1429.MathSciNetCrossRefzbMATHGoogle Scholar
  16. Ihaka, R., & Gentleman, R. (1996). R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics, 5(3), 299–314.Google Scholar
  17. Inglesi-Lotz, R., & Pouris, A. (2011). Scientometric impact assessment of a research policy instrument: The case of rating researchers on scientific outputs in South Africa. Scientometrics, 88(3), 747–760.CrossRefGoogle Scholar
  18. Jacob, B. A., & Lefgren, L. (2011). The impact of NIH postdoctoral training grants on scientific productivity. Research Policy, 40, 864–874.CrossRefGoogle Scholar
  19. Jalan, J., & Ravallion, M. (2003). Estimating the benefit incidence of an antipoverty program by propensity-score matching. Journal of Business and Economic Statistics, 21(1), 19–30.MathSciNetCrossRefGoogle Scholar
  20. Joly, C. A., Rodrigues, R. R., Metzger, J. P., Haddad, C. F. B., Verdade, L. M., Oliveira, M. C., et al. (2010). Biodiversity conservation research, training, and policy in São Paulo. Science, 328, 1358–1359.CrossRefGoogle Scholar
  21. Khandker, S. R., Koolwal, G. B., & Samad, H. A. (2009). Handbook on impact evaluation quantitative methods and practices. Washington, DC: World Bank.CrossRefGoogle Scholar
  22. Liu, X., Zhang, L., & Hong, S. (2011). Global biodiversity research during 1900–2009: A bibliometric analysis. Biodiversity Conservation, 20(807), 807–826.CrossRefGoogle Scholar
  23. Lynch, L., Gray, W., & Geoghegan, J. (2007). Are farmland preservation program easement restrictions capitalized into farmland prices? What can a propensity score matching analysis tell us? Applied Economic Perspectives and Policy, 29(3), 502–509.Google Scholar
  24. McCarey, D., Ridgeway, G., & Morral, A. (2004). Propensity score estimation with boosted regression for evaluating adolescent substance abuse treatment. Psychological Methods, 9(4), 403–425.CrossRefGoogle Scholar
  25. Mena-Chalco, J. P., & Cesar, R. M, Jr. (2009). scriptLattes: An open-source knowledge extraction system from the Lattes platform. Journal of the Brazilian Computer Society, 15(4), 31–39.CrossRefGoogle Scholar
  26. Mutz, R., & Daniel, H.-D. (2012). The generalized propensity score methodology for estimating unbiased journal impact factors. Scientometrics, 92(2), 377–390.CrossRefGoogle Scholar
  27. Nelder, J., & Wedderburn, R. (1972). Generalized linear models. Journal of the Royal Statistical Society. Series A (General), 135(3), 370–384.CrossRefGoogle Scholar
  28. Ravallion, M. (2008). Evaluating anti-poverty programs. In T. P. Schultz & J. Strauss (Eds.), Handbook of development economics (4th ed., pp. 3787–3846). Amsterdam: North-Holland.Google Scholar
  29. Ridgeway, G. (1999). The state of boosting. Computing Science and Statistics, 31, 172–181.Google Scholar
  30. Ridgeway, G. (2005). GBM 1.5 package manual.
  31. Ridgeway, G. (2006). Assessing the effect of race bias in post-traffic stop outcomes using propensity scores. Journal of Quantitative Criminology, 22(1), 1–29.CrossRefGoogle Scholar
  32. Ridgeway, G., McCaffrey, D., & Morral, A. (2006). twang: Toolkit for weighting and analysis of nonequivalent groups. Software for using matching methods in R. Available at
  33. Robins, J. M., & Rotnitzky, A. (2001). Comment on “inference for semiparametric models: Some questions and an answer” by P. Bickel and J. Kwon. Statistica Sinica, 11, 920–936.Google Scholar
  34. Rodrigues, R. R., & Bonomi, V. L. R. (Eds.). (2008). Diretrizes para conservação e restauração da biodiversidade no estado de São Paulo. São Paulo: SMA-SP & FAPESP.Google Scholar
  35. Rosenbaum, P., & Rubin, D. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55.MathSciNetCrossRefzbMATHGoogle Scholar
  36. Rubin, D. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701.Google Scholar
  37. Rubin, D. (2001). Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Services and Outcomes Research Methodology, 2, 169–188.CrossRefGoogle Scholar
  38. Salles-Filho, S., Avila, F. D., Sepulveda, J., & Colugnati, F. A. B. (2010a). Multidimensional assessment of technology and innovation programs: The impact evaluation of INCAGRO-Peru. Research Evaluation, 19(5), 361–372.CrossRefGoogle Scholar
  39. Salles-Filho, S., Bonacelli, M. B. M., Zackiewicz, M., Castro, P. F. D., & Bin, A. (2007). Development and application of a methodology for evaluating S,T&I Programs: The decomposition method. In Workshop Internacional sobre Inovações Metodológicas na Avaliação de Impacto da pesquisa Agropecuária, 2008, Brasília, XII Seminário Latino-Iberoamericano de Gestión Tecnológica - ALTEC 2007, Buenos Aires, 2007 (Vol. 1, pp. 2–6).Google Scholar
  40. Salles-Filho, S., Castro, P. F. D., Bonacelli, M. B. M., Zeitoum, C., & Sá, F. P. (2010b). Indicators for evaluation of results and impacts of research program in conservation and sustainable use of biodiversity, the case of BIOTA/FAPESP. In International conference: Getting post 2010 biodiversity targets right, Bragança Paulista.Google Scholar
  41. Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science, 25(1), 1–21.MathSciNetCrossRefGoogle Scholar
  42. White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica, 48, 817–830.MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2014

Authors and Affiliations

  • Fernando A. B. Colugnati
    • 1
    • 4
    Email author
  • Sergio Firpo
    • 2
  • Paula F. Drummond de Castro
    • 1
  • Juan E. Sepulveda
    • 3
  • Sergio L. M. Salles-Filho
    • 1
  1. 1.Laboratory of Studies on the Organization of Research and Innovation – GEOPI, Department of Science and Technology Policy, Institute of GeosciencesUniversity of CampinasCampinasBrazil
  2. 2.São Paulo School of Economics/FGV‎São PauloBrazil
  3. 3.Institute of Economics/UNICAMPCampinasBrazil
  4. 4.Department of Clinical MedicineFederal University of Juiz de ForaJuiz de ForaBrazil

Personalised recommendations