Skip to main content
Log in

A propensity score approach in the impact evaluation on scientific production in Brazilian biodiversity research: the BIOTA Program

  • Published:
Scientometrics Aims and scope Submit manuscript


Evaluation has become a regular practice in the management of science, technology and innovation (ST&I) programs. Several methods have been developed to identify the results and impacts of programs of this kind. Most evaluations that adopt such an approach conclude that the interventions concerned, in this case ST&I programs, had a positive impact compared with the baseline, but do not control for any effects that might have improved the indicators even in the absence of intervention, such as improvements in the socio-economic context. The quasi-experimental approach therefore arises as an appropriate way to identify the real contributions of a given intervention. This paper describes and discusses the utilization of propensity score (PS) in quasi-experiments as a methodology to evaluate the impact on scientific production of research programs, presenting a case study of the BIOTA Program run by FAPESP, the State of São Paulo Research Foundation (Brazil). Fundamentals of quasi-experiments and causal inference are presented, stressing the need to control for biases due to lack of randomization, also a brief introduction to the PS estimation and weighting technique used to correct for observed bias. The application of the PS methodology is compared to the traditional multivariate analysis usually employed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others


  1. Note that by expressing potential outcomes as time differences in the number of publications we are able to control for unobserved individual heterogeneities that are fixed over time and that could be conditionally correlated with the treatment status.

  2. Assumption (iii) is known as Stable Unit-Treatment Value Assumption (SUTVA). See, for example, Rosenbaum and Rubin (1983).

  3. These categories differ basically in terms of the amount of funding awarded (Thematic Projects typically receive more, as well as involving more institutions and participants), duration (2 years on average for projects with Regular Awards, 4 years for Thematic Projects and two to 4 years for Young Investigators projects) and research profile (Young Investigators projects typically aim to establish a new line of research at the institution involved, besides targeting researchers who are starting their career).

  4. These projects were selected by searching FAPESP’s database for projects that contained terms relating to biodiversity and then validating the results with the management of BIOTA and FAPESP.

  5. FAPESP Analysts, researchers and managers of biodiversity research projects at public and private organizations.

  6. In particular, see two paragraphs (and Assumption 3.2) right before section 3.1 on page 7, from Abadie (2005).


  • Abadie, A. (2005). Semiparametric difference-in-differences estimators. The Review of Economic Studies, 72(1), 1–19.

    Article  MathSciNet  MATH  Google Scholar 

  • Azoulay, P., Zivin, J. S. G., & Manso, G. (2011). Incentives and creativity: Evidence from the academic life sciences. RAND Journal of Economics, 42(3), 527–554.

    Article  Google Scholar 

  • Böhmer, S., & von Ins, M. (2009). Different—Not just by label: Research-oriented academic careers in Germany. Research Evaluation, 18(3), 177–184.

    Article  Google Scholar 

  • Carney, J., Smith, W., Parsad, A., Johnston, K., & Millsap, M. (2008). Evaluation of the Faculty Early Career Development (CAREER) Program. Bethesda, MD: Abt Associates Inc.

    Google Scholar 

  • Chen, S., Mu, R., & Ravallion, M. (2008). Are there lasting impacts of aid to poor areas? Evidence for rural China. Policy Research Working Paper 4084, World Bank, Washington, DC.

  • Crump, R. K., Hotz, J., Imbens, W., & Mitinik, O. A. (2009). Dealing with limited overlap in estimation of average treatment effects. Biometrika, 96(1), 187–199.

    Article  MathSciNet  MATH  Google Scholar 

  • D’Agostino, R. B, Jr, & Rubin, D. B. (2000). Estimating and using propensity scores with partially missing data. Journal of the American Statistical Association, 95, 749–759.

    Article  Google Scholar 

  • Ferraro, P. J. (2009). Counterfactual thinking and impact evaluation in environmental policy. New Directions for Evaluation 122 (Special Issue: Environmental Program and Policy Evaluation: Addressing Methodological Challenges).

  • Ferraro, P. J., & Pattanayak, S. K. (2006). Money for nothing? A call for empirical evaluation of biodiversity conservation investments. PLoS Biology, 4(4), e105. doi:10.1371/journal.pbio.0040105.

    Article  Google Scholar 

  • Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.

    Article  MathSciNet  MATH  Google Scholar 

  • Frondel, M., & Schmidt, C. M. (2005). Evaluating environmental programs: The perspective of modern evaluation research. Ecological Economics, 55(4), 515–526.

    Article  Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. New York: Springer.

    Book  MATH  Google Scholar 

  • Hirano, K., Imbens, G. W., & Ridder, G. (2003). Efficient estimation of average treatment effects using the estimated propensity score. Econometrica, 71, 1161–1189.

    Article  MathSciNet  MATH  Google Scholar 

  • Hornbostel, S., Böhmer, S., Klingsporn, B., Neufeld, J., & von Ins, M. (2009). Funding of young scientist and scientific excellence. Scientometrics, 79(1), 171–190.

    Article  Google Scholar 

  • Hoshino, T. (2008). A Bayesian propensity score adjustment for latent variable modeling and MCMC algorithm. Computational Statistics & Data Analysis, 52(3), 1413–1429.

    Article  MathSciNet  MATH  Google Scholar 

  • Ihaka, R., & Gentleman, R. (1996). R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics, 5(3), 299–314.

    Google Scholar 

  • Inglesi-Lotz, R., & Pouris, A. (2011). Scientometric impact assessment of a research policy instrument: The case of rating researchers on scientific outputs in South Africa. Scientometrics, 88(3), 747–760.

    Article  Google Scholar 

  • Jacob, B. A., & Lefgren, L. (2011). The impact of NIH postdoctoral training grants on scientific productivity. Research Policy, 40, 864–874.

    Article  Google Scholar 

  • Jalan, J., & Ravallion, M. (2003). Estimating the benefit incidence of an antipoverty program by propensity-score matching. Journal of Business and Economic Statistics, 21(1), 19–30.

    Article  MathSciNet  Google Scholar 

  • Joly, C. A., Rodrigues, R. R., Metzger, J. P., Haddad, C. F. B., Verdade, L. M., Oliveira, M. C., et al. (2010). Biodiversity conservation research, training, and policy in São Paulo. Science, 328, 1358–1359.

    Article  Google Scholar 

  • Khandker, S. R., Koolwal, G. B., & Samad, H. A. (2009). Handbook on impact evaluation quantitative methods and practices. Washington, DC: World Bank.

    Book  Google Scholar 

  • Liu, X., Zhang, L., & Hong, S. (2011). Global biodiversity research during 1900–2009: A bibliometric analysis. Biodiversity Conservation, 20(807), 807–826.

    Article  Google Scholar 

  • Lynch, L., Gray, W., & Geoghegan, J. (2007). Are farmland preservation program easement restrictions capitalized into farmland prices? What can a propensity score matching analysis tell us? Applied Economic Perspectives and Policy, 29(3), 502–509.

    Google Scholar 

  • McCarey, D., Ridgeway, G., & Morral, A. (2004). Propensity score estimation with boosted regression for evaluating adolescent substance abuse treatment. Psychological Methods, 9(4), 403–425.

    Article  Google Scholar 

  • Mena-Chalco, J. P., & Cesar, R. M, Jr. (2009). scriptLattes: An open-source knowledge extraction system from the Lattes platform. Journal of the Brazilian Computer Society, 15(4), 31–39.

    Article  Google Scholar 

  • Mutz, R., & Daniel, H.-D. (2012). The generalized propensity score methodology for estimating unbiased journal impact factors. Scientometrics, 92(2), 377–390.

    Article  Google Scholar 

  • Nelder, J., & Wedderburn, R. (1972). Generalized linear models. Journal of the Royal Statistical Society. Series A (General), 135(3), 370–384.

    Article  Google Scholar 

  • Ravallion, M. (2008). Evaluating anti-poverty programs. In T. P. Schultz & J. Strauss (Eds.), Handbook of development economics (4th ed., pp. 3787–3846). Amsterdam: North-Holland.

    Google Scholar 

  • Ridgeway, G. (1999). The state of boosting. Computing Science and Statistics, 31, 172–181.

    Google Scholar 

  • Ridgeway, G. (2005). GBM 1.5 package manual.

  • Ridgeway, G. (2006). Assessing the effect of race bias in post-traffic stop outcomes using propensity scores. Journal of Quantitative Criminology, 22(1), 1–29.

    Article  Google Scholar 

  • Ridgeway, G., McCaffrey, D., & Morral, A. (2006). twang: Toolkit for weighting and analysis of nonequivalent groups. Software for using matching methods in R. Available at

  • Robins, J. M., & Rotnitzky, A. (2001). Comment on “inference for semiparametric models: Some questions and an answer” by P. Bickel and J. Kwon. Statistica Sinica, 11, 920–936.

    Google Scholar 

  • Rodrigues, R. R., & Bonomi, V. L. R. (Eds.). (2008). Diretrizes para conservação e restauração da biodiversidade no estado de São Paulo. São Paulo: SMA-SP & FAPESP.

    Google Scholar 

  • Rosenbaum, P., & Rubin, D. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55.

    Article  MathSciNet  MATH  Google Scholar 

  • Rubin, D. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701.

  • Rubin, D. (2001). Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Services and Outcomes Research Methodology, 2, 169–188.

    Article  Google Scholar 

  • Salles-Filho, S., Avila, F. D., Sepulveda, J., & Colugnati, F. A. B. (2010a). Multidimensional assessment of technology and innovation programs: The impact evaluation of INCAGRO-Peru. Research Evaluation, 19(5), 361–372.

    Article  Google Scholar 

  • Salles-Filho, S., Bonacelli, M. B. M., Zackiewicz, M., Castro, P. F. D., & Bin, A. (2007). Development and application of a methodology for evaluating S,T&I Programs: The decomposition method. In Workshop Internacional sobre Inovações Metodológicas na Avaliação de Impacto da pesquisa Agropecuária, 2008, Brasília, XII Seminário Latino-Iberoamericano de Gestión Tecnológica - ALTEC 2007, Buenos Aires, 2007 (Vol. 1, pp. 2–6).

  • Salles-Filho, S., Castro, P. F. D., Bonacelli, M. B. M., Zeitoum, C., & Sá, F. P. (2010b). Indicators for evaluation of results and impacts of research program in conservation and sustainable use of biodiversity, the case of BIOTA/FAPESP. In International conference: Getting post 2010 biodiversity targets right, Bragança Paulista.

  • Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science, 25(1), 1–21.

    Article  MathSciNet  Google Scholar 

  • White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica, 48, 817–830.

    Article  MathSciNet  MATH  Google Scholar 

Download references


This work was supported by the São Paulo Research Foundation (FAPESP) [Grant number 2008/58628-7]; and Coordination for the Improvement of Higher Level Personnel (CAPES) [Grant AUX-PE-PNPD- 1945/2008].

Conflict of interest

Authors declare none conflict of interest. FAPESP played no role in the design of the study, data analysis, or in manuscript preparation.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Fernando A. B. Colugnati.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Colugnati, F.A.B., Firpo, S., de Castro, P.F.D. et al. A propensity score approach in the impact evaluation on scientific production in Brazilian biodiversity research: the BIOTA Program. Scientometrics 101, 85–107 (2014).

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI:


Mathematics Subject Classification