Abstract
Evaluation has become a regular practice in the management of science, technology and innovation (ST&I) programs. Several methods have been developed to identify the results and impacts of programs of this kind. Most evaluations that adopt such an approach conclude that the interventions concerned, in this case ST&I programs, had a positive impact compared with the baseline, but do not control for any effects that might have improved the indicators even in the absence of intervention, such as improvements in the socio-economic context. The quasi-experimental approach therefore arises as an appropriate way to identify the real contributions of a given intervention. This paper describes and discusses the utilization of propensity score (PS) in quasi-experiments as a methodology to evaluate the impact on scientific production of research programs, presenting a case study of the BIOTA Program run by FAPESP, the State of São Paulo Research Foundation (Brazil). Fundamentals of quasi-experiments and causal inference are presented, stressing the need to control for biases due to lack of randomization, also a brief introduction to the PS estimation and weighting technique used to correct for observed bias. The application of the PS methodology is compared to the traditional multivariate analysis usually employed.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Note that by expressing potential outcomes as time differences in the number of publications we are able to control for unobserved individual heterogeneities that are fixed over time and that could be conditionally correlated with the treatment status.
Assumption (iii) is known as Stable Unit-Treatment Value Assumption (SUTVA). See, for example, Rosenbaum and Rubin (1983).
These categories differ basically in terms of the amount of funding awarded (Thematic Projects typically receive more, as well as involving more institutions and participants), duration (2 years on average for projects with Regular Awards, 4 years for Thematic Projects and two to 4 years for Young Investigators projects) and research profile (Young Investigators projects typically aim to establish a new line of research at the institution involved, besides targeting researchers who are starting their career).
These projects were selected by searching FAPESP’s database for projects that contained terms relating to biodiversity and then validating the results with the management of BIOTA and FAPESP.
FAPESP Analysts, researchers and managers of biodiversity research projects at public and private organizations.
In particular, see two paragraphs (and Assumption 3.2) right before section 3.1 on page 7, from Abadie (2005).
References
Abadie, A. (2005). Semiparametric difference-in-differences estimators. The Review of Economic Studies, 72(1), 1–19.
Azoulay, P., Zivin, J. S. G., & Manso, G. (2011). Incentives and creativity: Evidence from the academic life sciences. RAND Journal of Economics, 42(3), 527–554.
Böhmer, S., & von Ins, M. (2009). Different—Not just by label: Research-oriented academic careers in Germany. Research Evaluation, 18(3), 177–184.
Carney, J., Smith, W., Parsad, A., Johnston, K., & Millsap, M. (2008). Evaluation of the Faculty Early Career Development (CAREER) Program. Bethesda, MD: Abt Associates Inc.
Chen, S., Mu, R., & Ravallion, M. (2008). Are there lasting impacts of aid to poor areas? Evidence for rural China. Policy Research Working Paper 4084, World Bank, Washington, DC.
Crump, R. K., Hotz, J., Imbens, W., & Mitinik, O. A. (2009). Dealing with limited overlap in estimation of average treatment effects. Biometrika, 96(1), 187–199.
D’Agostino, R. B, Jr, & Rubin, D. B. (2000). Estimating and using propensity scores with partially missing data. Journal of the American Statistical Association, 95, 749–759.
Ferraro, P. J. (2009). Counterfactual thinking and impact evaluation in environmental policy. New Directions for Evaluation 122 (Special Issue: Environmental Program and Policy Evaluation: Addressing Methodological Challenges).
Ferraro, P. J., & Pattanayak, S. K. (2006). Money for nothing? A call for empirical evaluation of biodiversity conservation investments. PLoS Biology, 4(4), e105. doi:10.1371/journal.pbio.0040105.
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.
Frondel, M., & Schmidt, C. M. (2005). Evaluating environmental programs: The perspective of modern evaluation research. Ecological Economics, 55(4), 515–526.
Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. New York: Springer.
Hirano, K., Imbens, G. W., & Ridder, G. (2003). Efficient estimation of average treatment effects using the estimated propensity score. Econometrica, 71, 1161–1189.
Hornbostel, S., Böhmer, S., Klingsporn, B., Neufeld, J., & von Ins, M. (2009). Funding of young scientist and scientific excellence. Scientometrics, 79(1), 171–190.
Hoshino, T. (2008). A Bayesian propensity score adjustment for latent variable modeling and MCMC algorithm. Computational Statistics & Data Analysis, 52(3), 1413–1429.
Ihaka, R., & Gentleman, R. (1996). R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics, 5(3), 299–314.
Inglesi-Lotz, R., & Pouris, A. (2011). Scientometric impact assessment of a research policy instrument: The case of rating researchers on scientific outputs in South Africa. Scientometrics, 88(3), 747–760.
Jacob, B. A., & Lefgren, L. (2011). The impact of NIH postdoctoral training grants on scientific productivity. Research Policy, 40, 864–874.
Jalan, J., & Ravallion, M. (2003). Estimating the benefit incidence of an antipoverty program by propensity-score matching. Journal of Business and Economic Statistics, 21(1), 19–30.
Joly, C. A., Rodrigues, R. R., Metzger, J. P., Haddad, C. F. B., Verdade, L. M., Oliveira, M. C., et al. (2010). Biodiversity conservation research, training, and policy in São Paulo. Science, 328, 1358–1359.
Khandker, S. R., Koolwal, G. B., & Samad, H. A. (2009). Handbook on impact evaluation quantitative methods and practices. Washington, DC: World Bank.
Liu, X., Zhang, L., & Hong, S. (2011). Global biodiversity research during 1900–2009: A bibliometric analysis. Biodiversity Conservation, 20(807), 807–826.
Lynch, L., Gray, W., & Geoghegan, J. (2007). Are farmland preservation program easement restrictions capitalized into farmland prices? What can a propensity score matching analysis tell us? Applied Economic Perspectives and Policy, 29(3), 502–509.
McCarey, D., Ridgeway, G., & Morral, A. (2004). Propensity score estimation with boosted regression for evaluating adolescent substance abuse treatment. Psychological Methods, 9(4), 403–425.
Mena-Chalco, J. P., & Cesar, R. M, Jr. (2009). scriptLattes: An open-source knowledge extraction system from the Lattes platform. Journal of the Brazilian Computer Society, 15(4), 31–39.
Mutz, R., & Daniel, H.-D. (2012). The generalized propensity score methodology for estimating unbiased journal impact factors. Scientometrics, 92(2), 377–390.
Nelder, J., & Wedderburn, R. (1972). Generalized linear models. Journal of the Royal Statistical Society. Series A (General), 135(3), 370–384.
Ravallion, M. (2008). Evaluating anti-poverty programs. In T. P. Schultz & J. Strauss (Eds.), Handbook of development economics (4th ed., pp. 3787–3846). Amsterdam: North-Holland.
Ridgeway, G. (1999). The state of boosting. Computing Science and Statistics, 31, 172–181.
Ridgeway, G. (2005). GBM 1.5 package manual. http://cran.r-project.org/doc/packages/gbm.pdf.
Ridgeway, G. (2006). Assessing the effect of race bias in post-traffic stop outcomes using propensity scores. Journal of Quantitative Criminology, 22(1), 1–29.
Ridgeway, G., McCaffrey, D., & Morral, A. (2006). twang: Toolkit for weighting and analysis of nonequivalent groups. Software for using matching methods in R. Available at http://cran.r-project.org/web/packages/twang/index.html.
Robins, J. M., & Rotnitzky, A. (2001). Comment on “inference for semiparametric models: Some questions and an answer” by P. Bickel and J. Kwon. Statistica Sinica, 11, 920–936.
Rodrigues, R. R., & Bonomi, V. L. R. (Eds.). (2008). Diretrizes para conservação e restauração da biodiversidade no estado de São Paulo. São Paulo: SMA-SP & FAPESP.
Rosenbaum, P., & Rubin, D. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55.
Rubin, D. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701.
Rubin, D. (2001). Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Services and Outcomes Research Methodology, 2, 169–188.
Salles-Filho, S., Avila, F. D., Sepulveda, J., & Colugnati, F. A. B. (2010a). Multidimensional assessment of technology and innovation programs: The impact evaluation of INCAGRO-Peru. Research Evaluation, 19(5), 361–372.
Salles-Filho, S., Bonacelli, M. B. M., Zackiewicz, M., Castro, P. F. D., & Bin, A. (2007). Development and application of a methodology for evaluating S,T&I Programs: The decomposition method. In Workshop Internacional sobre Inovações Metodológicas na Avaliação de Impacto da pesquisa Agropecuária, 2008, Brasília, XII Seminário Latino-Iberoamericano de Gestión Tecnológica - ALTEC 2007, Buenos Aires, 2007 (Vol. 1, pp. 2–6).
Salles-Filho, S., Castro, P. F. D., Bonacelli, M. B. M., Zeitoum, C., & Sá, F. P. (2010b). Indicators for evaluation of results and impacts of research program in conservation and sustainable use of biodiversity, the case of BIOTA/FAPESP. In International conference: Getting post 2010 biodiversity targets right, Bragança Paulista.
Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science, 25(1), 1–21.
White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica, 48, 817–830.
Acknowledgments
This work was supported by the São Paulo Research Foundation (FAPESP) [Grant number 2008/58628-7]; and Coordination for the Improvement of Higher Level Personnel (CAPES) [Grant AUX-PE-PNPD- 1945/2008].
Conflict of interest
Authors declare none conflict of interest. FAPESP played no role in the design of the study, data analysis, or in manuscript preparation.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Colugnati, F.A.B., Firpo, S., de Castro, P.F.D. et al. A propensity score approach in the impact evaluation on scientific production in Brazilian biodiversity research: the BIOTA Program. Scientometrics 101, 85–107 (2014). https://doi.org/10.1007/s11192-014-1397-1
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-014-1397-1