Prediction of Forest Aboveground Biomass: An Exercise on Avoiding Overfitting

  • Sara Silva
  • Vijay Ingalalli
  • Susana Vinga
  • João M. B. Carreiras
  • Joana B. Melo
  • Mauro Castelli
  • Leonardo Vanneschi
  • Ivo Gonçalves
  • José Caldas
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7835)

Abstract

Mapping and understanding the spatial distribution of forest aboveground biomass (AGB) is an important and challenging task. This paper describes an exercise of predicting the forest AGB of Guinea-Bissau, West Africa, using synthetic aperture radar data and measurements of tree size collected in field campaigns. Several methods were attempted, from linear regression to different variants and techniques of Genetic Programming (GP), including the cutting edge geometric semantic GP approach. The results were compared between each other in terms of root mean square error and correlation between predicted and expected values of AGB. None of the methods was able to produce a model that generalizes well to unseen data or significantly outperforms the model obtained by the state-of-the-art methodology, and the latter was also not better than a simple linear model. We conclude that the AGB prediction is a difficult problem, aggravated by the small size of the available data set.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Campbell, B.: Beyond Copenhagen: Redd plus, agriculture, adaptation strategies and poverty. Global Environmental Change-Human and Policy Dimensions 19(4), 397–399 (2009)CrossRefGoogle Scholar
  2. 2.
    Carreiras, J., Vasconcelos, M., Lucas, R.: Understanding the relationship between aboveground biomass and ALOS PALSAR data in the forests of Guinea-Bissau (West Africa). Remote Sensing of Environment 121, 426–442 (2012)CrossRefGoogle Scholar
  3. 3.
    Friedman, J.: Stochastic gradient boosting. Computational Statistics & Data Analysis 38(4), 367–378 (2002)MathSciNetMATHCrossRefGoogle Scholar
  4. 4.
    Gathercole, C., Ross, P.: Dynamic Training Subset Selection for Supervised Learning in Genetic Programming. In: Davidor, Y., Männer, R., Schwefel, H.-P. (eds.) PPSN 1994. LNCS, vol. 866, pp. 312–321. Springer, Heidelberg (1994)CrossRefGoogle Scholar
  5. 5.
    Gonçalves, I., Silva, S., Melo, J.B., Carreiras, J.M.B.: Random Sampling Technique for Overfitting Control in Genetic Programming. In: Moraglio, A., Silva, S., Krawiec, K., Machado, P., Cotta, C. (eds.) EuroGP 2012. LNCS, vol. 7244, pp. 218–229. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  6. 6.
    Gustafson, S., Vanneschi, L.: Crossover-based tree distance in genetic programming. IEEE Transactions on Evolutionary Computation 12(4), 506–524 (2008)CrossRefGoogle Scholar
  7. 7.
    Iba, H.: Bagging, boosting, and bloating in genetic programming. In: Proceedings of GECCO 1999, vol. 2, pp. 1053–1060 (1999)Google Scholar
  8. 8.
    Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)Google Scholar
  9. 9.
    Liu, Y., Khoshgoftaar, T.: Reducing overfitting in genetic programming models for software quality classification. In: Proceedings of the Eighth IEEE Symposium on International High Assurance Systems Engineering, Tampa, Florida, USA, March 25-26, pp. 56–65 (2004)Google Scholar
  10. 10.
    Lucas, R., Armston, J., Fairfax, R., Fensham, R., Accad, A., Carreiras, J., Kelly, J., Bunting, P., Clewley, D., Bray, S., Metcalfe, D., Dwyer, J., Bowen, M., Eyre, T., Laidlaw, M.: An evaluation of the alos palsar l-band backscatter – above ground biomass relationship over Queensland, Australia. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 3(4), 576–593 (2010)CrossRefGoogle Scholar
  11. 11.
    Luke, S., Panait, L.: Lexicographic parsimony pressure. In: Proceedings of GECCO 2002, pp. 829–836. Morgan Kaufmann (2002)Google Scholar
  12. 12.
    Moraglio, A., Krawiec, K., Johnson, C.G.: Geometric Semantic Genetic Programming. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds.) PPSN 2012, Part I. LNCS, vol. 7491, pp. 21–31. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  13. 13.
    Pan, Y., Birdsey, R., Fang, J., Houghton, R., Kauppi, P., Kurz, W., Phillips, O., Shvidenko, A., Lewis, S., Canadell, J., Ciais, P., Jackson, R., Pacala, S., McGuire, A., Piao, S., Rautiainen, A., Sitch, S., Hayes, D.: A large and persistent carbon sink in the world’s forests. Science 333(6045), 988–993 (2011)CrossRefGoogle Scholar
  14. 14.
    Paris, G., Robilliard, D., Fonlupt, C.: Applying Boosting Techniques to Genetic Programming. In: Collet, P., Fonlupt, C., Hao, J.-K., Lutton, E., Schoenauer, M. (eds.) EA 2001. LNCS, vol. 2310, pp. 267–918. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  15. 15.
    Poli, R., Langdon, W.B., Mcphee, N.F.: A field guide to genetic programming (March 2008), http://www.gp-field-guide.org.uk
  16. 16.
    Robilliard, D., Fonlupt, C.: Backwarding: An Overfitting Control for Genetic Programming in a Remote Sensing Application. In: Collet, P., Fonlupt, C., Hao, J.-K., Lutton, E., Schoenauer, M. (eds.) EA 2001. LNCS, vol. 2310, pp. 245–254. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  17. 17.
    Silva, S., Costa, E.: Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genetic Programming and Evolvable Machines 10(2), 141–179 (2009)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Suen, Y.L., Melville, P., Mooney, R.J.: Combining Bias and Variance Reduction Techniques for Regression Trees. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 741–749. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  19. 19.
    Vanneschi, L., Castelli, M., Manzoni, L., Silva, S.: A new implementation of geometric semantic GP applied to predicting pharmacokinetic parameters. In: Proceedings of EuroGP 2013, Springer (to appear, 2013)Google Scholar
  20. 20.
    Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B 67, 301–320 (2005)MathSciNetMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Sara Silva
    • 1
    • 2
  • Vijay Ingalalli
    • 1
  • Susana Vinga
    • 1
    • 3
  • João M. B. Carreiras
    • 4
  • Joana B. Melo
    • 4
  • Mauro Castelli
    • 1
    • 5
  • Leonardo Vanneschi
    • 5
    • 1
  • Ivo Gonçalves
    • 2
  • José Caldas
    • 1
  1. 1.INESC-ID, ISTUniversidade Técnica de LisboaLisboaPortugal
  2. 2.CISUCUniversidade de CoimbraCoimbraPortugal
  3. 3.FCMUniversidade Nova de LisboaLisboaPortugal
  4. 4.Instituto de Investigação Científica TropicalLisboaPortugal
  5. 5.ISEGIUniversidade Nova de LisboaLisboaPortugal

Personalised recommendations