Multi-scale Stacked Sequential Learning

  • Oriol Pujol
  • Eloi Puertas
  • Carlo Gatta
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5519)


One of the most widely used assumptions in supervised learning is that data is independent and identically distributed. This assumption does not hold true in many real cases. Sequential learning is the discipline of machine learning that deals with dependent data such that neighboring examples exhibit some kind of relationship. In the literature, there are different approaches that try to capture and exploit this correlation, by means of different methodologies. In this paper we focus on meta-learning strategies and, in particular, the stacked sequential learning approach. The main contribution of this work is two-fold: first, we generalize the stacked sequential learning. This generalization reflects the key role of neighboring interactions modeling. Second, we propose an effective and efficient way of capturing and exploiting sequential correlations that takes into account long-range interactions by means of a multi-scale pyramidal decomposition of the predicted labels. Additionally, this new method subsumes the standard stacked sequential learning approach. We tested the proposed method on two different classification tasks: text lines classification in a FAQ data set and image classification. Results on these tasks clearly show that our approach outperforms the standard stacked sequential learning. Moreover, we show that the proposed method allows to control the trade-off between the detail and the desired range of the interactions.


Sequential Learning Text Line Time Series Prediction Multiscale Approach Support Lattice 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Dietterich, T.G.: Machine Learning for Sequential Data: A Review. In: Caelli, T.M., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, pp. 15–30. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Cohen, W.W., de Carvalho, V.R.: Stacked sequential learning. In: Proc. of IJCAI 2005, pp. 671–676 (2005)Google Scholar
  3. 3.
    McCallum, A., Freitag, D., Pereira, F.: Maximum entropy markov models for information extraction and segmentation. In: Proc. of ICML 2000, pp. 591–598 (2000)Google Scholar
  4. 4.
    Borenstein, E., Ullman, S.: Learning to segment. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3023, pp. 315–328. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  5. 5.
    Wolpert, D.H.: Stacked generalization. Neural Networks 5(2), 241–259 (1992)CrossRefGoogle Scholar
  6. 6.
    Lafferty, J.D., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. of ICML 2001, pp. 282–289 (2001)Google Scholar
  7. 7.
    Burt, P., Adelson, E.: The laplacian pyramid as a compact image code. IEEE Transactions on Communications 31(4), 532–540 (1983)CrossRefGoogle Scholar
  8. 8.
    Dietterich, T.G., Ashenfelter, A., Bulatov, Y.: Training conditional random elds via gradient tree boosting. In: Proc. of the 21th ICML (2004)Google Scholar
  9. 9.
    Bottou, L., Bengio, Y., LeCun, Y.: Global training of document processing systems using graph transformer networks. In: CVPR, pp. 489–494. IEEE Computer Society, Los Alamitos (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Oriol Pujol
    • 1
    • 2
  • Eloi Puertas
    • 1
  • Carlo Gatta
    • 2
  1. 1.Dept. Matemàtica Aplicada i AnàlisiUniversitat de BarcelonaBarcelonaSpain
  2. 2.Computer Vision CenterBellaterraSpain

Personalised recommendations