An Extended System for Labeling Graphical Documents Using Statistical Language Models

  • Andrew O’Sullivan
  • Laura Keyes
  • Adam Winstanley
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3926)


This paper describes a proposed extended system for the recognition and labeling of graphical objects within architectural and engineering documents that integrates Statistical Language Models (SLMs) with shape classifiers. Traditionally used for Natural Language Processing, SLMS have been successful in such fields as Speech Recognition and Information Retrieval. There exist similarities between natural language and technical graphical data that suggest that adapting SLMs for use with graphical data is a worthwhile approach. Statistical Graphical Language Models (SGLMs) are applied to graphical documents based on associations between different classes of shape in a drawing to automate the structuring and labeling of graphical data. The SGLMs are designed to be combined with other classifiers to improve their recognition performance. SGLMs perform best when the graphical domain being examined has an underlying semantic system, that is; graphical objects have not been placed randomly within the data. A system which combines a Shape Classifier with SGLMS is described.


Graphical Data Graphical Document Graphical Notation Graphical Object Unknown Object 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Keyes, L., Winstanley, A.: Shape Description for Automatically Structuring Graphical Data. In: Lladós, J., Kwon, Y.-B. (eds.) GREC 2003. LNCS, vol. 3088, pp. 256–264. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  2. 2.
    Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press, Cambridge (1997)Google Scholar
  3. 3.
    Ponte, J.M., Croft, W.B.: A Language Modeling Approach to Information Retrieval. In: Proceedings of SIGIR 1988, pp. 276–281 (1998)Google Scholar
  4. 4.
    Manning, C.D., Schutz, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (2001)Google Scholar
  5. 5.
    Jurafsky, D., Martin, J.H.: Speech and Language Processing. Prentice-Hall, Englewood Cliffs (2000)Google Scholar
  6. 6.
    Bahl, L.R., Brown, P.F., de Souza, P.V., Mercer, R.L.: A Tree-based Statistical Language Model for Natural Language Speech Recognition. IEEE Transactions on Acoustics, Speech and Signal Processing 37, 1001–1008 (1989)CrossRefGoogle Scholar
  7. 7.
    Shilman, M., Pasula, H., Russell, S., Newton, R.: Statistical Visual Language Models for Ink Parsing. In: AAAI Spring 2002 Symposium on Sketch Understanding (2002)Google Scholar
  8. 8.
    Rosenfeld, R.: Two Decades of Statistical Language Modeling: Where Do We Go From Here? Proceedings of the IEEE 88(8), 1270–1278 (2000)CrossRefGoogle Scholar
  9. 9.
    Andrews, J.H.: Maps and Language, A Metaphor Extended. Cartographic Journal 27, 1–19 (1990)CrossRefGoogle Scholar
  10. 10.
    Winstanley, A., Salaik, B., Keyes, L.: Statistical Language Models For Topographic Data Recognition. In: IEEE International Geoscience and Remote Sensing Symposium (IGARSS 2003) (July 2003)Google Scholar
  11. 11.
    Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On Combing Classifiers. IEEE Transaction on Pattern Analysis and Machine Intelligence 20(3), 226–239 (1998)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Andrew O’Sullivan
    • 1
  • Laura Keyes
    • 1
  • Adam Winstanley
    • 2
  1. 1.School of Informatics and EngineeringInstitute of Technology BlanchardstownDublin 15Ireland
  2. 2.Department of Computer ScienceNUI MaynoothMaynooth, Co. KildareIreland

Personalised recommendations