Skip to main content

Classification of Text Processing Components: The Tesla Role System

  • Conference paper
  • First Online:
Book cover Advances in Data Analysis, Data Handling and Business Intelligence

Abstract

The modeling of component interactions represents a major challenge in designing component systems. In most cases, the components in such systems interact via the results they produce. This approach results in two conflicting requirements that have to be satisfied. On the one hand, the interfaces between the components are subject to exact specifications. On the other hand, however, the component interfaces should not be excessively restricted as this might require the data produced by the components to be converted into the system’s data format. This might pose certain difficulties if complex data types (e.g., graphs or matrices) have to be stored as they often require non-trivial access methods that are not supported by a general data format.

The approach introduced in this paper tries to overcome this dilemma by meeting both demands: A role system is a generic way that enables text processing components to produce highly specific results. The role concept described in this paper has been adopted by the Tesla (Text Engineering Software Laboratory) framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Altschul, S. F. , Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215(3), 403–410.

    Google Scholar 

  • Bird, S., Day, D., Garofolo, J., Henderson, J., Laprun, C., & Liberman, M. (1999). Atlas: A flexible and extensible architecture for linguistic annotation. Technical report, NIST, 1999.

    Google Scholar 

  • Cunningham, H., & Bontcheva, K. (2006). Computational language systems, architectures. In K. Brown, A. H. Anderson, L. Bauer, M. Berns, G. Hirst, & J. Miller (Eds.), The encyclopedia of language and linguistics (2nd ed.). Munich: Elsevier.

    Google Scholar 

  • Feldman, R., & Sanger, J. (2006). The text mining handbook: Advanced approaches in analyzing unstructured data. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Götz, T., & Suhre, O. (2004). Design and implementation of the uima common analysis system. IBM Systems Journal, 43(3), 476–489.

    Article  Google Scholar 

  • Hahn, U., Buyko, E., Tomanek, K., Piao, S., Tsuruoka, Y., McNaught J., et al. (2007). An uima annotation type system for a generic text mining architecture. In UIMA-Workshop, GLDV Conference, 2007.

    Google Scholar 

  • Hamlet, D., Mason, D., & Woit, D. (1991). Theory of software reliability based on components. In Proceedings ICSE ‘01, pages 361–370. IEEE Computer Society, 2001.

    Google Scholar 

  • Harris, Z. S. (1951). Methods in structural linguistics. Chicago: University of Chicago Press.

    Google Scholar 

  • Kondrak, G. (2002). Algorithms for language reconstruction. PhD thesis, Department of Computer Science, University of Toronto, Toronto, ON, Canada, July 2002.

    Google Scholar 

  • Szyperski, C. (1998). Component software. Reading, MA: Addison-Wesley.

    Google Scholar 

  • van Gurp J., & Bosch, J. (2002). Role-based component engineering. In M. Larsson, & I. Crnkovic (Eds.), Building reliable component-based systems. Norwood, MA: Artech House.

    Google Scholar 

  • van Zaanen, M. (1999). Bootstrapping structure using similarity. In P. Monachesi (Ed.), Computational Linguistics in The Netherlands 1999 – Selected Papers from the Tenth CLIN Meeting; Utrecht, The Netherlands, pages 235–245, Utrecht, The Netherlands, 1999.

    Google Scholar 

  • van Zaanen, M., & Geertzen, J. (2006). Grammatical inference for syntax-based statistical machine translation. In Y. Sakakibara, S. Kobayashi, K. Sato, T. Nishino, & E. Tomita (Eds.), Eighth International Colloquium on Grammatical Inference, (ICGI), Tokyo, Japan, number 4201 in Lecture Notes in AI, pages 356–358. Berlin: Springer.

    Google Scholar 

  • Veronis, J., & Ide, N. (1996). Considerations for the reusability of linguistic software. Technical report, EAGLES, April 1996.

    Google Scholar 

Download references

Acknowledgements

We would like to thank Maryia Fedzechkina and Sonja Subicin for their help.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stephan Schwiebert .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hermes, J., Schwiebert, S. (2009). Classification of Text Processing Components: The Tesla Role System. In: Fink, A., Lausen, B., Seidel, W., Ultsch, A. (eds) Advances in Data Analysis, Data Handling and Business Intelligence. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01044-6_26

Download citation

Publish with us

Policies and ethics