Representing document content via an object-oriented paradigm

  • Roberto Basili
  • Massimo Di Nanni
  • Maria Teresa Pazienza
Communications 2B Intelligent Information Retrieval
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1609)


Efficient software infrastructures for the design and implementation of Intelligent Information Systems (IIS) are very important, especially in the area of intelligent NLP-based systems. Recently several approaches have been proposed in literature. However, the emphasis is usually centred on the integration of heterogeneous linguistic processors and the problem of the representation of linguistic data in vivo is left in the shadow. In this paper an object oriented architecture for a NLP-based IIS devoted to information extraction tasks will be discussed. An application of this model to a distributed document categorisation framework, employed within an existing system, TREVI [9], will be discussed as a relevant case study.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    The common object request broker: Architecture and specification, ver. 2.0. Technical document ptc/96-03-0, OMG, 1995.Google Scholar
  2. 2.
    McKelvie D., Brew C., and Thompson H. Using sgml as a basis for data-intensive nlp. In ANLP97, 1997.Google Scholar
  3. 3.
    Gamma E., Helm R., Johnson R., Vlissides J., and Booch G. (Foreword), editors. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley Professional Computing, October 1994.Google Scholar
  4. 4.
    EAGLES. Evaluation of natural language processing systems. In EAG-EWG-PR.2, 1994.Google Scholar
  5. 5.
    Miller G. Wordnet: an on-line lexical database. International Journal of Lexicography, 3:656–691, 1994.Google Scholar
  6. 6.
    Cunningham H., Humphreys K., Gaizauskas R., and Wilks Y. Software infrastructure for natural language processing. In ANLP97, 1997.Google Scholar
  7. 7.
    Mazzucchelli L. and Marabello M.V. Specification of the overall toolkit architecture. In EP 23311 TREVI Project Deliverable 7D1, 1997.Google Scholar
  8. 8.
    Fowler M., Scott K. (Contributor), and Booch G., editors. Uml Distilled: Applying the Standard Object Modeling Language. Addison-Wesley Object Technology Series, June 1997.Google Scholar
  9. 9.
    Basili R., Mazzucchelli L. Di Nanni M., Marabello M.V., and Pazienza, M.T. Nlp for text classification: the trevi experience. In Proceedings of the Second International Conference on Natural Language Processing and Industrial Applications, Universite’ de Moncton, New Brunswick (Canada), August 1998.Google Scholar
  10. 10.
    Grishman R. and CAWG. Tipster text phase ii: Architecture design. Technical report, New York University, 1996.Google Scholar
  11. 11.
    Zajac R., Carper M., and Sharples N. An open distributed architecture for reuse and integration of heterogeneous nlp component. In ANLP97, 1997.Google Scholar
  12. 12.
    Peters W., Cunningham H., McCauley C., Bontcheva K., and Wilks Y. Uniform language resource access and distribution. In ICLRE98, 1998.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Roberto Basili
    • 1
  • Massimo Di Nanni
    • 1
  • Maria Teresa Pazienza
    • 1
  1. 1.Department of Computer Science, System and ProductionUniversity of RomaRomaItaly

Personalised recommendations