An Object-Oriented Schema for Querying Audio

  • José Martinez
  • Rania Lutfi
  • Marc Gelgon
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2425)


To be fully usable, digital multimedia contents should be supported by a set of tools to query them, and more generally to manipulate them. This is one of the major goals of an audio database management system (DBMS). Existing work, e.g., radio or television archives, generally tackles the signal processing aspects, often leaving the DBMS question open. In this paper, we lay the foundations for integrating audio into a general purpose DBMS in the form of an object- oriented library. This library provides additional capabilities on top of (MPEG-7) audio descriptions used for mere selections.


Speech Recognition Audio Data Temporal Projection Audio Segment Sound Track 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Allen, J. F.; Maintaining Knowledge About Temporal Intervals; Communications of the ACM (26):11 (1983) 832–843zbMATHCrossRefGoogle Scholar
  2. 2.
    Fauvet, M.-C., Canavaggio, J.-F., Scholl, P.-C.; Expressions de requêtes temporelles dans un SGBD á objets; Actes des 12émes Journées Bases de Données Avancées, Cassis, France (1996) 225–250Google Scholar
  3. 3.
    Gamma, E., Helm, R., Johnson, R., Vlissides, J.; Design Patterns: Elements of Reusable Object-Oriented Software; Addison-Wesley (1995)Google Scholar
  4. 4.
    Gauvain, J.-L., Lamel, L., Adda, G.; Partitioning and Transcription of Broadcast News Data; Int’l Conf. on Spoken Language Processing, Sydney, Australia, Vol. 5 (1998) 1335–1338Google Scholar
  5. 5.
    Herrera, P., Serra, X.; A Proposal for the Description of Audio in the Context of MPEG-7; Proc. of the 1st European Workshop on Content-based Multimedia Indexing, Toulouse, France (1999) 81–88Google Scholar
  6. 6.
    Jelinek, F.; Statistical Methods for Speech Recognition; MIT Press (2000)Google Scholar
  7. 7.
    Lutfi, R., Martinez, J., Gelgon, M.; Manipulating Audio into a DBMS; Proceedings of the 8th International Conference on Multimedia Modeling, Amsterdam, The Nertherlands (2001) 91–106Google Scholar
  8. 8.
    Nack, F., Lindsay, A.; Everything you Wanted to Know about MPEG-7: Part 1; IEEE Multimedia, July/September (1999) 65–77Google Scholar
  9. 9.
    Nack, F., Lindsay, A.; Everything you Wanted to Know about MPEG-7: Part 2; IEEE Multimedia, October/December (1999) 64–73Google Scholar
  10. 10.
    Sheth, A., Klas, W. (Eds.); Multimedia Data Management: Using Metadata to Integrate and Apply Digital Media; McGraw-Hill, Series on Data Warehousing and Data Management (1998)Google Scholar
  11. 11.
    Turk, A., Johnson, S. E., Jourlin, P., Spärck-Jones, P., Woodland, P. C.; The Cambridge University Multimedia Document Retrieval Demo System; Proc. of SIGIR 2000, Athens, Greece (2000) 394Google Scholar
  12. 12.
    IBM DB2 Universal Database; Image, Audio, and Video Extenders Administration and Programming;
  13. 13.
    Wactlar, H., Hauptmann, A., Witbrock, M.; Informedia: News-on-demand Experiments in Speech Recognition; Proc. of the ARPA Speech Recognition Workshop, Harriman, New York (1996)Google Scholar
  14. 14.
    Zhang, T., Kuo, C.-C. J.; Heuristic Approach for Generic Audio Data Segmentation and Annotation; Proc. of the 7th ACM Int’l Multimedia Conf., Orlando, Florida (1999) 67–76Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • José Martinez
    • 1
  • Rania Lutfi
    • 1
  • Marc Gelgon
    • 1
  1. 1.Institut de Recherche en Informatique de Nantes (IRIN / BaDRI)Ecole polytechnique de l’université de NantesNantes Cedex 3France

Personalised recommendations