Automatic Image Description Based on Textual Data

Badr, Youakim; Chbeir, Richard

doi:10.1007/11890591_7

Automatic Image Description Based on Textual Data

Youakim Badr¹⁷ &
Richard Chbeir¹⁸

Conference paper

275 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((JODS,volume 4244))

Abstract

In the last two decades, images are quite produced in increasing amounts in several application domains. In medicine, for instance, a large number of images of various imaging modalities (e.g. computer tomography, magnetic resonance, nuclear imaging, etc.) are produced daily to support clinical decision-making. Thereby, a fully functional Image Management System becomes a requirement to the end-users. In spite of current researches, the practice has proved that the problem of image management is highly related to image representation. This paper contribution is twofold in facilitating the representation of images and the extraction of its content and context descriptors. In fact, we introduce an expressiveness and extendable XML-based meta-model able to capture the metadata and content-based of images. We also propose an information extraction approach to provide automatic description of image content using related metadata. It automatically generates XML instances, which mark up metadata and salient objects matched by extraction patterns. In this paper, we illustrate our proposal by using the medical domain of lungs x-rays and we show our first experimental results.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wu, J.K., Narasimhalu, A.D., Mehtre, B.M., Lam, C.P., Gao, Y.J.: CORE: A Content-Based Retrieval Engine for Multimedia Information Systems. Multimedia Systems 3, 25–41 (1995)
Article Google Scholar
Berchtold, S., Boehm, C., Braunmueller, B., et al.: Fast Parallel Similarity Search in Multimedia Databases. In: SIGMOD Conference, AZ, USA, pp. 1–12 (1997)
Google Scholar
Yoshitaka, A., Ichikawa, T.: A Survey on Content-Based Retrieval for Multimedia Databases. IEEE Transactions on Knowledge and Data Engineering 11(1), 81–93 (1999)
Article Google Scholar
Oria, V., Özsu, M.T., Liu, L., et al.: Modeling Images for Content-Based Queries: The DISMA Approach. In: VIS 1997, San Diago, pp. 339–346 (1997)
Google Scholar
Wu, J.K.: Content-Based Indexing of Multimedia Databases. IEEE TKDE 9(6), 978–989 (1997)
Google Scholar
Rui, Y., Huang, T.S., Chang, S.F.: Image Retrieval: Past, Present, and Future. Journal of Visual Communication and Image Representation 10, 1–23 (1999)
Article Google Scholar
Stonebraker, M., Brown, P.: Object-Relational DBMSs. Mogan Kaufmann Pub. Inc., San Francisco (1999)
Google Scholar
Excalibur Image Datablade Module User’s Guide. Informix Press (March 1999) Ver. 1.2, P. No. 000-5356
Google Scholar
Oracle8i, Visual Information Retrieval Users Guide & Reference. Oracle Press (1999) Release 8.1.5, A67293-01
Google Scholar
Grosky, W.I.: Managing Multimedia Information in Database Systems. Communications of the ACM 40(12), 72–80 (1997)
Article Google Scholar
Grosky, W.I., Stanchev, P.L.: An Image Data Model. In: Laurini, R. (ed.) VISUAL 2000. LNCS, vol. 1929, pp. 14–25. Springer, Heidelberg (2000)
Chapter Google Scholar
Eakins, J.P., Graham, M.E.: Content-Based Image Retrieval: A Report to the JISC Technology Applications Programme. Inst. for Image Data Research, Univ. of North-umbria at Newcastle (January 1999)
Google Scholar
Smeulders, A.W.M., Gevers, T., Kersten, M.L.: Crossing the Divide Between Computer Vision and Databases in Search of Image Databases. In: Visual Database Systems Conf., Italy, pp. 223–239 (1998)
Google Scholar
Sheth, A., Klas, W.: Multimedia Data Management: Using Metadata to Integrate and Apply Digital Media. McGraw-Hill, San Francisco (1998)
Google Scholar
Badr, Y.: Xtractor: A Light Wrapper For XML Paragraph-Centric Documents. In: Proceedings of the 2005 International Conference on Signal-Image Technology & Internet - Based Systems (IEEE - SITIS 2005), Yaoundé Cameroon, pp. 150–155 (2005)
Google Scholar
Veltkamp, R.C., Tanase, M.: Content-Based Image Retrieval Systems: A Survey, Technical Report UU-cs-2000-34, Department of Computer Science, Utrecht University (October 2000)
Google Scholar
Oria, V., Özsu, M.T., Iglinski, P., et al.: DISMA: An Object Oriented Approach to Developing an Image Database System, ICDE 2000. In: 16th Int. Conf. on Data Engineering, San Diego, California (February 2000)
Google Scholar
Oria, V., Özsu, M.T., Iglinski, P., et al.: DISMA: A Distributed and Interoperable Image Database System. In: Proc. of ACM SIGMOD Int. Conf. on Management of Data, SIGMOD 2000, Dallas, Texas (2000)
Google Scholar
Duncan, J.S., Ayache, N.: Medical Image Analysis: Progress over Two Decades and the Challenges Ahead. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(1) (January 2000)
Google Scholar
Soderland, S., Fisher, D., Aseltine, J., et al.: Issues in inductive learning of domain-specic text extraction rules. In: Learning for Natural Language Processing, pp. 290–301. Springer, Heidelberg (1996)
Google Scholar
Allen, J.E.: Maintaining Knowledge about Temporal Intervals. Communications of ACM 26, 832–843 (1983)
Article MATH Google Scholar
Chbeir, R., Favetta, F.: A Global Description of Medical Image with a High Precision. In: IEEE International Symposium on Bio-Informatics and Biomedical Engineering IEEE-BIBE 2000, Washington D.C., USA, November 8th-10th, pp. 289–296. IEEE Computer Society, Los Alamitos (2000)
Chapter Google Scholar
Chu, W.W., Hsu, C.C., Cárdenas, A.F., et al.: Knowledge-Based Image Retrieval with Spatial and Temporal Constraints. IEEE Transactions on Knowledge and Data Engineering 10(6), 872–888 (1998)
Article Google Scholar
Mechkour, M.: EMIR2. An Extended Model for Image Representation and Retrieval. In: Database and Expert system Applications (DEXA), pp. 395–404 (September 1995)
Google Scholar
Trayser, G.: Interactive System for Image Selection, Digital Imaging Unit Center of Medical Informatics University Hospital of Geneva, http://www.expasy.ch/UIN/html1/projects/isis/isis.html
Narasimhalu, A.D.: Multimedia Databases, Multimedia Systems, vol. 4, pp. 226–249. Springer, Heidelberg (1996)
Google Scholar
Lu, G.: Multimedia Database Management Systems. Artech House Computing library (1999) ISBN 0-089006-342-7
Google Scholar
Hopcroft, J.E., Ullman, J.D.: Introduction to automata theory languages, and computation. Addison-Wesley Publishing Co., Reading (1979)
MATH Google Scholar
Hume, A.: A tale of two greps. Software Practice and Experience 18(11), 1063–1072 (1988)
Article Google Scholar
Wall, L., Christensen, T., Schwartz, R.L.: Programming Perl, 2nd edn. O’Reilly & Associates, Inc., Sebastopol (1996)
MATH Google Scholar
Smith, D.J., Lopez, M.: Information extraction for semi-structured documents. In: Proc. Workshop on Management of Semistructured Data (May 1997)
Google Scholar
Hammer, J., Garcia-Molina, H., Cho, J., et al.: Extracting Semi structured Information from the Web. In: Proceedings of the Workshop on Management of Semistructured Data, Tucson, Arizona (May 1997)
Google Scholar
Hsu, C.N., Dung, M.T.: Generating finite-state transducers for semistructured data extraction from the web. Information Systems, Special Issue on Semistructured Data 23(8), 521–538 (1998)
Google Scholar
Ashish, N., Knoblock, C.: Wrapper Generation for Semi-structured Internet Sources. In: ACM SIGMOD Workshop on Management of Semistructured Data, Tucson, Arizona (1997)
Google Scholar
Kuhlins, S., Tredwell, R.: Toolkits for Generating Wrappers: A survey. In: Aksit, M., Mezini, M., Unland, R. (eds.) NODe 2002. LNCS, vol. 2591. Springer, Heidelberg (2003)
Chapter Google Scholar
Sankar, S., Viswanadha, S., Duncan, R.: Java Compiler Compiler (JavaCC)
Google Scholar
The Java Parser Generator. Located at: http://www.suntest.com/JavaCC/
Savarese, D.F.: OROmatcher - Regular Expressions for Java, http://www.savarese.org/
Karttunen, L., Chanod, J.-P., Grefenstette, G., Schiller, A.: Regular expressions for language engineering. Journal of national language engineering 2(4), 305–328 (1996)
Article Google Scholar
van Noord, G., Gerdemann, D.: An Extendible Regular Expression Compiler for Finite-State Approaches in Natural Language Processing. In: Boldt, O., Jürgensen, H. (eds.) WIA 1999. LNCS, vol. 2214, p. 122. Springer, Heidelberg (2001)
Chapter Google Scholar
MPEG-7 Overview (visited at, 26/02/2006), http://www.chiariglione.org/MPEG/standards/mpeg-7/mpeg-7.htm
Chang, S.K., Shi, Q.Y., Yan, C.W.: Iconic Indexing by 2-D Strings. IEEE-Transactions-on-Pattern-Analysis-and-Machine-Intelligence PAMI-9(3), 413–428 (1987)
Article Google Scholar
Chang, S.K., Jungert, E.: Human- and System-Directed Fusion of Multimedia and Multimodal Information using the Sigma-Tree Data Model. In: Leung, C. (ed.) Visual Information Systems. LNCS, vol. 1306, pp. 21–28. Springer, Heidelberg (1997)
Chapter Google Scholar
Huang, P.W., Jean, Y.R.: Using 2D C+-Strings as spatial knowledge representation for image database management systems. Pattern Recognition 27(9), 1249–1257 (1994)
Article Google Scholar
Egenhofer, M.: Query Processing in Spatial Query By Sketch. Journal of Visual Language and Computing 8(4), 403–424 (1997)
Article Google Scholar
El-kwae, M.A., Kabuka, M.R.: A robust framework for Content-Based Retrieval by Spatial Similarity in Image Databases. ACM Transactions on Information Systems 17(2), 174–198 (1999)
Article Google Scholar
Peuquet, D.J.: The use of spatial relationships to aid spatial database retrieval. In: Proc. Second Int. Symp. on Spatial Data Handling, Seattle, pp. 459–471 (1986)
Google Scholar
Egenhofer, M., Frank, A., Jackson, J.: A Topological Data Model for Spatial Databases. In: Buchmann, A., Smith, T.R., Wang, Y.-F., Günther, O. (eds.) SSD 1989. LNCS, vol. 409, pp. 271–286. Springer, Heidelberg (1990)
Google Scholar
Gross, M.: The Use of Finite Automata in the Lexical Representation of Natural Language. In: Gross, M., Perrin, D. (eds.) LITP 1987. LNCS, vol. 377, pp. 34–50. Springer, Heidelberg (1989)
Google Scholar
Courtois, B.: Le dictionnaire electronique des mots simples. In: Les dctionnaires electroniques. Langue francaise no 87. Larousse, Paris (1990)
Google Scholar
Silberztein, M.: INTEX: a Finite State Transducer toolbox. In: Theoretical Computer Science #231:1. Elsevier Science, Amsterdam (1999)
Google Scholar
Subramaniam, L.V., Mukherjea, S., Kankar, P., Srivastava, B., Batra, V.S., Kamesam, P.V., Kothari, R.: Information Extraction from Biomedical Literature: Methodology, Evaluation and an Application, IBM India Research Lab, New Delhi, India
Google Scholar
Fukuda, K., Tsunoda, T., Tamura, A., Takagi, T.: Toward Information Extraction: Identify-ing Protein Names from Biological Papers. In: Proceedings of the Pacific Symposium on Biocomputing, Hawaii, pp. 707–718 (1998)
Google Scholar
Daniel, Q., Hesham, A.: Ontology Specific Data Mining Based on Dynamic Grammars. In: Bioinformatics conference, Stanford, CA, August 16-19 (2004)
Google Scholar
Embley, D.W., Campbell, D.M., Smith, R.D.: Ontology-Based Extraction and Structuring of Information from Data-Rich Unstructured Documents. In: Proceedings of CIKM 1998, Bethesda, Maryland (1998)
Google Scholar
Bricon-Souf, N., Beuscart-Zéphir, M.C., Watbled, L., Laforest, F., Karadimas, H., Anceaux, F., Flory, A., Lepage, E., Beuscart, R.: Technologies de l’Information Pour l’Hospitalisation A Domicile: le projet TIPHAD, Télémédecine et e-Santé, Collection Infor-matique et Santé, Paris, vol. 13. Springer- Verlag (2002)
Google Scholar
Unitex Home page (last visited, March 12th 2006), Available at: http://www-igm.univ-mlv.fr/~unitex/
Frakes, W.B., Baeza-Yates, R.: Information Retrieval: Data Structures & Algorithms. Prentice Hall, Englewood Cliffs (1992)
Google Scholar
Appelt, D.E., Israel, D.J.: Introduction to Information Extraction Technology. In: Tutorial for IJCAI 1999, Stockholm (1999)
Google Scholar
Charniak, E.: Statistical Language Learning, p. 192. MIT Press, Cambridge (1994)
Google Scholar
Brill, E., Church, K.: Proceedings of the Conference on Empirical Methods in Natural Language Processing. University of Pennsylvania. Philadelphia, PA (1996)
Google Scholar
Marcus, M., Santorini, B., Marcinkiewicz, M.: Building a large annotated corpus of English. Computational Linguistics 19(2), 313–330 (1993)
Google Scholar
Freitag, D., McCallum, A.: Information extraction with HMMs and shrinkage. In: Proceedings of the AAAI 1999 Workshop on Machine Learning for Information Extraction, pp. 31–36 (1999)
Google Scholar
Miikkulainen, R.: Subsymbolic Natural Language Processing: An Integrated Model of Scripts, Lexicon, and Memory. MIT Press, Cambridge (1993)
Google Scholar
Brill, E.: Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Computational Linguistics 21(4), 543–565 (1995)
Google Scholar
Magerman, D.M.: Statistical decision-tree models for parsing. In: Proceedings of the 33rd Annual Meeting of the Association for Computational Lenguistics, Cambridge, pp. 276–283 (1995)
Google Scholar
Wermter, S., Rilo, E., Scheler, G.: Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing, pp. 315–328. Springer, Berlin (1996)
Google Scholar
Lavrac, N., Dzeroski, S.: Inductive Logic Programming: Techniques and Applications. Ellis Horwood (1994)
Google Scholar
Huffman, S.: Learning information extraction patterns from examples. In: Workshop on Learning for Natural Language Processing, IJCAI 1995, Canada, pp. 246–260 (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

PRISMa – INSA de Lyon, 20 Av. Einstein, F-69621, Villeurbanne, France
Youakim Badr
LE2I – Bourgogne University, BP 47870, 21078 Cedex, Dijon, France
Richard Chbeir

Authors

Youakim Badr
View author publications
You can also search for this author in PubMed Google Scholar
Richard Chbeir
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

EPFL-IC-IIF-LBD, Station 14 - INJ 236, 1015, Lausanne, Switzerland
Stefano Spaccapietra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Badr, Y., Chbeir, R. (2006). Automatic Image Description Based on Textual Data. In: Spaccapietra, S. (eds) Journal on Data Semantics VII. Lecture Notes in Computer Science, vol 4244. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11890591_7

Download citation

DOI: https://doi.org/10.1007/11890591_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46329-0
Online ISBN: 978-3-540-46330-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics