Mutual Spotting Retrieval between Speech and Video Image Using Self-Organized Network Databases

Endo, Takashi; Zhang, Jian Xin; Nakazawa, Masakyuki; Oka, Ryuichi

doi:10.1007/3-540-48962-2_8

Takashi Endo⁵,
Jian Xin Zhang⁶,
Masakyuki Nakazawa⁵ &
…
Ryuichi Oka⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1554))

Included in the following conference series:

International Conference on Advanced Multimedia Content Processing

271 Accesses

Abstract

Video codec technology like MPEG and improved performance of microprocessors enable environments to be setup in which large volumes of video images can be stored. The ability to perform search and retrieve operations on stored video is therefore becoming more important. This paper proposes a technique for performing mutual spotting retrieval between speech and video images in which either speech or video is used as a query to retrieve the other. This technique makes use of a network that self organizes itself incrementally and represents redundant structures in degenerate form, which makes for efficient searches. As a result, the capacity of a database can be decreased by about one half for speech and by about three fourths for video when expressed in network form. Applying this technique to a database consisting of six-hours worth of speech and video, it was found that a search from video to speech could be performed in 0.5 seconds per frame.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

“Alta VistaC” http://www.altavista.digital.com/D
“goo,” http://www.goo.ne.jp/.
H. Ueda, T. Miyatake, S. Yosizawa, “A proposal of an Interactive Motion Picture Editing Scheme Assisted by Media Recognition Technology,” The transactions of the Institute of Electronics, Information and Communication Engineers(D-II), Vol. J75-D-II, No.2, pp.216–225, 1992.
Google Scholar
K. Otsuji, Y. Tonomura, Y. Ohba, “Cut Detecting Method by Projection Detecting Filter,” The transactions of the Institute of Electronics, Information and Communication Engineers(D-II), Vol. J77-D-II, No.3, pp.519–528, 1994.
Google Scholar
Y. Taniguchi, Y. Tonomura, H. Hamada, “A Method for Detecting Shot Changes and Its Application to Access Interfaces to Video,” The transactions of the Institute of Electronics, Information and Communication Engineers (D-II), Vol. J79-D-II, No.4, pp.538–546, 1996.
Google Scholar
A. Nagasaka, T. Miyatake, H. Ueda, “Realtime Video Scene Detection based on Shot Sequence Encoding,” The transactions of the Institute of Electronics, Information and Communication Engineers (D-II), Vol. J79-D-II, No.4, pp.531–537(1996-4).
Google Scholar
Y. Yaginuma, M. Sakauchi, “A Proposal of a Synchronization Method between Drama Image, Sound and Scenario Document Using DP Matching,” The transactions of the Institute of Electronics, Information and Communication Engineers(D-II), Vol. J79-D-II, No. 5, pp.747–755, 1996.
Google Scholar
R. Lienhart, “Indexing and Retrieval of Digital Video Sequences based on Automatic Text Recognition,”Proc. of Int. Multimedia Conf. 96, pp. 419–420, 1996.
Google Scholar
M. Abdel-Mottaleb, N. Dimitrova, “CONIVAS: CONtent-based Image and Video Access System,” Proc. of Int. Multimedia Conf. 96, pp. 427–428, 1996.
Google Scholar
Hatano, Quian, Tanaka, “A SOM-Based Information Organizer for Text and Video Data,” Proc. of the International Conference on Database Systems for Advanced Applications, 1997.
Google Scholar
R. Oka, Y. Itoh, J. Kiyama, C. Zhang, “Concept spotting by image automaton,” In RWC Symposium’ 95, pp. 45–46, 1995.
Google Scholar

Download references

Author information

Authors and Affiliations

Real World Computing Partnership, Mitsui Building 13F, 1-6-1 Takezono, Tsukuba, Ibaraki, 305, Japan
Takashi Endo, Masakyuki Nakazawa & Ryuichi Oka
Mediadrive Co, Ltd., Kumagaya-Ekimae Building 7F, 3-195 Tsukuba, Kumagaya, Saitama, 360, Japan
Jian Xin Zhang

Authors

Takashi Endo
View author publications
You can also search for this author in PubMed Google Scholar
Jian Xin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Masakyuki Nakazawa
View author publications
You can also search for this author in PubMed Google Scholar
Ryuichi Oka
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Engineering Department of Information Systems Engineering, Osaka University, 2-1 Yamadaoka, Suita, Osaka, 565-0871, Japan
Shojiro Nishio & Fumio Kishino &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Endo, T., Zhang, J.X., Nakazawa, M., Oka, R. (1999). Mutual Spotting Retrieval between Speech and Video Image Using Self-Organized Network Databases. In: Nishio, S., Kishino, F. (eds) Advanced Multimedia Content Processing. AMCP 1998. Lecture Notes in Computer Science, vol 1554. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48962-2_8

Download citation

DOI: https://doi.org/10.1007/3-540-48962-2_8
Published: 18 March 1999
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65762-0
Online ISBN: 978-3-540-48962-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics