Compensation of audio data with a high frequency components for realistic media FTV

Yeo, Sung-Dae; Cho, Tae-Il; Kim, Jong-Un; Park, Goo-Man; Kim, Seong-Kweon

doi:10.1007/s11042-016-3713-7

Compensation of audio data with a high frequency components for realistic media FTV

Published: 29 June 2016

Volume 76, pages 11361–11376, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Sung-Dae Yeo¹,
Tae-Il Cho¹,
Jong-Un Kim¹,
Goo-Man Park¹ &
…
Seong-Kweon Kim¹

144 Accesses
Explore all metrics

Abstract

Like the concept of free-viewpoint TV (FTV), the audio data should be rendered according to video data. However, on condition that minimum numbers of microphone are used, it is difficult to acquire accurate audio signal for rendering audio data to the image with the choice of view point. Especially, degradation of high frequency components (HFC) happens due to the characteristic of polar pattern for microphone. The degradation of HFC causes imperfection of signal restoration and leads to the degradation of clarity for hearing. In this paper, a compensation method for the degradation of HFC audio signal is proposed for producing an immersive audio effect at realistic media. Our experimental results show that low frequency components (LFC) of audio signal had a little directional degradation in spite of effect of the polar patterns of microphone and the compensation of HFC can be realized with adapting the attenuation inclination of LFC. This research is expected to be helpful for producing an immersive audio effect for a realistic media.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

CRetinex: A Progressive Color-Shift Aware Retinex Model for Low-Light Image Enhancement

Article 08 April 2024

Han Xu, Hao Zhang, … Jiayi Ma

Video steganography: recent advances and challenges

Article Open access 04 April 2023

Jayakanth Kunhoth, Nandhini Subramanian, … Ahmed Bouridane

Low-light Image Enhancement via Breaking Down the Darkness

Article 02 October 2022

Xiaojie Guo & Qiming Hu

References

Anantrasirichai N, Nishan Canagarajah C, Redmill DW, Bull DR (2006) Volumetric representation for sparse multi-views. IEEE Proc Int Conf Image Proc:1221–1224
Anantrasirichai N, Nishan Canagarajah C, Redmill DW, Akbari S, Bull DR (2011) Colour volumetric compression for realistic view synthesis applications. Multimedia Tools and Applications 53:25–51. doi:10.1008/s11042-010-0484-4
Article Google Scholar
Casanovas AL, Cavallaro A (2015) Audio-visual events for multi-camera synchronization. Multimedia Tools and Applications 74:1317–1340. doi:10.1007/s11042-014-1872-y
Article Google Scholar
Cho DH, Lee SL (2013) Object feature extraction and matching for effective multiple vehicles tracking. Journal of the Korea Information Processing Society 2:789–794. doi:10.3745/ktsde.2013.2.11.789
Google Scholar
Choi T, Hyun D, Lee S, Park, Y (2008) Design of Realtime Multichannel 3D audio rendering system. Proceedings of Symposium of the Korean Institute of communications and Information Sciences 1997-1998
Everest FA, Pohlmann KC (2009) Master handbook of acoustic. McGraw-Hill, New York
Google Scholar
Forrest S (2012) The future of TV. URL: http://blog.imgtec.com/powervr/the-future-of-tv
Han YC, B-j H (2014) Virtual pottery: a virtual 3D audiovisual interface using natural hand motions. Multimedia Tools and Applications 73:917–933. doi:10.1007/s11042-013-1382-3
Article Google Scholar
Herre J, Hilpert J, Kuntz A, Plogsties J (2015) MPEG-H Audio-The New Standard for Universal Spatial/3D Audio Coding. Audio Engineering Society (AES) 62:821–830. doi:10.17743/jaes.2014.0049
Article Google Scholar
Jang D, Seo J, Lee YJ, Yoo JH, Park T, Lee T (2015) A study on realistic sound reproduction for UHDTV. Journal of Broadcast Engineering 20:68–81. doi:10.5909/jbe.2015.20.1.68
Article Google Scholar
Kim S, Lee YW, LEE YL (2013) 3D sound system based on audio/video analysis. Conference of Institute of Electronics Engineers of Korea 1924-1927
Kim H-G, Moreau N, Sikora T (2006) MPEG-7 audio and beyond: audio content indexing and retrieval. Wiley, USA
Google Scholar
Kim JH, Kwon KS, Kang TG, Kim NS (2014) Current state of the art and Prospect of user centric-realistic audio technologies. The Korean Society of Broadcast Engineers 19:54–65. doi:10.5909/JBE.2014.19.1.10
Google Scholar
Kim J-U, Cho H-S, Lee Y-B, Yeo S-D, Kim S-K (2015) A Study on Immersive Audio Improvement of FTV using an effective noise. Journal of The Korea Institute of Electronic Communication Sciences 10:233–238. doi:10.13067/JKIECS.2015.10.2.233
Article Google Scholar
Lei C, Yang YH (2006) Tri-focal tensor-based multiple video synchronization with subframe optimization. IEEE trans. Image Processing 15:2473–2480. doi:10.1109/TIP/2006/877438
Article Google Scholar
Llagostera Casanovas A, Monaci G, Vandergheynst P, Gribonval R (2010) Blind audio-visual source separation based on sparse redundant representations. IEEE Trans Multimedia 12(5):358–371
Article Google Scholar
Magnor M, Ramanathan P, Girod B (2003) Multi-view coding for image-based rendering using 3-D scene geometry. 13:1092–1106
Mori Y, Fukushima N, Yendo T, Fujii T, Tanimoto M (2009) View generation with 3D warping using depth information for FTV. Signal Process Image Commun 24:65–72. doi:10.1016/j.image.2008.10.013
Article Google Scholar
Neuendorf M, Plogsties J, Meltzer S, Bleidt R (2014) Immersive audio with MPEG 3D audio-status and outlook. NAB Broadcast Engineering Conference (BEC) Proceedings 2014:52–56
Google Scholar
Niwa K, Nishino T, Takeda K (2007) Development of Selectable Viewpoint and Listening Point System for Musical Performance. In: 19th International Congress on Acoustics, Madrid 1–6
Nour-Eddine L, Abdelkader A (2015) GMM-based Maghreb dialect identification system. Journal of Information Processing Systems 30:22–38. doi:10.3745/jips.02.0015
Google Scholar
Oldfield R, Shirley B, Spille J (2015) Object-based audio for interactive football broadcast. Multimedia Tools and Applications 74:2717–2741. doi:10.1007/s11042-013-1472-2
Article Google Scholar
Ricketts Todd A, Dittberner Andrew B, Johnson Earl E (2008) High-frequency amplification and sound quality in listeners with normal through moderate hearing loss. Journal of Speech, Language, and Hearing Research 51:160–172. doi:10.1044/1092-4388
Article Google Scholar
Seo J, Kang K, Jeong D-G (2012) Overview of MPEG 3D audio standard activities for high-order multichannel realistic audio service. Conference of The Korean Society of Broad Engineers:170–172
Sha Y-t, Bao C-c, Jia M-s, Liu, X (2010) High frequency reconstruction of audio signal based on chaotic prediction theory. 2010 I.E. International Conference on Acoustics, Speech and Signal 381–384 doi:10.1109/icassp.2010.5495813
Tanimoto M, Tehrani MP, Fujii T, Yendo T (2011) Free-viewpoint TV. IEEE Signal Process Mag 28:67–76. doi:10.1109/MSP.2010.939077
Article Google Scholar
Tanimoto M, Tehrani MP, Fujii T, Yendo T (2012) FTV for 3-D spatial communication. Proceeding of the IEEE 100:905–917. doi:10.1109/JPROC.2011.2182101
Article Google Scholar
Tehrani MP, Hirano Y, Fujii T, Kajita S, Takeda K, Mase K (2006) Arbitrary listening-point generation using sub-band representation of sound wave ray-space. IEEE 5:541–544. doi:18.1109/ICASSP.2006.1661332
Tehrani MP, Yendo T, Fujii T, Takeda K, Tanimoto M (2009) Integration of 3D audio and 3D video for FTV. 3DTV Conference: The True Vision – Capture, Transmission and Display of 3D Video 4–6 doi:10.1109/3DTV.2009.5069681
Yamamoto K, Kitahara M, Kimata H, Yendo T, Fujii T, Tanimoto M, Shimizu S, Kamikura K, Yashima Y (2007) Multiview video coding using view interpolation and color correction. IEEE 17:1436–1449. doi:10.1109/TCSVT.2007.903802
Google Scholar
Yao Q, Takahashi K, Fujii T (2013) Compressed sensing of ray space for free viewpoint image (FVI) generation. ITE Transactions on Media Technology and Application 2:23–32. doi:10.1109/APSIPA.2013.6694266
Article Google Scholar
Yim E, Kham K, Lee J-H (2013) Spatial coincidence effects of the visual and auditory stimulation in 3D TV. Conference of The HCI Society of Korea:751–754
Zivkovic Z (2004) Improved Adaptive Gaussian Mixture Model for Background Subtraction. 17th International Conference on Pattern Recognition 2:28–31 doi:10.1109/icpr.2004.1333992

Download references

Acknowledgments

The work was supported by the ICT R&D program of MSIP/IITP, Republic of Korea, [B0101-15-0042, Volumetric 3D Image and 3D Audio Realization Technology].

Author information

Authors and Affiliations

Graduate School of NID Fusion Technology, Seoul National University of Science & Technology, Seoul, Republic of Korea
Sung-Dae Yeo, Tae-Il Cho, Jong-Un Kim, Goo-Man Park & Seong-Kweon Kim

Authors

Sung-Dae Yeo
View author publications
You can also search for this author in PubMed Google Scholar
Tae-Il Cho
View author publications
You can also search for this author in PubMed Google Scholar
Jong-Un Kim
View author publications
You can also search for this author in PubMed Google Scholar
Goo-Man Park
View author publications
You can also search for this author in PubMed Google Scholar
Seong-Kweon Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seong-Kweon Kim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yeo, SD., Cho, TI., Kim, JU. et al. Compensation of audio data with a high frequency components for realistic media FTV. Multimed Tools Appl 76, 11361–11376 (2017). https://doi.org/10.1007/s11042-016-3713-7

Download citation

Received: 27 October 2015
Revised: 31 March 2016
Accepted: 24 June 2016
Published: 29 June 2016
Issue Date: May 2017
DOI: https://doi.org/10.1007/s11042-016-3713-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Compensation of audio data with a high frequency components for realistic media FTV

Abstract

Access this article

Similar content being viewed by others

CRetinex: A Progressive Color-Shift Aware Retinex Model for Low-Light Image Enhancement

Video steganography: recent advances and challenges

Low-light Image Enhancement via Breaking Down the Darkness

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Compensation of audio data with a high frequency components for realistic media FTV

Abstract

Access this article

Similar content being viewed by others

CRetinex: A Progressive Color-Shift Aware Retinex Model for Low-Light Image Enhancement

Video steganography: recent advances and challenges

Low-light Image Enhancement via Breaking Down the Darkness

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation