Skip to main content
Log in

Simulated smart phone recordings for audio identification

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

This paper studies the use of simulated recordings to perform audio identification experiments. In contrast to use actual recordings in the experiments, we use the measured room impulse response to generate simulated recordings. Doing so greatly reduces the burden of manually recording audio items for experiments. By comparing the correlations between actual and simulated recordings, we conclude that this approach is highly possible. The audio identification experiments are conducted based on the moving picture expert group audio signature descriptors to represent the simulated recordings. We also add environmental noises, provided by European Telecommunications Standards Institute, to the simulated recordings to study the performance degradation. Finally, we study if performing filtering in the descriptor domain can improve the accuracy. The experimental results show that filtering in the frequency direction yields higher accuracy for signal to noise ratio of 10 dB items.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Cho H, Choi M (2014) Personal mobile album/diary application development. J Converg 5(1):32–37

    Google Scholar 

  2. Oh J-S, Park C-U, Lee S-B (2014) NFC-based mobile payment service adoption and diffusion. J Converg 5(2):8–14

  3. Feese S, Burscher MJ, Jonas K, Troster G (2014) Sensing spatial and temporal coordination in teams using the smartphone. Hum-Centric Comput Inf Sci 4(15):1–18

  4. You SD, Chen W-H, Chen W-K (2013) Music identification system using MPEG-7 audio signature descriptors. Sci World J 2013. doi:10.1155/2013/752464

  5. Ramona M et al (2012) A public audio identification evaluation for broadcast monitoring. Appl Artif Intell Int J 26(1–2):119–136. doi:10.1080/08839514.2012.629840

  6. Stan G-B, Embrechts J-J, Archambeau D (2002) Comparison of different impulse response measurement techniques. J Audio Eng Soc 50(4):249–262

  7. ETSI, Speech and Multimedia Transmission Quality (STQ) (2012) Speech quality performance in the presence of background noise; part 1: background noise simulation technique and background noise database. ETSI ES202 396-1, pp 45–47

  8. http://www.shazam.com/. Accessed 10 Mar 2015

  9. Wang AL-C (2003) An industrial-strength audio search algorithm. In: Proc. of international conference on music information retrieval (ISMIR), Baltimore, pp 7–13

  10. ISO/IEC (2002) Information technology—multimedia content description interface-part 4: audio. IS 15938-4

  11. Cano P, Battle E, Kalker T, Haitsma J (2005) A review of audio fingerprinting. J VLSI Signal Process 41(3):271–284

  12. Haitsma J, Kalker T (2002) A highly robust audio fingerprinting system. In: Proc. int’l. conf. on music information retrieval. IRCAM, France, pp 107–115

  13. Baluja S, Covell M (2007) Audio fingerprinting: combining computer vision and data stream processing. In: Proceedings of IEEE intl conf on acoustics, speech and signal processing. IEEE Press, Piscataway, pp II-213–II-216

  14. Burges CJC, Platt JC, Jana S (2003) Distortion discriminant analysis for audio fingerprinting. IEEE Trans Speech Audio Process 11(3):165–174

  15. Ramona M, Peeters G (2013) Audioprint: an efficient audio fingerprint system based on a novel cost-less synchronization scheme. In: Proceedings of the international conference on acoustics, speech and signal processing (ICASSP’13), pp 818–822

  16. You SD, Pu Y-H (2015) Using paired distances of signal peaks in stereo channels as fingerprints for copy identification. ACM Trans Multimed Comput Commun Appl 12(1):1–22, Art No 1

  17. http://en.wikipedia.org/wiki/Acoustic_fingerprint/. Accessed 10 Mar 2015

  18. You SD, Chen W-H (2015) Comparative study of methods for reducing dimensionality of MPEG-7 audio signature descriptors. Multimed Tools Appl 74(10):3579–3598

    Article  Google Scholar 

  19. Park M, Kim H-R, Yang SH (2006) Frequency-temporal filtering for a robust audio fingerprinting scheme in real-noise environments. ETRI J 28(4):509–512

  20. Storn R (1996) Echo cancellation techniques for multimedia applications: a survey. International Computer Science Institute-Publications-TR, Berkeley, USA

Download references

Acknowledgments

This work was supported in part by Ministry of Science and Technology of Taiwan through Grant NSC 102-2221-E-027-076.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shingchern D. You.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

You, S.D., Lin, YC. Simulated smart phone recordings for audio identification. J Supercomput 72, 1799–1812 (2016). https://doi.org/10.1007/s11227-015-1533-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-015-1533-6

Keywords

Navigation