Skip to main content

Room Identification with Personal Voice Assistants (Extended Abstract)

  • 487 Accesses

Part of the Lecture Notes in Computer Science book series (LNSC,volume 13106)

Abstract

Personal Voice Assistants (PVAs) are used to interact with digital environments and computer systems using speech. In this work we describe how to identify the room in which the speaker is located. Only the audio signal is used for identification without using any other sensor input. We use the output of existing trained models for speaker identification in combination with a Support Vector Machine (SVM) to perform room identification. This method allows us to re-use existing elements of PVA eco-systems and an intensive training phase is not required. In our evaluation rooms can be identified with almost 90% accuracy. Room identification might be used as additional security mechanism and the work shows that speech signals recorded by PVAs can also leak additional information.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-95484-0_19
  • Chapter length: 11 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   64.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-95484-0
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   84.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.

References

  1. Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets (2014)

    Google Scholar 

  2. Deb, S., Dandapat, S., Krajewski, J.: Analysis and classification of cold speech using variational mode decomposition. IEEE Trans. Affect. Comput. 11(2), 296–307 (2020). https://doi.org/10.1109/TAFFC.2017.2761750

    CrossRef  Google Scholar 

  3. Eaton, J., Gaubitch, N.D., Moore, A.H., Naylor, P.A.: Estimation of room acoustic parameters: the ace challenge. IEEE/ACM Trans. Audio Speech Lang. Process. 24(10), 1681–1693 (2016). https://doi.org/10.1109/TASLP.2016.2577502

    CrossRef  Google Scholar 

  4. Murgai, M.P., Rau, J.: Blind estimation of the reverberation fingerprint of unknown acoustic environments. J. Audio Eng. Soc. 143 (2017)

    Google Scholar 

  5. Moore, A.H., Brookes, M., Naylor, P.A.: Room geometry estimation from a single channel acoustic impulse response. In: 21st European Signal Processing Conference (EUSIPCO 2013), pp. 1–5 (2013)

    Google Scholar 

  6. Moore, A.H., Brookes, M., Naylor, P.A.: Room identification using room prints. J. Audio Eng. Soc. 54 (2014)

    Google Scholar 

  7. Moore, A.H., Naylor, P.A., Brookes, M.: Room identification using frequency dependence of spectral decay statistics. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6902–6906 (2018). https://doi.org/10.1109/ICASSP.2018.8462008

  8. Nagrani, A., Chung, J.S., Zisserman, A.: VoxCeleb: a large-scale speaker identification dataset. In: Interspeech 2017, August 2017. https://doi.org/10.21437/interspeech.2017-95

  9. Nediyanchath, A., Paramasivam, P., Yenigalla, P.: Multi-head attention for speech emotion recognition with auxiliary learning of gender recognition. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7179–7183 (2020). https://doi.org/10.1109/ICASSP40776.2020.9054073

  10. Papayiannis, C., Evers, C., Naylor, P.A.: End-to-end classification of reverberant rooms using DNNs. IEEE/ACM Trans. Audio Speech Lang. Process. 28, 3010–3017 (2020). https://doi.org/10.1109/TASLP.2020.3033628

    CrossRef  Google Scholar 

  11. Peters, N., Lei, H., Friedland, G.: Name that room: room identification using acoustic features in a recording. In: Proceedings of the 20th ACM International Conference on Multimedia. MM 2012, pp. 841–844. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2393347.2396326

  12. Peters, N., Lei, H., Friedland, G.: Room identification using acoustic features in a recording. US patent US 9,449,613 B2, 20 September 2016

    Google Scholar 

  13. Xie, W., Nagrani, A., Chung, J.S., Zisserman, A.: Utterance-level aggregation for speaker recognition in the wild. In: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5791–5795 (2019). https://doi.org/10.1109/ICASSP.2019.8683120

  14. Yeh, S.L., Lin, Y.S., Lee, C.C.: A dialogical emotion decoder for speech emotion recognition in spoken dialog. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6479–6483 (2020). https://doi.org/10.1109/ICASSP40776.2020.9053561

Download references

Acknowlegement

This publication has emanated from research conducted with the financial support of Science Foundation Ireland under Grant number 19/FFP/6775. For the purpose of Open Access, the author has applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammadreza Azimi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Azimi, M., Roedig, U. (2022). Room Identification with Personal Voice Assistants (Extended Abstract). In: , et al. Computer Security. ESORICS 2021 International Workshops. ESORICS 2021. Lecture Notes in Computer Science(), vol 13106. Springer, Cham. https://doi.org/10.1007/978-3-030-95484-0_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-95484-0_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-95483-3

  • Online ISBN: 978-3-030-95484-0

  • eBook Packages: Computer ScienceComputer Science (R0)