Skip to main content

TExtractor: An OSINT Tool to Extract and Analyse Audio/Video Content

  • Conference paper
  • First Online:
Innovation, Engineering and Entrepreneurship (HELIX 2018)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 505))

Abstract

Hacking, data breaches, and information loss are a growing concern for organizations. Aware of the escalation of cyber threats, organizations are looking for ways to detect and mitigate cyberattack scenarios. Cyber intelligence, that is, knowledge produced through data and information on cyber threats and its actors is one of the means explored for this purpose. OSINT (Open Source INTelligence) is one of the areas of data collection for the production of cyber intelligence.

In this paper we propose an OSINT tool (TExtractor) to facilitate the process of obtaining information about cyber threats. The TExtractor tool consists of extracting text from video/audio in open sources and searching for keywords linked to the activities of malicious actors. To support the development of TExtractor, we conducted a study to measure the effectiveness of different text extraction tools in audio/video sources. The results are presented in the paper and show that a tool like TExtractor can detect references to cyberattacks on audio/video sources in real time, with an accuracy between 60% and 70%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. ENISA - Threat landscape report 2017. TR, EU Cybersecurity Agency

    Google Scholar 

  2. NATO. NATO open source intelligence handbook. NATO 2001

    Google Scholar 

  3. Glassman, M., Kang, M.J.: Intelligence in the internet age: the emergence and evolution of open source intelligence (OSINT). Comput. Hum. Behav. 28(2), 673–682 (2012)

    Article  Google Scholar 

  4. Best, C.: Challenges in open source intelligence. In: 2011 European Intelligence and Security Informatics Conference, pp. 58–62, September 2011

    Google Scholar 

  5. Richelson, J.T.: The U.S. Intelligence Community. Avalon Publishing (2015)

    Google Scholar 

  6. Aliprandi, C., Irujo, J.A., Cuadros, M., Maier, S., Melero, F., Raffaelli, M.: Caper: collaborative information, acquisition, processing, exploitation and reporting for the prevention of organised crime. In: Stephanidis, C. (ed.) HCI International 2014 - Posters’ Extended Abstracts, pp. 147–152. Springer International Publishing, Cham (2014)

    Chapter  Google Scholar 

  7. Pfeiffer, M., Avila, M., Backfried, G., Pfannerer, N., Riedler, J.: Next generation data fusion open source intelligence (OSINT) system based on MPEG7. In: 2008 IEEE Conference on Technologies for Homeland Security, pp. 41–46, May 2008

    Google Scholar 

  8. Khelif, K., Mombrun, Y., Motlicek, P., Backfried, G., Kelly, D., Sahito, F., Hazzani, G., Scarpatto, L., Chatzigavriil, E..: Towards a breakthrough speaker identification approach for law enforcement agencies (2017)

    Google Scholar 

  9. Saindon, R.J., Estrin, L.H., Brand, D.A., Brand, S.: Systems and methods for automated audio transcription, translation, and transfer with text display software for manipulating the text, November 2004. US Patent 6,820,055

    Google Scholar 

  10. Maltego - data mining tool, February 2018. https://www.paterva.com/

  11. Shodan - Search engine internet-connected devices, February 2018. https://www.shodan.io

  12. Censys - security driven by data, February 2018. https://censys.io

  13. Nordine, J.: Osint-framework, February 2018. http://osintframework.com/

  14. IBM Watson Speech to Text, February 2018. https://speech-to-text-demo.ng.bluemix.net/

  15. Google Web speech api - text conversion powered by machine learning, February 2018. https://cloud.google.com/speech/

  16. Speech recognition & instant translation, February 2018. https://speechlogger.appspot.com

  17. Speech to text online notepad, February 2018. https://speechnotes.co

  18. Copyleaks - plagiarism checker, February 2018. https://copyleaks.com/

  19. Gnu wdiff, February 2018. https://www.gnu.org/software/wdiff/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to João Paulo Magalhães .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Magalhães, A., Magalhães, J.P. (2019). TExtractor: An OSINT Tool to Extract and Analyse Audio/Video Content. In: Machado, J., Soares, F., Veiga, G. (eds) Innovation, Engineering and Entrepreneurship. HELIX 2018. Lecture Notes in Electrical Engineering, vol 505. Springer, Cham. https://doi.org/10.1007/978-3-319-91334-6_1

Download citation

Publish with us

Policies and ethics