Scientific and Technical Information Processing

, Volume 44, Issue 5, pp 329–337 | Cite as

An Information Retrieval System for Decision Support: An Arctic-Related Mass Media Case Study

  • D. A. Devyatkin
  • R. E. Suvorov
  • I. V. Sochenkov


This paper discusses the problem of building a comprehensive information retrieval system that facilitates the decision-making process in a specified wide topic. We analyze the requirements for such a system, types of information sources, and typical search queries and propose an architecture and an integrated pipeline. We also present a case study in the field of Arctic exploration (oil & mining, ecology issues, etc.). The results are also presented, including vibrant topics and typical associations between entities.


information retrieval mass media monitoring event detection information extraction relation extraction knowledge base decision support 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Imran, M., et al., Processing social media messages in mass emergency: A survey, ACM Comput. Surv., 2015, vol. 47, no. 4, p. 67.MathSciNetCrossRefGoogle Scholar
  2. 2.
    Petrovic, S., Real-Time Event Detection in Massive Streams, 2013.Google Scholar
  3. 3.
    Li, R., et al., Tedas: A twitter-based event detection and analysis system, 2012 IEEE 28th International Conference on Data Engineering (ICDE), 2012, pp. 1273–1276.CrossRefGoogle Scholar
  4. 4.
    Li Zheng, Chao Shen, Liang Tang, et al., Disaster SitRep–A vertical search engine and information analysis tool in disaster management domain, Proceedings of 2012 IEEE 13th International Conference on Information Reuse and Integration (IRI), 2012, pp. 457–465.Google Scholar
  5. 5.
    Ashktorab, Z., Brown, C., Nandi, M., and Culotta, A., Tweedr: Mining Twitter to inform disaster response, Proceedings of ISCRAM, 2014, pp. 354–358.Google Scholar
  6. 6.
    Xiaohua, L., Shaodian, Zh., Furu, W., and Ming, Zh., Recognizing named entities in tweets, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, 2011, pp. 359–367.Google Scholar
  7. 7.
    Bhattacharya, A., Tiwari, M.K., and Harding, J.A., A framework for ontology based decision support system for e-learning modules, business modeling and manufacturing systems, J. Intell. Manuf., 2012, vol. 23, no. 5, 1763–1781.CrossRefGoogle Scholar
  8. 8.
    Rao, L., Mansingh, G., and Osei-Bryson, K.M., Building ontology based knowledge maps to assist business process re-engineering, Decis. Support Syst., 2012, vol. 52, no. 3, pp. 577–589.CrossRefGoogle Scholar
  9. 9.
    Hersovici, M., et al., The shark-search algorithm. An application: Tailored web site mapping, Comput. Networks ISDN Syst., 1998, vol. 30, no. 1, pp. 317–326.CrossRefGoogle Scholar
  10. 10.
    Chen, Z., et al., An improved shark-search algorithm based on multi-information, IEEE Fourth International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2007, 2007, vol. 4, pp. 659–658.CrossRefGoogle Scholar
  11. 11.
    Su, C., et al., An efficient adaptive focused crawler based on ontology learning, IEEE Fifth International Conference on Hybrid Intelligent Systems, 2005. HIS'05, 2005, p. 6.Google Scholar
  12. 12.
    Liu, H., Janssen, J., and Milios, E., Using HMM to learn user browsing patterns for focused web crawling, Data Knowl. Eng., 2006, vol. 59, no. 2, pp. 270–291.CrossRefGoogle Scholar
  13. 13.
    Blanvillain, O., Kasioumis, N., and Banos, V., Blog-Forever Crawler: Techniques and algorithms to harvest modern weblogs, Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics (WIMS14), ACM, 2014, p. 7.Google Scholar
  14. 14.
    Florian, R., et al., Named entity recognition through classifier combination, Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, 2003, vol. 4, pp. 168–171.CrossRefGoogle Scholar
  15. 15.
    Al-Rfou, R., et al., Polyglot-NER: Massive multilingual named entity recognition, Proceedings of the 2015 SIAM International Conference on Data Mining, Vancouver, 2015.Google Scholar
  16. 16.
    Wikipedia. Cited January 20, 2016.Google Scholar
  17. 17.
    Bollacker, K., et al., Freebase: A collaboratively created graph database for structuring human knowledge, Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, 2008, pp. 1247–1250.CrossRefGoogle Scholar
  18. 18.
    Manning, C.D., et al., Introduction to Information Retrieval, Cambridge: Cambridge University Press, 2008, vol. 1, p. 496.zbMATHGoogle Scholar
  19. 19.
    Sochenkov, I.V. and Suvorov, R.E., Services of full-text search in the information-analytical system (Part 1), Inf. Tekhnol. Vychisl. Sist., 2013, no. 2, pp. 69–78.Google Scholar
  20. 20.
    Takase, S., Okazaki, N., and Inui, K., Fast and Large-Scale Unsupervised Relation Extraction, 2015.Google Scholar
  21. 21.
    Angeli, G., Premkumar, M.J., and Manning, C.D., Leveraging linguistic structure for open domain information extraction, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL, 2015, pp. 26–31.Google Scholar
  22. 22.
    TAC Knowledge Base Population, NIST Information Technology Laboratory, 2015. Cited January 20, 2016.Google Scholar
  23. 23.
    Hoffmann, R., et al., Knowledge-based weak supervision for information extraction of overlapping relations, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011, vol. 1, pp. 541–550.Google Scholar
  24. 24.
    Scrapy. A Fast and Powerful Scraping and Web Crawling Framework. Cited January 20, 2016.Google Scholar
  25. 25.
    Osipov, G., et al., Relational-situational method for intelligent search and analysis of scientific publications, Proceedings of the Integrating IR Technologies for Professional Search Workshop, 2013, pp. 57–64.Google Scholar
  26. 26.
    Agrawal, R., Imielinski, T., and Swami, A., Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, 1993, vol. 22, pp. 207–216.CrossRefGoogle Scholar
  27. 27.
    Blei, D.M., Probabilistic topic models, Commun. ACM, 2012, vol. 55, no. 4, pp. 77–84.CrossRefGoogle Scholar
  28. 28.
    Devyatkin, D.A., Suvorov, R.E., and Sochenkov, I.V., A method of thematic clustering of large-scale collections of scientific and technical documents, Inf. Tekhnol. Vychisl. Sist., 2013, no. 1, pp. 33–42.Google Scholar
  29. 29.
    Haklay, M. and Weber, P., Openstreetmap: User-generated street maps, Pervasive Comput., 2008, vol. 7, no. 4, pp. 12–18.CrossRefGoogle Scholar
  30. 30.
    Titan: Distributed Graph Database, DataStax, 2016. Cited January 20, 2016.Google Scholar
  31. 31.
    Lakshman, A. and Malik, P., Cassandra: A decentralized structured storage system, ACM SIGOPS Oper. Syst. Rev., 2010, vol. 44, no. 2, pp. 35–40.CrossRefGoogle Scholar
  32. 32.
    Joishi, J. and Sureka, A., Vishleshan: Performance Comparison and Programming Process Mining Algorithms in Graph-Oriented and Relational Database Query Languages, 2015.Google Scholar
  33. 33.
    Rodriguez, M.A., The Gremlin graph traversal machine and language (invited talk), Proceedings of the 15th Symposium on Database Programming Languages, 2015, pp. 1–10.Google Scholar
  34. 34.
    Aho, A.V. and Corasick, M.J., Efficient string matching: An aid to bibliographic search, Commun. ACM, 1975, vol. 18, no. 6, pp. 333–340.MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Al-Rfou, R., Perozzi, B., and Skiena, S., Polyglot: Distributed word representations for multilingual nlp, arXiv Preprint arXiv:1307.1662, 2013.Google Scholar

Copyright information

© Allerton Press, Inc. 2017

Authors and Affiliations

  • D. A. Devyatkin
    • 1
  • R. E. Suvorov
    • 1
  • I. V. Sochenkov
    • 1
  1. 1.Institute for Systems Analysis, Computer Science and Control Federal Research CenterRussian Academy of SciencesMoscowRussia

Personalised recommendations