Skip to main content

Information Aggregation in an Enterprise

  • Chapter
  • First Online:
Smart Information Systems

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

  • 1642 Accesses

Abstract

In this chapter we discuss the application of a distributed information retrieval system in an enterprise environment. Focusing on the characteristics of information in enterprises such as the heterogeneity of available information, security policies, and the distributed nature of available data repositories we investigate how state-of-the-art distributed information retrieval approaches can be applied to build a distributed information retrieval system. Introducing a case study, we present an application of these techniques in an office environment that allows employees to find the most relevant documents across different data repositories without neglecting access rights. The application illustrates the advantages of using a multi-agent software infrastructure where individual components of such retrieval engine are realized by specific software agents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.microsoft.com/enterprisesearch/.

  2. 2.

    http://www.oracle.com/technetwork/search/oses/overview/index.html.

  3. 3.

    http://www.google.com/enterprise/search/.

  4. 4.

    http://nutch.apache.org/.

  5. 5.

    http://lucene.apache.org/solr/.

  6. 6.

    http://lucene.apache.org/.

  7. 7.

    http://dojotoolkit.org/.

References

  1. S. Albayrak, S. Wollny, N. Varone, A. Lommatzsch, D. Milosevic, Agent technology for personalized information filtering: the PIA-system, in Proceedings of the 2005 ACM Symposium on Applied Computing, SAC’05 (ACM, New York, 2005), pp. 54–59

    Google Scholar 

  2. J. Arguello, J. Callan, F. Diaz, Classification-based resource selection, in Proceeding of the 18th ACM Conference on Information and Knowledge Management—CIKM’09 (ACM Press, New York, 2009), p. 1277

    Google Scholar 

  3. J. Arguello, F. Diaz, J. Callan, J.F. Crespo, Sources of evidence for vertical selection, in Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 315–322 (2009)

    Google Scholar 

  4. P. Bailey, D. Hawking, B. Matson, Secure search in enterprise webs: tradeoffs in efficient implementation for document level security, in Proceedings of the 15th ACM International Conference on Information and Knowledge Management CIKM ’06 (2006)

    Google Scholar 

  5. J. Callan, Distributed information retrieval, in Advances in Information Retrieval (Kluwer Academic Publishers, 2000), pp. 127–150

    Google Scholar 

  6. J.P. Callan, Z. Lu, W.B. Croft, Searching distributed collections with inference networks, in Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR ’95 (1995)

    Google Scholar 

  7. N. Craswell, A.P. de Vries, I. Soboroff, Overview of the TREC 2005 enterprise track, in Proceedings of the Fourteenth Text REtrieval Conference, TREC 2005, Gaithersburg, Maryland 15–18 November (2005)

    Google Scholar 

  8. F. Crestani, I. Markov, Distributed information retrieval and applications, in Advances in Information Retrieval, Lecture Notes in Computer Science, vol. 7814, ed. by P. Serdyukov, P. Braslavski, S.O. Kuznetsov, J. Kamps, S. Rger, E. Agichtein, I. Segalovich, E. Yilmaz (Springer, Berlin, 2013), pp. 865–868

    Google Scholar 

  9. F. Crestani, I. Markov, Distributed information retrieval and applications, in Proceedings of ECIR, pp. 865–868 (2013)

    Google Scholar 

  10. P.B. Danzig, J. Ahn, J. Noll, K. Obraczka, Distributed indexing: a scalable mechanism for distributed information retrieval, in Proceedings of the 14th Annual SIGIR Conference (ACM Press, 1991) pp. 220–229

    Google Scholar 

  11. F. Diaz, M. Lalmas, M. Shokouhi, From federated to aggregated search, in Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR’10, pp. 910–910 (2010)

    Google Scholar 

  12. R. Fagin, R. Kumar, K.S. McCurley, Searching the workplace web, in WWW 2003 Proceedings of the 12th International Conference on World Wide Web (2003)

    Google Scholar 

  13. E. Gunadi, M. Meder, T. Plumbaum, C. Scheel, F. Hopfgartner, S. Albayrak, Distributed enterprise search using software agents, in Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems AAMAS ’14, pp. 1623–1624, Paris, France (2014)

    Google Scholar 

  14. D. Hawking, Challenges in enterprise search, in Proceedings of the 15th Australasian Database Conference, vol. 27 (2004)

    Google Scholar 

  15. D. Hawking, Enterprise search, in Modern Information Retrieval, ed. by R. Baeza-Yates, B. Ribeiro-Neto, 2nd edn. (Addison-Wesley, 2010), pp. 645–687

    Google Scholar 

  16. F. Hopfgartner, Understanding Video Retrieval (VDM, Saarbruecken, 2007)

    Google Scholar 

  17. K. Järvelin, J. Kekäläinen, Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. (TOIS) 20(4), 422–446 (2002)

    Article  Google Scholar 

  18. N.R. Jennings, M. Wooldridge, Agent-oriented software engineering. Artif. Intell. 117, 277–296 (2000)

    Article  MATH  Google Scholar 

  19. M. Klusch, S. Lodi, G. Moro, Agent-based distributed data mining: the KDEC scheme, in AgentLink, pp. 104–122 (2003)

    Google Scholar 

  20. N. Limsopatham, C. Macdonald, I. Ounis, Aggregating evidence from hospital departments to improve medical records search, in Proceedings of the 35th European Conference on Advances in Information Retrieval ECIR’13, pp. 279–291 (2013)

    Google Scholar 

  21. M. Lützenberger, T. Küster, T. Konnerth, A. Thiele, N. Masuch, A. Heßler, M. Burkhardt, J. Tonn, S. Kaiser, J. Keiser, Engineering industrial multi-agent systems—the JIAC V approach, in Proceedings of the 1st International Workshop on Engineering Multi-Agent Systems (EMAS 2013), ed. by M. Cossentino, A.E.F. Seghrouchni, M. Winikoff, pp. 160–175 (2013)

    Google Scholar 

  22. I. Markov, A. Arampatzis, F. Crestani, On CORI results merging, in Proceedings of the 35th European Conference on Advances in Information Retrieval ECIR’13, vol. 4, pp. 752–755 (2013)

    Google Scholar 

  23. M. Meder, T. Plumbaum, F. Hopfgartner, Perceived and actual role of Gamification principles, in Proceedings of the IEEE/ACM 6th International Conference on Utility and Cloud Computing UCC’13, (IEEE, 2013), pp. 488–493

    Google Scholar 

  24. M. Meder, T. Plumbaum, F. Hopfgartner, Daiknow: a Gamified enterprise bookmarking system, in Proceedings of the 36th European Conference on Information Retrieval ECIR’14 (Springer, 2014) pp. 759–762

    Google Scholar 

  25. R. Mukherjee, J. Mao, Enterprise search: tough stuff. Queue 2(2), 36–46 (2004)

    Article  Google Scholar 

  26. D. Nguyen, T. Demeester, D. Trieschnigg, D. Hiemstra, Federated search in the wild: the combined power of over a hundred search engines, in Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 1874–1878 (2012)

    Google Scholar 

  27. J.B. Odubiyi, D.J. Kocur, S.M. Weinstein, N. Wakim, S. Srivastava, C. Gokey, J. Graham, Saire—a scalable agent-based information retrieval engine, in Proceedings of the First International Conference on Autonomous Agents, AGENTS’97 (ACM, New York, 1997) pp. 292–299

    Google Scholar 

  28. M. Shokouhi, Central-rank-based collection selection in uncooperative distributed information retrieval, in Advances in Information Retrieval, Lecture Notes in Computer Science, vol. 4425, ed. by G. Amati, C. Carpineto, G. Romano (Springer, Berlin, 2007), pp. 160–172

    Google Scholar 

  29. M. Shokouhi, M. Baillie, L. Azzopardi, Updating collection representations for federated search, in Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval—SIGIR’07 (ACM Press, New York, 2007) p. 511

    Google Scholar 

  30. M. Shokouhi, L. Si, Federated search. Found. Trends\(\textregistered \) Inf. Retr. 5(1), 1–102 (2011)

    Google Scholar 

  31. M. Shokouhi, J. Zobel, Federated text retrieval from uncooperative overlapped collections, in Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR’07, pp. 495–502 (2007)

    Google Scholar 

  32. M. Shokouhi, J. Zobel, Robust result merging using sample-based score estimates. ACM Trans. Inf. Syst. 27(3), 1–29 (2009)

    Article  Google Scholar 

  33. L. Si, J. Callan, Using sampled data and regression to merge search engine results, in Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR’02 (ACM, New York, 2002) pp. 19–26

    Google Scholar 

  34. P. Thomas, M. Shokouhi, SUSHI: scoring scaled samples for server selection, in Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, Boston (2009)

    Google Scholar 

  35. M. Wooldridge, An Introduction to MultiAgent Systems, 2nd edn. (Wiley, Chichester, 2009)

    Google Scholar 

  36. J. Xu, W.B. Croft, Cluster-based language models for distributed retrieval, in Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval—SIGIR ’99 (ACM Press, New York, 1999) pp. 254–261

    Google Scholar 

  37. B. Yuwono, D.L. Lee, Server ranking for distributed text retrieval systems on the internet. in Proceedings of the Fifth International Conference on Database Systems for Advanced Applications, pp. 41–49 (1997)

    Google Scholar 

  38. H. Zhang, V. Lesser, Multi-agent based peer-to-peer information retrieval systems with concurrent search sessions, in Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS’06 (ACM, New York, 2006) pp. 305–312

    Google Scholar 

  39. K. Zhou, R. Cummins, M. Lalmas, J.M. Jose, Which vertical search engines are relevant? in WWW 2013 22nd International World Wide Web Conference (2013)

    Google Scholar 

  40. L. Zhou, Multi-agent based distributed secure information retrieval, in CMC’10 vol. 1, pp. 76–79 (2010)

    Google Scholar 

Download references

Acknowledgments

We would like to thank ITDZ Berlin for their support and cooperation in realizing the pilot project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Erwin Gunadi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Gunadi, E., Albayrak, S. (2015). Information Aggregation in an Enterprise. In: Hopfgartner, F. (eds) Smart Information Systems. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-14178-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14178-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14177-0

  • Online ISBN: 978-3-319-14178-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics