Applied Intelligence

, Volume 35, Issue 3, pp 375–398 | Cite as

A large-scale distributed framework for information retrieval in large dynamic search spaces

  • Eugene SantosJr.
  • Eunice E. Santos
  • Hien NguyenEmail author
  • Long Pan
  • John Korah


One of the main problems facing human analysts dealing with large amounts of dynamic data is that important information may not be assessed in time to aid the decision making process. We present a novel distributed processing framework called Intelligent Foraging, Gathering and Matching (I-FGM) that addresses this problem by concentrating on resource allocation and adapting to computational needs in real-time. It serves as an umbrella framework in which the various tools and techniques available in information retrieval can be used effectively and efficiently. We implement a prototype of I-FGM and validate it through both empirical studies and theoretical performance analysis.


Information search and retrieval Distributed processing Multi-agent architecture Dynamic anytime processing Content analysis and indexing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bergman MK (2001) White paper: the deep web: surfacing hidden value. J Electron Publ 7(1) doi: 10.3998/3336451.0007.104
  2. 2.
    Bhatia SK, Deogun JS (1998) Conceptual clustering in information retrieval. IEEE Trans Syst Man Cybern B 28(3):427–436 CrossRefGoogle Scholar
  3. 3.
    Bowman CM, Danzig PB, Hard DR, Manber U, Schwartz MF (1995) The harvest information discovery and access system. Comput Netw ISDN Syst 28(1–2):119–125 CrossRefGoogle Scholar
  4. 4.
    Chen SM, Horng YJ (1999) Fuzzy query processing for document retrieval based on extended fuzzy concept networks. IEEE Trans Syst Man Cybern B 29(1):96–104 CrossRefGoogle Scholar
  5. 5.
    Chen SM, Horng YJ, Lee CH (2001) Document retrieval using fuzzy-valued concept networks. IEEE Trans Syst Man Cybern B 31(1):111–118 CrossRefGoogle Scholar
  6. 6.
    Cheng J, Emami R, Kerschberg L, Santos E Jr, Zhao Q, Nguyen H, Wang H, Huhns MN, Valtorta M, Dang J, Goradia HJ, Huang J, Xi S (2005) OmniSeer: a cognitive framework for user modeling, reuse of prior and tacit knowledge, and collaborative knowledge services. In: Proceedings of the 38th Hawaii international conference on system sciences Google Scholar
  7. 7.
    Coden AR, Brown EW (2006) Automatic search from streaming data. Inf Retr 9(1):95–109 CrossRefGoogle Scholar
  8. 8.
    Craswell N (2000) Methods for distributed information retrieval. PhD thesis, The Australian Nation University Google Scholar
  9. 9.
    Das S, Shuster K, Wu C, Levit I (2005) Mobile agents for distributed and heterogeneous information retrieval. Inf Retr 8(3):383–416 CrossRefGoogle Scholar
  10. 10.
    Dhyani D, Ng WK, Bhowmick SVS (2002) A survey of web metrics. ACM Comput Surv 34(4):469–503 CrossRefGoogle Scholar
  11. 11.
    Foster I, Kesselman C, Tuecke S (2001) The anatomy of the grid: enabling scalable virtual organizations. Int J High Perform Comput Appl 15(3):200–222 CrossRefGoogle Scholar
  12. 12.
    Grossman DA, Frieder O (2004) Information retrieval: algorithms and heuristics. The Kluwer international series on information retrieval. Kluwer Academic, Dordrecht zbMATHGoogle Scholar
  13. 13.
    Herlocker JL, Konstan JA, Terveen LG, Riedl JT (2004) Evaluating collaborative filtering recommender systems. ACM Trans Inf Syst 22(1):5–53 CrossRefGoogle Scholar
  14. 14.
    Hu WC, Chen Y, Schmalz MS, Ritter GX (2001) An overview of world wide web search technologies. In: Proceedings of the fifth world multi conference on system, cybernetics and informatics, pp 356–361 Google Scholar
  15. 15.
    Kshemkalyani AD, Singhal M (2008) Distributed computing: principles, algorithms, and systems. Cambridge University Press, Cambridge zbMATHGoogle Scholar
  16. 16.
    Meng WY, Yu C, Liu K-L (2002) Building efficient and effective metasearch engines. ACM Comput Surv 34(1):48–89 CrossRefGoogle Scholar
  17. 17.
    Montes-y-Gómez M, Gelbukh A, Lópes-López A (2000) Comparison of conceptual graphs. In: Proceeding of MICAI-2000—1st Mexican international conference on artificial intelligence. Acapulco, Mexico Google Scholar
  18. 18.
    Nguyen H, Santos E Jr (2007) Effects of prior knowledge on the effectiveness of a hybrid user model for information retrieval. In: Proceedings of the SPIE: defense & security symposium, vol 6536, Orlando, FL Google Scholar
  19. 19.
    Nguyen H, Santos E Jr, Zhao Q, Lee C (2004) Evaluation of effects on retrieval performance for an adaptive user model. In: Adaptive Hypermedia 2004: workshop proceedings—part I, Eindhoven, The Netherlands, pp 193–202 Google Scholar
  20. 20.
    Nguyen H, Santos E Jr, Zhao Q, Wang H (2004) Capturing user intent for information retrieval. In: Proceedings of the 48th annual meeting of the human factors and ergonomics society (HFES 2004), New Orleans, LA, pp 371–375 Google Scholar
  21. 21.
    Pazzani M, Nguyen L, Mantik S (1995) Learning from hotlists and coldlists: towards a WWW information filtering and seeking agent. In: Proceedings of the IEEE international conference on tools with AI, pp 39–46 Google Scholar
  22. 22.
    Salton G, McGill M (1983) Introduction to modern information retrieval. McGraw-Hill Book, New York zbMATHGoogle Scholar
  23. 23.
    Santos E Jr, Mohamed A, Zhao Q (2004) Automatic evaluation of summaries using document graphs. In: Proceedings of the 42nd annual meeting of the association for computational linguistics (ACL 2004) workshop on text summarization branches out, Barcelona, Spain, pp 66–73 Google Scholar
  24. 24.
    Santos E Jr, Nguyen H, Brown SM (2001) Kavanah: an active user interface information retrieval application. In: Proceedings of the 2nd Asia-pacific conference on intelligent agent technology, pp 412–423 Google Scholar
  25. 25.
    Santos E Jr, Nguyen H, Zhao Q, Pukinskis E, (2003) Empirical evaluation of adaptive user modeling in a medical information retrieval application. In: Brusilovsky P, Corbett A, de Rosis F. (eds) Lecture notes in artificial intelligence. User Modeling 2003, vol 2702. Springer, Berlin, pp 292–296 Google Scholar
  26. 26.
    Santos E Jr, Nguyen H, Zhao Q, Wang H (2003) User modeling for intent prediction in information analysis. In: Proceedings of the 47th annual meeting for the human factors and ergonomics society (HFES-03), Denver, CO, pp 1034–1038 Google Scholar
  27. 27.
    Santos E Jr, Santos EE, Nguyen H, Pan L, Korah J (2005) Large-scale distributed foraging, gathering, and matching for information retrieval: assisting the geospatial intelligent analyst. In: Proceedings of the SPIE: defense & security symposium, vol 5803, pp 66–77 Google Scholar
  28. 28.
    Santos E Jr, Santos EE, Nguyen H, Pan L, Korah J, Zhao Q, Pittkin M (2006) Information retrieval in highly dynamic search spaces. In: Proceedings of the SPIE: defense & security symposium, Orlando, FL, vol 6229, pp 1–12 Google Scholar
  29. 29.
    Santos E Jr, Santos EE, Nguyen H, Pan L, Korah J, Zhao Q, Xia H (2007) Applying I-FGM to image retrieval and an I-FGM system performance analyses. In: Proceedings of the SPIE: defense & security symposium, vol 6560 Google Scholar
  30. 30.
    Santos E Jr, Zhao Q, Nguyen H, Wang H (2005) Impacts of user modeling on personalization of information retrieval: an evaluation with human intelligence analysts. In: Weibelzahl S, Paramythis A, Masthoff J (eds) Proceedings of the fourth workshop on the evaluation of adaptive systems (held in conjunction with the 10th International Conference on User Modeling (UM-05)), Edinburgh, UK, pp 27–36 Google Scholar
  31. 31.
    Santos E Jr, Santos E, Nguyen H, Pan L, Korah J, Xia H (2008) I-FGM as a real time information retrieval tool for E-governance. Int J Electr Governm Res 4(1):14–25. Special issue: E-government technologies for managing national security and defense CrossRefGoogle Scholar
  32. 32.
    Selberg E, Etzioni O (1995) Multi-service search and comparison using the MetaCrawler. In: Proceedings of the fourth world wide web conference, pp 195–208 Google Scholar
  33. 33.
    Sleator DD, Temperley D (1993) Parsing English with a link grammar. In: Proceedings of the 3rd international workshop on parsing technologies, pp 277–292 Google Scholar
  34. 34.
    Segaran T (2007) Programming collective intelligence. Building Smart Web 2.0 Applications. O’Reilly Media Google Scholar
  35. 35.
    Song F, Croft WB (1999) A general language model for information retrieval. In: Proceedings of eighth international conference on information and knowledge management, pp 279–280 Google Scholar
  36. 36.
    Suan NM (2004) Semi-automatic taxonomy for efficient information searching. In: Proceedings second international conference information technology for application Google Scholar
  37. 37.
    Tanaka H, Kumano T, Uratani N, Ehara T (1999) An efficient document clustering algorithm and its application to a document browser. Inf Process Manag 35:541–557 CrossRefGoogle Scholar
  38. 38.
    Text REtrieval Conference (TREC) see
  39. 39.
    Verton D (2003) IT deficiencies blamed in part for Pre-9/11 intelligence failure. Computerworld 37(30):12 Google Scholar
  40. 40.
    Yates RB, Neto BR (1999) Modern information retrieval. Addison Wesley, Reading Google Scholar
  41. 41.
    Zobel J, Moffat A (2006) Inverted files for text search engines. ACM Comput Surv 38(2). doi: 10.1145/1132956.1132959

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Eugene SantosJr.
    • 1
  • Eunice E. Santos
    • 2
  • Hien Nguyen
    • 3
    Email author
  • Long Pan
    • 4
  • John Korah
    • 2
  1. 1.Thayer School of EngineeringDartmouth CollegeHanoverUSA
  2. 2.Department of Computer ScienceUniversity of Texas at El PasoEl PasoUSA
  3. 3.Mathematical and Computer Sciences DepartmentUniversity of WisconsinWhitewaterUSA
  4. 4.Department of Computer ScienceVirginia Polytechnic Institute & State UniversityBlacksburgUSA

Personalised recommendations