A File Search Method Based on Intertask Relationships Derived from Access Frequency and RMC Operations on Files

  • Yi Wu
  • Kenichi Otagiri
  • Yousuke Watanabe
  • Haruo Yokota
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6860)

Abstract

The tremendous growth in the number of files stored in filesystems makes it increasingly difficult to find desired files. Traditional keyword-based search engines are incapable of retrieving files that do not include keywords. To tackle this problem, we use file-access logs to derive intertask relationships for file search. Our observations are that 1) files related to the same task are frequently used together, and 2) a set of Rename, Move, and Copy (RMC) operations tends to initiate a new task. We have implemented a system named SUGOI, which detects two types of task, FI tasks and RMC tasks, from file-access logs. An FI task corresponds to a group of files frequently accessed together. An RMC task is generated by RMC operations and then constructs a graph of intertask relationships based on the influence of RMC operations and the similarity between tasks. In utilizing detected tasks and intertask relationships, our system expands the search results of a keyword-based search engine. Experiments using actual file-access logs indicate that the proposed approach significantly improves search results.

Keywords

file-access logs desktop search full-text search task mining 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
    Chen, J., Guo, H., Wu, W., Wang, W.: iMecho: an associative memory based desktop search system. In: CIKM 2009: Proceeding of the 18th ACM Conference on Information and Knowledge Management, pp. 731–740. ACM, New York (2009)Google Scholar
  3. 3.
    Google: Google Desktop, http://desktop.google.com
  4. 4.
    Hirabayashi, M.: Hyper Estraier, http://fallabs.com/hyperestraier/
  5. 5.
  6. 6.
  7. 7.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Tech. rep. Stanford University (1998)Google Scholar
  8. 8.
    Soules, C.A.N., Ganger, G.R.: Connections: using context to enhance file search. SIGOPS Oper. Syst. Rev. 39(5), 119–132 (2005)CrossRefGoogle Scholar
  9. 9.
    Watanabe, T., Kobayashi, T., Yokota, H.: A Method for Searching Keyword-Lacking Files Based on Interfile Relationships. In: Chung, S., Herrero, P. (eds.) OTM-WS 2008. LNCS, vol. 5333, pp. 14–15. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  10. 10.
    Watanabe, T., Kobayashi, T., Yokota, H.: Searching Keyword-lacking Files Based on Latent Interfile relationship. In: Software and Data Technologies, pp. 236–244 (2010)Google Scholar
  11. 11.
    Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New Algorithms for Fast Discovery of Association Rules. In: KDD-1997 Proceedings, pp. 283–286 (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Yi Wu
    • 1
  • Kenichi Otagiri
    • 1
  • Yousuke Watanabe
    • 2
  • Haruo Yokota
    • 1
  1. 1.Department of Computer ScienceTokyo Institute of TechnologyJapan
  2. 2.Global Scientific Information and Computing CenterTokyo Institute of TechnologyJapan

Personalised recommendations