Skip to main content
Log in

Semantic association ranking schemes for information retrieval applications using term association graph representation

  • Published:
Sadhana Aims and scope Submit manuscript

Abstract

Most of the Information Retrieval (IR) techniques are based on representing the documents using the traditional vector space and probabilistic language model i.e., bag-of- words model. In this paper, associations among words in the documents are assessed and it is expressed in Term Association Graph model to represent the document content and the relationship among the keywords. Earlier attempt on exploiting term association graph was done for non-personalized document re-ranking task. This paper experiments improved non-personalized and personalized re-ranking strategy which exploits term association graph data structure to assess the importance of a document for the user query and thus documents are re-ranked according to the association and similarity exists among the documents. This paper proposes various approaches under two models namely, Term Rank based Approach (TRA) and Path Traversal based Approaches (PTA1, PTA2, and PTA3). These approaches employ term association graph and has been evaluated using manually prepared real dataset and benchmark OHSUMED dataset. The results obtained are reasonably promising.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16

Similar content being viewed by others

References

  • Agrawal R and Srikant R 1994 Fast algorithm for mining association rules. In: Proc. 20th Intl. Conf. VLDB, 487–499, ACM

  • Baeza-Yates R and Ribeiro-Neto B 1999 Modern information retrieval. Addison Wesley: ACM Press

    Google Scholar 

  • Berger A and Lafferty J 1999 Information retrieval as statistical translation, In: Proc. SIGIR, 222–229, ACM

  • Blanco R and Lioma C 2007 Random Walk Term Weighting for Information Retrieval. In: Proc. SIGIR, 829–830, ACM

  • Blanco R and Lioma C 2012 Graph-based term weighting for information retrieval. Springer Information Retrieval 15 (1): 54–92

    Article  Google Scholar 

  • Blondel V D, Gajardo A, Heymans M, Senellart P and Dooren P V 2004 A measure of similarity between graph vertices: Applications to synonym extraction and web searching. SIAM Rev. 46 (4): 647–666

    Article  MATH  MathSciNet  Google Scholar 

  • Boldi P, Bonchi F, Castillo C, Donato D, Gionis A and Vigna S 2008 The query-flow graph: model and applications. In: Proc. ACM CIKM, pp. 609–618

  • Brin S and Page L 1998 The Anatomy of a Large-Scale Hypertextual Web Search Engine. In: Proc. 7th International World Wide Web Conference (WWW), pp. 107–117

  • Carpineto C and Romano G 2009 ODP239 dataset, http://credo.fub.it/odp239/

  • Craswell N and Szummer M 2007 Random Walks on the Click Graph, In: ACM SIGIR, pp. 239–246, ACM

  • Croft W B and Lafferty J 2010 Language Modeling for information retrieval. Kluwer Academic Publishers, Springer Netherlands

  • Eirinaki M and Vazirgiannis M 2005 UPR:Usage-based page ranking for web persoanalization. In: Proc. 5th IEEE Intl. Conf. on Data Mining (ICDM), pp. 130–137

  • Han J and Kamber M 2006 Data mining concepts and techniques, Morgan Kaufmann publishers, Elsevier San Francisco, Second edition pp. 227–232

  • Haveliwala T H 2003 Topic-sensitive PageRank: A context-sensitive ranking algorithm for web search. IEEE Trans. Knowl. Data Eng. 15 (4): 784–796

    Article  Google Scholar 

  • Hersh W, Buckley C, Leone T J and Hickam D 1994 OHSUMED: An interactive retrieval evaluation and new large test collection for research, In: Proc. 17th Annual SIGIR Conference, pp. 192–201, ACM

  • Hofmann T 1999 Probabilistic latent semantic indexing, In: Proc. ACM SIGIR, pp. 50–57

  • Jain A and Mishne G 2010 Organizing query completions for web search. ACM Intl. Conf. on Information and Knowledge Management (CIKM): 1169–1178

  • Jarvelin K and Kekalainen J 2000 IR evaluation methods for retrieving highly relevant documents. In: Proc. SIGIR, pp. 41–48, ACM

  • Jarvelin K and Kekalainen J 2002 Cumulated Gain-Based Evaluation of IR Techniques. ACM Trans. Inf. Syst. 20 (4): 422–446

    Article  Google Scholar 

  • Kleinberg J M 1999 Authoritative Sources in a Hyperlinked Environment. J. ACM 46 (5): 604–632

    Article  MATH  MathSciNet  Google Scholar 

  • Koopman B, Zuccon G, Bruza P, Sitbon L and Lawley M 2012 Graph-based Concept Weighting for Medical Information Retrieval. In: Proc. 17th Australasian Document Computing Symposium (ADCS), pp. 80–87

  • Lafferty J and Zhai C 2001 Document language models, query models, and risk minimization for information retrieval. In: Proc. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 111–119

  • Leicht E A, Holme P and Newman M E J 2006. Physical Review Vertex similarity in networks

  • Leung K W T and Lee D L 2010 Deriving Concept-based User profiles from Search Engine Logs. IEEE Trans. Knowl. Data Eng. 22 (7): 969–982

    Article  Google Scholar 

  • Ma H, King I and Lyu M R -T 2012 Mining Web Graphs for Recommendations. IEEE Trans. Knowl. Data Eng. 24 (6): 1051–1064

    Article  Google Scholar 

  • Manning C D, Raghavan P and Schutze H 2008 Introduction to Information Retrieval. Cambridge University Press London, pp. 151–168

  • Masucci A P and Rodgers G J 2006 Network properties of written human language. Phys. Rev. 74 (2)

  • Mei Q, Zhou D and Church K 2008 Query suggestion using hitting time. In: Proc. 17th ACM Conf. on Information and Knowledge Management (CIKM), pp. 469–478

  • Mihalcea R and Tarau P 2004 TextRank: Bringing Order into Texts. In: Proc. Empirical Methods in Natural Language Processing. Association of Computational Linguistics (ACL), pp. 404–411

  • Montes-y-Gomez M, López-López A and Gelbukh A 2000 Information retrieval with conceptual graph matching. In: Proc. 12th Intl. Conf. Database and Expert Systems Applications, Springer LNCS, Volume 1873, 312–321

  • Nastase V, Sayyad-Shirabad J, Sokolova M and Szpakowicz S 2006 Learning noun-modifier semantic relations with corpus-based and wordnet-based features. In: American Association for Artificial Intelligence, 781–786

  • Pado S and Lapata M 2007 Dependency-based construction of semantic space models. Comput. Linquist. 33 (2): 161–199

    Article  MATH  Google Scholar 

  • Page L, Brin S, Motwani R and Winograd T 1998 The PageRank citation ranking: Bringing order to the Web. Technical report, Stanford Digital Library Technologies

  • Ponte J M and Croft W B 1998 A language modeling approach to information retrieval, In: Proc. SIGIR, ACM 275-281

  • Qin T, Liu T -Y, Xu J and Li H 2010 LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval. J. Inf. Retrieval 13 (4): 346–374

    Article  Google Scholar 

  • Robertson S E, Walker S, Jones S, Hancock-Beaulieu M M, Gatford M, Gull A and Lau M 1992 Okapi at TREC, In: Proc. Text Retrieval Conference, pp. 21–30

  • Salton G and McGill M J 1986 Introduction to modern information retrieval. New York: McGraw-Hill, pp. 98–112

  • Scott C D., Susan T Dumais, Thomas K Landauer, George W Furnas and Richard A Harshman 1990 Indexing by Latent Semantic Analysis. J. Am. Soc. Inf. Sci. 41 (6): 391–407

    Article  Google Scholar 

  • Sim K M and Wong P T 2004 Toward agency and ontology for web-based information retrieval. IEEE Trans. Systems, Man, Cybern. 34 (3): 257–269

    Article  Google Scholar 

  • Sparck J K, Walker S and Robertson S E 2000 A Probabilistic model of Information Retrieval: Development and Comparative Experiments. Inf. Process. Manag. 36 (6): 779–808

    Article  Google Scholar 

  • Veningston K and Shanmugalakshmi R 2014 Information Retrieval by Document Re-ranking using Term Association Graph In: Proc. ACM Intl. Conf. on Interdisciplinary Advances in Applied Computing (ICONIAAC), Article No. 21.

  • Viswanathan V and Ilango K 2012 Ranking semantic relationships between two entities using personalization in context specification. Elsevier Information Sciences: 35–49

  • Wang W, Do D B and Lin X 2005 Term Graph Model for Text Classification, In: Proc. Lecture Note in Artificial Intelligence, 3584: 19–30, Springer

  • Wong S K M and Raghavan V V 1984 Vector space model of Information Retrieval: A reevaluation, In: Proc. SIGIR, ACM 167–185

  • Wu Z and Palmer M 1994 Verb semantics and lexical selection. In: Proc. Annual Meeting of the Association for Computational Linguistics, 133–138

  • Yi J and Maghoul F 2009 Query Clustering using Click-Through Graph. In: Proc. 18th Intl. Conf. on World Wide Web (WWW), 1055–1056

  • Zhang B, Li H, Liu Y, Ji L, Xi W, Fan W, Chen Z and Ma W -Y 2005 Improving web search results using affinity graph, In: Proc. SIGIR, ACM 504–511

Download references

Acknowledgements

The work presented in this paper was supported and funded by the Department of Science and Technology (DST), Ministry of Science and Technology, Government of India under INSPIRE scheme. Authors thank the DST and also the anonymous reviewers for their helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K VENINGSTON.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

VENINGSTON, K., SHANMUGALAKSHMI, R. & NIRMALA, V. Semantic association ranking schemes for information retrieval applications using term association graph representation. Sadhana 40, 1793–1819 (2015). https://doi.org/10.1007/s12046-015-0413-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12046-015-0413-3

Keywords

Navigation