Abstract
Accurate results for queries of keyword-based search engines are hard to come by as the queries may return irrelevant URLs even though the given keyword is present in them and some relevant URLs may be lost as they may have the synonym of the keyword and not the original one. The proposed algorithm provides solutions for these problems by making use of the Search Engine Result Pages (SERPs) to generate a ranked list of candidate synonyms for individual keywords, where the relevance of the URLs depicted by these synonyms is proved by comparing it with the URL depicted by the original keyword. This scalable technique can be applied to online data on the dynamic, domain-independent and unstructured World Wide Web. The candidate synonyms are ranked using Co-occurrence Frequencies and various page count based measures. The experimental results show that the best results are obtained using the proposed algorithm with WebJaccard.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
T. Cheng, H.W. Lauw, S. Paparizos, Entity synonyms for structured web search. IEEE Trans. Knowl. Data Eng. 24(10), 1862–1873 (2012)
M. Harada, S. Sato, K. Kazama, Finding authoritative people from the web, in Proceedings of the Joint ACM/IEEE Conference on Digital Libraries (JCDL04) (2004), pp. 306–313
D.V. Dmitri, Kalashnikov, Z. Chen (Stella), S. Mehrotra, R. Nuray-Turan, Web people search \(via\) connection analysis. IEEE Trans. Knowl. Data Eng. 20(11), 1–16 (2008)
E. Lefever, T. Fayruzov, V. Hoste, M. De Cock, Fuzzy ants clustering for web people search, in 2nd Web People Search Evaluation Workshop (WePS 2009), Madrid, Spain (2009)
K. Balog , L. Azzopardi, L. Azzopardi, Personal name resolution of web people search, in NLPIX2008, Beijing, China, 22 April 2008
D. Shen, T. Walker, Z. Zheng, Q. Yang, Y. Li, Personal name classification in web queries, in WSDM 08, Palo Alto, California, USA (2008), pp. 149–158
L. Jiang, J. Wang, N. An, S. Wang, J. Zhan, L. Li, GRAPH: a graph-based framework for disambiguating people appearances in web search, in 9th IEEE International Conference on Data Mining (2009), pp. 199–208
E. Smirnova, K. Avrachenkov, B. Trousse, Using web graph structure for person name disambiguation, in 3rd Web People Search Evaluation Forum (WePS-3), CLEF Conference (2010)
D. Bollegala, Y. Matsuo, M. Ishizuka, A co-occurrence graph-based approach for personal name alias extraction from anchor texts, in International Joint Conference on Natural Language Processing (2008), pp. 865–870
D. Bollegala, Y. Matsuo, M. Ishizuka, Automatic discovery of personal name aliases from the web. IEEE Trans. Knowl. Data Eng. 23(6), 831–844 (2011)
Q. Shen, T. Boongoen, Fuzzy orders-of-magnitude-based link analysis for qualitative alias detection. IEEE Trans. Knowl. Data Eng. 24(4), 649–662 (2012)
Y. Kawai, T. Yoshikawa, T. Furuhashi, E. Hiraoy, A. Kunoy, T. Gotohy, A study on extraction method of synonyms in specification documents, in 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (2012), pp. 321–324
A. Simanovsky, A. Ulanov, Mining text patterns for synonyms extraction, in 22nd International Workshop on Database and Expert Systems Applications (2011), pp. 473–477
J. Niemi, K. Linden, M. Hyvarinen, Using a bilingual resource to add synonyms to a wordnet: FinnWordNet and wikipedia as an example, in 6th International Global Wordnet Conference, Matsue Japan (2012), pp. 227–231
L. van der Plas, J. Tiedemann, Finding synonyms using automatic word alignment and measures of distributional similarity, in Annual Meeting of the Association of Computational Linguistics (2006), pp. 866–873
K. Takeuchi, Extraction of verb synonyms using co-clustering approach, in 2nd International Symposium on Universal Communication (2008), pp. 173–178
R. Ageishi, T. Miura, Automatic extraction of synonyms based on statical machine translation, in 22nd International Conference on Tools with Artificial Intelligence (2010), pp. 313–317
Y. Li, Z.A. Bandar, D. McLean, An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans. Knowl. Data Eng. 15(4), 871–882 (2003)
E. Iosif, A. Potamianos, Unsupervised semantic similarity computation between terms using web documents. IEEE Trans. Knowl. Data Eng. 22(11), 1637–1647 (2010)
Y. Li, D. McLean, Z.A. Bandar, J.D. OShea, K. Crockett, Sentence similarity based on semantic nets and corpus statistics. IEEE Trans. Knowl. Data Eng. 18(8), 1138–1150 (2006)
L. Han, T. Finin, P. McNamee, A. Joshi, Y. Yesha, Improving word similarity by augmenting PMI with estimates of word polysemy. IEEE Trans. Knowl. Data Eng. 25(6), 1307–1322 (2013)
D. Bollegala, Y. Matsuo, M. Ishizuka, A web search engine-based approach to measure semantic similarity between words. IEEE Trans. Knowl. Data Eng. 23(7), 977–990 (2011)
D. Bollegala, Y. Matsuo, M. Ishizuka, Minimally supervised novel relation extraction using a latent relational mapping. IEEE Trans. Knowl. Data Eng. 25(2), 419–432 (2013)
R.L. Cilibrasi, P.M.B. Vitanyi, The google similarity distance. IEEE Trans. Knowl. Data Eng. 19(3), 370–383 (2007)
Z. Liu, J. Liu, W. Yao, C. Wang, Keyword extraction using PageRank on synonym networks, in ICEEE, Henan (2010), pp. 1–4
S.J. Green, Building hypertext links by computing semantic similarity. IEEE Trans. Knowl. Data Eng. 11(5), 713–730 (1999)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Venugopal, K.R., Srikantaiah, K.C. (2020). Automatic Discovery and Ranking of Synonyms for Search Keywords in the Web. In: Web Recommendations Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-2513-1_4
Download citation
DOI: https://doi.org/10.1007/978-981-15-2513-1_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-2512-4
Online ISBN: 978-981-15-2513-1
eBook Packages: Computer ScienceComputer Science (R0)