Abstract
As the amount of news publications increases each day, so does the need for effective search algorithms. Because simple word-based approaches are inherently limited, ignoring much of the information in natural language, in this paper we propose a linguistic approach called Destiny, which utilizes this information to improve search results. The major difference from approaches that represent text as a bag-of-words is that Destiny represents sentences as graphs, with words as nodes and the grammatical relations between words as edges. The proposed algorithm is evaluated using a custom corpus of user-rated sentences and compared to a TF-IDF baseline, performs significantly better in terms of Mean Average Precision, normalized Discounted Cumulative Gain, and Spearman’s Rho.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahn, J., Brusilovsky, P., Grady, J., He, D., Syn, S.Y.: Open user profiles for adaptive news systems: help or harm? In: 16th International Conference on World Wide Web (WWW 2007), pp. 11–20. ACM (2007)
Barwise, J., Cooper, R.: Generalized quantifiers and natural language. Linguist. Philos. 4, 159–219 (1981). http://dx.doi.org/10.1007/BF00350139
Billsus, D., Pazzani, M.J.: User modeling for adaptive news access. User Model. User-Adap. Inter. 10(2–3), 147–180 (2000)
Cook, S.A.: The complexity of theorem-proving procedures. In: Third Annual ACM Symposium on Theory of Computing (STOC 1971), pp. 151–158. ACM (1971). http://doi.acm.org/10.1145/800157.805047
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V., Aswani, N., Roberts, I., Gorrell, G., Funk, A., Roberts, A., Damljanovic, D., Heitz, T., Greenwood, M.A., Saggion, H., Petrak, J., Li, Y., Peters, W.: Text Processing with GATE (Version 6), University of Sheffield Department of Computer Science (2011)
Devitt, M., Hanley, R. (eds.): The Blackwell Guide to the Philosophy of Language. Blackwell Publishing, Oxford (2006)
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Frasincar, F., Borsje, J., Levering, L.: A semantic web-based approach for building personalized news services. IJEBR 5(3), 35–53 (2009)
Java, A., Finin, T., Nirenburg, S.: SemNews: a semantic news framework. In: The Twenty-First National Conference on Artificial Intelligence and the Eighteenth Innovative Applications of Artificial Intelligence Conference (AAAI 2006), pp. 1939–1940. AAAI Press (2006)
Kilgarriff, A., Rosenzweig, J.: English SENSEVAL: report and results. In: 2nd International Conference on Language Resources and Evaluation (LREC 2000), pp. 1239–1244. ELRA (2000)
Klein, D., Manning, C.: Accurate unlexicalized parsing. In: 41st Meeting of the Association for Computational Linguistics (ACL 2003), pp. 423–430. ACL (2003)
Lopez, V., Uren, V., Motta, E., Pasin, M.: AquaLog: an ontology-driven question answering system as an interface to the semantic web. J. Web Semant. 5(2), 72–105 (2007)
McGregor, J.J.: Backtrack search algorithms and the maximal common subgraph problem. Softw. Pract. Experience 12(1), 23–34 (1982)
Porter, M.F.: An algorithm for suffix stripping. In: Readings in Information Retrieval, pp. 313–316. Morgan Kaufmann Publishers Inc. (1997)
Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill, Maidenherd (1983)
Schouten, K., Ruijgrok, P., Borsje, J., Frasincar, F., Levering, L., Hogenboom, F.: A Semantic web-based approach for personalizing news. In: ACM Symposium on Applied Computing (SAC 2010), pp. 854–861. ACM (2010)
Schouten, K., Frasincar, F.: A linguistic graph-based approach for web news sentence searching. In: Decker, H., Lhotská, L., Link, S., Basl, J., Tjoa, A.M. (eds.) DEXA 2013, Part II. LNCS, vol. 8056, pp. 57–64. Springer, Heidelberg (2013)
Ullmann, J.R.: An algorithm for subgraph isomorphism. J. ACM 23(1), 31–42 (1976)
Acknowledgment
The authors are partially supported by the Dutch national program COMMIT.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Schouten, K., Frasincar, F. (2016). Web News Sentence Searching Using Linguistic Graph Similarity. In: Arnicans, G., Arnicane, V., Borzovs, J., Niedrite, L. (eds) Databases and Information Systems. DB&IS 2016. Communications in Computer and Information Science, vol 615. Springer, Cham. https://doi.org/10.1007/978-3-319-40180-5_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-40180-5_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40179-9
Online ISBN: 978-3-319-40180-5
eBook Packages: Computer ScienceComputer Science (R0)