Abstract
We present a learning to rank approach to classify folktales, such as fairy tales and urban legends, according to their story type, a concept that is widely used by folktale researchers to organize and classify folktales. A story type represents a collection of similar stories often with recurring plot and themes. Our work is guided by two frequently used story type classification schemes. Contrary to most information retrieval problems, the text similarity in this problem goes beyond topical similarity. We experiment with approaches inspired by distributed information retrieval and features that compare subject-verb-object triplets. Our system was found to be highly effective compared with a baseline system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abello, J., Broadwell, P., Tangherlini, T.R.: Computational folkloristics. Communications of the ACMĀ 55(7), 60ā70 (2012)
Allan, J.: Topic detection and tracking. Introduction to topic detection and tracking, pp. 1ā16. Kluwer Academic Publishers, Norwell (2002)
Allan, J., Lavrenko, V., Malin, D., Swan, R.: Detections, bounds, and timelines: Umass and TDT-3. In: Proceedings of Topic Detection and Tracking Workshop, TDT-3 (2000)
Androutsopoulos, I., Malakasiotis, P.: A survey of paraphrasing and textual entailment methods. Journal of Artificial Intelligence ResearchĀ 38(1), 135ā187 (2010)
Bendersky, M., Croft, W.B.: Finding text reuse on the web. In: WSDM 2009, pp. 262ā271 (2009)
Brunvand, J.H.: A type index of urban legends. Encyclopedia of Urban Legends. Updated and expanded edition, pp. 741ā765 (2012)
Callan, J.P., Lu, Z., Croft, W.B.: Searching distributed collections with inference networks. In: SIGIR 1995, pp. 21ā28 (1995)
Ceran, B., Karad, R., Mandvekar, A., Corman, S.R., Davulcu, H.: A semantic triplet based story classifier. In: ASONAM 2012 (2012)
Clough, P.: Old and new challenges in automatic plagiarism detection. National Plagiarism Advisory Service (2003)
Clough, P., Gaizauskas, R., Piao, S.S.L., Wilks, Y.: METER: MEasuring TExt Reuse. In: ACL 2002, pp. 152ā159 (2002)
Fisseni, B., Lƶwe, B.: Which dimensions of narrative are relevant for human judgments of story equivalence? In: The Third Workshop on Computational Models of Narrative (2012)
Friedland, L., Allan, J.: Joke retrieval: recognizing the same joke told differently. In: CIKM 2008, pp. 883ā892 (2008)
de Jong, F.M.G., Oard, D.W., Heeren, W.F.L., Ordelman, R.J.F.: Access to recorded interviews: A research agenda. ACM Journal on Computing and Cultural Heritage (JOCCH) 1(1), 3:1ā3:27 (2008)
Kipper-Schuler, K.: VerbNet: a broad-coverage, comprehensive verb lexicon. Ph.D. thesis, University of Pennsylvania (2005)
La Barre, K.A., Tilley, C.L.: The elusive tale: leveraging the study of information seeking and knowledge organization to improve access to and discovery of folktales. Journal of the American Society for Information Science and TechnologyĀ 63(4), 687ā701 (2012)
Lavrenko, V., Allan, J., DeGuzman, E., LaFlamme, D., Pollard, V., Thomas, S.: Relevance models for topic detection and tracking. In: HLT 2002, pp. 115ā121 (2002)
Liu, T.Y.: Learning to Rank for Information Retrieval. Foundations and Trends in Information Retrieval. Springer (2011)
Meder, T.: From a Dutch Folktale Database towards an International Folktale Database. FabulaĀ 51(1-2), 6ā22 (2010)
Metzler, D., Bernstein, Y., Croft, W.B., Moffat, A., Zobel, J.: Similarity measures for tracking information flow. In: CIKM 2005, pp. 517ā524 (2005)
Nawab, R.M.A., Stevenson, M., Clough, P.: Retrieving Candidate Plagiarised Documents Using Query Expansion. In: Baeza-Yates, R., de Vries, A.P., Zaragoza, H., Cambazoglu, B.B., Murdock, V., Lempel, R., Silvestri, F. (eds.) ECIR 2012. LNCS, vol.Ā 7224, pp. 207ā218. Springer, Heidelberg (2012)
Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Lioma, C.: Terrier: A High Performance and Scalable Information Retrieval Platform. In: SIGIR 2006 Workshop on Open Source Information Retrieval, OSIR 2006 (2006)
Sculley, D.: Large scale learning to rank. In: NIPS 2009 Workshop on Advances in Ranking (2009)
Si, L., Jin, R., Callan, J., Ogilvie, P.: A language modeling framework for resource selection and results merging. In: CIKM 2002, pp. 391ā397 (2002)
Thompson, S.: The folktale. Dryden Press (1951)
Uther, H.J.: The Types of International Folktales: A Classification and Bibliography Based on the System of Antti Aarne and Stith Thompson, vol.Ā 1-3. Suomalainen Tiedeakatemia, Helsinki (2004)
Uther, H.J.: Type- and motif-indices 1980-1995: An inventory. Asian Folklore StudiesĀ 55(2) (1996)
Uther, H.J.: Classifying tales: Remarks to indexes and systems of ordering. Folks Art - Croatian Journal Of Ethnology and Folklore Research (2009)
Van Den Bosch, A., Busser, B., Canisius, S., Daelemans, W.: An efficient memory-based morphosyntactic tagger and parser for Dutch. In: Computational Linguistics in the Netherlands: Selected Papers from the Seventeenth CLIN Meeting, pp. 99ā114. OTS (2007)
Vossen, P., Hofmann, K., Rijke, M., Tjong, E., Sang, K., Deschacht, K.: The Cornetto database: Architecture and user-scenarios. In: DIR 2007 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nguyen, D., Trieschnigg, D., Theune, M. (2013). Folktale Classification Using Learning to Rank. In: Serdyukov, P., et al. Advances in Information Retrieval. ECIR 2013. Lecture Notes in Computer Science, vol 7814. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36973-5_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-36973-5_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36972-8
Online ISBN: 978-3-642-36973-5
eBook Packages: Computer ScienceComputer Science (R0)