Key-Phrases as Means to Estimate Birth and Death Years of Jewish Text Authors

  • Dror Mughaz
  • Yaakov HaCohen-Kerner
  • Dov Gabbay
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9398)


In this study, we try to determine the time-frame in which the author of a given document lived. The discussed documents are rabbinic documents written in the Hebrew, Aramaic and Yiddish languages. The documents are usually undated and do not contain a bibliographic section, which leaves us with an interesting challenge to determine the desired time-frame. To do this, we define a set of key-phrases and formulate various types of rules: “Iron-clad”, Heuristic and Greedy constraints, to define the time-frame. These rules are based on key-phrases and key-words in the documents of the authors. Identifying the time-frame of an author can help us determine the generation in which specific documents were written, can help in the examination of documents, i.e., to conclude if documents were edited, and can also help us identify an anonymous author. We tested these rules on two corpuses of documents, which were authored by 12 and 24 rabbinic authors, respectively, and the results are promising.


Hebrew-Aramaic documents Key-phrases Key-words Knowledge discovery Time analysis Undated documents Undated references 


  1. 1.
    Wintner, S.: Hebrew computational linguistics: past and future. Artif. Intell. Rev. 21(2), 113–138 (2004)CrossRefzbMATHGoogle Scholar
  2. 2.
    HaCohen-Kerner, Y., Kass, A., Peretz, A.: HAADS: a Hebrew Aramaic abbreviation disambiguation system. J. Am. Soc. Inf. Sci. Technol. JASIST 61(9), 1923–1932 (2010)CrossRefGoogle Scholar
  3. 3.
    Gutwin, C., Paynter, G., Witten, I., Nevill-Manning, C., Frank, E.: Improving browsing in digital libraries with key-phrase indexes. Decis. Support Syst. 27(1), 81–104 (1999)CrossRefGoogle Scholar
  4. 4.
    Zhang, Y., Zincir-Heywood, N., Milios, E.: World wide web site summarization. Web Intell. Agent Syst. 2(1), 39–53 (2004)Google Scholar
  5. 5.
    Hulth, A., Megyesi, B.B.: A study on automatically extracted key-words in text categorization. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the ACL, pp. 537–544 (2006)Google Scholar
  6. 6.
    Kim, S.N., Baldwin, T.: Extracting key-words from multi-party live chats. In: Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation, pp. 199–208 (2012)Google Scholar
  7. 7.
    Berend, G.: Opinion expression mining by exploiting key-phrase extraction. In: IJCNLP, pp. 1162–1170 (2011)Google Scholar
  8. 8.
    Liu, Z., Huang, W., Zheng, Y., Sun, M.: Automatic key-phrase extraction via topic decomposition. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, ACL, pp. 366–376 (2010)Google Scholar
  9. 9.
    Hasan, K.S., Ng, V.: Conundrums in unsupervised key-phrase extraction: making sense of the state-of-the-art. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, ACL, pp. 365–373 (2010)Google Scholar
  10. 10.
    Medelyan, O., Frank, E., Witten, I.H.: Human-competitive tagging using automatic key-phrase extraction. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3, ACL, pp. 1318–1327 (2009)Google Scholar
  11. 11.
    Kim, S.N., Medelyan, O., Kan, M.Y., Baldwin, T.: Automatic key-phrase extraction from scientific articles. Lang. Resour. Eval. 47(3), 723–742 (2013)CrossRefGoogle Scholar
  12. 12.
    Yih, W.T., Goodman, J., Carvalho, V.R.: Finding advertising key-words on web pages. In: Proceedings of the 15th International Conference on World Wide Web, pp. 213–222. ACM (2006)Google Scholar
  13. 13.
    Garfield, E.: Can citation indexing be automated? In: Stevens, M. (ed.) Statistical Association Methods for Mechanical Documentation, Symposium Proceedings, National Bureau of Standards Miscellaneous Publication 269, pp. 189–142 (1965)Google Scholar
  14. 14.
    Berkowitz, E., Elkhadiri, M.R.: Creation of a style independent intelligent autonomous citation indexer to support academic research, pp. 68–73 (2004)Google Scholar
  15. 15.
    Giuffrida, G., Shek, E.C., Yang, J.: Knowledge-based metadata extraction from PostScript files. In: Proceedings of the 5th ACM Conference on Digital libraries, pp. 77–84. ACM (2000)Google Scholar
  16. 16.
    Seymore, K., McCallum, A., Rosenfeld, R.: Learning hidden markov model structure for information extraction. In: AAAI-99 Workshop on Machine Learning for Information Extraction, pp. 37–42 (1999)Google Scholar
  17. 17.
    Ritchie, A., Robertson, S., Teufel, S.: Comparing citation contexts for information retrieval. In the 17th ACM Conference on Information and Knowledge Management (CIKM), pp. 213–222 (2008)Google Scholar
  18. 18.
    Bradshaw, S.: Reference directed indexing: redeeming relevance for subject search in citation indexes. In: Koch, T., Sølvberg, I.T. (eds.) ECDL 2003. LNCS, vol. 2769, pp. 499–510. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  19. 19.
    HaCohen-Kerner, Y., Beck, H., Yehudai, E., Rosenstein, M., Mughaz, D.: Cuisine: classification using stylistic feature sets and/or name-based feature sets. J. Am. Soc. Inf. Sci. Technol. (JASIST) 61(8), 1644–1657 (2010)Google Scholar
  20. 20.
    HaCohen-Kerner, Y., Mughaz, D.: Estimating the birth and death years of authors of undated documents using undated citations. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds.) IceTAL 2010. LNCS, vol. 6233, pp. 138–149. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  21. 21.
    Mughaz, D., HaCohen-Kerner, Y., Gabbay, D.: When text authors lived using undated citations. In: Lamas, D., Buitelaar, P. (eds.) IRFC 2014. LNCS, vol. 8849, pp. 82–95. Springer, Heidelberg (2014)Google Scholar
  22. 22.
    HaCohen-Kerner, Y., Schweitzer, N., Mughaz, D.: Automatically identifying citations in Hebrew-Aramaic documents. Cybern. Syst. Int. J. 42(3), 180–197 (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Open Access This chapter is distributed under the terms of the Creative Commons Attribution Noncommercial License, which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Authors and Affiliations

  • Dror Mughaz
    • 1
    • 2
  • Yaakov HaCohen-Kerner
    • 2
  • Dov Gabbay
    • 1
    • 3
  1. 1.Department of Computer ScienceBar-Ilan UniversityRamat-GanIsrael
  2. 2.Department of Computer ScienceLev Academic CenterJerusalemIsrael
  3. 3.Department of InformaticsKings College LondonLondonUK

Personalised recommendations