Extracting Personal Concepts from Users’ Emails to Initialize Their Personal Information Models

  • Sven Schwarz
  • Frank Marmann
  • Heiko Maus
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6882)


Although the Semantic Desktop paradigm has great potential, new users have to face the cold-start problem. Having to start with empty models is a barrier to any semantic technology and filling them with world-known concepts does not work for personal models. We propose to analyze the email database of a user and extract concepts of multiple types to fill the empty PIMO. The paper presents results of the research project Semopad funded by the Stiftung Rheinland-Pfalz für Innovation under contract no. 961-386261/1001.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cavnar, W.B., Trenkle, J.M.: N-gram-based text categorization. In: Proceedings of SDAIR 1994, 3rd Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, US, pp. 161–175 (1994)Google Scholar
  2. 2.
    Cheyer, A., Park, J., Giuli, R.: Iris: Integrate. relate. infer. share. In: Decker, S., Park, J., Quan, D., Sauermann, L. (eds.) Proc. of Semantic Desktop Workshop at the ISWC, Galway, Ireland, vol. 175 (November 2005)Google Scholar
  3. 3.
    Elsayed, T., Namata, G., Getoor, L., Oard, D.W.: Personal name resolution in email: A heuristic approach (2008)Google Scholar
  4. 4.
    Elsayed, T., Oard, D.W.: Modeling identity in archival collections of email: A preliminary study. In: CEAS (2006)Google Scholar
  5. 5.
    Harris, Z.: Distributional structure. In: Fodor, J.A., Katz, J.J. (eds.) The Structure of Language, pp. 33–49. Prentice-Hall, Englewood Cliffs (1954)Google Scholar
  6. 6.
    Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th Conference on Computational linguistics, COLING 1992, vol. 2, pp. 539–545. Association for Computational Linguistics, Stroudsburg (1992)CrossRefGoogle Scholar
  7. 7.
    Lampert, A., Dale, R., Paris, C.: Segmenting email message text into zones. In: EMNLP 2009: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 919–928. Association for Computational Linguistics, Morristown (2009)Google Scholar
  8. 8.
    Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)MATHGoogle Scholar
  9. 9.
    Osterfeld, F.: Ein lernfähiges System zur Akquisition und Wartung von persönlichen Informationsmodellen. Master’s thesis, Fachbereich Informatik, University of Kaiserslautern (2006)Google Scholar
  10. 10.
    Rau, L.F.: Extracting company names from text. In: Proc. of the Seventh Conference on Artificial Intelligence Applications CAIA 1991, Miami Beach, FL. vol. II: Visuals, pp. 189–194 (1991)Google Scholar
  11. 11.
    Sauermann, L., van Elst, L., Dengel, A.: PIMO - A Framework for Representing Personal Information Models. In: Pellegrini, T., Schaffert, S. (eds.) Proceedings of I-MEDIA 2007 and I-SEMANTICS 2007 International Conferenceson New Media Technology and Semantic Systems as part of TRIPLE-I 2007, J.UCS, pp. 270–277. Know-Center, Austria (2007)Google Scholar
  12. 12.
    Shafer, G.: A Mathematical Theory of Evidence. Princeton University Press, Princeton (1976)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Sven Schwarz
    • 1
  • Frank Marmann
    • 2
  • Heiko Maus
    • 1
  1. 1.Deutsches Forschungszentrum für Künstliche Intelligenz GmbHKaiserslauternGermany
  2. 2.CapgeminiDüsseldorfGermany

Personalised recommendations