Extracting Personal Concepts from Users’ Emails to Initialize Their Personal Information Models

  • Sven Schwarz
  • Frank Marmann
  • Heiko Maus
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6882)


Although the Semantic Desktop paradigm has great potential, new users have to face the cold-start problem. Having to start with empty models is a barrier to any semantic technology and filling them with world-known concepts does not work for personal models. We propose to analyze the email database of a user and extract concepts of multiple types to fill the empty PIMO. The paper presents results of the research project Semopad funded by the Stiftung Rheinland-Pfalz für Innovation under contract no. 961-386261/1001.


Noun Phrase Computational Linguistics Concept Type Semantic Technology Personal Concept 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cavnar, W.B., Trenkle, J.M.: N-gram-based text categorization. In: Proceedings of SDAIR 1994, 3rd Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, US, pp. 161–175 (1994)Google Scholar
  2. 2.
    Cheyer, A., Park, J., Giuli, R.: Iris: Integrate. relate. infer. share. In: Decker, S., Park, J., Quan, D., Sauermann, L. (eds.) Proc. of Semantic Desktop Workshop at the ISWC, Galway, Ireland, vol. 175 (November 2005)Google Scholar
  3. 3.
    Elsayed, T., Namata, G., Getoor, L., Oard, D.W.: Personal name resolution in email: A heuristic approach (2008)Google Scholar
  4. 4.
    Elsayed, T., Oard, D.W.: Modeling identity in archival collections of email: A preliminary study. In: CEAS (2006)Google Scholar
  5. 5.
    Harris, Z.: Distributional structure. In: Fodor, J.A., Katz, J.J. (eds.) The Structure of Language, pp. 33–49. Prentice-Hall, Englewood Cliffs (1954)Google Scholar
  6. 6.
    Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th Conference on Computational linguistics, COLING 1992, vol. 2, pp. 539–545. Association for Computational Linguistics, Stroudsburg (1992)CrossRefGoogle Scholar
  7. 7.
    Lampert, A., Dale, R., Paris, C.: Segmenting email message text into zones. In: EMNLP 2009: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 919–928. Association for Computational Linguistics, Morristown (2009)Google Scholar
  8. 8.
    Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)zbMATHGoogle Scholar
  9. 9.
    Osterfeld, F.: Ein lernfähiges System zur Akquisition und Wartung von persönlichen Informationsmodellen. Master’s thesis, Fachbereich Informatik, University of Kaiserslautern (2006)Google Scholar
  10. 10.
    Rau, L.F.: Extracting company names from text. In: Proc. of the Seventh Conference on Artificial Intelligence Applications CAIA 1991, Miami Beach, FL. vol. II: Visuals, pp. 189–194 (1991)Google Scholar
  11. 11.
    Sauermann, L., van Elst, L., Dengel, A.: PIMO - A Framework for Representing Personal Information Models. In: Pellegrini, T., Schaffert, S. (eds.) Proceedings of I-MEDIA 2007 and I-SEMANTICS 2007 International Conferenceson New Media Technology and Semantic Systems as part of TRIPLE-I 2007, J.UCS, pp. 270–277. Know-Center, Austria (2007)Google Scholar
  12. 12.
    Shafer, G.: A Mathematical Theory of Evidence. Princeton University Press, Princeton (1976)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Sven Schwarz
    • 1
  • Frank Marmann
    • 2
  • Heiko Maus
    • 1
  1. 1.Deutsches Forschungszentrum für Künstliche Intelligenz GmbHKaiserslauternGermany
  2. 2.CapgeminiDüsseldorfGermany

Personalised recommendations