Skip to main content

Multilingual Lexicalisation and Population of Event Ontologies: A Case Study for Social Media

  • Chapter
  • First Online:
Towards the Multilingual Semantic Web

Abstract

We describe a semi-automatic method for ontology-driven lexical acquisition and ontology population. Our method is language independent and weakly supervised and encompasses three distributional semantics sub-algorithms to learn semantic classes, modifiers and event patterns from an unannotated text corpus. The distributional features which our algorithms use are linear contexts, extracted without any language-specific resources, apart from a list of stop words. This makes our method applicable across different languages and domains. To illustrate the feasibility of our approach, we learned lexicalizations of concepts from the domain of natural disasters in Spanish and English. Then, we populated an event micro-ontology by performing event extraction from tweets published during several big tropical storms. The evaluation showed quite promising precision, while the event extraction recall could be improved further.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Almuhareb, A., & Poesio, M. (2008). Extracting concept descriptions from the web: The importance of attributes and values. In P. Buitelaar & P. Cimiano (Eds.), Ontology learning and population. Bridging the gap between text and knowledge (pp. 29–44). Berlin: Springer.

    Google Scholar 

  • Breslin, J., Ellison, N., Shanahan, J., & Tufekci, Z. (Eds.). (2012). Proceedings of the 6th International Conference on Weblogs and Social Media (ICWSM - 12). Dublin: AAAI Press.

    Google Scholar 

  • Buitelaar, P., & Cimiano, P. (Eds.). (2008). Ontology learning and population. Bridging the gap between text and knowledge. Berlin: Springer.

    MATH  Google Scholar 

  • Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Estevam, R., Hruschka, J., & Mitchell, T. (2010). Toward an architecture for never-ending language learning. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) (pp. 1306–1313). Atlanta, Georgia.

    Google Scholar 

  • Drumond, L., & Girardi, G. (2008). A survey of ontology learning procedures. In The 3rd Workshop on Ontologies and Their Applications, Salvador, Brasil (pp. 13–25).

    Google Scholar 

  • Edwards, J. (2013). Twitter is surprisingly small compared to a bunch of other apps and online companies. Retrieved from http://www.businessinsider.com/twitter-user-base-compared-to-other-apps-and-online-companies-2013-11

  • Gruber, T. (2008). Collective knowledge systems: Where the social web meets the semantic web. Web Semantics: Science, Services and Agents on the World Wide Web, 6, 4–13.

    Article  Google Scholar 

  • Pantel, P., & Lin, D. (2002). Discovering Word senses from text, In Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 613–619). Edmonton.

    Google Scholar 

  • Piskorski, J. (2008). Ex-press - Extraction pattern recognition engine and specification suite. In Finite State Methods and Natural Language Processing: 6th International Workshop, FSMNLP 2007 (pp. 166–183). Potsdam, Germany: Univesitätsverlag.

    Google Scholar 

  • Reuter, T., & Cimiano, P. (2002). Event-based classification of social media streams. In Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, (pp. 22:1–22:8). Hong Kong.

    Google Scholar 

  • Riloff, E., & Jones, R. (2002). Learning dictionaries for information extraction by multi-level bootstrapping. In Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI 99), Orlando, FL (pp. 474–479).

    Google Scholar 

  • Tanev, H., & Magnini, B. (2008). Weakly supervised approaches for ontology population. In Ontology learning and population. Bridging the gap between text and knowledge (pp. 129–144). Berlin: Springer.

    Google Scholar 

  • Tanev, H., Zavarella, V., Kabadjov, M., Piskorski, J., Atkinson, M., & Steinberger, R. (2009). Exploiting machine learning techniques to build an event extraction system forPortuguese and Spanish. Linguamatica, 2, 55–66.

    Google Scholar 

  • Völker, J., Haase, P., & Hitzler, P. (2008). Learning expressive ontologies. In Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge (pp. 45–69). Amsterdam: IOS Press.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hristo Tanev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Tanev, H., Zavarella, V. (2014). Multilingual Lexicalisation and Population of Event Ontologies: A Case Study for Social Media. In: Buitelaar, P., Cimiano, P. (eds) Towards the Multilingual Semantic Web. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43585-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-43585-4_16

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-43584-7

  • Online ISBN: 978-3-662-43585-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics