Multilingual Lexicalisation and Population of Event Ontologies: A Case Study for Social Media

Tanev, Hristo; Zavarella, Vanni

doi:10.1007/978-3-662-43585-4_16

Hristo Tanev³ &
Vanni Zavarella³

825 Accesses
1 Citations

Abstract

We describe a semi-automatic method for ontology-driven lexical acquisition and ontology population. Our method is language independent and weakly supervised and encompasses three distributional semantics sub-algorithms to learn semantic classes, modifiers and event patterns from an unannotated text corpus. The distributional features which our algorithms use are linear contexts, extracted without any language-specific resources, apart from a list of stop words. This makes our method applicable across different languages and domains. To illustrate the feasibility of our approach, we learned lexicalizations of concepts from the domain of natural disasters in Spanish and English. Then, we populated an event micro-ontology by performing event extraction from tweets published during several big tropical storms. The evaluation showed quite promising precision, while the event extraction recall could be improved further.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Almuhareb, A., & Poesio, M. (2008). Extracting concept descriptions from the web: The importance of attributes and values. In P. Buitelaar & P. Cimiano (Eds.), Ontology learning and population. Bridging the gap between text and knowledge (pp. 29–44). Berlin: Springer.
Google Scholar
Breslin, J., Ellison, N., Shanahan, J., & Tufekci, Z. (Eds.). (2012). Proceedings of the 6th International Conference on Weblogs and Social Media (ICWSM - 12). Dublin: AAAI Press.
Google Scholar
Buitelaar, P., & Cimiano, P. (Eds.). (2008). Ontology learning and population. Bridging the gap between text and knowledge. Berlin: Springer.
MATH Google Scholar
Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Estevam, R., Hruschka, J., & Mitchell, T. (2010). Toward an architecture for never-ending language learning. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) (pp. 1306–1313). Atlanta, Georgia.
Google Scholar
Drumond, L., & Girardi, G. (2008). A survey of ontology learning procedures. In The 3rd Workshop on Ontologies and Their Applications, Salvador, Brasil (pp. 13–25).
Google Scholar
Edwards, J. (2013). Twitter is surprisingly small compared to a bunch of other apps and online companies. Retrieved from http://www.businessinsider.com/twitter-user-base-compared-to-other-apps-and-online-companies-2013-11
Gruber, T. (2008). Collective knowledge systems: Where the social web meets the semantic web. Web Semantics: Science, Services and Agents on the World Wide Web, 6, 4–13.
Article Google Scholar
Pantel, P., & Lin, D. (2002). Discovering Word senses from text, In Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 613–619). Edmonton.
Google Scholar
Piskorski, J. (2008). Ex-press - Extraction pattern recognition engine and specification suite. In Finite State Methods and Natural Language Processing: 6th International Workshop, FSMNLP 2007 (pp. 166–183). Potsdam, Germany: Univesitätsverlag.
Google Scholar
Reuter, T., & Cimiano, P. (2002). Event-based classification of social media streams. In Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, (pp. 22:1–22:8). Hong Kong.
Google Scholar
Riloff, E., & Jones, R. (2002). Learning dictionaries for information extraction by multi-level bootstrapping. In Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI 99), Orlando, FL (pp. 474–479).
Google Scholar
Tanev, H., & Magnini, B. (2008). Weakly supervised approaches for ontology population. In Ontology learning and population. Bridging the gap between text and knowledge (pp. 129–144). Berlin: Springer.
Google Scholar
Tanev, H., Zavarella, V., Kabadjov, M., Piskorski, J., Atkinson, M., & Steinberger, R. (2009). Exploiting machine learning techniques to build an event extraction system forPortuguese and Spanish. Linguamatica, 2, 55–66.
Google Scholar
Völker, J., Haase, P., & Hitzler, P. (2008). Learning expressive ontologies. In Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge (pp. 45–69). Amsterdam: IOS Press.
Google Scholar

Download references

Author information

Authors and Affiliations

European Commission, Joint Research Centre, Ispra, Italy
Hristo Tanev & Vanni Zavarella

Authors

Hristo Tanev
View author publications
You can also search for this author in PubMed Google Scholar
Vanni Zavarella
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hristo Tanev .

Editor information

Editors and Affiliations

National University of Ireland, Galway, Ireland
Paul Buitelaar
Universität Bielefeld, Bielefeld, Germany
Philipp Cimiano

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tanev, H., Zavarella, V. (2014). Multilingual Lexicalisation and Population of Event Ontologies: A Case Study for Social Media. In: Buitelaar, P., Cimiano, P. (eds) Towards the Multilingual Semantic Web. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43585-4_16

Download citation

DOI: https://doi.org/10.1007/978-3-662-43585-4_16
Published: 19 August 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43584-7
Online ISBN: 978-3-662-43585-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics