Abstract
We describe a semi-automatic method for ontology-driven lexical acquisition and ontology population. Our method is language independent and weakly supervised and encompasses three distributional semantics sub-algorithms to learn semantic classes, modifiers and event patterns from an unannotated text corpus. The distributional features which our algorithms use are linear contexts, extracted without any language-specific resources, apart from a list of stop words. This makes our method applicable across different languages and domains. To illustrate the feasibility of our approach, we learned lexicalizations of concepts from the domain of natural disasters in Spanish and English. Then, we populated an event micro-ontology by performing event extraction from tweets published during several big tropical storms. The evaluation showed quite promising precision, while the event extraction recall could be improved further.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Almuhareb, A., & Poesio, M. (2008). Extracting concept descriptions from the web: The importance of attributes and values. In P. Buitelaar & P. Cimiano (Eds.), Ontology learning and population. Bridging the gap between text and knowledge (pp. 29–44). Berlin: Springer.
Breslin, J., Ellison, N., Shanahan, J., & Tufekci, Z. (Eds.). (2012). Proceedings of the 6th International Conference on Weblogs and Social Media (ICWSM - 12). Dublin: AAAI Press.
Buitelaar, P., & Cimiano, P. (Eds.). (2008). Ontology learning and population. Bridging the gap between text and knowledge. Berlin: Springer.
Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Estevam, R., Hruschka, J., & Mitchell, T. (2010). Toward an architecture for never-ending language learning. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) (pp. 1306–1313). Atlanta, Georgia.
Drumond, L., & Girardi, G. (2008). A survey of ontology learning procedures. In The 3rd Workshop on Ontologies and Their Applications, Salvador, Brasil (pp. 13–25).
Edwards, J. (2013). Twitter is surprisingly small compared to a bunch of other apps and online companies. Retrieved from http://www.businessinsider.com/twitter-user-base-compared-to-other-apps-and-online-companies-2013-11
Gruber, T. (2008). Collective knowledge systems: Where the social web meets the semantic web. Web Semantics: Science, Services and Agents on the World Wide Web, 6, 4–13.
Pantel, P., & Lin, D. (2002). Discovering Word senses from text, In Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 613–619). Edmonton.
Piskorski, J. (2008). Ex-press - Extraction pattern recognition engine and specification suite. In Finite State Methods and Natural Language Processing: 6th International Workshop, FSMNLP 2007 (pp. 166–183). Potsdam, Germany: Univesitätsverlag.
Reuter, T., & Cimiano, P. (2002). Event-based classification of social media streams. In Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, (pp. 22:1–22:8). Hong Kong.
Riloff, E., & Jones, R. (2002). Learning dictionaries for information extraction by multi-level bootstrapping. In Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI 99), Orlando, FL (pp. 474–479).
Tanev, H., & Magnini, B. (2008). Weakly supervised approaches for ontology population. In Ontology learning and population. Bridging the gap between text and knowledge (pp. 129–144). Berlin: Springer.
Tanev, H., Zavarella, V., Kabadjov, M., Piskorski, J., Atkinson, M., & Steinberger, R. (2009). Exploiting machine learning techniques to build an event extraction system forPortuguese and Spanish. Linguamatica, 2, 55–66.
Völker, J., Haase, P., & Hitzler, P. (2008). Learning expressive ontologies. In Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge (pp. 45–69). Amsterdam: IOS Press.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Tanev, H., Zavarella, V. (2014). Multilingual Lexicalisation and Population of Event Ontologies: A Case Study for Social Media. In: Buitelaar, P., Cimiano, P. (eds) Towards the Multilingual Semantic Web. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43585-4_16
Download citation
DOI: https://doi.org/10.1007/978-3-662-43585-4_16
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43584-7
Online ISBN: 978-3-662-43585-4
eBook Packages: Computer ScienceComputer Science (R0)