Overcoming Linguistic Barriers to the Multilingual Semantic Web

Hirst, Graeme

doi:10.1007/978-3-662-43585-4_1

Graeme Hirst³

848 Accesses
1 Citations

Abstract

I analyze Berners-Lee, Hendler, and Lassila’s description of the Semantic Web, discussing what it implies for a Multilingual Semantic Web and the barriers that the nature of language itself puts in the way of that vision. Issues raised include the mismatch between natural language lexicons and hierarchical ontologies, the limitations of a purely writer-centered view of meaning, and the benefits of a reader-centered view. I then discuss how we can start to overcome these barriers by taking a different view of the problem and considering distributional models of semantics in place of purely symbolic models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
I will refer to these authors, and metonymously to this paper, as BLHL.
2.
Wilks (2009), echoed by Borin (2012), suggests that, a fortiori, “ontologies” as presently constructed are nothing more than substandard lexicons disguised as something different.
3.
Thanks to Kazuko Nakajima for the translation of this text from Japanese.
4.
For example, contemporary researchers in biodiversity have trouble searching the legacy literature in the field because diachronic changes both in the terminology and in the conceptual understanding of the domain result in there being no shared ontologies. “Even competent and well-intentioned researchers often have difficulties searching this literature. Simple Google-style keyword searches are frequently insufficient, because in this literature, more so perhaps than most other fields of science, related concepts are often described or explained in different terms, or in completely different conceptual frameworks, from those of contemporary research. As a result, interesting and beneficial relations with legacy publications, or even with whole literatures, may remain hidden to term-based methods” (Hirst et al. 2013).
5.
The collaborative annotation of a Semantic Web page with semantic interpretations generated by software agents that are beyond the control of its author raises many issues that are outside the scope of this chapter. The annotations might be objectionable to the author or counterproductive to his or her goals; they could be willfully misleading or outright vandalism. While these issues may also arise with the present-day public tagging or bookmarking of sites by users (Breslin et al. 2009), their scale is greatly magnified when the annotations become a central part of the Semantic Web retrieval mechanism rather than merely some user’s advisory opinion.

References

Adamska-Sałaciak, A. (2013). Equivalence, synonymy, and sameness of meaning in a bilingual dictionary. International Journal of Lexicography, 26(3), 329–345. doi:10.1093/ijl/ect016.
Article Google Scholar
Baroni, M., Bernardi, R., & Zamparelli, R. (2014). Frege in space: A program for compositional distributional semantics. Linguistic Issues in Language Technology, 9(6) (February 2014).
Google Scholar
Berners-Lee, T. (2009). The next Web. In TED Conference, Long Beach, CA. www.ted.com/talks/tim_berners_lee_on_the_next_web.html.
Berners-Lee, T., Hendler, J., & Lassila, O. (2001, May). The Semantic Web. Scientific American, 284(5), 34–43.
Article Google Scholar
Borin, L. (2012). Core vocabulary: A useful but mystical concept in some kinds of linguistics. In D. Santos, K. Lindén, & W. Ng’ang’a (Eds.), Shall we play the Festschrift game? (pp. 53–65). Berlin: Springer. doi:10.1007/978-3-642-30773-7_6.
Breslin, J. G., Passant, A., & Decker, S. (2009). The social Semantic Web. Berlin: Springer. doi:10.1007/978-3-642-01172-6.
Book Google Scholar
Cimiano, P., Buitelaar, P., McCrae, J., & Sintek, M. (2011). LexInfo: A declarative model for the lexicon–ontology interface. Journal of Web Semantics, 9(1), 29–51. doi:10.1016/j.websem.2010.11.001.
Article Google Scholar
Cimiano, P., Unger, C., & McCrae, J. (2014). Ontology-based interpretation of natural language. San Rafael: Morgan & Claypool Publishers.
Google Scholar
Clarke, D. (2012). A context-theoretic framework for compositionality in distributional semantics. Computational Linguistics, 38(1), 41–71. doi:10.1162/COLI_a_00084.
Article Google Scholar
Edmonds, P., & Hirst, G. (2002). Near-synonymy and lexical choice. Computational Linguistics, 28(2), 105–144. doi:10.1162/089120102760173625.
Article Google Scholar
Erk, K. (2012). Vector space models of word meaning and phrase meaning: A survey. Language and Linguistics Compass, 6(10), 635–653. doi:10.1002/lnco.362.
Article Google Scholar
Erk, K. (2013). Towards a semantics for distributional representations. In Proceedings, 10th International Conference on Computational Semantics (IWCS-2013), Potsdam. www.aclweb.org/anthology/W13-0109.
Farrell, R. B. (1977). German synonyms. Cambridge: Cambridge University Press.
Google Scholar
Fish, S. (1980). Is there a text in this class? The authority of interpretive communities. Cambridge: Harvard University Press.
Google Scholar
Freitas, A., O’Riain, S., & Curry, E. (2013). A distributional semantic search infrastructure for linked dataspaces. In The Semantic Web: ESWC 2013 Satellite Events. Lecture Notes in Computer Science (Vol. 7955, pp. 214–218). Berlin: Springer. doi:10.1007/978-3-642-41242-4_27.
Google Scholar
Friedman, J., Moran, D. B., & Warren, D. S. (1978). Explicit finite intensional models for PTQ. American Journal of Computational Linguistics, microfiche 74, 3–22. www.aclweb.org/anthology/J79-1074
Fujiwara, Y., Isogai, H., & Muroyama, T. (1985). Hyogen Ruigo Jiten. Tokyo: Tokyodo Publishing.
Google Scholar
Gracia, J., Montiel-Ponsoda, E., Cimiano, P., Gómez-Pérez, A., Buitelaar, P., & McCrae, J. (2012). Challenges for the multilingual Web of data. Journal of Web Semantics, 11, 63–71. doi:10.1016/j.websem.2011.09.001
Article Google Scholar
Hirst, G. (2007). Views of text-meaning in computational linguistics: Past, present, and future. In G. Dodig Crnkovic & S. Stuart (Eds.), Computation, information, cognition — The Nexus and the Liminal (pp. 270–279). Newcastle: Cambridge Scholars Publishing. ftp.cs.toronto.edu/pub/gh/Hirst-ECAPbook-2007.pdf.
Hirst, G. (2008). The future of text-meaning in computational linguistics. In P. Sojka, A. Horák, I. Kopeček, & K. Pala (Eds.), Proceedings, 11th International Conference on Text, Speech and Dialogue (TSD 2008). Lecture Notes in Artificial Intelligence (Vol. 5246, pp. 1–9). Berlin: Springer. doi:10.1007/978-3-540-87391-4_1.
Google Scholar
Hirst, G. (2009a). Ontology and the lexicon. In S. Staab & R. Studer (Eds.), Handbook on ontologies. International Handbooks on Information Systems (2nd ed., pp. 269–292). Berlin: Springer. doi:/10.1007/978-3-540-92673-3_12.
Hirst, G. (2009b, July). Limitations of the philosophy of language understanding implicit in computational linguistics. Proceedings, 7th European Conference on Computing and Philosophy, Barcelona (pp. 108–109). ftp.cs.toronto.edu/pub/gh/Hirst-ECAP-2009.pdf.
Hirst, G. (2013). Computational linguistics. In K. Allan (Ed.), The Oxford handbook of the history of linguistics. Oxford: Oxford University Press.
Google Scholar
Hirst, G., & Mohammad, S. (2011). Semantic distance measures with distributional profiles of coarse-grained concepts. In A. Mehler, K. U. Kühnberger, H. Lobin, H. Lüngen, A. Storrer, & A. Witt (Eds.), Modeling, learning, and processing of text technological data structures. Studies in Computational Intelligence Series (Vol. 370, pp. 61–79). Berlin: Springer. doi:10.1007/978-3-642-22613-7_4.
Hirst, G., & Ryan, M. (1992). Mixed-depth representations for natural language text. In P. S. Jacobs (Ed.), Text-based intelligent systems (pp. 59–82). Hillsdale, NJ: Lawrence Erlbaum Associates. ftp.cs.toronto.edu/pub/gh/Hirst+Ryan-92.pdf.
Hirst, G., Talent, N., & Scharf, S. (2013, 27 May). Detecting semantic overlap and discovering precedents in the biodiversity research literature. In Proceedings of the First International Workshop on Semantics for Biodiversity (S4BioDiv) (CEUR Workshop Proceedings, Vol. 979), 10th Extended Semantic Web Conference (ESWC-2013), Montpellier, France. ceur-ws.org/Vol-979/.
Hjelmslev, L. (1961). Prolegomena to a theory of language (rev. ed.). (F. J. Whitfield, Trans.). Madison: University of Wisconsin Press. (Originally published as Omkring sprogteoriens grundlæggelse, 1943.)
Google Scholar
Inkpen, D., & Hirst, G. (2006). Building and using a lexical knowledge-base of near-synonym differences. Computational Linguistics, 32(2), 223–262. www.aclweb.org/anthology/J06-2003
Kennedy, A., & Hirst, G. (2012, December). Measuring semantic relatedness across languages. In Proceedings, xLiTe: Cross-Lingual Technologies Workshop at the Neural Information Processing Systems Conference, Lake Tahoe, NV. ftp.cs.toronto.edu/pub/gh/Hirst-ECAP-2009.pdf.
McCrae, J., Aguado-de-Cea, G., Buitelaar, P., Cimiano, P. Declerck, T. Gómez-Pérez, A., et al. (2012). Interchanging lexical resources on the Semantic Web. Language Resources and Evaluation, 46(4), 701–719. doi:10.1007/s10579-012-9182-3.
Article Google Scholar
Mitchell, J., & Lapata, M. (2010). Composition in distributional models of semantics. Cognitive Science, 34(8), 1388–1429. doi:10.1111/j.1551-6709.2010.01106.x.
Article Google Scholar
Mohammad, S., Gurevych, I., Hirst, G., & Zesch, T. (2007). Cross-lingual distributional profiles of concepts for measuring semantic distance. In 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007), Prague (pp. 571–580). www.aclweb.org/anthology/D07-1060.
Mohammad, S., & Hirst, G. (2006, July). Distributional measures of concept-distance: A task-oriented evaluation. In Proceedings, 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), Sydney, Australia (pp. 35–43). www.aclweb.org/anthology/W06-1605.
Montague, R. (1974). Formal philosophy. New Haven: Yale University Press.
Google Scholar
Nováček, V., Handschuh, S., & Decker, S. (2011). Getting the meaning right: A complementary distributional layer for the web semantics. In Proceedings, 10th International Semantic Web Conference (ISWC-2011) (Vol. 1, pp. 504–519). Lecture Notes in Computer Science, Vol. 7031. Berlin: Springer. doi:10.1007/978-3-642-25073-6_32.
Google Scholar
Reddy, M. J. (1979). The conduit metaphor: A case of frame conflict in our language about language. In A. Ortony (Ed.), Metaphor and thought (pp. 284–324). Oxford: Oxford University Press. [Reprinted unchanged in the second edition, 1993, pp. 164–201.]
Google Scholar
Schogt, H. G. (1976). Sémantique synchronique: synonymie, homonymie, polysémie. Toronto: University of Toronto Press.
Google Scholar
Schogt, H. G. (1988). Linguistics, literary analysis, and literary translation. Toronto: University of Toronto Press.
Google Scholar
Turney, P. D., & Pantel, P. (2010). From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37, 141–188. doi:10.1613/jair.2934.
MATH MathSciNet Google Scholar
Wilks, Y. (2009). Ontotherapy, or how to stop worrying about what there is. In N. Nicolov, G. Angelova, & R. Mitkov (Eds.), Recent advances in natural language processing V (pp. 1–20). Amsterdam: John Benjamins.
Chapter Google Scholar

Download references

Acknowledgements

This work was supported financially by the Natural Sciences and Engineering Research Council of Canada. For helpful comments, I am grateful to Lars Borin, Philipp Cimiano, Nadia Talent, the anonymous reviewers, and the participants of the Dagstuhl Seminar on the Multilingual Semantic Web.

Author information

Authors and Affiliations

Department of Computer Science, University of Toronto, Toronto, ON, Canada, M5S 3G4
Graeme Hirst

Authors

Graeme Hirst
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Graeme Hirst .

Editor information

Editors and Affiliations

National University of Ireland, Galway, Ireland
Paul Buitelaar
Universität Bielefeld, Bielefeld, Germany
Philipp Cimiano

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hirst, G. (2014). Overcoming Linguistic Barriers to the Multilingual Semantic Web. In: Buitelaar, P., Cimiano, P. (eds) Towards the Multilingual Semantic Web. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43585-4_1

Download citation

DOI: https://doi.org/10.1007/978-3-662-43585-4_1
Published: 19 August 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43584-7
Online ISBN: 978-3-662-43585-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics