Abstract
I analyze Berners-Lee, Hendler, and Lassila’s description of the Semantic Web, discussing what it implies for a Multilingual Semantic Web and the barriers that the nature of language itself puts in the way of that vision. Issues raised include the mismatch between natural language lexicons and hierarchical ontologies, the limitations of a purely writer-centered view of meaning, and the benefits of a reader-centered view. I then discuss how we can start to overcome these barriers by taking a different view of the problem and considering distributional models of semantics in place of purely symbolic models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
I will refer to these authors, and metonymously to this paper, as BLHL.
- 2.
- 3.
Thanks to Kazuko Nakajima for the translation of this text from Japanese.
- 4.
For example, contemporary researchers in biodiversity have trouble searching the legacy literature in the field because diachronic changes both in the terminology and in the conceptual understanding of the domain result in there being no shared ontologies. “Even competent and well-intentioned researchers often have difficulties searching this literature. Simple Google-style keyword searches are frequently insufficient, because in this literature, more so perhaps than most other fields of science, related concepts are often described or explained in different terms, or in completely different conceptual frameworks, from those of contemporary research. As a result, interesting and beneficial relations with legacy publications, or even with whole literatures, may remain hidden to term-based methods” (Hirst et al. 2013).
- 5.
The collaborative annotation of a Semantic Web page with semantic interpretations generated by software agents that are beyond the control of its author raises many issues that are outside the scope of this chapter. The annotations might be objectionable to the author or counterproductive to his or her goals; they could be willfully misleading or outright vandalism. While these issues may also arise with the present-day public tagging or bookmarking of sites by users (Breslin et al. 2009), their scale is greatly magnified when the annotations become a central part of the Semantic Web retrieval mechanism rather than merely some user’s advisory opinion.
References
Adamska-Sałaciak, A. (2013). Equivalence, synonymy, and sameness of meaning in a bilingual dictionary. International Journal of Lexicography, 26(3), 329–345. doi:10.1093/ijl/ect016.
Baroni, M., Bernardi, R., & Zamparelli, R. (2014). Frege in space: A program for compositional distributional semantics. Linguistic Issues in Language Technology, 9(6) (February 2014).
Berners-Lee, T. (2009). The next Web. In TED Conference, Long Beach, CA. www.ted.com/talks/tim_berners_lee_on_the_next_web.html.
Berners-Lee, T., Hendler, J., & Lassila, O. (2001, May). The Semantic Web. Scientific American, 284(5), 34–43.
Borin, L. (2012). Core vocabulary: A useful but mystical concept in some kinds of linguistics. In D. Santos, K. Lindén, & W. Ng’ang’a (Eds.), Shall we play the Festschrift game? (pp. 53–65). Berlin: Springer. doi:10.1007/978-3-642-30773-7_6.
Breslin, J. G., Passant, A., & Decker, S. (2009). The social Semantic Web. Berlin: Springer. doi:10.1007/978-3-642-01172-6.
Cimiano, P., Buitelaar, P., McCrae, J., & Sintek, M. (2011). LexInfo: A declarative model for the lexicon–ontology interface. Journal of Web Semantics, 9(1), 29–51. doi:10.1016/j.websem.2010.11.001.
Cimiano, P., Unger, C., & McCrae, J. (2014). Ontology-based interpretation of natural language. San Rafael: Morgan & Claypool Publishers.
Clarke, D. (2012). A context-theoretic framework for compositionality in distributional semantics. Computational Linguistics, 38(1), 41–71. doi:10.1162/COLI_a_00084.
Edmonds, P., & Hirst, G. (2002). Near-synonymy and lexical choice. Computational Linguistics, 28(2), 105–144. doi:10.1162/089120102760173625.
Erk, K. (2012). Vector space models of word meaning and phrase meaning: A survey. Language and Linguistics Compass, 6(10), 635–653. doi:10.1002/lnco.362.
Erk, K. (2013). Towards a semantics for distributional representations. In Proceedings, 10th International Conference on Computational Semantics (IWCS-2013), Potsdam. www.aclweb.org/anthology/W13-0109.
Farrell, R. B. (1977). German synonyms. Cambridge: Cambridge University Press.
Fish, S. (1980). Is there a text in this class? The authority of interpretive communities. Cambridge: Harvard University Press.
Freitas, A., O’Riain, S., & Curry, E. (2013). A distributional semantic search infrastructure for linked dataspaces. In The Semantic Web: ESWC 2013 Satellite Events. Lecture Notes in Computer Science (Vol. 7955, pp. 214–218). Berlin: Springer. doi:10.1007/978-3-642-41242-4_27.
Friedman, J., Moran, D. B., & Warren, D. S. (1978). Explicit finite intensional models for PTQ. American Journal of Computational Linguistics, microfiche 74, 3–22. www.aclweb.org/anthology/J79-1074
Fujiwara, Y., Isogai, H., & Muroyama, T. (1985). Hyogen Ruigo Jiten. Tokyo: Tokyodo Publishing.
Gracia, J., Montiel-Ponsoda, E., Cimiano, P., Gómez-Pérez, A., Buitelaar, P., & McCrae, J. (2012). Challenges for the multilingual Web of data. Journal of Web Semantics, 11, 63–71. doi:10.1016/j.websem.2011.09.001
Hirst, G. (2007). Views of text-meaning in computational linguistics: Past, present, and future. In G. Dodig Crnkovic & S. Stuart (Eds.), Computation, information, cognition — The Nexus and the Liminal (pp. 270–279). Newcastle: Cambridge Scholars Publishing. ftp.cs.toronto.edu/pub/gh/Hirst-ECAPbook-2007.pdf.
Hirst, G. (2008). The future of text-meaning in computational linguistics. In P. Sojka, A. Horák, I. Kopeček, & K. Pala (Eds.), Proceedings, 11th International Conference on Text, Speech and Dialogue (TSD 2008). Lecture Notes in Artificial Intelligence (Vol. 5246, pp. 1–9). Berlin: Springer. doi:10.1007/978-3-540-87391-4_1.
Hirst, G. (2009a). Ontology and the lexicon. In S. Staab & R. Studer (Eds.), Handbook on ontologies. International Handbooks on Information Systems (2nd ed., pp. 269–292). Berlin: Springer. doi:/10.1007/978-3-540-92673-3_12.
Hirst, G. (2009b, July). Limitations of the philosophy of language understanding implicit in computational linguistics. Proceedings, 7th European Conference on Computing and Philosophy, Barcelona (pp. 108–109). ftp.cs.toronto.edu/pub/gh/Hirst-ECAP-2009.pdf.
Hirst, G. (2013). Computational linguistics. In K. Allan (Ed.), The Oxford handbook of the history of linguistics. Oxford: Oxford University Press.
Hirst, G., & Mohammad, S. (2011). Semantic distance measures with distributional profiles of coarse-grained concepts. In A. Mehler, K. U. Kühnberger, H. Lobin, H. Lüngen, A. Storrer, & A. Witt (Eds.), Modeling, learning, and processing of text technological data structures. Studies in Computational Intelligence Series (Vol. 370, pp. 61–79). Berlin: Springer. doi:10.1007/978-3-642-22613-7_4.
Hirst, G., & Ryan, M. (1992). Mixed-depth representations for natural language text. In P. S. Jacobs (Ed.), Text-based intelligent systems (pp. 59–82). Hillsdale, NJ: Lawrence Erlbaum Associates. ftp.cs.toronto.edu/pub/gh/Hirst+Ryan-92.pdf.
Hirst, G., Talent, N., & Scharf, S. (2013, 27 May). Detecting semantic overlap and discovering precedents in the biodiversity research literature. In Proceedings of the First International Workshop on Semantics for Biodiversity (S4BioDiv) (CEUR Workshop Proceedings, Vol. 979), 10th Extended Semantic Web Conference (ESWC-2013), Montpellier, France. ceur-ws.org/Vol-979/.
Hjelmslev, L. (1961). Prolegomena to a theory of language (rev. ed.). (F. J. Whitfield, Trans.). Madison: University of Wisconsin Press. (Originally published as Omkring sprogteoriens grundlæggelse, 1943.)
Inkpen, D., & Hirst, G. (2006). Building and using a lexical knowledge-base of near-synonym differences. Computational Linguistics, 32(2), 223–262. www.aclweb.org/anthology/J06-2003
Kennedy, A., & Hirst, G. (2012, December). Measuring semantic relatedness across languages. In Proceedings, xLiTe: Cross-Lingual Technologies Workshop at the Neural Information Processing Systems Conference, Lake Tahoe, NV. ftp.cs.toronto.edu/pub/gh/Hirst-ECAP-2009.pdf.
McCrae, J., Aguado-de-Cea, G., Buitelaar, P., Cimiano, P. Declerck, T. Gómez-Pérez, A., et al. (2012). Interchanging lexical resources on the Semantic Web. Language Resources and Evaluation, 46(4), 701–719. doi:10.1007/s10579-012-9182-3.
Mitchell, J., & Lapata, M. (2010). Composition in distributional models of semantics. Cognitive Science, 34(8), 1388–1429. doi:10.1111/j.1551-6709.2010.01106.x.
Mohammad, S., Gurevych, I., Hirst, G., & Zesch, T. (2007). Cross-lingual distributional profiles of concepts for measuring semantic distance. In 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007), Prague (pp. 571–580). www.aclweb.org/anthology/D07-1060.
Mohammad, S., & Hirst, G. (2006, July). Distributional measures of concept-distance: A task-oriented evaluation. In Proceedings, 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), Sydney, Australia (pp. 35–43). www.aclweb.org/anthology/W06-1605.
Montague, R. (1974). Formal philosophy. New Haven: Yale University Press.
Nováček, V., Handschuh, S., & Decker, S. (2011). Getting the meaning right: A complementary distributional layer for the web semantics. In Proceedings, 10th International Semantic Web Conference (ISWC-2011) (Vol. 1, pp. 504–519). Lecture Notes in Computer Science, Vol. 7031. Berlin: Springer. doi:10.1007/978-3-642-25073-6_32.
Reddy, M. J. (1979). The conduit metaphor: A case of frame conflict in our language about language. In A. Ortony (Ed.), Metaphor and thought (pp. 284–324). Oxford: Oxford University Press. [Reprinted unchanged in the second edition, 1993, pp. 164–201.]
Schogt, H. G. (1976). Sémantique synchronique: synonymie, homonymie, polysémie. Toronto: University of Toronto Press.
Schogt, H. G. (1988). Linguistics, literary analysis, and literary translation. Toronto: University of Toronto Press.
Turney, P. D., & Pantel, P. (2010). From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37, 141–188. doi:10.1613/jair.2934.
Wilks, Y. (2009). Ontotherapy, or how to stop worrying about what there is. In N. Nicolov, G. Angelova, & R. Mitkov (Eds.), Recent advances in natural language processing V (pp. 1–20). Amsterdam: John Benjamins.
Acknowledgements
This work was supported financially by the Natural Sciences and Engineering Research Council of Canada. For helpful comments, I am grateful to Lars Borin, Philipp Cimiano, Nadia Talent, the anonymous reviewers, and the participants of the Dagstuhl Seminar on the Multilingual Semantic Web.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Hirst, G. (2014). Overcoming Linguistic Barriers to the Multilingual Semantic Web. In: Buitelaar, P., Cimiano, P. (eds) Towards the Multilingual Semantic Web. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43585-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-662-43585-4_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43584-7
Online ISBN: 978-3-662-43585-4
eBook Packages: Computer ScienceComputer Science (R0)