Skip to main content

Overcoming Linguistic Barriers to the Multilingual Semantic Web

  • Chapter
  • First Online:
Towards the Multilingual Semantic Web

Abstract

I analyze Berners-Lee, Hendler, and Lassila’s description of the Semantic Web, discussing what it implies for a Multilingual Semantic Web and the barriers that the nature of language itself puts in the way of that vision. Issues raised include the mismatch between natural language lexicons and hierarchical ontologies, the limitations of a purely writer-centered view of meaning, and the benefits of a reader-centered view. I then discuss how we can start to overcome these barriers by taking a different view of the problem and considering distributional models of semantics in place of purely symbolic models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    I will refer to these authors, and metonymously to this paper, as BLHL.

  2. 2.

    Wilks (2009), echoed by Borin (2012), suggests that, a fortiori, “ontologies” as presently constructed are nothing more than substandard lexicons disguised as something different.

  3. 3.

    Thanks to Kazuko Nakajima for the translation of this text from Japanese.

  4. 4.

    For example, contemporary researchers in biodiversity have trouble searching the legacy literature in the field because diachronic changes both in the terminology and in the conceptual understanding of the domain result in there being no shared ontologies. “Even competent and well-intentioned researchers often have difficulties searching this literature. Simple Google-style keyword searches are frequently insufficient, because in this literature, more so perhaps than most other fields of science, related concepts are often described or explained in different terms, or in completely different conceptual frameworks, from those of contemporary research. As a result, interesting and beneficial relations with legacy publications, or even with whole literatures, may remain hidden to term-based methods” (Hirst et al. 2013).

  5. 5.

    The collaborative annotation of a Semantic Web page with semantic interpretations generated by software agents that are beyond the control of its author raises many issues that are outside the scope of this chapter. The annotations might be objectionable to the author or counterproductive to his or her goals; they could be willfully misleading or outright vandalism. While these issues may also arise with the present-day public tagging or bookmarking of sites by users (Breslin et al. 2009), their scale is greatly magnified when the annotations become a central part of the Semantic Web retrieval mechanism rather than merely some user’s advisory opinion.

References

  • Adamska-Sałaciak, A. (2013). Equivalence, synonymy, and sameness of meaning in a bilingual dictionary. International Journal of Lexicography, 26(3), 329–345. doi:10.1093/ijl/ect016.

    Article  Google Scholar 

  • Baroni, M., Bernardi, R., & Zamparelli, R. (2014). Frege in space: A program for compositional distributional semantics. Linguistic Issues in Language Technology, 9(6) (February 2014).

    Google Scholar 

  • Berners-Lee, T. (2009). The next Web. In TED Conference, Long Beach, CA. www.ted.com/talks/tim_berners_lee_on_the_next_web.html.

  • Berners-Lee, T., Hendler, J., & Lassila, O. (2001, May). The Semantic Web. Scientific American, 284(5), 34–43.

    Article  Google Scholar 

  • Borin, L. (2012). Core vocabulary: A useful but mystical concept in some kinds of linguistics. In D. Santos, K. Lindén, & W. Ng’ang’a (Eds.), Shall we play the Festschrift game? (pp. 53–65). Berlin: Springer. doi:10.1007/978-3-642-30773-7_6.

  • Breslin, J. G., Passant, A., & Decker, S. (2009). The social Semantic Web. Berlin: Springer. doi:10.1007/978-3-642-01172-6.

    Book  Google Scholar 

  • Cimiano, P., Buitelaar, P., McCrae, J., & Sintek, M. (2011). LexInfo: A declarative model for the lexicon–ontology interface. Journal of Web Semantics, 9(1), 29–51. doi:10.1016/j.websem.2010.11.001.

    Article  Google Scholar 

  • Cimiano, P., Unger, C., & McCrae, J. (2014). Ontology-based interpretation of natural language. San Rafael: Morgan & Claypool Publishers.

    Google Scholar 

  • Clarke, D. (2012). A context-theoretic framework for compositionality in distributional semantics. Computational Linguistics, 38(1), 41–71. doi:10.1162/COLI_a_00084.

    Article  Google Scholar 

  • Edmonds, P., & Hirst, G. (2002). Near-synonymy and lexical choice. Computational Linguistics, 28(2), 105–144. doi:10.1162/089120102760173625.

    Article  Google Scholar 

  • Erk, K. (2012). Vector space models of word meaning and phrase meaning: A survey. Language and Linguistics Compass, 6(10), 635–653. doi:10.1002/lnco.362.

    Article  Google Scholar 

  • Erk, K. (2013). Towards a semantics for distributional representations. In Proceedings, 10th International Conference on Computational Semantics (IWCS-2013), Potsdam. www.aclweb.org/anthology/W13-0109.

  • Farrell, R. B. (1977). German synonyms. Cambridge: Cambridge University Press.

    Google Scholar 

  • Fish, S. (1980). Is there a text in this class? The authority of interpretive communities. Cambridge: Harvard University Press.

    Google Scholar 

  • Freitas, A., O’Riain, S., & Curry, E. (2013). A distributional semantic search infrastructure for linked dataspaces. In The Semantic Web: ESWC 2013 Satellite Events. Lecture Notes in Computer Science (Vol. 7955, pp. 214–218). Berlin: Springer. doi:10.1007/978-3-642-41242-4_27.

    Google Scholar 

  • Friedman, J., Moran, D. B., & Warren, D. S. (1978). Explicit finite intensional models for PTQ. American Journal of Computational Linguistics, microfiche 74, 3–22. www.aclweb.org/anthology/J79-1074

  • Fujiwara, Y., Isogai, H., & Muroyama, T. (1985). Hyogen Ruigo Jiten. Tokyo: Tokyodo Publishing.

    Google Scholar 

  • Gracia, J., Montiel-Ponsoda, E., Cimiano, P., Gómez-Pérez, A., Buitelaar, P., & McCrae, J. (2012). Challenges for the multilingual Web of data. Journal of Web Semantics, 11, 63–71. doi:10.1016/j.websem.2011.09.001

    Article  Google Scholar 

  • Hirst, G. (2007). Views of text-meaning in computational linguistics: Past, present, and future. In G. Dodig Crnkovic & S. Stuart (Eds.), Computation, information, cognition — The Nexus and the Liminal (pp. 270–279). Newcastle: Cambridge Scholars Publishing. ftp.cs.toronto.edu/pub/gh/Hirst-ECAPbook-2007.pdf.

  • Hirst, G. (2008). The future of text-meaning in computational linguistics. In P. Sojka, A. Horák, I. Kopeček, & K. Pala (Eds.), Proceedings, 11th International Conference on Text, Speech and Dialogue (TSD 2008). Lecture Notes in Artificial Intelligence (Vol. 5246, pp. 1–9). Berlin: Springer. doi:10.1007/978-3-540-87391-4_1.

    Google Scholar 

  • Hirst, G. (2009a). Ontology and the lexicon. In S. Staab & R. Studer (Eds.), Handbook on ontologies. International Handbooks on Information Systems (2nd ed., pp. 269–292). Berlin: Springer. doi:/10.1007/978-3-540-92673-3_12.

  • Hirst, G. (2009b, July). Limitations of the philosophy of language understanding implicit in computational linguistics. Proceedings, 7th European Conference on Computing and Philosophy, Barcelona (pp. 108–109). ftp.cs.toronto.edu/pub/gh/Hirst-ECAP-2009.pdf.

  • Hirst, G. (2013). Computational linguistics. In K. Allan (Ed.), The Oxford handbook of the history of linguistics. Oxford: Oxford University Press.

    Google Scholar 

  • Hirst, G., & Mohammad, S. (2011). Semantic distance measures with distributional profiles of coarse-grained concepts. In A. Mehler, K. U. Kühnberger, H. Lobin, H. Lüngen, A. Storrer, & A. Witt (Eds.), Modeling, learning, and processing of text technological data structures. Studies in Computational Intelligence Series (Vol. 370, pp. 61–79). Berlin: Springer. doi:10.1007/978-3-642-22613-7_4.

  • Hirst, G., & Ryan, M. (1992). Mixed-depth representations for natural language text. In P. S. Jacobs (Ed.), Text-based intelligent systems (pp. 59–82). Hillsdale, NJ: Lawrence Erlbaum Associates. ftp.cs.toronto.edu/pub/gh/Hirst+Ryan-92.pdf.

  • Hirst, G., Talent, N., & Scharf, S. (2013, 27 May). Detecting semantic overlap and discovering precedents in the biodiversity research literature. In Proceedings of the First International Workshop on Semantics for Biodiversity (S4BioDiv) (CEUR Workshop Proceedings, Vol. 979), 10th Extended Semantic Web Conference (ESWC-2013), Montpellier, France. ceur-ws.org/Vol-979/.

  • Hjelmslev, L. (1961). Prolegomena to a theory of language (rev. ed.). (F. J. Whitfield, Trans.). Madison: University of Wisconsin Press. (Originally published as Omkring sprogteoriens grundlæggelse, 1943.)

    Google Scholar 

  • Inkpen, D., & Hirst, G. (2006). Building and using a lexical knowledge-base of near-synonym differences. Computational Linguistics, 32(2), 223–262. www.aclweb.org/anthology/J06-2003

  • Kennedy, A., & Hirst, G. (2012, December). Measuring semantic relatedness across languages. In Proceedings, xLiTe: Cross-Lingual Technologies Workshop at the Neural Information Processing Systems Conference, Lake Tahoe, NV. ftp.cs.toronto.edu/pub/gh/Hirst-ECAP-2009.pdf.

  • McCrae, J., Aguado-de-Cea, G., Buitelaar, P., Cimiano, P. Declerck, T. Gómez-Pérez, A., et al. (2012). Interchanging lexical resources on the Semantic Web. Language Resources and Evaluation, 46(4), 701–719. doi:10.1007/s10579-012-9182-3.

    Article  Google Scholar 

  • Mitchell, J., & Lapata, M. (2010). Composition in distributional models of semantics. Cognitive Science, 34(8), 1388–1429. doi:10.1111/j.1551-6709.2010.01106.x.

    Article  Google Scholar 

  • Mohammad, S., Gurevych, I., Hirst, G., & Zesch, T. (2007). Cross-lingual distributional profiles of concepts for measuring semantic distance. In 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007), Prague (pp. 571–580). www.aclweb.org/anthology/D07-1060.

  • Mohammad, S., & Hirst, G. (2006, July). Distributional measures of concept-distance: A task-oriented evaluation. In Proceedings, 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), Sydney, Australia (pp. 35–43). www.aclweb.org/anthology/W06-1605.

  • Montague, R. (1974). Formal philosophy. New Haven: Yale University Press.

    Google Scholar 

  • Nováček, V., Handschuh, S., & Decker, S. (2011). Getting the meaning right: A complementary distributional layer for the web semantics. In Proceedings, 10th International Semantic Web Conference (ISWC-2011) (Vol. 1, pp. 504–519). Lecture Notes in Computer Science, Vol. 7031. Berlin: Springer. doi:10.1007/978-3-642-25073-6_32.

    Google Scholar 

  • Reddy, M. J. (1979). The conduit metaphor: A case of frame conflict in our language about language. In A. Ortony (Ed.), Metaphor and thought (pp. 284–324). Oxford: Oxford University Press. [Reprinted unchanged in the second edition, 1993, pp. 164–201.]

    Google Scholar 

  • Schogt, H. G. (1976). Sémantique synchronique: synonymie, homonymie, polysémie. Toronto: University of Toronto Press.

    Google Scholar 

  • Schogt, H. G. (1988). Linguistics, literary analysis, and literary translation. Toronto: University of Toronto Press.

    Google Scholar 

  • Turney, P. D., & Pantel, P. (2010). From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37, 141–188. doi:10.1613/jair.2934.

    MATH  MathSciNet  Google Scholar 

  • Wilks, Y. (2009). Ontotherapy, or how to stop worrying about what there is. In N. Nicolov, G. Angelova, & R. Mitkov (Eds.), Recent advances in natural language processing V (pp. 1–20). Amsterdam: John Benjamins.

    Chapter  Google Scholar 

Download references

Acknowledgements

This work was supported financially by the Natural Sciences and Engineering Research Council of Canada. For helpful comments, I am grateful to Lars Borin, Philipp Cimiano, Nadia Talent, the anonymous reviewers, and the participants of the Dagstuhl Seminar on the Multilingual Semantic Web.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Graeme Hirst .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Hirst, G. (2014). Overcoming Linguistic Barriers to the Multilingual Semantic Web. In: Buitelaar, P., Cimiano, P. (eds) Towards the Multilingual Semantic Web. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43585-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-43585-4_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-43584-7

  • Online ISBN: 978-3-662-43585-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics