Skip to main content
Log in

The GOLD Community of Practice: an infrastructure for linguistic data on the Web

  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

The GOLD Community of Practice is proposed as a model for linking on-line linguistic data to an ontology. The key components of the model include the linguistic data resources themselves and those focused on the knowledge derived from data. Data resources include the ever-increasing amount of linguistic field data and other descriptive language resources being migrated to the Web. The knowledge resources capture generalizations about the data and are anchored in the General Ontology for Linguistic Description (GOLD). It is argued that such a model is in the spirit of the vision for a Semantic Web and, thus, provides a concrete methodology for rendering highly divergent resources semantically interoperable. The focus of this work, then, is not on annotation at the syntactic level, but rather on how annotated Web resources can be linked to an ontology. Furthermore, a methodology is given for creating specific communities of practice within the overall Web infrastructure for linguistics. Finally, ontology-driven search is discussed as a key application of the proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Aristar, A. (2003). ‘FIELD’. Technical report, presented at the workshop on digitizing and annotating texts and field recordings. LSA Institute.

  • Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The Semantic Web. Scientific American.

  • Bird, S., & Simons, G. F. (2003a). Extending Dublin Core metadata to support the description and discovery of language resources. Computers and the Humanities, 37, 375–388. http://www.arxiv.org/abs/cs.CL/0308022

  • Bird, S., & Simons, G. F. (2003b). Seven dimensions of portability for language documentation and description. Language, 79, 557–582.

    Article  Google Scholar 

  • Bruening, B. (2001). Syntax at the edge: Cross-clausal phenomena and the syntax of passamaquoddy. Ph.D. thesis, MIT.

  • Calzolari, N., Bertagna, F., Lenci, A., & Monachini, M. (2002). Standards and best practice for multilingual computational lexicons & MILE (the Multilingual ISLE Lexical Entry). ISLE Deliverable D2.2-D3.2, ISLE Computational Lexicons Working Group. http://www.ilc.cnr.it/EAGLES96/isle/clwg_doc/ISLE_D2.2-D3.2.zip(2006-07-09).

  • Calzolari, N., Grishman, R., & Palmer, M. (2001). Survey of major approaches towards Bilingual/Multilingual Lexicons. ISLE Deliverable D2.1-D3.1, ISLE Computational Lexicons Working Group, Pisa.

  • Calzolari, N., McNaught, J., Palmer, M., & Zampolli, A. (2003). ISLE D14.2-Final report. ISLE Deliverable D14.2, ISLE. http://www.ilc.cnr.it/EAGLES96/isle/ISLE_D14.2.zip (2006-07-09).

  • Farrar, S. (in press). Using ‘Ontolinguistics’ for language description. In A. Schalley & D. Zaefferer (Eds.), Ontolinguistics: How ontological status shapes the linguistic coding of concepts. Berlin: Mouton de Gruyter. http://www.u.arizona.edu/∼farrar/papers/Far-fc.pdf

  • Farrar, S., & Langendoen, D. T. (2003). A linguistic ontology for the Semantic Web. GLOT International, 7(3), 97–100. http://www.u.arizona.edu/∼farrar/papers/FarLang03b.pdf

  • Greenberg, J. (1966). Language universals. Mouton: The Hague.

    Google Scholar 

  • Ide, N., Lenci, A., & Calzolari, N. (2003). RDF instantiation of ISLE/MILE lexical entries. In Proceedings of ACL’03 workshop on linguistic annotation: Getting the model right, Sapporo, pp. 30–37. http://www.cs.vassar.edu/∼ide/papers/ACL2003-ws-ISLE.pdf(2006-07-09).

  • Ide, N., & Romary, L. (2004). International standard for a linguistic annotation framework. Journal of Natural Language Engineering, 10(3–4), 211–225.

    Article  Google Scholar 

  • Kemps-Snijders, M., Nederhof, M.-J., & Wittenburg, P. (2006). LEXUS, a web-based tool for manipulating lexical resources. In LREC 2006: fifth international conference on language resources and evaluation, Genoa, Italy, pp. 1862–1865.

  • Langendoen, D. T., Farrar, S., & Lewis, W. D. (2002). Bridging the markup gap: Smart search engines for language researchers. In Proceedings of the international workshop on resources and tools in field linguistics. Las Palmas, Gran Canaria, Spain. http://www.u.arizona.edu/∼farrar/papers/LangFarLew02.pdf

  • Lenci, A., Busa, F., Ruimy, N., Monachini, E. G. M., Calzolari, N., & Zampolli, A. (2000). Linguistic specifications. SIMPLE deliverable D2.1, ILC and University of Pisa, Pisa. http://www.ub.es/gilcub/SIMPLE/reports/simple/SIMPLE_FGuidelines.rtf.zip(2006-07-09).

  • Lewis, W. D. (2006). ODIN: A model for adapting and enriching legacy infrastructure. In Proceedings of the e-humanities workshop held in cooperation with e-science 2006: 2nd IEEe international conference on e-science and grid computing, Amsterdam. Available at http://www.faculty.washington.edu/wlewis2/papers/ODIN-eH06.pdf(2006-10-29).

  • Niles, I., & Pease, A. (2001). Toward a standard upper ontology. In C. Welty & B. Smith (Eds.) Proceedings of the 2nd international conference on formal ontology in information systems (FOIS-2001). Ogunquit, Maine. http://www.home.earthlink.net/adampease/professional/FOIS.pdf

  • Romary, L. (2003). Implementing a data category registry within ISO TC37-Technical note contributing to a future WD for ISO 12620-1. Technical report SC36N0581, International Standards Organization.

  • Rosse, C., Kumar, A., Mejino Jr., J. L. V., Cook, D. L., Detwilern, L. T., & Smith, B. (2005). A strategy for improving and integrating biomedical ontologies. In Proceedings of AMIA symposium 2005, Washington, DC, pp. 639–643

  • Simons, G., & Bird, S. (2003). The open language archives community: An infrastructure for distributed archiving of language resources. Literary and Linguistic Computing, 18, 117–128. http://www.arxiv.org/abs/cs.CL/0306040 (2006-May-17).

  • Simons, G. F., Lewis, W. D., Farrar, S. O., Langendoen, D. T., Fitzsimons, B., & Gonzalez, H. (2004). The semantics of markup: Mapping legacy markup schemas to a common semantics. In Proceedings of the 4th workshop on NLP and XML (NLPXML-2004): held in cooperation with ACL-04, Barcelona, Spain, pp. 25–32. http://www.u.arizona.edu/∼farrar/papers/Sim-etal04b.pdf

  • Sperberg-McQueen, C. M., & Burnard, L. (Eds.) (2002). Guidelines for electronic text encoding and interchange, TEI P4. Oxford: Text Encoding Initiative Consortium.

    Google Scholar 

  • Weber, D. J. (2002). Reflections on the Huallaga Quechua dictionary: Derived forms as subentries. In On-line proceedings of the 2002 E-MELD workshop on digitizing lexical information. http://www.saussure.linguistlist.org/cfdocs/emeld/workshop/2002/presentations/weber/emeld.pdf(2006-07-07).

Download references

Acknowledgements

Special thanks goes to Terry Langendoen for his support of our research project from the beginning. The idea to construct an ontology for linguistics was conceived by the authors during their work on the E-MELD project [emeld.org] (NSF ITR-0094934). We gratefully acknowledge the support of the E-MELD PIs and associates, especially Gary Simons, Helen Aristar-Dry and Anthony Aristar. We acknowledge the comments of the members of the GOLD summit held in November, 2004 in Fresno, CA including Jeff Good, Baden Hughes, Laura Buszard-Welcher, Brian Fitzsimons, and Ruby Basham. Finally, we gratefully acknowledge the NSF-funded Data-Driven Linguistic Ontology Development project (BCS-0411348) which supported the authors during the writing of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Scott Farrar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Farrar, S., Lewis, W.D. The GOLD Community of Practice: an infrastructure for linguistic data on the Web. Lang Resources & Evaluation 41, 45–60 (2007). https://doi.org/10.1007/s10579-007-9016-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-007-9016-x

Keywords

Navigation