Abstract
The idea that ambiguity can be productive in data science remains controversial. Efforts to make scientific publications and data intelligible to computers generally assume that accommodating multiple meanings for words, known as polysemy, undermines reasoning and communication. This assumption has nonetheless been contested by historians, philosophers, and social scientists, who have applied qualitative research methods to demonstrate the generative and strategic value of polysemy. Recent quantitative results from linguistics have also shown how polysemy can actually improve the efficiency of human communication. I present a new conceptual typology based on a synthesis of prior research about the aims, norms, and circumstances under which polysemy arises and is evaluated. The typology supports a contextual pluralist view of polysemy’s value for scientific research practices: polysemy does both substantial positive and negative work in science, but its utility is context-sensitive in ways that are often overlooked by the norms people have formulated to regulate its use, including prior scholars researching polysemy. I also propose that historical patterns in the use of partial synonyms, i.e. terms with overlapping meanings, provide an especially promising phenomenon for integrative research addressing these issues.
Similar content being viewed by others
Data availability
Not applicable.
References
Alagić, D., & Šnajder, J. (2021). Representing word meaning in context via lexical substitutes. Automatika. https://doi.org/10.1080/00051144.2021.1928437.
Ali-Khan, S. E., Jean, A., MacDonald, E., et al. (2018). Defining Success in Open Science. Mni Open Research. https://doi.org/10.12688/mniopenres.12780.1.
Altomonte, G. (2020). Exploiting ambiguity: A moral polysemy approach to variation in economic practices. American Sociological Review, 85(1), 76–105. https://doi.org/10.1177/0003122419895986.
Arp, R., Smith, B., & Spear, A. D. (2015). Building ontologies with basic formal ontology. MIT Press.
Beltagy, I., Lo, K., & Cohan, A. (2019). SciBERT: A pretrained language model for scientific text. http://arxiv.org/abs/1903.10676 [cs]
Bertone, M. A., Miko, I., Yoder, M. J., et al. (2013). Matching arthropod anatomy ontologies to the Hymenoptera Anatomy Ontology: Results from a manual alignment. Database, 2013, bas057–bas057. https://doi.org/10.1093/database/bas057.
Bowen, G. A. (2006). Grounded theory and sensitizing concepts. International Journal of Qualitative Methods, 5(3), 12–23. https://doi.org/10.1177/160940690600500304.
Bowker, G. C. (2000). Biodiversity datadiversity. Social Studies of Science, 30(5), 643–683. https://doi.org/10.1177/030631200030005001.
Bowker, G. C., & Star, S. L. (1999). Sorting things out: Classification and its consequences. MIT Press.
Brandom, R. B. (2008). Between saying and doing: Towards an analytic pragmatism. Oxford University Press.
Carr, J. W., Smith, K., Culbertson, J., et al. (2020). Simplicity and informativeness in semantic category systems. Cognition, 202(104), 289. https://doi.org/10.1016/j.cognition.2020.104289.
Català, N., Baixeries, J., Ferrer-Cancho, R., et al. (2021). Zipf’s laws of meaning in Catalan. http://arxiv.org/abs/2107.00042
Ceccarelli, L. (2001). Shaping science with rhetoric: The cases of Dobzhansky, Schrödinger, and Wilson. University of Chicago Press.
Ceusters, W., Smith, B., & Goldberg, L. (2005). A terminological and ontological analysis of the NCI thesaurus. Methods of Information in Medicine, 44(04), 498–507. https://doi.org/10.1055/s-0038-1634000.
Currie, A. (2015). Marsupial lions and methodological omnivory: Function, success and reconstruction in paleobiology. Biology & Philosophy, 30(2), 187–209. https://doi.org/10.1007/s10539-014-9470-y.
Cusimano, S., & Sterner, B. (2019). Integrative pluralism for biological function. Biology & Philosophy, 34(6), 55. https://doi.org/10.1007/s10539-019-9717-8.
Davenport, S., & Leitch, S. (2005). Circuits of power in practice: Strategic ambiguity as delegation of authority. Organization Studies, 26(11), 1603–1623. https://doi.org/10.1177/0170840605054627.
DeFries, R., & Nagendra, H. (2017). Ecosystem management as a wicked problem. Science, 356(6335), 265–270. https://doi.org/10.1126/science.aal1950.
Del Tredici, M., Nissim, M., & Zaninello, A. (2016). Tracing metaphors in time through self-distance in vector spaces. http://arxiv.org/abs/1611.03279 [cs]
Denis, J. L., Dompierre, G., Langley, A., et al. (2011). Escalating indecision: Between reification and strategic ambiguity. Organization Science, 22(1), 225–244. https://doi.org/10.1287/orsc.1090.0501.
Devlin, J., Chang, M.W., Lee, K., et al. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. http://arxiv.org/abs/1810.04805
Dietz, B. (2012). Contribution and co-production: The collaborative culture of Linnaean botany. Annals of Science, 69(4), 551–569. https://doi.org/10.1080/00033790.2012.680982.
Dourish, P. (2001). Process descriptions as organisational accounting devices: the dual use of workflow technologies. In: Proceedings of the 2001 International ACM SIGGROUP conference on supporting group work. Association for Computing Machinery, New York, NY, USA, GROUP ’01 (pp 52–60), https://doi.org/10.1145/500286.500297
Dragisic, Z., Ivanova, V., Li, H., et al. (2017). Experiences from the anatomy track in the ontology alignment evaluation initiative. Journal of Biomedical Semantics, 8(1), 56. https://doi.org/10.1186/s13326-017-0166-5.
Duncan, M. (2020). Terminology version control discussion paper. http://mrtablet.co.uk/chocolate_teapot_lite.htm
Eisenberg, E. M. (1984). Ambiguity as strategy in organizational communication. Communication Monographs, 51(3), 227–242. https://doi.org/10.1080/03637758409390197.
Ferraro, F., Etzion, D., & Gehman, J. (2015). Tackling grand challenges pragmatically: Robust action revisited. Organization Studies, 36(3), 363–390. https://doi.org/10.1177/0170840614563742.
Ferrer-i Cancho, R., Bentz, C., & Seguin, C. (2020). Optimal coding and the origins of Zipfian laws. Journal of Quantitative Linguistics. https://doi.org/10.1080/09296174.2020.1778387.
Fokkens, A., Ter Braake, S., Maks, I., et al. (2016). On the semantics of concept drift: Towards formal definitions of semantic change. Proceedings of Drift-a-LOD (2016): 247–265.
Franz, N. M., & Sterner, B. W. (2018). To increase trust, change the social design behind aggregated biodiversity data. Database. https://doi.org/10.1093/database/bax100.
Galison, P. (1996). Computer simulations and the trading zone. In P. Galison & D. J. Stump (Eds.), The disunity of science: Boundaries, contexts, and power (pp. 118–57). Stanford University Press.
Garnett, S. T., Christidis, L., Conix, S., et al. (2020). Principles for creating a single authoritative list of the world’s species. PLoS Biology, 18(7), e3000736. https://doi.org/10.1371/journal.pbio.3000736.
Garson, J. (2016). A critical overview of biological functions. Springer.
Geeraerts, D. (1997). Diachronic prototype semantics: A contribution to historical lexicology. Clarendon Press.
Gentner, D., & Grudin, J. (1985). The evolution of mental metaphors in psychology: A 90-year retrospective. American Psychologist, 40(2), 181–192. https://doi.org/10.1037/0003-066X.40.2.181.
Germain, P. L., Ratti, E., & Boem, F. (2014). Junk or functional DNA? ENCODE and the function controversy. Biology & Philosophy, 29(6), 807–831. https://doi.org/10.1007/s10539-014-9441-3.
Gerson, E. M. (2008). Reach, bracket, and the limits of rationalized coordination: Some challenges for CSCW. Resources, co-evolution and artifacts (pp. 193–220). Springer. https://doi.org/10.1007/978-1-84628-901-9_8.
Gibson, E., Futrell, R., Piantadosi, S. P., et al. (2019). How efficiency shapes human language. Trends in Cognitive Sciences, 23(5), 389–407. https://doi.org/10.1016/j.tics.2019.02.003.
Giroux, H. (2006). ‘It was such a handy term’: Management fashions and pragmatic ambiguity. Journal of Management Studies, 43(6), 1227–1260. https://doi.org/10.1111/j.1467-6486.2006.00623.x.
Granovetter, M. S. (1973). The strength of weak ties. American Journal of Sociology, 78(6), 1360–1380. https://doi.org/10.1086/225469.
Grantham, T. A. (2004). Conceptualizing the (dis)unity of science. Philosophy of Science, 71(2), 133–155. https://doi.org/10.1086/383008.
Greenhalgh, T., Robert, G., Macfarlane, F., et al. (2005). Storylines of research in diffusion of innovation: A meta-narrative approach to systematic review. Social Science & Medicine, 61(2), 417–430. https://doi.org/10.1016/j.socscimed.2004.12.001.
Grice, H. P. (1975). Logic and conversation, syntax and semantics. Speech Acts, 3, 41–58.
Grosholz, E. (2007). Representation and productive ambiguity in mathematics and the sciences. Oxford University Press.
Gross, A. G. (2006). Starring the text: The place of rhetoric in science studies. Southern Illinois University Press.
Hamilton, W.L., Leskovec, J., & Jurafsky, D. (2016). Diachronic word embeddings reveal statistical laws of semantic change. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany (pp 1489–1501). https://doi.org/10.18653/v1/P16-1141
Hauer, B., & Kondrak, G. (2020). Synonymy = Translational Equivalence. http://arxiv.org/abs/2004.13886[cs]
Hesse, M. (1988). The cognitive claims of metaphor. The Journal of Speculative Philosophy, 2(1), 1–16.
Higuera, C. R. (2018). Productive perils: On metaphor as a theory-building device. Linguistic Frontiers, 1(2), 102–111.
Hirsch, P. M., & Levin, D. Z. (1999). Umbrella advocates versus validity police: A life-cycle model. Organization Science, 10(2), 199–212. https://doi.org/10.1287/orsc.10.2.199.
Jarzabkowski, P., Sillince, J. A., & Shaw, D. (2010). Strategic ambiguity as a rhetorical resource for enabling multiple interests. Human Relations, 63(2), 219–248. https://doi.org/10.1177/0018726709337040.
Johansen, Winni. (2018). Strategic Ambiguity. In: The International Encyclopedia of Strategic Communication, edited by Robert L Heath, Winni Johansen, et al., 1st ed. Wiley. https://doi.org/10.1002/9781119010722.iesc0170.
Karjus, A., Blythe, R.A., Kirby, S., et al. (2020). Communicative need modulates competition in language change. http://arxiv.org/abs/2006.09277 [cs]
Kemp, C., Xu, Y., & Regier, T. (2018). Semantic typology and efficient communication. Annual Review of Linguistics, 4(1), 109–128. https://doi.org/10.1146/annurev-linguistics-011817-045406.
Keuchenius, A., Törnberg, P., & Uitermark, J. (2021). Adoption and adaptation: A computational case study of the spread of Granovetter’s weak ties hypothesis. Social Networks, 66, 10–25. https://doi.org/10.1016/j.socnet.2021.01.001.
Kilgarriff, A. (1997). I don’t believe in word senses. Computers and the Humanities, 31(2), 91–113. https://doi.org/10.1023/A:1000583911091.
L’ Homme, M.C., Robichaud, B., & Subirats, C. (2020). Building multilingual specialized resources based on FrameNet: Application to the field of the environment. In: Proceedings of the International FrameNet Workshop 2020: Towards a Global, Multilingual FrameNet. European Language Resources Association, Marseille, France (pp 85–92) https://www.aclweb.org/anthology/2020.framenet-1.12
Lakoff, G., & Johnson, M. (2008). Metaphors we live by. University of Chicago press.
Laubichler, M. D., Prohaska, S. J., & Stadler, P. F. (2018). Toward a mechanistic explanation of phenotypic evolution: The need for a theory of theory integration. Journal of Experimental Zoology Part B: Molecular and Developmental Evolution, 330(1), 5–14. https://doi.org/10.1002/jez.b.22785.
Lean, O. M. (2021). Are bio-ontologies metaphysical theories? Synthese. https://doi.org/10.1007/s11229-021-03303-4.
Leon-Arauz, P., Martin, A. S., & Reimerink, A. (2018). The EcoLexicon English corpus as an open corpus in sketch engine. http://arxiv.org/abs/1807.05797 [cs]
Leonelli, S. (2012). Classificatory theory in data-intensive science: The case of open biomedical ontologies. International Studies in the Philosophy of Science, 26(1), 47–65. https://doi.org/10.1080/02698595.2012.653119.
Leonelli, S. (2016). Data-centric biology: A philosophical study. University of Chicago Press.
Leonelli, S., & Tempini, N. (2020). Data journeys in the sciences. Springer. https://doi.org/10.1007/978-3-030-37177-7.
Lidgard, S., & Love, A. C. (2018). Rethinking living fossils. BioScience, 68(10), 760–770. https://doi.org/10.1093/biosci/biy084.
Li, J., & Joanisse, M. F. (2021). Word senses as clusters of meaning modulations: A computational model of polysemy. Cognitive Science, 45(4), e12955. https://doi.org/10.1111/cogs.12955.
Linquist, S., Doolittle, W. F., & Palazzo, A. F. (2020). Getting clear about the F-word in genomics. PLoS Genetics, 16(4), e1008702. https://doi.org/10.1371/journal.pgen.1008702.
Loureiro, D., Rezaee, K., Pilehvar, M. T., et al. (2021). Analysis and evaluation of language models for word sense disambiguation. Computational Linguistics (pp 1–57). https://doi.org/10.1162/coli_a_00405
McMahan, P., & Evans, J. (2018). Ambiguity and engagement. American Journal of Sociology, 124(3), 860–912. https://doi.org/10.1086/701298.
Meyer, F., & Lewis, M. (2020). Modelling lexical ambiguity with density matrices. http://arxiv.org/abs/2010.05670 [cs]
Miller, G. A. (1995). WordNet: A lexical database for English. Communications of the ACM, 38(11), 39–41. https://doi.org/10.1145/219717.219748.
Monckton, S., Johal, S., Packer, L., et al. (2020). Inadequate treatment of taxonomic information prevents replicability of most zoological research. Canadian Journal of Zoology, 98(9), 633–642. https://doi.org/10.1139/cjz-2020-0027.
Mons, B., Schultes, E., Liu, F., et al. (2019). The FAIR principles: First generation implementation choices and challenges. Data Intelligence, 2(1–2), 1–9. https://doi.org/10.1162/dint_e_00023.
Nakazawa, T. (2020). Species interaction: Revisiting its terminology and concept. Ecological Research, 35(6), 1106–1113. https://doi.org/10.1111/1440-1703.12164.
Nerlich, B., & Clarke, D. D. (2001). Ambiguities we live by: Towards a pragmatics of polysemy. Journal of Pragmatics, 33(1), 1–20. https://doi.org/10.1016/S0378-2166(99)00132-0.
Neto, C. (2020). When imprecision is a good thing, or how imprecise concepts facilitate integration in biology. Biology & Philosophy, 35(6), 58. https://doi.org/10.1007/s10539-020-09774-y.
Oliveira, D., & Pesquita, C. (2018). Improving the interoperability of biomedical ontologies with compound alignments. Journal of Biomedical Semantics, 9(1), 1. https://doi.org/10.1186/s13326-017-0171-8.
Olson, M. E., Arroyo-Santos, A., & Vergara-Silva, F. (2019). A user’s guide to metaphors in ecology and evolution. Trends in Ecology & Evolution, 34(7), 605–615. https://doi.org/10.1016/j.tree.2019.03.001.
Ortony, A. (1993). Metaphor and thought (2nd ed.). Cambridge University Press.
Panchenko, A., Ruppert, E., Faralli, S., et al. (2017). Unsupervised does not mean uninterpretable : the case for word sense induction and disambiguation. In: 15th Conference of the European Chapter of the Association for Computational Linguistics : proceedings of conference, volume 1: Long Papers. Association for Computational Linguistics, Stroudsburg, PA, pp 86–98.Stroudsburg, PA, pp 86–98. https://ub-madoc.bib.uni-mannheim.de/42007
Perrault, S. T., & O’Keefe, M. (2019). New metaphors for new understandings of genomes. Perspectives in Biology and Medicine, 62(1), 1–19.
Piantadosi, S. T., Tily, H., & Gibson, E. (2012). The communicative function of ambiguity in language. Cognition, 122(3), 280–291. https://doi.org/10.1016/j.cognition.2011.10.004.
Pimentel, T., Maudslay, R.H., Blasi, D., et al. (2020). Speakers fill lexical semantic gaps with context. http://arxiv.org/abs/2010.02172
Poesio,M. (2020).”Ambiguity".In: The Wiley Blackwell Companion to Semantics, edited by Daniel Gutzmann, Lisa Matthewson, et al., 1st ed., 1–38. Wiley. https://doi.org/10.1002/9781118788516.sem098
Poirier, L. (2019). Classification as catachresis: Double binds of representing difference with semiotic infrastructure. Canadian Journal of Communication. https://doi.org/10.22230/cjc.2019v44n3a3455.
Ribes, D., & Bowker, G. C. (2009). Between meaning and machine: Learning to represent the knowledge of communities. Information and Organization, 19(4), 199–217. https://doi.org/10.1016/j.infoandorg.2009.04.001.
Rittel, H. W. J., & Webber, M. M. (1973). Dilemmas in a general theory of planning. Policy Sciences, 4, 155–169.
Schlechtweg, D., Eckmann, S., Santus, E., et al. (2017). German in flux: Detecting metaphoric change via word entropy. http://arxiv.org/abs/1706.04971
Sennet, A. (2021). Ambiguity. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy (fall 2021). Metaphysics Research Lab, Stanford University.
Shavit, A., & Griesemer, J. (2011). Transforming objects into data: How minute technicalities of recording “species location” entrench a basic challenge for biodiversity. In: M. Carrier & A. Nordmann, Science in the context of application (Vol. 274, pp. 169–193). Springer. Dordrecht: Springer. https://doi.org/10.1007/978-90-481-9051-5_12.
Shipman, F. M., & Marshall, C. C. (1999). Formality considered harmful: Experiences, emerging themes, and directions on the use of formal representations in interactive systems. Computer Supported Cooperative Work (CSCW), 8(4), 333–352. https://doi.org/10.1023/A:1008716330212.
Stankowski, S., & Ravinet, M. (2021). Quantifying the use of species concepts. Current Biology, 31(9), R428–R429. https://doi.org/10.1016/j.cub.2021.03.060.
Star, S. L., & Griesemer, J. R. (1989). Institutional ecology, “translations’’ and boundary objects: amateurs and professionals in Berkeley’s museum of vertebate zoology, 1907–39. Social Studies of Science, 19, 387–420.
Sterner, B. W., & Franz, N. M. (2017). Taxonomy for Humans or Computers? Cognitive Pragmatics for Big Data. Biological Theory12(2), 99–111. https://doi.org/10.1007/s13752-017-0259-5
Sterner, B. W., Gilbert, E. E., & Franz, N. M. (2020). Decentralized but globally coordinated biodiversity data. Frontiers in Big Data, 3(519), 133. https://doi.org/10.3389/fdata.2020.519133.
Sterner, B. W., Witteveen, J., & Franz, N. M. (2020). Coordinating dissent as an alternative to consensus classification: Insights from systematics for bio-ontologies. History and Philosophy of the Life Sciences, 42(1), 8. https://doi.org/10.1007/s40656-020-0300-z.
Swedberg, R. (2020). Using metaphors in sociology: Pitfalls and potentials. The American Sociologist, 51, 240–257.
Tahmasebi, N., Borin, L., Jatowt, A., et al. (2021). Computational approaches to semantic change. Language Science Press. https://doi.org/10.5281/zenodo.5040241.
Takacs, D. (1996). The idea of biodiversity: Philosophies of paradise. Johns Hopkins University Press.
Ustalov, D., Chernoskutov, M., Biemann, C., et al. (2018). Fighting with the sparsity of synonymy dictionaries for automatic synset induction. Lecture Notes in Computer Science. In W. M. van der Aalst, D. I. Ignatov, M. Khachay, et al. (Eds.), Analysis of images, social networks and texts (pp. 94–105). Springer International Publishing.
Volanschi, A., & Kübler, N. (2011). The impact of metaphorical framing on term creation in biology. Terminology International Journal of Theoretical and Applied Issues in Specialized Communication, 17(2), 198–223. https://doi.org/10.1075/term.17.2.02vol.
Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3(1), 1–9. https://doi.org/10.1038/sdata.2016.18.
Wilson, M. (2006). Wandering significance: An essay on conceptual behavior. Oxford University Press.
Winkler, S. (2015). Exploring ambiguity and the ambiguity model from a transdisciplinary perspective. In: Winkler, S. Ambiguity. De Gruyter. https://doi.org/10.1515/9783110403589-002/html.
Acknowledgements
This project was funded by NSF Science and Technology Studies Grant STS-1827993. My thanks to Joeri Witteveen, Elizabeth Lerman, Nico Franz, and Manfred Laubichler for their conversations and feedback about the ideas presented here. My special thanks also to the reviewers whose constructive comments helped improve the manuscript significantly. All mistakes are entirely my own.
Funding
Funding was provided by NSF STS-1827993.
Author information
Authors and Affiliations
Contributions
Not applicable.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Code availability
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sterner, B. Explaining ambiguity in scientific language. Synthese 200, 354 (2022). https://doi.org/10.1007/s11229-022-03792-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11229-022-03792-x