Skip to main content
Log in

Linguistic inferences from pro-speech music

Musical gestures generate scalar implicatures, presuppositions, supplements, and homogeneity inferences

  • Original Research
  • Published:
Linguistics and Philosophy Aims and scope Submit manuscript

Abstract

Language has a rich typology of inferential types. It was recently shown that subjects are able to divide the informational content of new visual stimuli among the various slots of the inferential typology: when gestures or visual animations are used in lieu of specific words in a sentence, they can trigger the very same inferential types as language alone (Tieu et al., 2019). How general are the relevant triggering algorithms? We show that they extend to the auditory modality and to music cognition. We tested whether pro-speech musical gestures, i.e. musical excerpts that replace words in sentences, can give rise to the same inferences. We show that it is possible to replicate the same typology of inferences using pro-speech music. Minimal and complex musical excerpts can behave just like language, gestures, and visual animations with respect to the logical behavior of their content when embedded in sentences. Specifically, we found that pro-speech music can generate scalar implicatures, presuppositions, supplements, and homogeneity inferences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Here we use the example of presuppositions as an illustration of the main argument of the paper. The argument however applies similarly to the three other inferences tested: scalar implicatures, supplements and homogeneity inferences. A summary of the different inferential mechanisms can be found in Appendix III.

  2. Generally, the projection problem refers to the computation of the presupposition and asserted content of a sentence from the presupposition and asserted content of its constituents. By contrast, projection tests, such as the family-of-sentences test [so-called in Chierchia and McConnell-Ginet (2000)]. i.e. embedding under negation, modality or question, are used to check whether a given proposition is a presupposition (Beaver, 2001; Chemla, 2009; Geurts, 1999; Heim, 1990, 1992; Stalnaker, 1974).

  3. Although ‘pro-speech music’ [literally, music replacing words] is a specific kind of musical gestures [i.e. the iconic musical motives or excerpts roughly used in lieu of gestures in Tieu et al. (2019)], we use ‘pro-speech music’ and ‘musical gestures’ interchangeably throughout this paper.

  4. Except for the paradigm testing supplements in Sect. 3.3, we mainly used basic scales, drum sounds or isolated tones. Our definition of these as music could be contested because of their simple nature. However, even if these stimuli did not count as music, our claims on the generality of the algorithm that divides content in at-issue and non-at-issue would remain unaltered.

  5. All musical gestures can be directly accessed by clicking on the hyperlinks.

  6. As pointed out by an anonymous reviewer, we cannot know when precisely participants drew the inference. However, whenever the inference is actually triggered, it is unclear how our predictions would be different at this stage. As the same issue was present in Tieu et al. (2019), we assume that our predictions would not have been different.

  7. For instance, our data does not allow us to decide between a grammatical or a neo-Gricean theory of scalar implicatures, but allows us to claim that regardless of the details of the algorithm responsible for scalar implicatures, this algorithm extends to music cognition and must therefore be domain-general.

  8. As pointed out by a reviewer, it cannot be ruled out that when asked about inferences, subjects understand that they must guess the right word in the stimuli, e.g. ‘climb’. Indeed, the target sentence was visible while listening to the stimulus, just like in Tieu et al. (2019). Since this issue applies to all of this literature, we leave it as a problem for future research to understand if and how participants consider the lexical material from the target sentence to be relevant to the comprehension of the stimulus.

  9. All material including stimuli, design files and analysis scripts are accessible at https://osf.io/hw45u/?view_only=89f983db777f49e9a6f5b41b3dea60d6. Material was uploaded prior to the beginning of the data collection, see preregistration folder. Final results and statistics are available in the results folder.

  10. For presentational clarity, we choose to present scalar implicatures by going through the Gricean reasoning. We then mention the alternative theory, the grammatical account of scalar implicatures. In any case, these should be viewed as placeholders for any theory of scalar implicatures. Note that any theory of scalar implicatures predicts that if alternatives are provided in the context, implicatures should be derived. This is, in a way, a sanity check to confirm that the inferential mechanisms work as expected with pro-speech sounds, and that there can be competition among musical excerpts.

  11. In our stimuli, alternatives were provided in the context: the alternative, e.g. drum\(\times \)3, to the musical gesture, drum\(\times \)1, was salient in the context. For this reason, our results don’t speak to the issue of alternative generation. Our point is that scalar implicatures can be triggered by pro-speech music as long as two alternatives are available. In other words, whatever the mechanism responsible for alternative generation, subjects are able to interpret one musical alternative as logically stronger than the other, have it compete with the other, and finally draw a scalar implicature from this process. However, if scalar implicatures were in fact to arise when alternatives are not directly given, then a general theory of alternatives could be needed. This is an exciting question for future research. Schlenker (2020) discussed such a theory in relation to Katzir (2007) in the case of gestures. As to music, it would not be surprising that implicatures are triggered whenever a musical gesture competes with a contextually salient (but not explicitly given) more informative alternative.

  12. The original sequence of sentences and the visual animations can be accessed at Tieu et al.’s supplementary materials page: https://mfr.au-1.osf.io/render?url=https://osf.io/v5xa3/?direct%26mode=render%26action=download%26mode=render.

  13. In this paradigm, we embedded complex music in language and uncovered rich linguistic inferences. A non-trivial extension of our paradigm may in future test if such logical inferences can arise in purely musical environments.

  14. Although this music unambiguously conveys a feeling of fear, danger or suspense, there are many possible situations this music can refer to that would trigger such feelings, considering that music can indeed refer to several external non-musical situations sharing some structural and/or emotional properties (Schlenker, 2017, 2019b).

  15. As music is not generally used to directly communicate ideas or convey information about the world, investigating the intermediary case of onomatopoeias may bridge our findings with the results from Tieu et al. (2019). Since onomatopoeias are used in combination with language and are highly iconic, there are substantial reasons to believe that any inference type from the typology we replicated with musical gestures could just as well be replicated with onomatopoeias: since onomatopoeias occur naturally in lieu of words, we would expect them to display the same inferential behavior as musical gestures which do not occur naturally in lieu of words and are arguably more difficult to process. To verify that this was the case, we ran a similar experiment with onomatopoeias instead of musical stimuli. Results were not significant for all inference types (see Appendix I for details). However, the results were not significantly different across the two experiments, leaving open the possibility that the onomatopoeias experiment, which we ran on fewer participants, was lacking power.

  16. As pointed out by the Editor, instead of considering whether the premise triggered a homogeneous or a non-homogeneous reading, we could have shown that the interpretation of the musical gesture flips from universal in the positive environment to existential in the negative environment. In this case, we would have been interested in the interaction between the inference type factor, and the environment (positive vs. negative). We display this alternative analysis in Appendix II, and show that it leads to a significant interaction between the two factors, indicating that the reading did change from universal to existential once embedded under negation.

  17. The detailed analyses and statistical scripts to compute interactions are available in the results folder at https://osf.io/hw45u/?view_only=89f983db777f49e9a6f5b41b3dea60d6.

  18. Bivalence refers to the existence of pre-conditions to the meaning.

  19. The output is actually closer to <presupposition, presupposition+assertion>, because the assertion in <presupposition, assertion> is underdetermined.

References

  • Abrusán, M. (2010). Triggering verbal presuppositions. Semantics and Linguistic Theory, 20, 684–701.

    Article  Google Scholar 

  • Abrusán, M. (2011). Predicting the presuppositions of soft triggers. Linguistics and Philosophy, 34, 491–535.

    Article  Google Scholar 

  • Abusch, D. (2002). Lexical alternatives as a source of pragmatic presuppositions. Semantics and Linguistic Theory, 12, 1–19.

    Article  Google Scholar 

  • Abusch, D. (2010). Presupposition triggering from alternatives. Journal of Semantics, 27, 37–80.

    Article  Google Scholar 

  • Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68, 255–278.

    Article  Google Scholar 

  • Beaver, D. (1994). When variables don’t vary enough. Semantics and Linguistic Theory, 4, 35–60.

    Article  Google Scholar 

  • Beaver, D. (2001). Presupposition and assertion in dynamic semantics. Stanford: CSLI Publications.

    Google Scholar 

  • Chemla, E. (2009). Presuppositions of quantified sentences: Experimental data. Natural Language Semantics, 17, 299–340.

    Article  Google Scholar 

  • Chemla, E., & Schlenker, P. (2012). Incremental vs. symmetric accounts of presupposition projection: An experimental approach. Natural Language Semantics, 20, 177–226.

    Article  Google Scholar 

  • Chierchia, G., Fox, D., & Spector, B. (2012). Scalar implicature as a grammatical phenomenon. In C. Maienborn, K. von Heusinger, & P. Portner (Eds.), Semantics: An international handbook of natural language meaning (Vol. 3, pp. 2297–2331). Berlin: De Gruyter Mouton.

  • Chierchia, G., & McConnell-Ginet, S. (2000). Meaning and grammar: An introduction to semantics. Cambridge, MA: MIT Press.

    Google Scholar 

  • Chomsky, N. (1980). On cognitive structures and their development: A reply to Piaget. In M. Piatelli-Palmarini (Ed.), Language and learning (pp. 35–54) London: Routledge and Kegan Paul.

  • Chomsky, N. (1988). Language and problems of knowledge: The Managua lectures. Cambridge, MA: MIT Press.

    Google Scholar 

  • Fox, D. (2013). Presupposition projection from quantificational sentences: Trivalence, local accommodation, and presupposition strengthening. MIT Web Domain.

  • Gajewski, J. (2005). Neg-raising: Polarity and presupposition. Ph.D. thesis, MIT.

  • George, B. R. (2008). A new predictive theory of presupposition projection. Semantics and Linguistic Theory, 18, 358–375.

    Article  Google Scholar 

  • Geurts, B. (1998). The mechanisms of denial. Language, 74, 274–307.

    Article  Google Scholar 

  • Geurts, B. (1999). Presuppositions and pronouns. Amsterdam: Elsevier.

  • Geurts, B., & van Tiel, B. (2016). When “All the Five Circles’’ are four: New exercises in domain restriction. Topoi, 35, 109–122.

    Article  Google Scholar 

  • Grice, H. (1975). Logic and conversation. In P. Cole & J. Morgan (Eds.), Syntax and semantics 3: Speech acts (pp. 41–58). New York: Academic Press.

    Google Scholar 

  • Heim, I. (1988). The semantics of definite and indefinite noun phrases. New York: Garland.

  • Heim, I. (1990). Presupposition projection. In R. van der Sandt (Ed.), Presupposition, lexical meaning and discourse processes: Workshop reader. Nijmegen: University of Nijmegen.

    Google Scholar 

  • Heim, I. (1992). Presupposition projection and the semantics of attitude verbs. Journal of Semantics, 9, 183–221.

    Article  Google Scholar 

  • Heim, I., & Kratzer, A. (1998) Semantics in generative grammar. Oxford: Blackwell.

  • Horn, L. (1972). On the semantic properties of the logical operators in English. Ph.D. thesis, University of California at Los Angeles.

  • Kadmon, N. (2001). Formal pragmatics: Semantics, pragmatics, presupposition, and focus. Malden, MA: Wiley-Blackwell.

    Google Scholar 

  • Katzir, R. (2007). Structurally-defined alternatives. Linguistics and Philosophy, 30, 669–690.

    Article  Google Scholar 

  • Križ, M. (2015) Aspects of homogeneity in the semantics of natural language. Ph.D. thesis, University of Vienna.

  • Križ, M. (2016). Homogeneity, non-maximality, and all. Journal of Semantics, 33, 493–539.

    Article  Google Scholar 

  • Križ, M. (2019). Homogeneity effects in natural language semantics. Language and Linguistics Compass, 13, e12350.

    Article  Google Scholar 

  • Križ, M., & Spector, B. (2020). Interpreting plural predication: Homogeneity and non-maximality. Linguistics and Philosophy, 44(5), 1131–1178.

    Article  Google Scholar 

  • Löbner, S. (2000). Polarity in natural language: Predication, quantification and negation in particular and characterizing sentences. Linguistics and Philosophy, 23, 213–308.

    Article  Google Scholar 

  • Levinson, S. C. (2000). Presumptive meanings: The theory of generalized conversational implicature. Language, speech, and communication. Cambridge, MA: MIT Press.

    Book  Google Scholar 

  • Mandelkern, M. (2016). A note on the architecture of presupposition. Semantics and Pragmatics, 9, 13–24.

    Article  Google Scholar 

  • Mayr, C., & Sauerland, U. (2016). Accommodation and the strongest meaning hypothesis. In T. Brochhagen, F. Roelofsen, & N. Theiler (Eds.), Proceedings of the 20th Amsterdam Colloquium. (pp. 276–285). Amsterdam: ILLC.

  • Pfau, R., & Steinbach, M. (2006). Pluralization in sign and in speech: A cross-modal typological study. Linguistic Typology, 10, 135–182. https://doi.org/10.1515/LINGTY.2006.006.

  • Potts, C. (2004). The logic of conventional implicatures. Oxford: Oxford University Press.

    Book  Google Scholar 

  • R Core Team. (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Version, 3(4), 3.

    Google Scholar 

  • Sauerland, U. (2004). Scalar implicatures in complex sentences. Linguistics and Philosophy, 27, 367–391.

    Article  Google Scholar 

  • Sauerland, U. (2012). The computation of scalar implicatures: Pragmatic, lexical or grammatical? Computation of scalar implicatures. Language and Linguistics Compass, 6, 36–49.

    Article  Google Scholar 

  • Schlenker, P. (2008). Presupposition projection: The new debate. Semantics and Linguistic Theory, 18, 655–693.

    Article  Google Scholar 

  • Schlenker, P. (2010). Presuppositions and local contexts. Mind, 119, 377–391.

    Article  Google Scholar 

  • Schlenker, P. (2016). The semantics-pragmatics interface. In M. Aloni & P. Dekker (Eds.), The Cambridge handbook of formal semantics (pp. 664–727). Cambridge: Cambridge University Press.

    Chapter  Google Scholar 

  • Schlenker, P. (2017). Outline of music semantics. Music Perception: An Interdisciplinary Journal, 35, 3–37.

    Article  Google Scholar 

  • Schlenker, P. (2018). Gesture projection and cosuppositions. Linguistics and Philosophy, 41, 295–365.

    Article  Google Scholar 

  • Schlenker, P. (2019a). Gestural semantics: Replicating the typology of linguistic inferences with pro- and post-speech gestures. Natural Language & Linguistic Theory, 37, 735–784.

    Article  Google Scholar 

  • Schlenker, P. (2019b). Prolegomena to music semantics. Review of Philosophy and Psychology, 10, 35–111.

    Article  Google Scholar 

  • Schlenker, P. (2020). Gestural grammar. Natural Language & Linguistic Theory, 38, 887–936.

    Article  Google Scholar 

  • Schlenker, P. (2021). Triggering Presuppositions. Glossa: A Journal of General Linguistics, 6, 35.

    Article  Google Scholar 

  • Simons, M., Tonhauser, J., Beaver, D., & Roberts, C. (2010). What projects and why. Semantics and Linguistic Theory, 20, 309–327.

    Article  Google Scholar 

  • Spector, B. (2013). Homogeneity and plurals: From the strongest meaning hypothesis to supervaluations. Presentation at Sinn und Bedeutung 18. https://ehutb.ehu.eus/uploads/material/Video/3289/Sinn18_01.pdf.

  • Stalnaker, R. (1974). Pragmatic presuppositions. In R. Stalnaker (Ed.), Context and content (pp. 47–62). Oxford: Oxford University Press.

    Google Scholar 

  • Sudo, Y., Romoli, J., Hackl, M., & Fox, D. (2012). Presupposition projection out of quantified sentences: Strengthening, local accommodation and inter-speaker variation. In M. Aloni, V. Kimmelman, F. Roelofsen, G. W. Sassoon, K. Schulz, & M. Westera (Eds.), Logic, language and meaning (pp. 210–219). Berlin: Springer.

  • Tieu, L., Pasternak, R., Schlenker, P., & Chemla, E. (2018). Co-speech gesture projection: Evidence from inferential judgments. Glossa: A Journal of General Linguistics, 3, 109.

    Article  Google Scholar 

  • Tieu, L., Schlenker, P., & Chemla, E. (2019). Linguistic inferences without words. Proceedings of the National Academy of Sciences, 116, 9796–9801.

    Article  Google Scholar 

  • Tonhauser, J., Beaver, D., Roberts, C., & Simons, M. (2013). Toward a taxonomy of projective content. Language, 89(1), 66–109.

    Article  Google Scholar 

  • van der Sandt, R. A. (1992). Presupposition projection as anaphora resolution. Journal of Semantics, 9, 333–377.

    Article  Google Scholar 

  • van Rooij, R., & Schulz, K. (2004). Exhaustive interpretation of complex sentences. Journal of Logic, Language and Information, 13, 491–519.

    Article  Google Scholar 

  • Zehr, J., Bill, C., Lyn, T., Jacopo, R., & Florian, S. (2016). Presupposition projection from the scope of None: Universal, existential, or both? Semantics and Linguistic Theory, 26, 754–774.

    Article  Google Scholar 

Download references

Acknowledgements

We are greatly indebted to Philippe Schlenker and Emmanuel Chemla for in-depth discussion of virtually every aspect of this project. We also thank Amir Anvari for his helpful comments on the theoretical underpinnings of this work and suggestions for improvement on the first draft. Thank you to Salvador Mascarenhas, Rob Pasternak and Lyn Tieu for discussion of the early versions of this work. Finally, we would like to thank the audiences of the ‘Linguistic investigations beyond language’ workshop at ZAS, GLOW 2019, and the Linguae seminar.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Léo Migotti.

Ethics declarations

Ethical Standards

Data collection was approved under Opinion number 20-733 of the Institutional Review Board of the French Institute of medical research and Health (IRB00003888, IORG0003254, FWA00005831).

Informed consent

Online informed consent was obtained for all participants prior to the beginning of the experiment.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant Agreement No. 788077, Orisem, PI: Schlenker). Research was conducted at Institut d’Etudes Cognitives, Ecole Normale Supérieure - PSL Research University. Institut d’Etudes Cognitives is supported by grants ANR-10-IDEX-0001-02 and FrontCog ANR-17-EURE-0017. This paper is part of the Special Issue “Super Linguistics”, edited by Pritty Patel-Grosz, Emar Maier, and Philippe Schlenker.

Appendices

Appendix I: Musical gestures and vocal gestures

As mentioned in Sect. 3.3, we ran a parallel experiment on a different pool of participants using pro-speech onomatopoeias, i.e. iconic vocal sounds replacing one or several words in sentences, that we call ‘vocal gestures’, instead of musical gestures. The same four inference types were tested: scalar implicatures, presuppositions, supplements and homogeneity inferences, using paradigms and stimuli that were either perfectly analogous to the ones described throughout this paper, or very slightly different when vocal counterparts to musical stimuli could not be found. For instance, wherever an upward scale was used in our stimuli to evoke a rise in space to test for presuppositions, we used a whistle rising in frequency, and wherever a choir singing the French national anthem was used to evoke the action of singing to test for homogeneity inferences, we used a similar stimulus where the very same song was whistled instead of sung. Wherever a drum was used to evoke someone boxing to test for scalar implicatures, we used the onomatopoeia boom vocally pronounced.

We found a difference in the results collected from the experiment on musical gestures and the one on vocal gestures, which is surprising as most paradigms were perfectly symmetric. However, this experiment was ran on a smaller pool than the musical gestures experiment, and the lack of systematically significant effects of the inference type might thus be explained by a lack of power. We found indeed that the differences in the endorsement rates across both experiments were mainly not or marginally significant themselves, i.e. the responses given for the experiment on vocal gestures were not significantly different from the ones given for the experiment on musical gestures. Figure 7 summarizes the comparisons of the distribution of the data between both experiments. For each inference type, we computed the interaction between inference type and the Experiment factor, whose two levels corresponded to the two experiments.Footnote 17

Fig. 7
figure 7

Comparison between musical gestures and vocal gestures experiments

Figure 7 displays the figures assessing the significance of the difference in distribution of the data collected across experiments. None of the data subsets were significantly different across both experiments, except for supplements, for which we found no contrast in endorsement due to an unexpected interpretation of the vocal stimulus.

Appendix II: Statistical models

In this appendix, we provide the generalized linear mixed-effects models we used to analyze the data. They were all inspired by that of Tieu et al. (2019), whose analysis script is open source. For each model, we justify the structure of the model and in particular its random effects structure by the design [as recommended in Barr et al. (2013)]. We also provide the coefficients found for each model. For the sake of clarity, we report the statistical models in a raw fashion (i.e. the R code lines). Useful information about the syntax of R, which can be helpful to read these formulas, can be found online and are accessible at this link: https://github.com/clayford/LMEMInR/blob/master/lme4_cheat_sheet.Rmd.

1.1 Scalar implicatures

In the scalar implicatures paradigms, two main factors could have had an effect on the endorsement of the target inferences: the gesture factor (GestureC), which two levels correspond to the target and control premises contrasting two alternative realizations of the drum sound mimicking someone boxing, drum\(\times \)1 and drum\(\times \)2; and the inference factor (InferenceC), which two levels correspond to the target and baseline inferences contrasting two possible interpretations of the the drum sounds (‘some’ vs. ‘a lot’ in the positive environment; ‘some’ vs. ‘none’ in the negative environment).

Here, we were interested in the GestureC * InferenceC interaction to ensure that the contrast found for the target premise was not due to a default bias in endorsement for one inference over the other, by comparing the difference between the endorsement of the target and that of its negation for the target premise and the same difference for the control premise. We did not include the environment factor in the model (positive vs negative) because there was no theoretical reason to expect a difference in these interactions between the positive and the negative environment. GestureC, InferenceC, and their interaction were thus used as fixed effects in our model, while this interaction by subject was used as a random effect, accounting for the variability across participants in (i) the interpretation of the premise, (ii) the interpretation of the inference, and (iii) the interaction between (i) and (ii).

Model

value \(\sim \) GestureC * InferenceC +(1 + GestureC * InferenceC|SubjID)

This first model failed to converge. We followed the same procedure as (Tieu et al., 2019) in simplifying the random effects structure, and removed the interaction between the two factors GestureC and InferenceC from the random effects as follows:

Simplification of random effects structure (as in Tieu et al., 2019)

value \(\sim \) GestureC * InferenceC+(1+ GestureC+ InferenceC|SubjID)

$$\begin{aligned} \begin{array}{ c c c c c } \text { Environment} &{} \text {(Intercept)} &{} \text {GestureC} &{} \text {InferenceC} &{} \text {GestureC:InferenceC} \\ \text {Positive} &{} 52.60377 &{} -\,0.79245 &{} -\,0.07547 &{} 116.60377 \\ \text {Negative} &{} 57.005 &{} 1.557 &{} 7.575 &{} 44.698\\ \end{array} \end{aligned}$$

1.2 Presuppositions

There was no theoretical reason for predicting an interaction in presuppositional behavior between question formation and projection under ‘none’, as the two projection tests are independent. Inferences from questions and from ‘none’ were thus analyzed separately. We are rather interested in the effect of the inference type (presupposition vs no presupposition) on the responses.

To ensure that the inference containing presupposition was not due to a by-default preference for this kind of inference, the endorsement of the presupposition was contrasted with that of its negation (baseline inference), as a control. The model thus used the inference factor InferenceC, which levels correspond to the presupposition/no presupposition, as a fixed effect, and this same factor by subject was used as a random effect, to capture the possibility that each participants may simply tend to endorse presuppositions differently.

Model

This first model failed to converge, leading us to simplify the random effect structure by only keeping the SubjID factor as a random effect, to capture the variability in intercept across participants:

Simplification of random effects structure

$$\begin{aligned} \begin{array}{ c c c } \text { Environment } &{}\qquad \text { (Intercept) } &{}\qquad \text { InferenceC }\\ \text { Question } &{}\qquad 49.48 &{}\qquad -\,15.60 \\ \text { `none' } &{}\qquad 51.65 &{}\qquad 2.74\\ \end{array} \end{aligned}$$

1.3 Supplements

The case for supplements is symmetric to the paradigm for scalar implicatures in 6, where both the gesture factor GestureC, describing the two types of premise (the target premise where the musical gesture is expected to be interpreted as a non-restrictive relative clause, and the control premise where the musical gesture is made at-issue by using ‘like this’) and the inference factor InferenceC, describing the two types of inference (supplemental or not) were used as fixed effects in the model, while the interaction by participant was used as a random effect in the maximal model below:

Here, we are interested in the interaction between the Gesture factor, which two levels represent the two forms of sentences (with a pro-speech gesture standing for a non-restrictive relative clause, and with the deictic ‘like this’) and the Inference factor, which two levels correspond to the supplemental inference (if X, then X would have happened in a certain way) and the inference without the supplemental information (if X, then X would not have necessarily happened in this same way). As the control inference using ‘not necessarily’ was not the exact negation of the target inference for reasons of simplicity, we had no strong prediction as to how differently the target and control inference would be endorsed for the control premise using ‘like this’; but we expected this difference to be important for the target inference using a post-speech musical gesture (without ‘like this’) which was expected to trigger a supplemental inference just as post-speech gestures do (Tieu et al., 2019; Schlenker, 2018). We thus expected an interaction between the two factors, with a higher difference in endorsement between the target inference and the baseline inference in response to the target premise than in response to the control premise.

Model

value\(\sim \) GestureC * InferenceC + (1 + GestureC * InferenceC|SubjID)

This first model failed to converge, leading us to simplify as before the random effect structure by removing the interaction:

Simplification of random effects structure 1

value\(\sim \) GestureC * InferenceC + (1 + GestureC + InferenceC|SubjID)

This second model converged but did not allow for interaction testing (removing the interaction from the model prevented the model from converging), leading us to simplify the random effect structure even more:

Simplification of random effects structure 2

$$\begin{aligned} \begin{array}{ c c c c } \text { (Intercept) }&{} \text { GestureC }&{} \text { InferenceC }&{} \text { GestureC:InferenceC }\\ \text { 57.175 }&{} -\,7.085 &{} -\,26.858 &{} 8.774 \\ \end{array} \end{aligned}$$

1.4 Homogeneity inferences—NP and predicate

The case for homogeneity is analogous to that of presuppositions in 6, where only inference type was used as a fixed effect in the model, while the effect of inference type by participant (testing whether each participant had a personal tendency to endorse the inferences) was used as a random effect. As we were interested in whether the homogeneous (‘all’ or ‘none’) inference would be preferred to the non-homogeneous one in each environment (positive and negative), we are here only interested in the effect of inference type (homogeneous vs. non-homogeneous).

Model

This model did not converge, so we decided, as with presuppositions, to go with the most minimal random effect structure only accounting for the difference in intercepts between participants:

Simplification of random effects structure

$$\begin{aligned} \begin{array}{ c c c c } \text { Hypothesis tested } &{}\text { Environment } &{}\text { (Intercept) } &{}\text { InferenceC } \\ \text { NP } &{}\text { Positive } &{}51.49 &{}-\,59.85 \\ \text { NP } &{}\text { Negative } &{}47.91 &{}-\,38.87 \\ \text { Predicate } &{}\text { Positive } &{}54.71 &{}-\,61.34 \\ \text { Predicate } &{} \text { Negative } &{}40.31 &{}-\,37.34 \\ \end{array} \end{aligned}$$

Note that in our analysis, we were interested in inference type, i.e. whether the homogeneous inference was significantly more endorsed than its non-homogeneous counterpart. However, as the editor pointed out, it would also have made sense to analyze the inference type in terms of universal or existential. As described in the paradigm in Sect. 3.4, we expected musical gestures (that were punctuated repetitions) to have a universal reading in the positive environment (‘all harps’) and to get an existential reading in the negative environment (‘at least one harp’, i.e. ‘some harps’), just as the reading of gestures and visual animations were shown to flip from universal to existential depending on the context in Tieu et al. (2019). If we perform such an analysis, we are interested in the interaction between environment (positive vs. negative) and inference type (understood as universal vs. existential, and not as homogeneous vs. non-homogeneous anymore). We plot the results with this new analysis and report the significance and coefficients of what the model would have been in this situation below:

Fig. 8
figure 8

Reading flipping under negation

Model

This model failed to converge, leading to a simplification of the random effects structure as follows:

Simplified model

value\(\sim \) InferenceTypeC * Environment + (1 + InferenceTypeC + Environment|SubjID)

The comparison of this model to the same model without the interaction between inference type and environment showed a significant difference between the two models (\(\chi ^2\) = 84, p < 0.001). This supports the idea that the reading of a same musical gesture flips under negation, which is consistent with our first analysis.

1.5 Iconicity controls

For each iconicity control, we were interested in the difference in endorsement between the inference containing the matching interpretation of the iconic modulation and the inference that contained the opposite non-matching interpretation. For instance, we wanted to know whether the inference containing the interpretation of a long upward scale as a high mountain was more endorsed than the inference containing the interpretation of a long upward scale as a low mountain. We thus counted inference type as a fixed effect. To model the possible differences in the tendencies to endorse both types of inferences in a certain way for each participant, we included inference type by subject as a random effect in the model, as shown:

Model

value\(\sim \) InferenceTypeC + (1 + InferenceTypeC|SubjID)

$$\begin{aligned} \begin{array}{ c c c c } \text { Control } &{}\quad \text { (Intercept) } &{}\quad \text { InferenceC } \\ 1 &{}\quad 55.09 &{}\quad 52.11 \\ 2 &{}\quad 52.75 &{}\quad -\,28.37\\ \end{array} \end{aligned}$$

Appendix III: Inferential mechanisms

Here, we provide a loose description of the underlying mechanisms responsible for the four types of inferences tested. Although an important part of its content is itself subject to debate, the aim of this table is merely to provide some very basic analysis of each inference to facilitate the reading of each section; it does not aim at exhaustively accounting for all possible theoretical models.Footnote 18\(^{,}\)Footnote 19

Fig. 9
figure 9

Description of each inferential type

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Migotti, L., Guerrini, J. Linguistic inferences from pro-speech music. Linguist and Philos 46, 989–1026 (2023). https://doi.org/10.1007/s10988-022-09376-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10988-022-09376-9

Keywords

Navigation