Algebraic Effects for Extensible Dynamic Semantics

Research in dynamic semantics has made strides by studying various aspects of discourse in terms of computational effect systems, for example, monads (Shan, 2002; Charlow, 2014), Barker and 2014), (Maršik, 2016). We provide a system, based on graded monads, that synthesizes insights from these programs by formalizing individual discourse phenomena in terms of separate effects, or grades. Included are effects for introducing and retrieving discourse referents, non-determinism for indefiniteness, and generalized quantifier meanings. We formalize the behavior of individual effects, as well as the interactions between effects, in terms of algebraic laws tailored to the relevant discourse phenomena. The system we propose is thus modular and suggests a novel approach to integrating formal accounts of distinct semantic phenomena. Finally, we give an interpretation of the system into pure λ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \lambda $$\end{document}-calculus that respects the laws. Future work will aim to integrate more discourse phenomena using the same methodology, for example, presupposition and conventional implicature.


Introduction
In the last two decades, research in dynamic semantics has attained a breadth of insights into the relation between compositional semantics and dynamic semantics by viewing prima facie non-compositional phenomena as arising from computational side effects. The effectful approach to meaning allows one to view compositionally recalcitrant features-e.g., discourse referents, intensionality, and scope-taking-as giving rise to a rich, but uniform structure which integrates them with truth-conditional B Julian Grove julian.grove@gmail.com Jean-Philippe Bernardy jean-philippe.bernardy@gu.se meaning. This integration is done by injecting truth-conditional meaning into the effectful structure in a natural way, such that the structure gives rise to a functor. This pattern of analysis has come in a variety of forms: continuations to study scope (Barker 2002; Barker and Shan 2014; de Groote 2001), graded applicative functors to study quantification (Kobele, 2018a) , and monads to study discourse referents, anaphora, and indefiniteness, among other phenomena (Shan, 2002;Giorgolo & Unger, 2009;Giorgolo & Asudeh, 2012, Charlow, 2014, 2020a, 2020b. Progress in understanding dynamic and scopal phenomena in terms of effects, however, has presented two basic methodological questions. On the one hand, given effectful treatments of individual phenomena (say, discourse referents, quantification, and conventional implicature), how does one integrate them into a semantic analysis encompassing them all? On the other hand, how does one study interactions between these phenomena, while simultaneously preserving their individual treatments in the result? In this paper, we address both questions by providing a general framework, based on algebraic effects, for characterizing individual dynamic semantic phenomena, as well as their interactions, in terms of algebraic laws.
Stated in other terms, our goal is to improve the compositionality properties of functor-based theories of dynamic semantics; i.e., by recasting them as algebraic theories: • At a meta-theoretical level, when two phenomena are described by two distinct theories within our framework, we provide a systematic recipe for obtaining a combined theory of both phenomena. The combination is monotonic, in the sense that the predictions of the original theories, regarding either phenomenon, remain unchanged in the combined theory. • In individual analyses, when two syntactically adjacent constituents feature two distinct (yet possibly interacting) phenomena, their meanings may always be combined compositionally, in order to obtain a meaning for their combination.
We begin in Sect. 2 with a background to monadic dynamic semantics, and we present the issue of compositionality that pertains to monads and monad transformers. In Sect. 3, we axiomatize our approach in terms of the meta-language we use to describe algebraic theories, and we show how meanings may be provided in terms of this meta-language. Section 4 provides an interpretation of this axiomatizations in terms of a simply typed λ-calculus with products. We discuss related work in Sect. 5 before concluding in Sect. 6.

Monadic Dynamic Semantics
Since the work of Shan (2002), monads have provided a popular interface for semantic analyses employing computational effects. Monads have been used to study anaphora (Giorgolo & Unger, 2009) and conventional implicature (Giorgolo and Asudeh 2012), and have more recently been taken up by Charlow (2014Charlow ( , 2020a to study the interactions among quantification, anaphora, indefiniteness, and binding in a framework that relies on monad transformers. We assume a general familiarity with monads, but we briefly remind the reader of their structure, in order to introduce notation. A monad M is an endofunctor that takes a given type α onto a type Mα of computations exhibiting structure that encapsulates some desired side effect, e.g., reading and writing to a store, or non-determinism. 1 Each monad M is associated with two operators, η ('return') and ('bind'), having the following type signatures, for any types α and β: The role of η is to inject pure (i.e., non-effectful) values into the structure provided by M, while sequences a computation of type Mα with an indexed computation of type α → Mβ to produce a sequenced computation of type Mβ.

Using Monad Transformers: Charlow (2014)
Charlow (2014) introduces a monadic dynamic semantics that combines analyses of anaphora, indefiniteness, and quantification by relying on monad transformers. In particular, Charlow uses a Powerset monad to characterize indefiniteness, and then applies a State monad transformer, in order to obtain a system to characterize both indefiniteness and anaphora in the same grammar. He then applies a Continuation monad transformer, in order to provide a setting to study quantification. Crucially, the analyses that he provides for individual phenomena are extended compositionally to obtain analyses of their combinations with new phenomena. 2 The Powerset monad P allows one to analyze indefinite noun phrases (and the expressions with which they compose) as denoting sets, encoded as functions of type α → t: This way, the noun phrase a linguist, for instance, will denote the set {x | lingx} and may be composed with an intransitive verb such as sleeps by injecting the latter into the monad via η: ηsleep. To compose them, Charlow employs monadic functional application (which he overloads with forward and backward application, to be disambiguated by the types of arguments). Functional application (FA) is defined as follows for an arbitrary monad M: Now, a linguist sleeps may be interpreted as FA{x | lingx}(ηsleep), which can be reduced to {sleepx | lingx}; that is, a set of truth values containing True iff some linguist sleeps.
To incorporate anaphora, he invokes the following State monad transformer, which takes an underlying monad M onto a new monad S T M, for some fixed type s of states: In this example M will be instantiated to the Powerset monad, and s to the type of lists of individuals. This transformation of the Powerset monad to provide State functionality allows such lists to be accessed and updated throughout semantic composition as lists of discourse referents. To allow the indefinite a linguist to introduce a discourse referent, for example, Charlow defines the following operation, (·) , for an underlying Powerset monad, though which we give for an arbitrary underlying monad M in the presence of State functionality: Here, the operation :: conses a new individual onto a list, thus providing it as a discourse referent. Now, one can associate the sentence a linguist sleeps with a discourse referent by having a linguist introduce it (given an updated instance of FA): Thus the meaning of a linguist has changed, given our use of the State-transformed Powerset monad. The new meaning is, in fact, straightforward to obtain from the old meaning, however, in terms of a function lifting values from Mα to S T Mα: It is in this sense that the addition of State functionality to meanings stated with respect to the Powerset monad is (in principle) compositional. Both the monadic combinators of the Powerset monad, and the meanings it is used to characterize, may be injected into the State setting.
Charlow uses this strategy to introduce analyses of quantificational noun phrases into the monadic setting. Taking inspiration from the continuation-style treatments of quantifiers of Barker (2002) and Barker and Shan (2014), he employs the following Continuation monad transformer, C T .
The underlying monad, in this case, is the State-transformed Powerset monad. Like the State monad transformer, the Continuation monad transformer also comes with a lifting function lift C : Such lowered meanings may, in turn, be lifted back into the Continuation monad, e.g., in order to further compose them with quantificational meanings: Note that once a quantifier has been lowered, its scope is fixed. Thus Charlow composes lift C and lower C in this way, as an operator reset, in order to delimit the scope of quantifiers to finite clause boundaries. At the same time, as he shows, lowering does not affect the capacity of indefinite noun phrases and discourse referents to take scope; their side effects are still potent, as they are represented in terms of the State-transformed Powerset monad. As a consequence, the limited scopal possibilities for quantifiers and the flexible scoping behavior of indefinites and discourse referents may be modeled within the same continuation-based setting.

Monads and Compositionality
The above discussion provides only a schematic presentation of the system of Charlow (2014). What we hope to have conveyed, however, is the manner in which the system, as a theory of indefiniteness, anaphora, and quantification, is monotonic and compositional in the senses introduced earlier. The theory of indefiniteness may be stated on its own, in terms of the Powerset monad, and then embedded into the combined theory of indefiniteness and anaphora, using a monad transformer. This combined theory may likewise be embedded into the combined theory of indefiniteness, anaphora, and quantification. To say that the embedding is monotonic and compositional is to say that it constitutes a (monad) homomorphism. Every lifting function lift has the property, in general, that it preserves the monadic combinators: lift (ηa) = ηa, and lift (m k) = lift m λx.lift (kx). Thus the theory stated with respect to the underlying monad is never truly forgotten and may, in fact, be used when convenient, i.e., before applying a lift.
What we aim to show in this paper is that the algebraic approach that we advocate has this property to an even greater degree. Indeed, the simple monadic approach requires, in many cases, determining a monad ahead of time that combines all of the effects which may occur in a given analysis. For example, say that one wants a theory of quantification on its own, independent of a theory of indefiniteness and anaphora.
Then, one may employ the Continuation monad (as akin to Barker 2002;Barker and Shan 2014); in this case, the definitions of η and remain identical to those stated above, except for their types: the result type of the continuation is now simply t, so that Mα = (α → t) → t. In turn, every philosopher may be given its usual generalizedquantifier meaning, i.e., λk.∀x : philx → kx. Incorporating theories of indefiniteness and anaphora, however, will now prove more difficult. The State monad transformer will provide a monad that takes a type α onto the type s → (α × s → t) → t: Indeed, this result may appear, at first, to be suitable for a combined analysis of quantification and anaphora, but note, for example, that the value returned within the underlying Continuation monad will systematically have the type of a product. As a result, a lower operation will be required to have the type is not obvious what the appropriate definition of such an operation would be. 4 Rather, in order to achieve the desired result, it seems that one must start with the Powerset monad, then incorporate anaphora, and then finally, incorporate quantification. Much more generally, the lifting functions lift X associated with each monad transformer X are often unidirectional, requiring that a choice of result fixes the underlying monad. Thus, ensuring that the resulting monad has a certain desired behavior will limit the flexibility with which one is able to combine different sources of functionality. In contrast, as we will show, the algebraic approach assigns a type with the minimum required effect to each meaning, and the combination of meanings with different effects systematically computes the appropriate result. There is thus no priority associated with one effect or another. 5 A second difference between our algebraic approach and the monadic approach is that the types we compute for meanings exhibiting multiple effects is more informative: it yields a linguistically meaningful summary of the effects an expression gives rise to, as we will show. 4 The most obvious candidate would be to throw out the state returned by the surrounding function on continuations-that is, such a lower would be defined as: Such an operation, however, would invariably discard the anaphoric potential of its argument, treating, e.g., quantifiers and proper names alike. (In contrast, choosing State to be the underlying monad would allow anaphoric side effects to survive when they do not arise from bona fide quantifers.) 5 One may wonder if the algebraic approach could be recast in terms of monad transformers. An issue which would arise is that the relevant lifting operation depends both on the meaning to which it is applied and on the context. Thus if we have n different atomic effects, we must consider n × (n − 1) combinations of them (one for each pair of effects). Furthermore, monad transformers are ill-equipped to deal with effect bracketing, which we introduce in Sect. 3.5.

Algebraic Effects via Graded Monads
As a way forward, we propose a double move: to simultaneously make the monadic approach modular and make its types more fine-grained.
First, we propose that semantic side effects be studied algebraically, in terms of equational laws characterizing the individual phenomena, which may then be combined. This move is inspired by Maršík (2016) and Maršík andAmblard (2014, 2016), who develop a typed extension of the λ-calculus to study algebraic effects in semantics. 6 Unlike the approach of Maršík, we show how effects employed by semanticists-e.g., state and non-determinism-may be recast algebraically (while remaining in a pure setting), leading to more extensible grammars.
Second, we propose to track the relevant effects at the level of types, by using a graded monad. 7 In contrast to plain monads, graded monads are indexed with an abstraction of the effect that they perform, hereafter referred to simply as the "grade". The unit η of the monad is associated the unit grade (1). The grade of the composition of effects under is the composition of their grades, written with the operator (·) (see Fig. 1). Graded monads have been applied previously in the field of programming language theory to describe the semantics of algebraic effects (Katsumata, 2014;Mycroft et al., 2016;Orchard et al., 2019) . In natural language semantics, they have been employed in the analysis of presupposition projection and anaphora (Grove, 2019) . In our analysis, different phenomena are assigned different grades independently of each other. This means that the interpretations associated with individual phenomena may be freely composed, in order to yield grammars that combine the relevant effects.
One can then describe the interactions between effects using two sets of laws. The first set concerns the abstract level of grades. The second set concerns the concrete level of λ-terms and operations. These two sets of laws are related: any law between terms generates a corresponding law between grades; that is, any law governing terms is only allowable if there exists a corresponding law governing the behavior of grades. To illustrate, consider the unit and associativity laws on terms, which are part of the definition of a graded monad (Fig. 1). For types to be preserved in the statement of associativity for , the (·) operator must be associative. Likewise, for types to be preserved in the identity laws regulating the behavior of η, 1 must be the left and right unit of (·). That is, grades must form a monoid.
From now on, we develop not just an equational theory of terms and grades, but a theory of reduction. That is, we use a reduction relation between terms written '−→', and one between grades written ' '. These relations are the (respective) reflexive transitive congruence of the laws that we list below. By definition, two terms t 1 and t 2 are equal if they are inter-reducible; likewise for grades. At this point, our theory encompasses only the graded monad laws. At the introduction of any new law, we will ensure that the reduction relations on both grades and terms are confluent. In particular, we will ensure that the asserted laws are compatible with associativity; i.e., Fig. 1 Definition of a graded monad g 1 · (g 2 · g 3 ) and (g 1 · g 2 ) · g 3 should always reduce to the same grade. Similarly at the level of terms: any proposed reduction rule should respect the Associativity law. We further discuss the importance of confluence in Sect. 5.1.

Compositional Dynamic Semantics
As recalled in Sect. 2, monadic semantics in the style of Shan (2002) aims to augment the interpretation of each syntactic category with an effect. In the present framework, this effect is graded. For example, if a sentence is interpreted as a truth value of type t in a non-effectful semantics, it is interpreted in our framework as a truth value associated with an effect with some grade g, i.e., of type M g t.
Moreover, whereas in Montague semantics, one uses functional application, we additionally employ the graded applicative functor structure arising from the graded monad, characterized by (either of) the operators ( ) and ( ): For illustration, we present a small applicative categorial grammar fragment in Table 1, and, in Fig. 2, two rules of interpretation corresponding to functional application and two rules which make use of the applicative functor structure of our system. The need for seemingly redundant rules corresponding to simple functional application (above in Fig. 2), in addition to applicative combination (below in Fig. 2), arises from the fact that some meanings manipulate effectful values directly: their type is of the form M p α → M q β, rather than M q (α → β). As such, they cannot be combined by either of the or operators. 8 We additionally admit a rule (μ) which collapses a meaning of type M g 1 (M g 2 α) into one of type M g 1 ·g 2 α by sequencing it (via ) with the identity function. In the following pages, we write 'μ m' in place of 'm λx.x' to be concise. Using only the (\) rule, we may interpret john walks as a value whose grade is 1, i.e., one without any dynamic effect. The definition of ( ) and the monad laws allow this result to be reduced:

Anaphora
We can extend our analysis to account for anaphora. For any type α, we may posit a grade Get[d : α], along with a new primitive, get d : Indeed, the choice between the simple and applicative variants of (/) in a derivational step is determined by the semantic types of the arguments being combined. Likewise for the choice between the variants of (\). (Semantic types of the form M p α → M q β will be encountered in Sect. 3.5.) The same quirk justifies the presence of the μ rule, which we introduce next.
The purpose of get d is to retrieve a discourse referent d, whose type is α, from the linguistic context. 9 For instance, one can consider α to be e, the semantic type of entities, although any semantic type is supported, in principle. The grade Get[d : α] records that one presupposes the existence of a discourse referent with label d and type α. For example, get d may be used to interpret a pronoun, with the typing get d : M Get[d:e] e. The labels used for discourse referents are equipped with a decidable equality relation, but otherwise, they carry no meaning. 10 It should be noted that labels occur only inside grades-in Sect. 4, we show how the primitives may be interpreted into a label-free calculus. Finally, thanks to the typing rule for , a phrase which uses some number of discourse referents lists them all in its grade. For example, we might have the type M Get[d masc :e]·Get[d f em :e] t for the sentence he likes her.
Our goal is to formalize how grades interact. Since we do not keep track of the order in which discourse referents are introduced, we have the following equality on grades: Whenever we assert such a law on grades, it is important to check that it preserves the overall system's confluence in the presence of the other laws, including the monoid laws. So far, we have asserted only a commutation law, and it is easy to see that no problem arises. Second, we do not keep track of how many references to a single discourse referent occur. Moreover, if two references to the same discourse referent are made, their types should agree. This is captured by the following law: 11 To complete the formal definition of the treatment of anaphoric expressions, it suffices to state how two instances of get d should interact, as guided by the behavior of their grades. We employ two laws on terms (which we label according to the 9 We encode here roughly the notion of discourse referents of Karttunen (1976). 10 This decision procedure tells whether or not there is co-reference. A possible implementation of it would be to match the properties of referents with predicates associated with anaphoric expressions (Bernardy et al., 2021) . 11 Our framework is, in principle, agnostic about the type system of the underlying λ-calculus. For instance, rich types, as proposed by Luo (2012), are supported, as is the simply typed λ-calculus. Even though we will avoid rich types in our analysis, we note that they may be particularly beneficial when it comes to tracking discourse referents. For instance, law (2) generalizes as follows: (where 'α ∧ β' refers to the meet of types α and β). Thus if two parts of a phrase refer to the same discourse referent, then the type associated with that discourse referent needs to be the meet of the types found in the parts. Additionally, complex relations can be captured within the types of discourse referents. For example, the meaning of john sees his dog could be assigned the type M Get([d: (x:dog)(have(j,x))]) , which records a presupposition of the existence (via a type) of John's dog. In the presence of rich types, one can additionally expect the types of the discourse referents to play a role in resolving anaphora.
respective corresponding laws on grades): The first law states that references to independent discourse referents commute. This law corresponds to law (1) on grades stating that the order of labels in a grade does not matter. The second law states that two references to the same discourse referent collapse to a single reference. This law corresponds to law (2) on grades, which collapses two associations with the same label. Note that, instead of first presenting the laws on grades, we could have stated the algebraic laws on terms and deduced their typing. Correct typing ensures that the behavior of terms, as captured by the algebraic laws, is mirrored by the behavior of grades, as captured by the grade laws.

Introducing Discourse Referents
As the dual to accessing discourse referents, one can introduce new ones. For this purpose, we add a new grade Put[d : α], along with a new primitive: The returned type, , is the unit type, thus signifying that put d makes no significant contribution at the level of values. In terms of this primitive, one can define an operation (·) d , which binds its argument to the discourse referent d. (The notation is inspired by the similar notation of Barker and Shan (2014), as well as of Charlow (2014).) This operation performs the dynamic effects associated with its argument, following which it binds the value returned to d: The 'λ .' notation indicates that a value of type is expected as an argument to the relevant λ-expression.
To illustrate, let us return to our running example, given the updated lexicon in Table 2. We now interpret john walks as follows.
john, (ηj) d :: N P walks, ηwalk :: N P\S \ john walks, (ηj) d (ηwalk) :: S After unfolding the definitions and β-reducing, we obtain put d j λ .η(walk j), whose type is M Put[d:e] t, thus capturing that the discourse referent d has been introduced.
When considered on its own, put d behaves similarly to get d . The order of introduction does not matter: 12 Consequently, two discourse referents commute. We can formalize this as the following equation on terms: Although get d and put d arise independently and have interpretations on their own, we can describe their interactions in terms of algebraic laws. We illustrate this fact first on the relations on terms, by adding two laws: These laws ensure that get d uses only the discourse referent d that put d introduces.
Assuming that the terms are well typed, the grades on the left should reduce to the grades on the right; consequently, the following laws hold on grades: The first law finds a satisfying linguistic justification: when a discourse referent is introduced, it is no longer presupposed. The second law ensures that introductions and uses of distinct discourse referents ignore each other. To illustrate, consider composing the two utterances john walks with he sits. Given the lexicon in Table 2, this miniature discourse receives the following meaning: The resulting meaning is of type M Put[d:e] t; it introduces a discourse referent (d), but has no anaphoric presupposition, despite the presence of the pronoun he. That is, its reference is resolved. Checking confluence is a less easy exercise now than before. We can, however, convince ourselves that it holds by noting that the following re-association is confluent:

On the State Monad
Both Charlow 2014 and other work in monadic dynamic semantics have employed the state monad, in order to model anaphora (Giorgolo & Unger, 2009;Unger, 2012) . The foregoing formalization vindicates some of the state monad laws (laws (2) and (4)), but to get a full specification of the state monad, one additionally needs the following law: To preserve types, this law on terms requires the following law to hold on grades: Such a law is problematic, however, as it contravenes confluence: (by law (6) (by law (2)) Thus not all of the state monad laws can be imported into our framework, given how we employ graded types. What is responsible for this difference? The state monad is a theory of memory locations. According to the corresponding model of state, such memory locations pre-exist the lifetime of a program, and can be updated any number of times. Using a state monad to model anaphora would thus require that a constant set of referents be handled by the discourse. In comparison, our encoding of discourse referents is more precise: we record at the level of grades the exact discourse referents either introduced or presupposed. For our purposes, there is a fundamental difference between introducing a discourse referent and not introducing it. A contrario, we ought to reject the hypothetical law (6), which implies that using a discourse referent and then introducing it is, in fact, equivalent to doing nothing.

Quantification
As a further step, we may introduce another grade, Scope, in order to analyze expressions, such as every, which are commonly taken to denote generalized quantifier meanings. Like those introduced above, this grade is accompanied by its own primitive: Thus given a quantifier q of type (e → t) → t, scope q allows it to act as an entity at the level of values, i.e., in terms of the variable q binds, given that the primitive's return type is e. 13 Indeed, the scopes of natural language quantifiers have been observed to be restricted in certain ways: one common view is that a quantifier cannot take scope outside the smallest finite clause in which it occurs syntactically. For example, some cat fears every dog will chase it can be understood only to imply the existence of a single highly pessimistic cat. To capture the effect of scope islands, we also introduce an operation · on grades and a primitive · which introduces it: The intent is that body allows one to ensure that a value bound in body using scope q is not available outside of body. This makes it possible to statically limit the scope of a variable bound by a quantifier.
The modularity provided by our approach allows us to import the laws regulating anaphora into the current setting. At the same time, we may describe the interactions between anaphora and quantification. To that end, we may state the following laws on grades: These laws are reflected on terms as follows: 14 The occurrences of Get[d : α] inside a bracket can be pulled to its left [laws (7) and (12)]. Doing so, moreover, facilitates it meeting a Put[d : α], which can then eliminate it.
Note that a law commuting Put[d : α] and Scope is absent. Indeed, the grade Scope · Put[d : α] corresponds to introducing an entity which may depend on another entity quantified over. Such a commutation should be rejected, as it would allow the introduced entity to escape its scope. An entity which is introduced inside a bracket, but before any Scope introduction, however, can be pulled out of the bracket, as per law (11).
A Scope introduced at the rightmost point of the body of a bracket can be reduced (law (9)): the operational interpretation of this law is to apply the quantifier to the returned property. If a discourse referent is introduced at the rightmost point of the body, immediately after Scope, then the introduction is simply ignored (law (10)). This should remain true for any number of introduced entities, moreover. To avoid introducing a scheme of reduction laws, we may use a law such as the following one, which coalesces indefinitely many introductions into one (or splits them) as needed:

Indefinites
We now turn to indefinite noun phrases. Here, we pursue the idea of Charlow (2014Charlow ( , 2020a that the meaning of an indefinite noun phrase is to non-deterministically choose an entity from the set defined by its restriction. To do so, we introduce a new grade, Choose[α], indexed by a type α, and associate it with the following primitive: We additionally provide the following law on grades: This law is reflected at the level of terms as follows: Intuitively, what this law says is that choosing two values in sequence is the same as choosing them simultaneously, as a pair. To remain concise, we transcribe only the laws on terms that relate choose and scope [laws (18) and (19)].

Determiners and Donkey Anaphora
The determiner algebra provides a new grade, Det, from which we define a new primitive, det, having the following type signature: det introduces a determiner meaning, which it merely returns. The utility of including determiners among the grades is manifest, however, when considering their interactions with other effects; in particular Choose[α]: Note that each of these laws has a corresponding law that involves Scope, rather than Det. Indeed, the corresponding laws on terms are analogous, except for laws (23) and (24), which are substantively different. Before we demonstrate this, we give the laws on terms for laws (21) and (22), which are realized by feeding a determiner meaning to its continuation: 15 More interesting are laws (23) and (24), each of which can be realized in two ways. The first gives rise to a "weak" existential reading of donkey sentences, while the second gives rise to a "strong" universal reading. 16 We provide the two laws corresponding to law (23), as those for law (24)  With the lexicon in Table 4, we may derive the following meaning for every new yorker who sees a dog pets it: every new yorker who sees a dog pets it .det every λQ.P λP .scope(Q P ) At this point, we have two options, depending on the reduction rule we choose to coincide with law (24). If we opt for the weak reading, we can continue as follows: −→ η(every(λy.∃x : dogx ∧ NYery ∧ seex y)) (λy.∃x : dogx ∧ NYery ∧ seex y ∧ petx y) −→ η(every(λy.∃x : dogx ∧ NYery ∧ seex y)) (λy.∃x : dogx ∧ NYery ∧ seex y ∧ petx y) On this reading, every New Yorker who sees a dog pets at least one dog they see. If we opt instead for the strong reading, we can continue as follows: −→ η(every(λy.∃x : dogx ∧ NYery ∧ seex y)) (λy.∀x : dogx → ((NYery ∧ seex y) → petx y)) −→ η(every(λy.∃x : dogx ∧ NYery ∧ seex y)) (λy.∀x : dogx → ((NYery ∧ seex y) → petx y)) Now, every New Yorker who sees a dog pets every dog they see; i.e., the reading attributed to donkey sentences by most dynamic semantic accounts.

Realization in Terms of a Pure Calculus
In this section, we provide meanings to the grades, the operations, and their relation in terms of the simply typed λ-calculus with products (hereafter, STLC). We will only provide proof sketches here, but we note that the contents of this section and Sect. 3 have been formalized using the Agda proof assistant.
Proof By case analysis.
Definition 1 (Interpretation of grades) For every graded type M g α, there is a semantic interpretation M g α = S g ( α ) as a type in the STLC (or, more generally, in the underlying typed λ-calculus without effects). · preserves STLC types and is defined on graded types as follows.
We stress that this interpretation is entirely modular in the sense that the meanings of the atomic effects are devised independently, without taking into account any interplay between effects. (It is a homomorphism on the grade structure.) As a rule, if the primitive operation associated with an effect takes as input an object of type X , then we take the product with X in the interpretation. Conversely, if such a primitive returns a type Y , then Y is found as the domain of an arrow in the interpretation. A consequence of this modularity is that all the results of this section can be proven in a modular fashion, by case analysis for each atomic grade. For grade composition, a straightforward induction applies.

Lemma 1 S is a graded monad.
The proof relies on the following facts: (1) each atomic grade is interpreted as a functor; (2) the unit grade is interpreted as the identity functor; (3) the composition of grades is interpreted as functor composition.
Proof This is a constructive proof done by case analysis. The function f says how (the semantic interpretations of) effects are transformed by reductions. For instance, the law corresponds to functions f : (α × (α → β)) → (α × β), which pass the newly introduced value (of type α) to its continuation, which then uses it. That is, f x, k = x, kx .
We call the relation induced by such functions ' g 1 g 2 '. (That is, x g 1 g 2 y iff f x = y, where f is a function provided by Theorem 2.) 18 Finally, it bears repeating that the above construction defines the semantics of the reduction relation, and is thus the keystone of the interpretation.
Definition 2 (Interpretation of terms) For every well-typed term t : M g α, we define an interpretation t such that t : S g ( α ). The interpretations of η and are given by the graded monadic structure of S (Lemma 1). The recipe for interpreting each atomic grade is based straightforwardly on the type of the primitive giving rise to the grade. For example, get d = λx.x, t = t , λφ.φ , etc.
Theorem 3 (Adequacy of the interpretation) The interpretation of terms respects the interpretation of grades and the interpretation of reductions as functions. Formally, if t 1 : M g 1 α, t 2 : M g 2 α, and t 1 −→ t 2 , then t 1 g 1 g 2 t 2 .
This theorem essentially tells us that the axiomatization of term reductions exactly fits the interpretations of grades. As a result, if one wishes, one may omit the axiomatization, and use only the interpretation and the corresponding reduction relation. We have chosen to present the axiomatic view to emphasize the operational behavior of terms having effects. If one is interested only in the end product (i.e., pure λ-terms), then one would be better off axiomatizing grades and their relations only. This way, by omitting the axiomatization of operations and algebraic laws, one can describe their compositional meanings (as in Definitions 1 and 2) directly.

Effects and Handlers
To improve compositionality, general effects and handlers systems have been proposed for dynamic semantics by Maršík (2016) and Maršík andAmblard (2014, 2016). In these approaches, new operations, such as get or put, can be declared and defined locally in terms of the ambient calculus. These approaches have much in common with ours, insofar as they provide modular interpretations of the effectful operations they employ. Furthermore, while effectful meanings are defined in a typed extension of λ-calculus, they yield terms of a pure λ-calculus once they are handled. The chief difference between the effects and handlers approach and the one advanced here, which makes algebraic laws central, is that the former approach demands that every occurrence of an operation be interpreted (i.e., handled) independently of the context in which it occurs. This requirement enforces absolute compositionality of interpretation, whereas our method does not. In other words, while our syntax is compositional, the eventual interpretation of a grade may depend on its context. Indeed, our reduction rules are written so that the meaning of an operation can depend on its neighbors. This design allows the interpretation of Scope, for example, to occur only at the rightmost point in a bracket, where it may receive a function of type e → t. Crucially, nevertheless, the results yielded by the applications of laws are compositional: due to associativity and confluence, one may safely apply reduction rules to a term m or a grade g independently of the context in which m or g occurs. When combining m with a continuation k, it suffices to consider their reduced forms: confluence guarantees that the result of m k is the same, regardless of what reductions occur before their combination.

The Underlying Calculus
Even though we have assumed the STLC as our ambient calculus, monadic and algebraic effects approaches (and, more generally, approaches based on computational effects) are agnostic as to the type system used by the underlying λ-calculus, be it Martin Löf Type Theory (Martin-Löf, 1984) or one of its variants, System F Girard 1972, Cooper's TTR Cooper and Ginzburg 2015, Asher's TCL Asher 2011. Thus our approach (as others) may be added to such systems without modifying the respective calculi.

Graded Effects
Our treatment of discourse phenomena in terms of grades is partially inspired by the interpretation of Cooper storage in terms of a graded applicative functor due to Kobele (2018a). Kobele employs grades that correspond to stores of quantifier meanings, in order to encode the types of both stored quantifiers and the variables they bind. We employ somewhat richer grades than Kobele, in order to encode, e.g., discourse referents. Such rich grades allow us to describe linguistically meaningful interactions at the level of types that reflect the algebraic laws that apply at the level of terms.

Modalities Instead of Graded Monads
Our presentation relies on the standard structure of λ-calculi to encode dynamic effects as monads. This causes a certain amount of notational weight in the axiomatization. Namely, we have to use a family of operators , , , etc., instead of simple functional application.
To avoid this overhead, an alternative presentation could use modalities to represent the combination of dynamic effects associated with a value. Several calculi supporting these kind of modalities have been developed recently (Petricek et al. 2014;Orchard et al. 2019;Abel and Bernardy 2020).

Conclusion
We have proposed a framework which both unifies and refines approaches to dynamic semantics based on monads. The key idea is to break down effects into atomic grades. The interactions among grades are provided by algebraic laws, which can be presented in a modular fashion. Even though the number of possible laws grows quadratically with the number of possible effects, laws are much fewer than this theoretical maximum if we exclude the mechanical commutation laws.
The process of applying this refinement reveals possible improvements to earlier analyses, for example regarding the interpretation of anaphora using the state monad (Sect. 3.4). The use of a bracketing operation to delimit scope appears to be new, and is an essential device in the interpretation of quantification effects.
Our framework can either be given a purely axiomatic treatment (Sect. 3), or, like many accounts, be provided as part of a pure λ-calculus (Sect. 4). In future work, we intend to describe more effects within the same framework, including presupposition and conventional implicature.
Funding Open access funding provided by University of Gothenburg.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.