Donkey Sentences

In semantics, we typically take a sentence, and convert it into a formal logical representation.¹ “All dogs bark” might be represented as \([\forall x\) : DOG\((x)]\) BARK\((x)\). Formal logical systems tend to use lexical scope — “\(\forall\)” introduces an \(x\), some things are said about \(x\), and then we throw the \(x\) away.

My background is in PL Theory, so this makes me feel pretty warm and fuzzy inside. But… this might be a case of trying to fit a round peg into a square hole, as can be demonstrated by a man and his donkey:

Russell saw a donkey. He smiled at it.

We could be lazy, and just say he and it come from “context”, and leave it at that. But even then, what does this look like in a logical representation?

\[ \begin{align*} [\exists x : \text{DONKEY}(x) ]\; &\text{SEE}(r, x)\\ [\exists x : \text{DONKEY}(x) ]\; &\text{SMILE}(r, x)\\ \end{align*} \]

That obviously doesn’t work, because the donkey Russell saw was the same one he smiled at. Maybe we could “merge” these sentences together?

\[ [\exists x : \text{DONKEY}(x) ]\; \text{SEE}(r, x) \wedge \text{SMILE}(r, x) \]

But why did the scope of “\(\exists\)” extend to the next sentence? What if we put a sentence between the two? And we still haven’t dealt with he and it!

This week I want to dive into so-called Donkey Sentences, and the advantages of tossing lexical-scope out the window, using a technique known as Discourse Representation Theory, or DRT.

Anaphora

Did you just say “Donkey Sentences?”

Why yes I did — they’re a thing! Donkey sentences are totally correct sentences, that have clear meanings, but have one small problem: they use anaphora.

Anaphora: Any expression whose interpretation depends on another expression elsewhere (called the antecedent if it occurred before, postcedent if it occurred after).
e.g. “The president was there, and that made everyone uncomfortable.” The meaning of that depends on the antecedent the president was there.

So in (1) the anaphora were he and it.

When I say “depends on”, what do I mean exactly? At first glance, you might think you can simply cut-and-past anaphora out of a sentence.

Russell woke up. He was surprised to see his donkey.
Russell was surprised to see Russell’s donkey.

That would make our life easy, but the simple solution of (3) doesn’t always work. Trying the same method on (1):

Russell saw a donkey. Russell smiled at a donkey.

This isn’t exactly the same sentence — it’s not clear that we’re talking about the same donkey.² It’s also worth mentioning that anaphora don’t just extend across sentences, but can happen within sentences, too!

If Russell had a donkey, he would beat it.
If Russell had a donkey, Russell would beat a donkey.

Yeah, that’s just no good.

An interesting note here, is cutting-and-pasting the word “Russell” seems to be just fine; it’s this “a donkey” that’s causing problems. Let’s keep this in the back of our minds for now.

Not just Nouns!

It’s worth pointing out there there are all kinds of anaphora, not just pronouns. These (and others) all occur cross-linguistically.³

For more information on anaphora, I’d look at The Stanford Encyclopedia of Philosophy.
Verb Phrase	“I just got a donkey. Irma did too.” (Here “did” refers back to last sentence)
Propositional	“Russell says it was his donkey. I don’t believe that.” (What is it I don’t beleive?)
Adjectival	This was a stubborn donkey. I guess all donkeys are. (We just dropped the last word of the sentence, “stubborn”!)
Modal	Russell might buy a donkey. He would pay cash. (The world in which Russell is paying cash is the same world where he is buying a donkey.)
Temporal	Russell had a party last Friday, and his donkey got drunk. (The time in which the donkey got drunk was during the party.)

Scope Woes

So why was it that “Russell” could be copied-and-pasted in sentences like (4) and (6), but “a donkey” couldn’t? Looking back at the logical representation, we have

\[ [\exists x : \text{DONKEY}(x) ]\; \text{SEE}(r, x) \]

We converted “a donkey” into \([\exists x : \text{DONKEY}(x) ]\), which is pretty reasonable. However, this commits us to a scope for \(x\) — outside this \(\exists\)-expression, \(x\) will be thrown out.

Russell, on the other hand, was simply assigned to \(r\). \(r\) has no quantifier, and therefore no scope. Russell is “global” — there is only one (relevant) Russell to speak of, and we can put him wherever we like in our sentences.

And that’s the main difference: Russell isn’t giving us any trouble, because \(r\) has no scope. “A donkey”, however, is confusing because it’s not clear how big the scope of \(x\) should be.

Dynamic Scoping

Here’s a hot take: language is stateful. When I say something about a donkey, you don’t just throw it away once the sentence is over. Maybe this donkey will make an appearance later on down the road — who’s to say? You incrementally keep a model of what is being said. This is so intuitive, it borders on obvious, yet trying to fit language into lexical scope is wilfully ignoring this reality!

This was the intuition of Kamp (1981), and it caused him to develop Discourse Representation Theory. At its core, DRT and systems like it involve keeping track of a dynamic scope.⁴

Let’s see what this actually looks like.

DRT

DRT revolves around the idea of a Discourse Representation Structure.

Discourse Representation Structure (DRS)

A structure made up of two lists:

A list of discourse referents (the things being talked about)
A list of the conditions (what is being said about them)⁵

As sentences are received, the DRT becomes updated. Let’s start with the first half of our original sentence.

Russell saw a donkey.

We started with an empty DRS, which just became updated after (7).

\[ \drs{r, x}{\text{RUSSELL}(r)\\ \text{DONKEY}(x) \\ \text{SEE}(r, x)} \]

With this state in our brains, we then receive the second sentence.

He smiled at it.

Now we have two new names, “he” and “it” (both of which are anaphora), and one new conditional.

\[ \drs{r, x}{\text{RUSSELL}(r)\\ \text{DONKEY}(x) \\ \text{SEE}(r, x)} \Rightarrow \drs{r, x, \hat{y}, \hat{z}} {\text{RUSSELL}(r)\\ \text{DONKEY}(x) \\ \text{SEE}(r, x) \\ \text{SMILE}(\hat{y}, \hat{z})} \]

The anaphora \(\hat{y}\) and \(\hat{z}\) at this point are incomplete — we need to resolve which antecedent goes with which. This is actually trivial in our example. \(\hat{y}\) can only map to \(r\), and \(\hat{z}\) to \(x\).⁶

\[ \drs{r, x, \hat{y}, \hat{z}} {\text{RUSSELL}(r)\\ \text{DONKEY}(x) \\ \text{SEE}(r, x) \\ \text{SMILE}(\hat{y}, \hat{z})} \Rightarrow \drs{r, x, \hat{y}, \hat{z}} {r = \hat{y} \\ x = \hat{z} \\ \text{RUSSELL}(r)\\ \text{DONKEY}(x) \\ \text{SEE}(r, x) \\ \text{SMILE}(\hat{y}, \hat{z})} \Rightarrow \drs{r, x} {\text{RUSSELL}(r)\\ \text{DONKEY}(x) \\ \text{SEE}(r, x) \\ \text{SMILE}(r, x)} \]

Some Intuition on New Names

So much of this work revolves around rethinking what quantifiers mean in a dynamic context. Should we be adding discourse referents for every noun we see?

Probably not. Consider if I said:

Russell loves a donkey. The donkey does not love him back.

When you hear “Russell”, your brain says “ok, we’re talking about a guy named Russell”. Then you hear “a donkey” — “ah! A donkey has entered the picture”. “The donkey” however, did not add a new character to our world. Because it is definite⁷, we know not to add a new referent. Any time we hear the indefinite “a farmer” or “another donkey”, however, we know to add a discourse referent.

Negation of DRSs

It turns out global scope isn’t exactly always what you want.

#There is no donkey. Russell smiles at it.

It here certainly can’t refer to no donkey! This negative-quantifier has a very narrow scope — you can never refer back to the donkey that doesn’t exist. To deal with this, we can nest DRSs!

\[ \drs{r, \hat{y}}{ \text{RUSSELL}(r) \\ \text{SMILE}(r, \hat{y}) \\ \neg \; \drs{x}{\text{DONKEY}(x)}_{R_1} }_{R_0} \]

We can read this as “any extension to \(R_0\) which includes the content of \(R_1\) must be false”.

Conditionals

So how do we model conditionals with this system?

If Russell had a donkey, he would beat it.

Again, we’re going to use a method of nested DRSs.

\[ \drs{r}{ \text{RUSSELL}(r) \\ \drs{x}{ \text{DONKEY}(x) \\ \text{HAS}(r,x) }_{R_1} \Rightarrow \drs{\hat{y}, \hat{z}}{ \text{BEAT}(\hat{y}, \hat{z}) }_{R_2} }_{R_0} \]

A lot like negation, \(R_1 \Rightarrow R_2\) is giving us rules about how \(R_0\) might be expanded. Namely, any extension which includes the contents of \(R_1\) must also include the contents of \(R_2\).

Dynamic is Useful

Obviously I’ve only barely introduced DRT⁸, but it might quickly be apparent how dynamic semantics can be very powerful. It seems to address anaphora fairly well, but that’s really just the beginning.

DRT is also something that can be easily modeled by a computer. This has led to a whole swath of fruitful research in AI and Computational Linguistics. For the same reason, it also bridges formal semantics with psychological models of language.

What I find particularly interesting, is that dynamic semantics also seems to bring the pragmatic-semantic interface more into focus. Pragmatics, which is loosely the subfield of linguistics dealing with how context contributes to meaning, has been a rather elusive topic. With a DRS, however, we can more clearly see when and how pragmatics jumps into the picture — for example, searching for an antecedent \(x\) to bind to some anaphora \(\hat{y}\).

Parting Notes

If you’d like a deeper look at DRT, I’d strongly suggest Kamp, Genabith, and Reyle (2010) and Geurts and Maier (2020), probably in that order. They both read very easy, and go much more in-depth on the subject.

I haven’t seen the actual phrase “dynamic scope” used in discussion about these theories, but the idea is there. I’m just using my computer science vocabulary.

I actually originally wanted to write exclusively about Partee (1984), and her use of DRT for the analysis of tense, but… I couldn’t actually find a copy of the paper online! The tragic realities of not being affiliated with a university… I hope to post a mini-post soon, introducing those ideas.

If I were to write a part-two, I think I’d like to look at judgments, inferences, and maybe even a computational implementation (this feel like it would be fun to implement in Coq, for example). It’s also worth noting that there are a number of different flavors of DRT, each with their own pros and cons.

Stay tuned later this week for a spiritual sequel, though! I have a fun topic planed related to anaphora.

Thanks as always for reading! Every week I write about something I’m learning to test my own understanding, but also to work on my writing and explanatory skills. If you have questions, critiques, or thoughts, please leave a comment! They might teach me something, and they’ll let me know someone’s reading!

References

Geurts, Beaver, Bart, and Emar Maier. 2020. “Discourse Representation Theory.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta, Spring 2020. https://plato.stanford.edu/archives/spr2020/entries/discourse-representation-theory/; Metaphysics Research Lab, Stanford University.

Kamp, Hans. 1981. “A Theory of Truth and Semantic Representation.” Formal Semantics-the Essential Readings, 189–222.

Kamp, Hans, Josef Genabith, and Uwe Reyle. 2010. “Discourse Representation Theory.” In Handbook of Philosophical Logic, Handbook of Philosophical Logic, Volume 15. ISBN 978-94-007-0484-8. Springer Science+Business Media B.V., 2011, p. 125, 125–394. https://doi.org/10.1007/978-94-007-0485-5_3.

Partee, Barbara H. 1984. “Nominal and Temporal Anaphora.” Linguistics and Philosophy 7 (3): 243–86. http://www.jstor.org/stable/25001168.

Specifically, this is the tradition of Montague grammar (which is pretty pervasive). Montague’s thesis is essentially that natural languages and formal languages (like programming languages) ultimately can be treated the same way.↩︎
This is a bit more clear with a sentence added in the middle. e.g. “Russell saw a donkey. In fact, he saw hundreds of donkeys. Russell smiled at a donkey”.↩︎
This includes sign languages, which I find to be objectively cool. Look up some recent work by Philippe Schlenker for more information.↩︎
If you want a programming example to draw intuition from, I would point out bash. In bash, all variables are global, and they only come into existence when you first assign something to them.↩︎
This is actually essentially the same of converting to logical notation, but now we keep a dynamic list of names.↩︎
This is because the syntax of “he” requires a male.↩︎
Definite here means “there is only one”. By saying “the \(x\)”, you are stating that there is only one \(x\). Pronouns are another form of definite references.↩︎
Question for my readers: have my posts been too “introductory”? Would a deep-dive into topics be more interesting than a brief overview? Because I haven’t been writing much about novel work, I’ve been under the assumption that those who wish to learn more will read the source research. There are a number of extensions to DRT, critiques, and we haven’t touched at all on judgments within the system, or computational implementations.↩︎