Journal of Psycholinguistic Research

Referential and Non-referential Uses of the Third Person Pronominal Subject in Spanish

  • Marcos García Salido


This paper studies the role of two different types of motivation that have been proposed to explain the use of subject personal pronouns in Spanish, namely their function as indications for the addressee to identify the subject’s referent, and their suitability for expressing informational values such as contrastiveness or focus. This study focuses exclusively on third-person forms and relies on conversational data. The distribution of third-person pronouns is analysed combining qualitative and quantitative approaches. It will be argued that the informational and referential properties of subject personal pronouns are by themselves insufficient to account for their expression, their occurrence depending crucially on their activation through the previous use of units of the same type.


Pronominal subjects in Spanish Referential continuity Focus Contrast Priming 


As is well known, the use of subject personal pronouns is not obligatory in Spanish, since verb morphology usually conveys enough information to establish subject reference—a pattern that makes Spanish a pro-drop language in generative terms.1 Accounting for the alternation between the presence of subject personal pronouns (henceforth, SPPs) and their absence has been a classical problem in the field of Spanish linguistics. Two broad types of motivation for the expression of SPPs can be distinguished. Explanations based on the first type rely on the idea that personal pronouns perform a referential function by providing the addressee with information that verb grammatical morphemes lack. This is evident in the case of third person pronouns, which, in contrast to person inflections, are marked for gender (él/ella cantó ‘he/she sang’ vs. cant-ó ‘sing-pst3sg’). First person pronouns can also be viewed as a disambiguating resource when they accompany verb forms that are compatible with both first- and third-person readings (yo/él cantaba, ‘I/s/he was singing’). In addition to the disambiguating power of pronouns, a considerable amount of research carried out from a variationist approach has revealed that pronominal expression is conditioned by referential (dis)continuity with respect to the previous context.

A second type of motivation often mentioned for SPP expression is what can be referred to as non-referential uses, which have to do with conveying informational values (or closely related ones, such as contrastiveness). The rationale behind this type of explanation is the following: in those contexts where verb morphology is rich enough to establish the reference of its subject, and the absence of an overt subject is allowed by the grammar, the presence of a pronoun must entail some additional content, such as emphasis, contrastiveness or the like.

The aim of this paper is to explore the explanatory validity of the two above-mentioned types of motivation for the expression of third-person subject pronouns. The study will rely on conversational data from the Val.Es.Co 2.0 corpus (Cabedo and Pons 2013) for two reasons. First, the factors leading to pronominal expression are hard to establish on the basis of purely introspective reasoning (see in this respect Posio 2011, p. 778 who, using real data, gives the lie to certain assumptions on the distribution of pronouns made by Fernández Soriano 1999). Second, conversation is a discourse genre practised by virtually all speakers that is characterised by a minimum degree of prior planning.

In recent years, a vast amount of literature on pronominal subject expression based on real data has been published, mostly focusing on first person forms. The present study aims to contribute to our current understanding of pronominal subject expression in Spanish by investigating third-person SPPs (3SPPs, henceforth) in this language.

The paper is organised as follows: after this introduction, the next section presents a qualitative analysis of the use of 3SPPs in order to test the explanatory power of informational and referential accounts in their expression. The following section presents a series of factors (namely, shift of reference, verb ambiguity, verb semantics and priming) that have been proven to influence the expression of SPPs as an alternative to their omission. Next, a section follows presenting the results of examining the probability of 3SPPs expression under the afore-mentioned factors and considering their relation to the two types of account under discussion (referential and non-referential motivations). The paper ends with some concluding remarks.

Informational and Referential Motivations for the Use of SPPs: A Qualitative Analysis

As pointed out in the introduction, two different explanations have been given for the expression of SPPs. The first account relies on the referential properties of pronouns, emphasising their disambiguating power. However, some approaches have recently drawn attention to the referential implications derived from the formal differences between pronouns and more attenuated forms—an issue that will be discussed at the end of this section. Before that, I will discuss some informational properties of SPPs such as their contrastive or alleged focal character. Such factors have been claimed to be of critical importance in SPP expression, to the extent that some authors (see Luján 1999, p. 1312) completely identify expressed SPPs—in those contexts where their omission is possible—with focalised and contrastive elements.

Focalisation and SPP Expression

Although Luján’s view may be somewhat extreme, other scholars have also attributed a prominent role in the expression of SPPs to focus and contrastiveness (see Travis and Torres-Cacoullos 2012, p. 714, and references therein). However, it has not been easy to find direct evidence of the focal character of subject pronouns in the corpus sample used for the purposes of this study (the 243 3SPPs found in the Val.Es.Co 2.0 corpus). In fact, features contradicting the focal character of some instances of SPPs are much easier to spot.

In spite of some discrepancies, there seems to be a certain consensus in considering information focus as a means to signal the part of the information unit2 that the speaker wants the addressee to integrate into his set of assumptions. The chunk of information marked by the focus is therefore not presupposed by the addressee (i.e., not shared, new) at the time of being uttered. In other words, the focus signals the asserted part of a proposition.3 Some authors identify information focus with the entire portion of new information, while others claim that information focus signals the last element conveying new information. The present discussion will assume the second approach. Thus, in Spanish, an utterance such as
where ratón receives the last stress of the clause, is a valid answer to both ?‘Qué pasó?, ‘What happened?’ and ?‘Qué se comió el gato?, ‘What did the cat eat?’, with no alteration in the clause’s intonation. In both cases the information focus falls on ratón, but in the former there is no presupposition at all, whereas in the latter, after speaker A utters ?‘Qué se comió el gato?, speaker B may safely presuppose that A knows that ‘the cat ate X’. By answering El gato se comió un ratón, B is adding a piece of information missing from A’s knowledge: ‘the X eaten by the cat \(=\) the mouse’.4

In Spanish, a neutral focus, i.e. a focus allowing both for a broad (what happened?) and a narrow (what did the cat eat?) interpretation, falls on the last stressed element of a declarative clause (Zubizarreta 1999, p. 4228; Martín Butragueño 2005, p. 126), as in our example above. Speakers can alter this distribution, but at the expense of modifying the intonation pattern of the clause in question. In such a case, instead of the gradual pitch fall usual in Spanish, an abrupt fall after the focalised element would take place.

Taking this into account, it seems that in order to focalise an SPP, speakers of Spanish have at least two basic options: (i) placing the SPP as the last stressed element of the clause, as in (1), or (ii), altering the intonation pattern normally associated to declaratives when the focus is not in final position, as in (2):
The latter strategy, however, is difficult to trace in the examined corpus. Although the transcription reflects some intonation phenomena whenever an intonation indication has been found occurring after a preverbal 3SPP, it normally represents a pause. Because it is commonly accepted that fronted focalised subjects in Spanish (at least in non- Caribbean varieties) form an intonation unit with their verb and no syntactic constituent can be placed between them (Real Academia Española et al. 2009: Section 40.4 k; Zubizarreta 1999, p. 4241), such pauses, sometimes combined with interpolations, are suggestive of topicalisation or left-dislocation instances, rather than of fronting of focalised SPPs:5
A more reliable sign of focalisation is the post-verbal placement of SPPs. Example (5) illustrates this strategy:

In (5), given the previous context, when Edu utters pero navega él, both interlocutors can assume that Lucas and the referent of él spend time on a boat together. The assertion conveyed by the sentence with post-verbal él is the identity of the one who sails (presupposition=‘they are on a boat, someone sails’). Lucas, in turn, specifies the precise nature of the task undertaken by the referent of él: here the presupposition is ‘he sails’, which is further specified by means of the assertion that ‘what he does is steer the boat’.

In spite of the association of SPP postposition and focalisation, one must be cautious in attributing a focal character to every instance of post-verbal SPPs, since it is also possible to find cases of right dislocation with the same distribution, as well as other clause types whose information structure is dubious, as shown by the following examples.

The underlined pronoun in (6) is likely a case of right dislocation. It is preceded by another sentence with a right dislocated constituent (el tío ese, ‘that guy’) and a very similar argumentative value. Both are explanations for the status acquired by a certain driver. The underlying question of both could be Why has he gotten there?, rather than one asking for the pronoun under discussion (#Who has a lot of money?).

The pronoun in (7) is part of a relative clause. The function of such clauses seems to be to provide the addressees with information they already assume, so that they can identify a given referent by means of such information: that is, once a certain propositional content has been encoded by means of a relative clause, it no longer informs about an event or state, but serves as a referential indication. Therefore, asking about the given-new opposition, or the focal status of a pronoun within a relative clause, is perhaps pointless. In (7), not only the pronoun, but the entire relative clause is the focus of the negative operator no and is negated and replaced by an alternative option. The presupposition is something like ‘the answer is x’ and the utterance of the example asserts that ‘\(\hbox {x}=\hbox {A}\), not B’, where A is encoded by the focalised esa and B is encoded by the entire focalised relative clause—this structure is labelled “replacing focus’ in Dik (1997, p. 131ff.).

SPPs in interrogative clauses pose similar problems. On the one hand, it is widely acknowledged that interrogative constituents (i.e., interrogative pronouns and adverbs, or phrases with an interrogative determiner) in interrogative clauses can be viewed as a parallel to the foci of declarative clauses. On the other, it is generally assumed that, unless the clause contains more than one intonation unit, there is only one focus per clause (see, for instance, Halliday 1967, pp. 201–208; Martín Butragueño 2005, p. 120). Considering these two assumptions, it is difficult to admit that the focus of examples such as (8) falls on the subject pronoun.

Taking into account what has been said so far, only 33 cases out of the 243 3SPPs found in the corpus can be considered focalised with a sufficient degree of certainty. In other words, focalisation can only account for 13.6% of the 3SPPs in the sample.

Focus and Contrast

Contrastiveness and focalisation are often considered related phenomena, but two questions arise regarding the issue under discussion: (a) do all instances of focalised SPPs receive a contrastive interpretation? And (b) can non-focalised SPPs be interpreted contrastively?

With respect to question (a), compare examples (9) and (10):

At first, it seems that both pronouns seem good candidates to bear contrastive readings, since they occur in contexts where other subjects are said to perform related actions (‘the others score against you, but he doesn’t’; ‘we arrived, and then she arrived’). However, the informational interpretations of (9) and (10) are slightly different, and whereas (9) allows for an adversative continuation such as sino los otros a ti (‘but the others against you’), (10) does not (llegó ella #pero no nosotros, ‘she arrived, but we didn’t’). One might argue that this continuation in (10) is incoherent because a positive relation is established between the two events described (termed by some as “corroborative contrast”, i.e. ‘we arrived and so did she’), but there is more to it than that. The set of presuppositions for the two examples is different. In (9) the speaker is talking about the referent of él playing soccer and one of the presuppositions involved is that ‘X scores’, another is that there are at least two candidates that could be equated to X (él, and his opponents). These are two necessary components that Chafe (1976) attributes to contrastive utterances: some background knowledge, and a limited set of candidates compatible with it. The contrastive focus asserts the right candidate.

Example (10) shows a quite different scenario. A sequence of events/states is narrated (‘we arrived \(\rightarrow \) she wasn’t there \(\rightarrow \) she arrived’). When llegó ella is uttered, there is no presupposition implying someone’s arrival such as ‘X arrived’: the addressee knows that the referent of the subject of llegamos arrived, but this assumption is neither contradicted nor confirmed by the clause under discussion. Consequently, if ‘X arrived’ is not presupposed, neither is there a contrastive closed set of candidates for X. If we had to formulate an underlying question for llegó ella, it would be And then what happened?, rather than Who arrived? In summary, llegó ella acts here as a sort of presentative construction not decomposable into a given-new structure, since all of it represents non-presupposed information.

Other seemingly non-contrastive examples of focalised SPPs are those where some emphasis is placed on the individual responsibility of the subject’s referent for the event described by the verb. In such cases, the pronoun may be modified by solo ‘alone’ or mismo ‘self’.6 It could be argued that in such cases the pronoun has an exhaustive interpretation, according to which its referent is contrasted with all the possible alternatives, but again it is hard to view the rest of the clauses where such a pronoun occurs as conveying presupposed information.

The answer to question (a) must be, then, that not every focalised pronoun receives a contrastive reading.

As far as question (b) is concerned, examining examples of what has sometimes been called “double contrast” (see Travis and Torres-Cacoullos 2012, p. 715ff.) could be clarifying.
In (13) two opposite events are confronted. A contrastive relation can be said to exist between the subjects (la chica vs. él) and between the predicates (se va vs. se queda). However, this type of contrast differs from the one exemplified in (9) in several respects. First, the second clause neither confirms nor contradicts any presupposition generated by the first one, either in the case of the subject or that of the predicate: there is no limited set of subjects nor a limited set of predicates in competition, so that any different continuation is possible—la chica se va y llega una amiga (‘a friend arrives’)/ y él también se va (‘and he goes too’), etc.7 Second, all the information in the latter clause seems, then, to be new, i.e. non-presupposed, and its focal structure is plausibly a neutral one. That becomes apparent in the following example, informationally similar to the previous one, but with transitive verbs:

In the clause pero él lo lleva así it is perfectly possible to insert an explicit object between the allegedly contrastive pronoun and its verb (pero él el pelo lo lleva así). That would not be possible with preverbal focalised elements—in peninsular Spanish at least—(cf. eso creo yo/*eso yo creo, ‘that is what I \(\hbox {think}'\ne \) eso yo lo creo, [pero no lo otro], ‘that I believe, [but not the other thing]’, ?‘quién lleva el pelo así?/*?‘quién el pelo lleva así?, ‘who wears his hair like that?’).8 In fact, this type of contrastive subject is considered a case of “contrastive theme” in Real Academia Española et al. (2009: Sect. 33.5c–e).

Interestingly, looking for cases of double contrast, Travis and Torres Cacoullos (2012, p. 718) found both expressed and omitted subjects.9 This might be due to the fact that the subject reference is derivable from another syntactic constituent (see Matos-Amaral and Schwenter 2005), but this is also what can be expected if we admit that the informational focus does not fall on the subject, but on the final stressed element, as in other unmarked cases. The following examples, where one or both of the subjects allegedly in contrast are omitted, seem to point in that direction.

In short, the answer to question (b) above must be that some SPPs, often interpreted as contrastive, are in fact non-focal elements.

Referential Motivations

3SPPs convey gender information that is absent when the subject is encoded by means of a verbal suffix alone. Likewise, the use of a 3SPP resolves the ambiguity brought about by homonymous verb forms compatible with first- and third person readings. It is only logical to suppose that, when competition arises between two third-person antecedents with different genders, or there is a conflictive interpretation involving a third-person antecedent and a possible reference to the speaker, 3SPPs are a means of supplying additional information in order to establish the reference intended by the speaker.

Some scholars have gone further and claim that the encoding of referents may provide indications about the identity of a given referent, even if it conveys the same contents as other alternatives available. In other words, if a speaker can choose between two encodings differing only in their formal complexity, choosing the most complex one implies referring to an entity less accessible to the addressee than the entity that would have been referred to with the simplest encoding (see, for instance, Ariel 1990 or Givón 2002, p. 232). Example (14) above may illustrate this point, and is therefore reproduced again as (17) with additional context and omitting the glosses of the fragment already given.

P is discussing the haircut of the person indexed as i in the example. At some point he introduces a comparison with his cousin (indexed as j in the example). The underlined pronoun él introduces a shift back to i, which is the most distant available antecedent, and therefore less accessible than j.10 The omission of the pronoun would have probably implied a co-referential reading of the two last instances of the verb lleva, since a coherent interpretation of the adversative clause is not ruled out under this reading: it would have been interpreted as a specification (‘black, but with some bald spots’) rather than as a contradiction. The use of one of the two alternatives—SPP plus verbal morphology versus verbal morphology alone—affects the choice of the relevant referent. Since the two antecedents at issue are masculine, the gender mark of the pronoun is irrelevant. The more prolix encoding is interpreted as a referential indication, providing evidence in support of the relation between coding material and referential accessibility.

The above notwithstanding, it seems that reference determination in Spanish does not rely on the encoding of referential expressions alone. Inferences aiming at maximally coherent interpretations also play such an important role in reference determination that the effects of the alternation illustrated in (17) can be cancelled.

Contó in (18) represents a shift back to a referent more distant than that of the previous subject (i, Lucía), but in contrast with él lo lleva in (17), there is no pronominal subject here. Otherwise, the context is equivalent since we have two competing referents compatible with the third person and with the same gender. A co-referential reading of vino Reyes and contó is however excluded as incoherent, as it is Lucía who informs J about her conversation with her mother.

The comparison between (19) and (18) also shows that there are no categorical associations between the expression and omission of an SPP and a given function, even in identical contexts. A probabilistic approach should thus be adopted in order to find different tendencies in the distribution of 3SPPs and their alternatives. The rest of this paper is devoted to this task.

Variation Between the Expression and Omission of 3SPPs

This section presents a corpus study aimed at analysing some of the factors contributing to the expression or omission of 3SPPs. The analysis is based on a sample extracted from the Val.Es.Co corpus, compiled by extracting all the third-person finite verbs in the corpus. Out of those, a set of ca. 1000 instances of verbs either with a 3SPP or lacking an overt subject was randomly selected: in order to do so, ca. 5000 examples were examined. After discarding the cases of focalised 3SPPs—according to the criteria established in “Informational and Referential Motivations for the Use of SPPs: A Qualitative Analysis” section—and certain other examples lacking a sufficiently explicit context for them to be classified, the final sample was narrowed down to 953 instances consisting of 89 3SPPs and 864 omitted subjects.

In what follows, the groups of factors chosen as explanatory variables are presented and their choice is justified, after which the results of a multivariate analysis performed with the aid of the Goldvarb software (Sankoff et al. 2012; see also Tagliamonte 2006, chs. 8–11) are presented and discussed.

Reference Shift

One factor affecting the expression of pronominal subjects and widely acknowledged in most quantitative or variationist approaches is the shift of reference with respect to the previous subject (Silva-Corvalán 1982; Bentivoglio 1987; Blanco Canales 1999; Cameron and Flores-Ferrán 2004; Travis 2005; Samper Padilla et al. 2006, etc.): pronominal subjects are more likely to occur if their referent is different from that of the subject of the previous clause. The preference for such contexts can be interpreted as related to their referential function. Nevertheless, since most of the variationist literature focuses on the expression of first person pronouns, which by definition and due to their deictic character lack antecedents, their distributional preference suggests that their role is to ease the cognitive task of shifting from one referent to another, irrespective of their accessibility. By contrast, third person pronouns are mostly used anaphorically. A preference for similar contexts could thus be viewed as a means of encoding antecedents with a lower degree of accessibility due to the greater textual distance between them and the anaphor.

In order to determine the effect of reference shift in the expression of 3SPPs, I will adopt a criterion similar to that used by Travis and Torres-Cacoullos (2012), who take into account the presence of a human referent between the subject anaphor at issue and its antecedent, instead of simply considering the change of reference with respect to the subject of the previous sentence. Thus, I will distinguish between third person subjects with and without human subjects intervening between them and their antecedents.11 Third-person plural subjects with generic reference will not count as intervening human subjects. Certain sequences lacking propositional content and in all probability expressing procedural indications (mostly of an epistemic nature) such as (yo) creo (que), ‘I think’, supongo (que) ‘I suppose’,12 yo qué sé ‘what do I know?’, etc. will not be taken into account either, since, due to their lack of propositional content, their alleged subjects are not in competition with the subjects of clauses conveying information about events or states.

Ambiguity of the Verbal Form

This is a factor whose influence on the expression of subject personal pronouns is clearly related to the latter’s referential function. In the Spanish verbal paradigm, certain forms such as cantaba ‘I/s/he/it sang’, cantaría ‘I/s/he/it would sing’, había cantado ‘I/s/he/it had sung’ etc. can have both first and third person singular subjects. The presence of a pronominal subject resolves the ambiguity by making it explicit whether an ambiguous verb form has a first or a third person reading. Therefore, a higher probability of expression of pronominal subjects is expected with these forms. That assumption, however, has not always been confirmed by quantitative studies; whereas in Rosengren (1974, p. 41), Bentivoglio (1987), Blanco Canales (1999) and Samper Padilla et al. (2006) verbal ambiguity shows up as a significant factor favouring the expression of pronominal subjects, Barrenechea and Alonso (1973) and Enríquez (1984) deny its influence.

For the purposes of my analysis, I have classified as ambiguous all those forms compatible with first and third person readings, as long as a reflexive clitic does not resolve the ambiguity without the necessity of a pronominal subject (e.g. se iba ‘s/he left’ vs. me iba ‘I left’). Such ambiguity does not affect plural forms.

Verb Semantics

Research on pronominal expression in Spanish has for many years detected correlations between certain semantic classes of verbs and a higher or lower frequency of pronominal subjects (Enríquez 1984; Fernández Ramírez 1987[1951]; Rosengren 1974), a finding which has been more recently confirmed (Bentivoglio 1987; Posio 2011; Travis 2005; Travis and Torres-Cacoullos 2012). Accounting for such correlations is not easy, especially considering that (i) depending on the grammatical person, the same class of verbs can have different types of effect (cf. Enríquez 1984, pp. 240, 244; Rosengren 1974, p. 224ff); (ii) depending on the sample analysed, a given class of verbs may be associated with different probabilities of SPP expression13; and (iii) the classifications of verb semantics vary from one researcher to another.

In the case of the first person singular pronoun, recent studies (Posio 2012, p. 169ff; Travis and Torres-Cacoullos 2012, p. 738ff.) have noticed the impact of frequently repeated sequences such as yo creo (que) ‘I think (that)’,  which act as discourse markers in many respects, on the observed correlation between verbs of thought or opinion and a high frequency of subject expression. A global explanation of the different distribution of SPPs is attempted by Posio (2011, p. 796), for whom SPPs are a means of directing the addressee’s attention toward certain participants in the event depicted by the clause. According to Posio’s hypothesis, in transitive clauses the main focus of attention will be the action or the participants encoded by the object, whereas in intransitive clauses or clauses with lower transitivity, the subject is the constituent that will attract the attention of the addressee. Posio conceives of transitivity as a partly gradual property made up of several features: number of arguments (transitive verbs need at least two), volition and agency (verbs selecting an Agent, such as hacer ‘to make’,  or a Volitioner, such as querer ‘to want’ as their subjects are more transitive than others selecting a Cognizer, such as pensar ‘to think’ or creer to believe), affectedness of the patient, etc. His hypothesis is supported by the data—at least partially—in that two-argument verbs with volitional, agent-like subjects and affected objects (hacer, poner) are those with the lowest rate of SPPs, while those selecting non-volitional subjects (cognizers) and less affected objects (clausal ones that the subject just perceives) show higher rates of SPPs.

A feature discussed by Posio in connection with his hypothesis is the stative character of verbs associated with frequent rates of pronominal expression (Posio 2011, p. 784), a feature that also can be considered a symptom of low transitivity. Since this feature allows for a straightforward binary classification (stative vs. non-stative verbs), it will be the one used in this analysis, especially considering that more elaborate classifications, such as those used by Bentivoglio (1987)—with five classes of verbs—or Posio himself—with eight—, would probably yield classes with very few instances, which could pose problems for a multivariate analysis such as the one presented in this paper.

For the present study, I have considered stative verb copulas (ser, estar) and verbs denoting durative situations involving no change in their internal development, such as tener ‘to have’, pensar ‘to think’, saber ‘to know’, querer ‘to love’, etc.


Ever since the first attempts to extend the variationist approach beyond the field of phonetics, scholars have been reluctant to accept cases of variation involving lexical or syntactic units as not having any effect on the content of the units involved (see Lavandera 1978 or Romaine 1981, p. 7).14 All the phenomena reviewed so far are in line with the assumption that variation in subject expression implies some content change, since the presence or absence of the pronoun can be regarded as the result of the expression of different referential indications or information structure configurations. One factor having a significant influence on SPPs expression, however, could challenge this view. Perseverance (Dell et al. 1997) or structural priming (Bock and Griffin 2000) is a phenomenon whereby the use of a certain structure activates it in the mind of the speaker and triggers subsequent uses of the same structure. In the case of SPPs, their use in a given clause would be primed or motivated by the use of the same structure in previous sentences. In cases of priming, then, the use of an SPP is not motivated by the fact that its features serve to some communicative purpose on the speaker’s part, but is simply the speaker’s subconscious response to certain contextual conditions. The priming phenomenon in SPP expression has been observed, under slightly different conditions, by Cameron and Flores-Ferrán (2004), Travis (2005), or Travis and Torres-Cacoullos (2012), among others.

In order to analyse the impact of priming on the expression of SPPs in the corpus sample used in this study, the examples have been classified as follows: (i) those preceded by a sentence with an SPP, irrespective of its person and reference; (ii) those preceded by a sentence with no syntactic subject; and (iii) those preceded by a sentence whose subject is a syntactic element other than a personal pronoun (e.g., a demonstrative pronoun, a noun phrase, etc.).

Results and Discussion

Table 1 shows the results of the analysis performed by means of the Goldvarb application. Out of the four groups of factors considered—(1) intervening human subjects between the subject and its antecedent, (2) verbal ambiguity, (3) verb semantics and (4) priming—only verbal ambiguity has been discarded as non-significant. These groups are presented in the first column ordered according to their weight. The second column displays the values of probability of SPP expression associated to each factor. The third column presents the percentage of SPP in each context. Finally, the fourth column presents the total number of instances for each context (i.e., it includes cases of pronominal and omitted subjects).
Table 1

Multivariate analysis of the factors contributing to the expression of third-person SPPs in conversational Spanish; factor groups not selected as significant are shown in square brackets

Corrected mean



Log likelihood


− 281.175

Total N




Factor weight



Subject of the preceding clause

Personal pronoun




Other overt subjects




Omitted subject







Verb semantics













Intervening human subjects












Ambiguity of the verbal form









The low frequency of 3SPPs in the sample translates into a low overall probability (.08). If we compare this result with the results obtained in other variationist studies, mostly focused on the expression of first person subjects (cf. the references given above), we can conclude that third person SPPs are a rare resource, showing lower probabilities of occurrence than their first person counterparts. This might have to do with the fact that personal pronouns, relative pronouns and zeroes are the only alternatives available for first person singular subjects, whereas third person expression would count on a wider set of alternatives (demonstrative pronouns, noun phrases, proper names...).

Another striking fact that emerges when these findings are compared with those of other studies is the fact that verbal ambiguity is statistically non-significant and, furthermore, the percentage of ambiguous 3SPPs is lower than that of non-ambiguous ones. This observation is in contrast with the findings of several studies referred to the first person and cited in previous sections. Taking into account that ambiguous verbal forms favour the use of first-person SPPs (see references cited above) but do not have a clear effect on third-person ones, one could hypothesise that a third person reading is the default interpretation for ambiguous verbal forms, and first person subjects are a means of cancelling such a reading. Additionally, one could ask what the picture would be like if we had considered the whole range of possible third person explicit subjects. A partial answer to this question can be found in García Salido (2013, p. 251), which includes a comparison of the frequencies of all the subjects contained in the BDS15 for the three persons, including explicit pronominal and non pronominal subjects, concluding that ambiguous verbal forms take an explicit subject—pronominal or not—more frequently than non-ambiguous forms.

As for the groups of factors selected as significant, priming is the one with the greatest weight. The presence of a pronominal subject in the previous sentence favours the expression of the subject by means of a unit of the same type, whereas 3SPPs are disfavoured if the subject of the preceding sentence has been omitted, giving rise to clause chains such as those in (19) and (20).

Similarly, a non-pronominal overt subject also seems to slightly favour the use of a pronominal subject in the following sentence. As mentioned above, the influence of priming means that the use of 3SPPs is determined to a considerable extent not by their intrinsic features and referential and informational possibilities, but by the way in which certain structures present in the context condition the speaker’s performance.

The second group of factors in order of importance is verb semantics. Stative verbs favour the use of 3SPPs, whereas non-stative verbs disfavour them. If we take stativeness as an indication of low transitivity, these findings provide further support for Posio’s hypothesis, albeit very indirectly. Further research on the relation between the degree of transitivity and the expression of 3SPPs seems to be in order.

The last group of factors selected as significant is the presence of an intervening human subject between the subject at issue and its antecedent. As expected, the probability that a 3SPP occurs after a non-coreferential intervening human subject is greater than in the opposite case (see Travis and Torres-Cacoullos (2012) for similar results). Since the influence of ambiguous verb forms has been discarded, this is the only significant group of factors in the multivariate analysis that can be related with the referential function of 3SPPs.


This paper has reviewed two types of motivation for the expression of 3SPPs in Spanish, the first having to do mostly with informational values, such as focus and contrast, and the second with the possibilities of 3SPPs as referential indications. As for the former, corpus evidence has shown that the use of 3SPPs as focalised elements can only account for a small proportion of their occurrences. As for the latter, a qualitative analysis has revealed that clauses with and without 3SPPs can occur in the same contexts, indicating that a probabilistic analysis of the preference for one structure or the other was required. From this analysis one can conclude that 3SPPs are quite infrequent and, in the absence of a lexical subject, usually a third person finite verb with no overt subject is used. Of the various factors analysed, three have proven to be significant for the presence of a 3SPP: stative verbs, the presence of an intervening human subject between the antecedent and the anaphoric 3SPP, and the presence of an overt subject (phrasal or pronominal) in the previous clause. In fact, the latter two factors can be regarded as partially contradictory. That an intervening human subject favours the use of a 3SPP can be interpreted in relation to the referential potential of this form, since an intervening subject constitutes a disruption with respect to the pronoun’s antecedent. Thus, this could suggest that 3SPP are preferentially used when their antecedents are more distant or less accessible than those of zero subjects. On the other hand, the fact that previous overt subjects, irrespective of their reference, favour 3SPPs, whilst previous zeroes, also irrespective of their reference, disfavour them could be taken as evidence for a priming effect, whereby speakers produce overt subjects if they have heard them recently, and vice versa. From the respective weights of these two competing motivations—one with a referential basis, the other with a purely formal one, it seems that the presence or absence of an overt subject plays a more prominent role than the intervention of a disruptive human reference.

In summary, expression of pronominal subjects in Spanish is a complex phenomenon affected by multiple causes (information structure, referential and anaphoric relations, priming). Further research could help understand it and could shed light on what at first sight appear to be contradictory aspects, such as the competition between purely formal and semantico-referential motivations.


  1. 1.

    The term is mentioned here due to its widespread use, but the process it suggests (i.e., that the expression of subject pronouns is a sort of basic syntactic configuration and, for whatever reasons, they are dropped at some point of the derivation) is by no means assumed in this study.

  2. 2.

    Information units are usually identified with clauses. However, authors such as Halliday (1967, p. 200ff.) warn against the automatic identification of both, in spite of their frequent overlapping. A bi-univocal correspondence that seems to hold with absolute regularity is the one that takes place between information and intonation units.

  3. 3.

    Lambrecht (1994, p. 49) points out that the informational division signalled by foci refers to propositional contents, rather than to individual entities referred to in the discourse. The distinction is then relational, operates with regard to the presupposed and non-presupposed part of a given proposition and is independent of the (non-)identifiability of a certain referent. Similar views have been held by Akmajian (1973, p. 218), Kuno (1972, p. 272) and, less clearly, by Halliday (1967, p. 206), who distinguishes between “inherent” and “structural” givenness.

  4. 4.

    Of course, a much more natural answer would be a phrasal (Un ratón ‘the mouse’), rather than a clausal structure.

  5. 5.

    The transcription conventions of the Val.Es.Co corpus have sometimes been altered or suppressed for the sake of clarity or to comply with formatting requirements.

  6. 6.

    If the pronoun is modified by solo, its omission seems possible: estudiaba él solo \(\approx \) estudiaba solo (‘he studied on his own’). Both él solo and solo seem to function as predicative complements. Mismo, however, occurs always as a modifier of a noun or a pronoun: lo hizo él mismo/*lo hizo mismo (‘he himself made it’).

  7. 7.

    In other words, in (9) the topic of conversation (a football game) implies a closed set of referents (the opponent teams or their members) and the act of (non-)scoring. Likewise, if the addressee assumes that somebody ate a hamburger and, given a certain situation, supposes a set of candidates who could have done so, including Mary and John, and the speaker states MARY ate the hamburger, he adds information to a given presupposition and denies the participation of John. In the case of (13), the staying of él does not entail other leavings or arrivals, nor does it contradict the leaving of ella.

  8. 8.

    As an anonymous reviewer has pointed out, both structures also differ prosodically: after a fronted topic an intonation break seems possible, whilst fronted foci reject such breaks.

  9. 9.

    Contrastive contexts, nevertheless, are associated to higher rates of subject expression than the rest of their sample.

  10. 10.

    Other third-person antecedents like pelo, ‘hair’, or the subject of se nota, ‘it is evident’, are irrelevant, due to their non-human character, which makes them incompatible with the pronoun él..

  11. 11.

    Textual distance was not taken into account: if the anaphor at issue had the same referent as the previous subject, no reference shift was annotated; if the first preceding subject with a human referent was not the antecedent of the anaphor at issue, it was considered a case of reference shift. If the subject was not human, the clause was disregarded and the reference shift/continuity was established from the next preceding one, and so on.

  12. 12.

    All instances of these verbs in the first person singular of the present were considered to convey epistemic indications, rather than referring events/states, irrespective of the presence of the complement conjunction que.

  13. 13.

    Thus, for instance, in Bentivoglio (1987) verbs of thought slightly disfavour the expression of pronominal subject, whereas in Blanco Canales (1999) and Samper Padilla et al. (2006), who use the same classification as Bentivoglio, they favour it; in Travis (2005) verbs of speech favour pronominal subject expression, whilst in the three other studies cited, they have the opposite effect, etc.

  14. 14.

    A similar stance can be found in the Construction Grammar framework, where different surface forms are said to be “typically associated with slightly different semantic and/or discourse functions” (Goldberg 2006, p. 9). It remains unclear, however, whether the alternation between presence and absence of pronominal subjects in Spanish would count as different constructions within this framework.

  15. 15.

    A data base containing syntactic information extracted from a corpus made up of written and spoken genres:



This research has been partially funded by a postdoctoral grant (Xunta de Galicia POS-A/2013/191).

Compliance with Ethical Standards

Conflict of interest

The author declares that he has no conflict of interest.


