Having shown that both L1 and L2 comprehenders allow for pronouns to be linked to QPs inside relative clauses in Experiment 1A, we carried out an eye-movement-monitoring-during-reading experiment to examine whether (and if so, when) non c-commanding QP antecedents are considered during processing.
Method
Participants
Participants were recruited from the University of Potsdam community and from the Berlin area. We tested 63 native German speakers, the data from three of whom were later excluded because of track loss. The remaining 60 participants' (22 male) mean age was 24.6 years (range: 19–38 years). The non-native participant group was comprised of 50 L2 speakers of German with Russian as their L1. The data from three L2 speakers were excluded because of track loss. The remaining 47 participants (7 male) had a mean age of 25.7 years (range: 18–38 years). They had started learning German between the ages of 6–25 years (mean: 13.4 years), and all of them had been living in Germany for at least six months at the time of testing (mean: 7.3 years, range: 0.5–23 years). To obtain an indication of our L2 participants' proficiency in German, they were asked to complete the web-based Goethe Institute Placement Test (Goethe Institute, 2010). Participants' mean Goethe test score was 25/30 points, with a range from 18 to 30 points. These scores placed them within the B2–C2 range according to the Common European Framework of Reference for Languages. All participants had normal or corrected-to-normal vision and provided their informed consent to participate in our study.
Materials
Our eye-movement experiment had a 2 × 2 design that was modelled after Cunnings et al.'s (2015) experiment 4, but with a QP located in object rather than subject position. Twenty-four stimulus quadruplets were constructed by manipulating the gender match between the personal pronoun er ('he') and two potential antecedents, a c-commanding DP and a non c-commanding QP. The sentence containing the critical pronoun was preceded by a short lead-in sentence which served to set the scene, and followed by a closing sentence. The four experimental conditions are exemplified by (12a–d) below.
(12)
|
LEAD-IN SENTENCE:
|
Der Schlosspark war riesig gross.
|
| | | |
'The castle park was extremely large.'
|
a.
|
DP Match, QP Match
|
Der
|
König,
| |
der
| |
jeden
|
Gärtner
| |
kannte,
|
the
|
king
| |
whoNOM
|
every
|
gardenerACC-MASC
|
knew
|
war
|
überzeugt,
|
dass
|
er
|
mehr
|
Bäume
| |
pflanzen
| |
was
|
convinced
|
that
|
he
|
more
|
trees
| |
plant
| |
sollte.
| | | | | | | | |
should
| | | | | | | | |
b.
|
DP Match, QP Mismatch
|
Der
|
König,
| |
der
| |
jede
|
Gärtnerin
| |
kannte,
|
the
|
king
| |
whoNOM
|
every
|
gardenerNOM/ACC-FEM
|
knew
|
war
|
überzeugt,
|
dass
|
er
|
mehr
|
Bäume
| |
pflanzen
| |
was
|
convinced
|
that
|
he
|
more
|
trees
| |
plant
| |
sollte.
| | | | | | | | |
should
| | | | | | | | |
c.
|
DP Mismatch, QP Match
|
Die
|
Königin,
| |
die
| |
jeden
|
Gärtner
| |
kannte,
|
the
|
queen
| |
whoNOM
|
every
|
gardenerACC-MASC
|
knew
|
war
|
überzeugt,
|
dass
|
er
|
mehr
|
Bäume
| |
pflanzen
| |
was
|
convinced
|
that
|
he
|
more
|
trees
| |
plant
| |
sollte.
| | | | | | | | |
should
| | | | | | | | |
d.
|
DP Mismatch, QP Mismatch
|
Die
|
Königin,
| |
die
| |
jede
|
Gärtnerin
| |
kannte,
|
the
|
queen
| |
whoNOM
|
every
|
gardenerNOM/ACC-FEM
|
knew
|
war
|
überzeugt,
|
dass
|
er
|
mehr
|
Bäume
| |
pflanzen
| |
was
|
convinced
|
that
|
he
|
more
|
trees
| |
plant
| |
sollte.
| | | | | | | | |
should
| | | | | | | | |
'The king/queen, who knew every (male/female) gardener, was convinced that he should plant more trees.'
|
CLOSING SENTENCE:
|
Dann würde es mehr Vögel im Park geben.
|
‘There would be more birds in the park then.’
|
Our experimental sentences were structurally identical to the stimulus items from Experiment 1A containing object QPs (10a). The experimental items were distributed across different presentation lists in a Latin-square design and mixed with 140 fillers, 80 of which were short stimulus texts from two unrelated experiments, yielding 164 items per list in total. Sixteen of the critical and 33 of the filler items were followed by a yes/no comprehension question. The experimental sentences were spread across three lines such that the critical pronoun appeared roughly in the middle of the second line.
Procedure
All participants were tested individually in a dedicated laboratory room. Their eye movements were recorded using an SR Research Eyelink 1000 system with a sampling rate of 1000 Hz. Although participants read binocularly, only their right eye was tracked unless calibration of the right eye was not possible. The stimulus texts were presented in black Courier New font (18pt) on a white background, and their presentation order was randomized. The experiment started after the eye calibration and the presentation of three practice items, two of which were followed by a comprehension question. Participants were instructed to read the stimulus texts carefully for meaning at their normal reading pace. The L1 speakers finished the experiment in approximately 45 and the L2 speakers in approximately 70 min. After completing the experiment each participant either received course credit or a small monetary compensation as a reward for their contribution.
The analysis regions of primary interest included the critical region containing the pronoun and the complementizer preceding it (dass er 'that he'), and the postcritical region consisting of the two words following the pronoun. We extended the pronoun region backwards by one word so as to be able to capture potential parafoveal viewings of the pronoun, which might be skipped during initial reading (e.g. Sturt, 2003). The following eye-movement measures were analysed: first-pass reading times (the summed duration of all fixations on a region of interest before exiting it to the left or right), right-bound reading times (the summed duration of all fixations on a region of interest before exiting it to the right), rereading times (the summed duration of all fixations on a region of interest after it has been exited to the left or to the right), and total reading times (the summed duration of all fixations on a region of interest), We also analysed the probabilities of first-pass regressions from, and of rereading, the two regions of interest.
The data were analysed in R version 3.6.0 (R Core Team, 2019) using mixed modelling with the lmerTest package version 3.1.2 (Kuznetsova et al., 2017). Linear-mixed effects models were fitted for the continuous measures and mixed-effects logistic regressions for the binomial dependent variables. For each region and measure, the effects of interest were inserted into one model. P-values were calculated from the model output rather than from model comparison. Given that the distribution of fixation durations is often right-skewed, we used the Box-Cox procedure (Box & Cox, 1964) to determine the appropriate data transformation for each of the two regions of interest. In consequence, a non-linear transformation (log) was applied to each reading-time continuous measure to satisfy the assumptions of normality. Statistical analyses were performed on the log-transformed data.
We first carried out a between-group analysis and subsequently, as several interactions with the factor Group reached significance, analysed the L1 group and the L2 group separately. The fixed parts of the models for the between-group analysis contained the sum-coded two-level factors QP (match, mismatch), DP (match, mismatch), and Group (L1, L2), the centred factor Trial, and their interactions. The random parts of the models contained, in their simplest version, by-subject and by-item random intercepts (Formula: log (value) ~ QP.sum * DP.sum * group.sum* c.(Trial) + (1|Subj) + (1|Item)). The within-group models contained the sum-coded two-level factors QP (match, mismatch) and DP (match, mismatch), centred Trial, and their interactions in the fixed parts and by-subject and by-item random intercepts in the simplest version of the random parts (Formula: log (value) ~ QP.sum * DP.sum * c.(Trial) + (1|Subj) + (1|Item)). To find out whether a more complex random structure provided a better fit to our data we conducted a series of models whose random parts contained different levels of complexity. The full random structures contained by-subject and by-item random slopes for QP, DP, and their interaction (+ (1 + QP.sum * DP.sum|Subj) + (1 + QP.sum * DP.sum|Item)). Likelihood ratio tests were applied to determine the adequate level of complexity. The models were refitted with full maximum likelihood estimation. We selected the simplest model unless a more complex one proved to be a significantly better fit at an alpha level of 0.05 (Baayen et al., 2008). In the vast majority of cases, this procedure led to the simplest model being selected (see the formulae above). On the rare occasion that convergence for the simplest structure could not be achieved, we removed first Trial, then Trial and the by-Item intercept, and lastly Trial and the by-Subject intercept. Trial refers to the order of item presentation. It was originally included to account for changes in the effects of the predictors over the course of the experiment due to, for example, tiredness (slow-down) or learning (speed-up). As a numeric variable that does not have a meaningful zero value, Trial was mean-centred. Since no interaction with this variable reached statistical significance, Trial will not be discussed any further.
Predictions
Participants' attempts to link the pronoun to a DP or QP antecedent should be reflected in corresponding gender-mismatch effects. If dependency formation is attempted during the initial reading of the pronoun region, we expect to find gender effects in early eye-movement measures, including first-pass and right-bound reading time. If, on the other hand, dependency formation with the DP or QP antecedent is only attempted at later processing stages, we expect to find gender effects restricted to later (rereading time and/or rereading probability) or composite measures (total viewing time), or to the postcritical region. If the QP antecedent is considered later than the DP antecedent or if it is not considered at all, then QP gender effects should be delayed or absent. This outcome would be expected if telescoping readings are only computed during later comprehension stages, or if pronouns that enter into telescoping dependencies are of a special type that is insensitive to gender congruence, as has been suggested by Moulton and Han (2018).
For both L1 and L2 processing, an antecedent search strategy that is based on surface syntax or which is discourse-based would lead comprehenders to focus on the DP antecedent in matrix subject position. If L2 speakers have more difficulty than L1 speakers computing non-isomorphic syntax-semantic mappings in real time, then QP gender effects should be delayed (relative to the L1 group) or altogether absent in this group.
Results
Both groups' comprehension accuracy was high (L1: 96%, range: 80–100%; L2: 93%, range: 73–100%), confirming that participants read the stimulus items actively for meaning. During reading the L1 group skipped the critical region 14.6% and the postcritical one 4.5% of the time. Skipping rates for the L2 group were 6.5% for the critical and 2.8% for the postcritical region. One experimental item and one condition of another item were excluded because they contained an error. Individual fixations shorter than 40 ms or longer than 1000 ms were removed, comprising 0.50% of the L1 data and 0.45% of the L2 data. Individual trials excluded for track loss comprised 0.21% of the L1 data and 0.27% of the L2 data.
Table 3 provides an overview of the two participant groups' reading times across the four experimental conditions, and Table 4 shows the probabilities of first-pass regressions and of rereading the two interest regions. Table 5 shows our initial between-groups analysis that revealed main effects of, and interactions with the factor group, in several eye-movement measures and across both interest regions. Positive estimates for QP and DP indicate longer reading times/higher probabilities in the gender mismatch as compared to the gender match condition. Main effects of group were found in all continuous reading time measures and reflected the fact that the L2 group generally read more slowly than the L1 group. At the pronoun region, QP by Group interactions were found in first-pass and right-bound reading times. Marginally significant three-way interactions between QP, DP and Group were seen in total reading times at both interest regions. There was a further QP by Group interaction in participants' first-pass regressions at the postcritical region, and DP by Group interactions were found for rereading probability at both interest regions. As these results indicate group differences in participants' reading-time patterns across our experimental conditions, we went on to analyse the data from the two groups separately.
Table 3 Native (L1) and non-native (L2) speakers’ mean reading times in the pronoun and postcritical regions in milliseconds (SDs and 95% CIs in parentheses), Experiment 2 Table 4 Native (L1) and non-native (L2) speakers’ proportions of first-pass regressions and of rereading the pronoun and postcritical regions (SEs in parentheses), Experiment 2 Table 5 Summary of the statistical between-group analyses of the eye-movement data for the two interest regions, Experiment 2 Table 6 shows the model outputs for the L1 group. At the critical region, significant main effects of QP gender were found in all reading-time measures reported. They reflect the fact that reading times were shorter if the QP's gender matched the gender of the pronoun compared to when it did not.
Table 6 Summary of the statistical analyses of the L1 group's eye-movement data for the two interest regions, Experiment 2 Main effects of DP gender, reflecting shorter reading times for DP-match than for DP-mismatch conditions, were restricted to later or composite eye-movement measures including rereading and total reading times, and were also found for rereading probability. QP by DP interactions were found for rereading probability and total reading times. Pairwise comparisons showed that Gender-mismatching QPs led to a significantly higher probability of participants rereading the pronoun region when no gender-matching DP was available (Est. = 0.271, z = 3.223, p = 0.001) but not when a gender-matching DP was available (Est. = 0.017, z = 0.191, p = 0.848). Similarly, mismatching QPs elicited elevated total reading times in the DP-mismatch conditions (Est. = 0.099, z = 4.745, p < 0.001) but not in the DP-match conditions (Est. = 0.011, t = 0.616, p = 0.538).
At the postcritical region, the L1 group showed main effects of QP gender in both first-pass regressions and rereading probability, and main effects of DP gender in first-pass regressions, rereading probability, and in rereading and total viewing times. The two factors interacted in right-bound and total reading times. The pattern was the same as we observed at the critical region: Pairwise comparisons revealed that Gender-mismatching QPs led to significantly higher reading times when no matching DP antecedent was available (right-bound time: Est. = 0.050, t = 2.892, p = 0.004; total reading time: Est. = 0.065, t = 3.361, p = 0.001) but not when a matching DP antecedent was available (right-bound time: Est. = −0.026, t = −1.677, p = 0.094; total reading time: Est. = -0.014, t = -0.744, p = 0.457).
The model output for the L2 group's data is shown in Table 7. For this group, no significant effects or interactions were found in early reading-time measures (i.e., in first-pass or right-bound reading times). At the pronoun region, significant or marginal effects of DP gender were seen in rereading and total reading times, as well as in rereading probability. A main effect of QP gender was found in rereading times only, and there were no interactions. At the postcritical region, only DP gender showed any significant effects, which were again restricted to later eye-movement measures (i.e. rereading times and rereading probability).
Table 7 Summary of the statistical analyses of the L2 group's eye-movement data for the two interest regions, Experiment 2 In summary, our two participant groups showed very different reading-time patterns across the four experimental conditions. The L1 group showed main effects of QP gender in early reading-time measures at the pronoun region, as well as in later eye-movement measures and across both regions of interest. Effects of DP gender only appeared in later measures in the critical region and in the postcritical region. The interactions observed in some later measures and at the postcritical region showed QP-mismatch effects being restricted to the DP-mismatch conditions. The L2 group, in contrast, showed no early effects of either antecedent's gender but showed robust effects of DP gender in later eye-movement measures, and no effects of the QP's gender except during their rereading of the pronoun region.
Discussion
In Experiment 2 we asked whether L1 and/or L2 speakers of German would try to link a pronoun to a non c-commanding object QP during processing. The early main effects of QP gender, in the absence of any effects of DP gender, that were seen in the L1 group indicate that the native German speakers primarily considered the QP antecedent during their initial reading of the pronoun. QP effects persisted across several eye-movement measures and both interest regions, indicating that telescoping dependencies were attempted from early on during processing. Effects of the DP antecedent's gender became visible only with some delay. The L1 speakers' eye-movement record shows that they gradually homed in on the DP antecedent over time, but with QP antecedents still being considered if the DP antecedent mismatched the pronoun in gender. A reviewer points out that the interaction pattern we observed in total reading times and rereading probability in the L1 data resembles a faciliatory interference pattern (Jäger et al., 2017). Note, however, that the present study was not designed to test memory interference as our stimulus sentences contained two permissible antecedents and no inappropriate distractor. Hence neither consideration of the DP nor of the QP antecedent can serve as a reliable diagnostic for interference here.
As noted above, our L1 group's results contrast with the absence of QP gender effects in previous processing experiments (Cunnings et al., 2015; Kush et al., 2015; Moulton & Han, 2018). To our knowledge, the current study is the first to observe gender-mismatch effects for non c-commanding QP antecedents. One possible reason for this is that previous studies used stimulus materials with QPs in subject position, whereas we used QPs in object position. Syntactic approaches to telescoping out of RCs which assume that the QP must undergo QR in order to take scope over the pronoun would account for this apparent subject/object asymmetry in terms of a grammatical constraint which prohibits or penalises the extraction of subjects. This account must remain speculative, though, as we did not compare sentences with subject and object QPs directly (as did Radó et al., 2019).
Our L2 group, in contrast, did not show any evidence of considering the QP antecedent during processing, except fleetingly during their rereading of the pronoun region. Instead, the L2 speakers showed robust effects of DP gender in later eye-movement measures across both interest regions, suggesting that they tried to resolve the pronoun towards the DP antecedent. The L2 group's reading-time pattern is in line with previous findings indicating that L2 speakers prefer to link pronominal elements to discourse-prominent antecedents during processing and avoid quantified antecedents even if they are potential binders.