Skip to main content
Log in

Integrating Corpus-Linguistic and Conversation-Analytic Transcription in XML: The Case of Backchannels and Overlap in Storytelling Interaction

  • Original Paper
  • Published:
Corpus Pragmatics Aims and scope Submit manuscript

Abstract

This paper sketches out and illustrates the research opportunities that come with the recent addition to BNCweb of very large numbers of audio files for the spoken component in the BNC. It aims to demonstrate that the availability of the audio files enables researchers not only to correct the orthographic transcripts, but also to re-transcribe the conversations using conversation-analytic transcription. It also shows that the CA transcripts can be integrated into the BNC’s XML annotation network and illustrates how XML query tools such as XPath and XQuery can be used to efficiently exploit the XML network. The main thrust of the paper is to argue that the integration of corpus-linguistic and conversation-analytic transcription in XML can make major contributions both to CL and CA. CL research into conversation can for the first time be performed on the basis of transcription that is “detailed enough to facilitate the analyst’s quest to discover and describe orderly practices of social action in interaction” (Hepburn and Bolden, in: Sidnell, Stivers (eds) The handbook of conversation analysis, Wiley Malden, 2013: 58) while CA research can gain a large-scale quantitative basis to substantiate claims about the generalizability of observed regularities and patterns in talk-in-interaction. To illustrate the benefits of doing research on re-transcriptions of the BNC’s audio files, a case study is presented on backchannels occurring in overlap in storytelling interaction. The case study reveals, inter alia, that backchannels produced by story recipients simultaneously with parts of the storyteller’s ongoing turn tend to increase in frequency as the storytelling reaches its climax. Backchannel overlap is thus in synchrony with story organization. This finding adds weight to Goodwin’s observation that recipients attend to the task “not simply of listening to the events being recounted but rather of distinguishing different subcomponents of the talk in terms of the alternative possibilities for action they invoke” (Goodwin, in: Atkinson, Heritage (eds) Structures of social action: studies in conversation analysis, Cambridge University Press, Cambridge, 1984: 243). The case study also presents exploratory evidence to suggest that, arguably due to the extended length of storytelling turns (Ochs and Capps in Living narrative, Harvard University Press, Cambridge, 2001), the proportion of overlap in running speech may be considerably lower in storytelling than in general conversation and telephone conversation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. The C-ORAL-ROM project started a tradition continued by Brazilian Portuguese, C-ORAL-BRASIL; Japanese, C-ORAL-JAPAN; and Chinese C-ORAL-CHINA.

  2. In Levinson and Torreira (2015) the percentage of 5% refers to overlap in turns including all silent parts, i.e., inter- and intra-speaker pauses; when silent parts are excluded, the percentage is 3.8%. This proportion is clearly closer to the 3.2% obtained from the model. However, the model’s most critical coefficient, the slope, is calculated on the basis of turn lengths including silent parts. So the percentage of 3.2% is best juxtaposed to the percentage of 5%.

  3. The proportion of 8% is not stated explicitly but can be read off the cumulative distribution (summation) of response times shorter than 0 ms (overlaps) on the right hand scale (scale b) of Fig. 5 on page 289 of Norwine & Murphy’s study; I’m indebted to Mattias Heldner (personal email communication) for this pointer.

  4. Another contributing factor to greater turn length in storytelling is the significantly greater number of storyteller pauses within storytelling turns (cf. Rühlemann 2013).

References

  • Aijmer, K., & Rühlemann, C. (Eds.). (2015). Corpus pragmatics. A handbook. Cambridge: Cambridge University Press.

  • Albert, S., de Ruiter, L. E., & de Ruiter, J. P. (2015). CABNC: the Jeffersonian transcription of the Spoken British National Corpus. https://saulalbert.github.io/CABNC/.

  • Bolden, G. (2004). The quote and beyond: Defining boundaries of reported speech in conversational Russian. Journal of Pragmatics, 36, 1071–1118.

    Article  Google Scholar 

  • Campillos, L. L. (2014). A Spanish learner oral corpus for computer-aided error analysis. Corpora, 9(2), 207–238.

    Article  Google Scholar 

  • Coleman, J., Baghai-Ravary, L., Pybus, J., & Grau, S. (2012). Audio BNC: The audio edition of the Spoken British National Corpus. Phonetics Laboratory, University of Oxford. http://www.phon.ox.ac.uk/AudioBNC

  • Cresti, E., & Moneglia, M. (Eds.). (2005). C-ORAL-ROM: Integrated reference corpora for spoken Romance languages. Amsterdam: Benjamins.

    Google Scholar 

  • Crowdy, S. (1994). Spoken corpus transcription. Literary and Linguistic Computing, 9(1), 25–28.

    Article  Google Scholar 

  • Crowdy, S. (1995). The BNC spoken corpus. In G. Leech, G. Myers, & J. Thomas (Eds.), Spoken English on computer: Transcription, mark-up and application (pp. 225–234). London: Longman.

    Google Scholar 

  • Gardner, R. (1998). Between speaking and listening: The vocalisation of understandings. Applied Linguistics, 19(2), 204–224.

    Article  Google Scholar 

  • Goffman, E. (1981). Forms of talk. Philadelphia: Philadelphia University Press.

    Google Scholar 

  • Goodwin, C. (1984). Notes on story structure and the organization of participation. In J. M. Atkinson & J. Heritage (Eds.), Structures of social action: Studies in conversation analysis (pp. 225–246). Cambridge: Cambridge University Press.

    Google Scholar 

  • Gries, S Th. (2009a). Quantitative corpus linguistics with R. A practical introduction. New York: Routledge.

    Book  Google Scholar 

  • Gries, S Th. (2009b). Statistics for linguistics with R. A practical introduction. Berlin: Mouton de Gruyter.

    Book  Google Scholar 

  • Hardie, A. (2014). Modest XML for corpora: Not a standard, but a suggestion. ICAME Journal, 38, 73–103.

    Article  Google Scholar 

  • Heldner, M., & Edlund, J. (2010). Pauses, gaps and overlaps in conversations. Journal of Phonetics, 38, 555–568. doi:10.1016/j.wocn.2010.08.002.

    Article  Google Scholar 

  • Hepburn, A., & Bolden, G. (2013). The conversation-analytic approach to transcription. In J. Sidnell & T. Stivers (Eds.), The handbook of conversation analysis (pp. 57–76). Malden, MA: Wiley.

    Google Scholar 

  • Hoey, M., & O’Donnell, M. B. (2008). Lexicography, grammar, and textual position. International Journal of Lexicography, 21(3), 293–309.

    Article  Google Scholar 

  • Hoffmann, S., Evert, S., Smith, N., Lee, D., & Berglund Prytz, Y. (2008). Corpus linguistics with BNCweb—A practical guide. Frankfurt am Main: Peter Lang.

    Google Scholar 

  • Holmes, J., & Stubbe, M. (1997). Good listeners: Gender differences in New Zealand conversation. Women and Language, 20(2), 7–14.

    Google Scholar 

  • Holt, E. (1996). Reporting talk: The use of direct reported speech in conversation. Research on Language and Social Interaction, 29(3), 219–245.

    Article  Google Scholar 

  • Jefferson, G. (1973). A case of precision timing in ordinary conversation: Overlapped tag-positioned address terms in closing sequences. Semiotics, 9, 47–96.

    Google Scholar 

  • Jefferson, G. (1979). A technique for inviting laughter and its subsequent acceptance declination. In G. Psathas (Ed.), Everyday language—Studies in ethnomethodology (pp. 79–95). New York: Irvington Publishers.

    Google Scholar 

  • Jefferson, G. (1985). An exercise in the transcription and analysis of laughter. In T. A. van Dijk (Ed.), Handbook of discourse analysis (Vol. 3, pp. 25–34). London: Academic.

    Google Scholar 

  • Jefferson, G. (1986). Notes on ‘latency’ in overlap onset. Human Studies, 9, 153–183.

    Article  Google Scholar 

  • Kallen, J. L., & Kirk, J. (2012). SPICE-Ireland: A user’s guide. Belfast: Cló Ollscoil na Banríona.

    Google Scholar 

  • Kjellmer, G. (2009). Where do we backchannel? International Journal of Corpus Linguistics, 14(1), 81–112.

    Article  Google Scholar 

  • Lerner, G. (1996). On the “semi-permeable” character of grammatical units in conversation: conditional entry into the turn space of another speaker. In E. Ochs, E. A. Schegloff, & S. A. Thompson (Eds.), Interaction and grammar (pp. 238–276). Cambridge: Cambridge University Press.

    Chapter  Google Scholar 

  • Levinson, S. C. (1983). Pragmatics. Cambridge: Cambridge University Press.

    Google Scholar 

  • Levinson, S. C., & Torreira, F. (2015). Timing in turn-taking and its implications for processing models of language. Frontiers in Psychology, 6, 731. doi:10.3389/fpsyg.2015.00731.

    Article  Google Scholar 

  • Liddicott, A. J. (2007). An introduction to conversation analysis. London: Continuum.

    Google Scholar 

  • Mayes, P. (1990). Quotation in spoken English. Studies in Language, 14, 325–363.

    Article  Google Scholar 

  • McCarthy, M. (2003). Talking back: ‘Small’ interactional response tokens in everyday conversation. Research on Language and Social Interaction, 36(1), 33–63.

    Article  Google Scholar 

  • Nichols, T. E., & Holmes, A. P. (2001). Nonparametric permutation test for functional neuroimaging: A primer with examples. Human Brain Mapping, 15, 1–25.

    Article  Google Scholar 

  • Norwine, A. C., & Murphy, O. J. (1938). Characteristic time intervals in telephonic conversation. The Bell System Technical Journal, 17, 281–291.

    Article  Google Scholar 

  • O’Keeffe, A., & Adolphs, S. (2008). Response tokens in British and Irish discourse. Corpus, context and variational pragmatics. In K. P. Schneider & A. Barron (Eds.), Variational pragmatics (pp. 69–98). Amsterdam: John Benjamins.

    Chapter  Google Scholar 

  • Ochs, E., & Capps, L. (2001). Living narrative. Cambridge, MA: Harvard University Press.

    Google Scholar 

  • Peters, P., & Wong, D. (2015). Turn management and backchannels. In K. Aijmer & C. Rühlemann (Eds.), Corpus pragmatics. A handbook (pp. 408–429). Cambridge: Cambridge University Press.

    Chapter  Google Scholar 

  • Robinson, J. D. (2007). The role of numbers and statistics within conversation analysis. Communication Methods and Measures, 1(1), 65–75.

    Article  Google Scholar 

  • Rossano, F. (2013). Gaze in conversation. In T. Stivers & J. Sidnell (Eds.), The handbook of conversation analysis (pp. 308–329). Malden/MA & Oxford: Blackwell.

  • Rühlemann, C. (2013). Narrative in English conversation: A corpus analysis. Cambridge: Cambridge University Press.

  • Rühlemann, C., Bagoutdinov, A., & O’Donnell, M. B. (2015). Modest XPath and XQuery for corpora: Exploiting deep XML annotation. ICAME Journal, 39, 47–84.

  • Rühlemann, C., & Gries, S. T. (2015). Turn order and turn distribution in multi-party storytelling. Journal of Pragmatics, 87, 171–191.

  • Rühlemann, C., & O’Donnell, M. B. (2012). Towards a corpus of conversational narrative. Construction and annotation of the Narrative Corpus. Corpus Linguistics and Linguistic Theory, 8(2), 313–350.

  • Sacks, H. (1984). Notes on methodology. In J. M. Atkinson & J. Heritage (Eds.), Structures of social action (pp. 21–27). Cambridge: Cambridge University Press.

  • Sacks, H. (1992). Lectures on conversation. (Vols. I and II). Oxford: Blackwell.

  • Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). A simplest systematics for the organisation of turn-taking for conversation’. Language, 50(4), 696–735.

    Article  Google Scholar 

  • Schegloff, E. A. (1982). Discourse as an interactional achievement: Some uses of ‘uh huh’ and other things that come between sentences. In D. Tannen (Ed.), Georgetown University round table on languages and linguistics analyzing discourse: Text and talk (pp. 71–93). Washington, DC: Georgetown University Press.

    Google Scholar 

  • Schegloff, E. A. (2000). Overlapping talk and the organization of turn-taking for conversation. Language in Society, 29, 1–63.

    Article  Google Scholar 

  • Schmidt, T., & Schütte, W. (2014). FOLKER: An annotation tool for efficient transcription of natural, multi-party interaction. http://www.lrec-conf.org/proceedings/lrec2010/pdf/18_Paper.pdf.

  • Schmidt, Thomas, & Wörner, Kai. (2014). EXMARaLDA. In Jacques Durand, Ulrike Gut, & Gjert Kristoffersen (Eds.), The Oxford handbook of corpus phonology (pp. 402–419). Oxford: Oxford University Press.

    Google Scholar 

  • Stivers, T. (2008). Stance, alignment, and affiliation during storytelling: When nodding is a token of affiliation. Research on Language and Social Interaction, 41(1), 31–57.

    Article  Google Scholar 

  • Stivers, T. (2013). Sequence organization. In J. Sidnell & T. Stivers (Eds.), The handbook of conversation analysis (pp. 191–209). Malden, MA: Wiley.

    Google Scholar 

  • Stivers, T. (2015). Coding social interaction: A heretical approach in conversation analysis? Research on Language and Social Interaction, 48(1), 1–19.

    Article  Google Scholar 

  • Stivers, T., & Sidnell, J. (2013). Introduction. In J. Sidnell & T. Stivers (Eds.), The handbook of conversation analysis (pp. 1–8). Malden, MA: Wiley.

    Google Scholar 

  • ten Bosch, L., Oostdijk, N., & Boves, L. (2005). On temporal aspects of turn taking in conversational dialogues. Speech Communication, 47, 80–86.

    Article  Google Scholar 

  • Tolins, J., & Fox Tree, J. E. (2014). Addressee backchannels steer narrative development. Journal of Pragmatics, 70, 152–164.

    Article  Google Scholar 

  • Walker, M. A. (1993). Informational redundancy and resource bounds in dialogue (Ph.D. thesis). University of Pennsylvania, Philadelphia, PA.

  • Walmsley, P. (2007). XQuery. Sebastopol, CA: O’Reilly.

    Google Scholar 

  • Watt, A. (2002). XPath essentials. New York: Wiley.

    Google Scholar 

  • Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., & Sloetjes, H. (2006). ELAN: A professional framework for multimodality research. In: Proceedings of LREC 2006, Fifth international conference on language resources and evaluation.

  • Wong, D., & Peters, P. (2007). A study of backchannels in regional varieties of English, using corpus mark-up as the means of identification. International Journal of Corpus Linguistics, 12(4), 479–509.

    Article  Google Scholar 

  • Woods, A., Fletcher, P., & Hughes, A. (1986). Statistics in language studies. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Yngve, V. (1970). On getting a word in edgewise. In: Papers from the sixth regional meeting of the Chicago Linguistic Society. University of Chicago, pp. 567–77.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christoph Rühlemann.

Appendix

Appendix

See Table 3.

Table 3 XTranscript tagging scheme

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rühlemann, C. Integrating Corpus-Linguistic and Conversation-Analytic Transcription in XML: The Case of Backchannels and Overlap in Storytelling Interaction. Corpus Pragmatics 1, 201–232 (2017). https://doi.org/10.1007/s41701-017-0018-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41701-017-0018-7

Keywords

Navigation