Abstract
This paper sketches out and illustrates the research opportunities that come with the recent addition to BNCweb of very large numbers of audio files for the spoken component in the BNC. It aims to demonstrate that the availability of the audio files enables researchers not only to correct the orthographic transcripts, but also to re-transcribe the conversations using conversation-analytic transcription. It also shows that the CA transcripts can be integrated into the BNC’s XML annotation network and illustrates how XML query tools such as XPath and XQuery can be used to efficiently exploit the XML network. The main thrust of the paper is to argue that the integration of corpus-linguistic and conversation-analytic transcription in XML can make major contributions both to CL and CA. CL research into conversation can for the first time be performed on the basis of transcription that is “detailed enough to facilitate the analyst’s quest to discover and describe orderly practices of social action in interaction” (Hepburn and Bolden, in: Sidnell, Stivers (eds) The handbook of conversation analysis, Wiley Malden, 2013: 58) while CA research can gain a large-scale quantitative basis to substantiate claims about the generalizability of observed regularities and patterns in talk-in-interaction. To illustrate the benefits of doing research on re-transcriptions of the BNC’s audio files, a case study is presented on backchannels occurring in overlap in storytelling interaction. The case study reveals, inter alia, that backchannels produced by story recipients simultaneously with parts of the storyteller’s ongoing turn tend to increase in frequency as the storytelling reaches its climax. Backchannel overlap is thus in synchrony with story organization. This finding adds weight to Goodwin’s observation that recipients attend to the task “not simply of listening to the events being recounted but rather of distinguishing different subcomponents of the talk in terms of the alternative possibilities for action they invoke” (Goodwin, in: Atkinson, Heritage (eds) Structures of social action: studies in conversation analysis, Cambridge University Press, Cambridge, 1984: 243). The case study also presents exploratory evidence to suggest that, arguably due to the extended length of storytelling turns (Ochs and Capps in Living narrative, Harvard University Press, Cambridge, 2001), the proportion of overlap in running speech may be considerably lower in storytelling than in general conversation and telephone conversation.
Similar content being viewed by others
Notes
The C-ORAL-ROM project started a tradition continued by Brazilian Portuguese, C-ORAL-BRASIL; Japanese, C-ORAL-JAPAN; and Chinese C-ORAL-CHINA.
In Levinson and Torreira (2015) the percentage of 5% refers to overlap in turns including all silent parts, i.e., inter- and intra-speaker pauses; when silent parts are excluded, the percentage is 3.8%. This proportion is clearly closer to the 3.2% obtained from the model. However, the model’s most critical coefficient, the slope, is calculated on the basis of turn lengths including silent parts. So the percentage of 3.2% is best juxtaposed to the percentage of 5%.
The proportion of 8% is not stated explicitly but can be read off the cumulative distribution (summation) of response times shorter than 0 ms (overlaps) on the right hand scale (scale b) of Fig. 5 on page 289 of Norwine & Murphy’s study; I’m indebted to Mattias Heldner (personal email communication) for this pointer.
Another contributing factor to greater turn length in storytelling is the significantly greater number of storyteller pauses within storytelling turns (cf. Rühlemann 2013).
References
Aijmer, K., & Rühlemann, C. (Eds.). (2015). Corpus pragmatics. A handbook. Cambridge: Cambridge University Press.
Albert, S., de Ruiter, L. E., & de Ruiter, J. P. (2015). CABNC: the Jeffersonian transcription of the Spoken British National Corpus. https://saulalbert.github.io/CABNC/.
Bolden, G. (2004). The quote and beyond: Defining boundaries of reported speech in conversational Russian. Journal of Pragmatics, 36, 1071–1118.
Campillos, L. L. (2014). A Spanish learner oral corpus for computer-aided error analysis. Corpora, 9(2), 207–238.
Coleman, J., Baghai-Ravary, L., Pybus, J., & Grau, S. (2012). Audio BNC: The audio edition of the Spoken British National Corpus. Phonetics Laboratory, University of Oxford. http://www.phon.ox.ac.uk/AudioBNC
Cresti, E., & Moneglia, M. (Eds.). (2005). C-ORAL-ROM: Integrated reference corpora for spoken Romance languages. Amsterdam: Benjamins.
Crowdy, S. (1994). Spoken corpus transcription. Literary and Linguistic Computing, 9(1), 25–28.
Crowdy, S. (1995). The BNC spoken corpus. In G. Leech, G. Myers, & J. Thomas (Eds.), Spoken English on computer: Transcription, mark-up and application (pp. 225–234). London: Longman.
Gardner, R. (1998). Between speaking and listening: The vocalisation of understandings. Applied Linguistics, 19(2), 204–224.
Goffman, E. (1981). Forms of talk. Philadelphia: Philadelphia University Press.
Goodwin, C. (1984). Notes on story structure and the organization of participation. In J. M. Atkinson & J. Heritage (Eds.), Structures of social action: Studies in conversation analysis (pp. 225–246). Cambridge: Cambridge University Press.
Gries, S Th. (2009a). Quantitative corpus linguistics with R. A practical introduction. New York: Routledge.
Gries, S Th. (2009b). Statistics for linguistics with R. A practical introduction. Berlin: Mouton de Gruyter.
Hardie, A. (2014). Modest XML for corpora: Not a standard, but a suggestion. ICAME Journal, 38, 73–103.
Heldner, M., & Edlund, J. (2010). Pauses, gaps and overlaps in conversations. Journal of Phonetics, 38, 555–568. doi:10.1016/j.wocn.2010.08.002.
Hepburn, A., & Bolden, G. (2013). The conversation-analytic approach to transcription. In J. Sidnell & T. Stivers (Eds.), The handbook of conversation analysis (pp. 57–76). Malden, MA: Wiley.
Hoey, M., & O’Donnell, M. B. (2008). Lexicography, grammar, and textual position. International Journal of Lexicography, 21(3), 293–309.
Hoffmann, S., Evert, S., Smith, N., Lee, D., & Berglund Prytz, Y. (2008). Corpus linguistics with BNCweb—A practical guide. Frankfurt am Main: Peter Lang.
Holmes, J., & Stubbe, M. (1997). Good listeners: Gender differences in New Zealand conversation. Women and Language, 20(2), 7–14.
Holt, E. (1996). Reporting talk: The use of direct reported speech in conversation. Research on Language and Social Interaction, 29(3), 219–245.
Jefferson, G. (1973). A case of precision timing in ordinary conversation: Overlapped tag-positioned address terms in closing sequences. Semiotics, 9, 47–96.
Jefferson, G. (1979). A technique for inviting laughter and its subsequent acceptance declination. In G. Psathas (Ed.), Everyday language—Studies in ethnomethodology (pp. 79–95). New York: Irvington Publishers.
Jefferson, G. (1985). An exercise in the transcription and analysis of laughter. In T. A. van Dijk (Ed.), Handbook of discourse analysis (Vol. 3, pp. 25–34). London: Academic.
Jefferson, G. (1986). Notes on ‘latency’ in overlap onset. Human Studies, 9, 153–183.
Kallen, J. L., & Kirk, J. (2012). SPICE-Ireland: A user’s guide. Belfast: Cló Ollscoil na Banríona.
Kjellmer, G. (2009). Where do we backchannel? International Journal of Corpus Linguistics, 14(1), 81–112.
Lerner, G. (1996). On the “semi-permeable” character of grammatical units in conversation: conditional entry into the turn space of another speaker. In E. Ochs, E. A. Schegloff, & S. A. Thompson (Eds.), Interaction and grammar (pp. 238–276). Cambridge: Cambridge University Press.
Levinson, S. C. (1983). Pragmatics. Cambridge: Cambridge University Press.
Levinson, S. C., & Torreira, F. (2015). Timing in turn-taking and its implications for processing models of language. Frontiers in Psychology, 6, 731. doi:10.3389/fpsyg.2015.00731.
Liddicott, A. J. (2007). An introduction to conversation analysis. London: Continuum.
Mayes, P. (1990). Quotation in spoken English. Studies in Language, 14, 325–363.
McCarthy, M. (2003). Talking back: ‘Small’ interactional response tokens in everyday conversation. Research on Language and Social Interaction, 36(1), 33–63.
Nichols, T. E., & Holmes, A. P. (2001). Nonparametric permutation test for functional neuroimaging: A primer with examples. Human Brain Mapping, 15, 1–25.
Norwine, A. C., & Murphy, O. J. (1938). Characteristic time intervals in telephonic conversation. The Bell System Technical Journal, 17, 281–291.
O’Keeffe, A., & Adolphs, S. (2008). Response tokens in British and Irish discourse. Corpus, context and variational pragmatics. In K. P. Schneider & A. Barron (Eds.), Variational pragmatics (pp. 69–98). Amsterdam: John Benjamins.
Ochs, E., & Capps, L. (2001). Living narrative. Cambridge, MA: Harvard University Press.
Peters, P., & Wong, D. (2015). Turn management and backchannels. In K. Aijmer & C. Rühlemann (Eds.), Corpus pragmatics. A handbook (pp. 408–429). Cambridge: Cambridge University Press.
Robinson, J. D. (2007). The role of numbers and statistics within conversation analysis. Communication Methods and Measures, 1(1), 65–75.
Rossano, F. (2013). Gaze in conversation. In T. Stivers & J. Sidnell (Eds.), The handbook of conversation analysis (pp. 308–329). Malden/MA & Oxford: Blackwell.
Rühlemann, C. (2013). Narrative in English conversation: A corpus analysis. Cambridge: Cambridge University Press.
Rühlemann, C., Bagoutdinov, A., & O’Donnell, M. B. (2015). Modest XPath and XQuery for corpora: Exploiting deep XML annotation. ICAME Journal, 39, 47–84.
Rühlemann, C., & Gries, S. T. (2015). Turn order and turn distribution in multi-party storytelling. Journal of Pragmatics, 87, 171–191.
Rühlemann, C., & O’Donnell, M. B. (2012). Towards a corpus of conversational narrative. Construction and annotation of the Narrative Corpus. Corpus Linguistics and Linguistic Theory, 8(2), 313–350.
Sacks, H. (1984). Notes on methodology. In J. M. Atkinson & J. Heritage (Eds.), Structures of social action (pp. 21–27). Cambridge: Cambridge University Press.
Sacks, H. (1992). Lectures on conversation. (Vols. I and II). Oxford: Blackwell.
Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). A simplest systematics for the organisation of turn-taking for conversation’. Language, 50(4), 696–735.
Schegloff, E. A. (1982). Discourse as an interactional achievement: Some uses of ‘uh huh’ and other things that come between sentences. In D. Tannen (Ed.), Georgetown University round table on languages and linguistics analyzing discourse: Text and talk (pp. 71–93). Washington, DC: Georgetown University Press.
Schegloff, E. A. (2000). Overlapping talk and the organization of turn-taking for conversation. Language in Society, 29, 1–63.
Schmidt, T., & Schütte, W. (2014). FOLKER: An annotation tool for efficient transcription of natural, multi-party interaction. http://www.lrec-conf.org/proceedings/lrec2010/pdf/18_Paper.pdf.
Schmidt, Thomas, & Wörner, Kai. (2014). EXMARaLDA. In Jacques Durand, Ulrike Gut, & Gjert Kristoffersen (Eds.), The Oxford handbook of corpus phonology (pp. 402–419). Oxford: Oxford University Press.
Stivers, T. (2008). Stance, alignment, and affiliation during storytelling: When nodding is a token of affiliation. Research on Language and Social Interaction, 41(1), 31–57.
Stivers, T. (2013). Sequence organization. In J. Sidnell & T. Stivers (Eds.), The handbook of conversation analysis (pp. 191–209). Malden, MA: Wiley.
Stivers, T. (2015). Coding social interaction: A heretical approach in conversation analysis? Research on Language and Social Interaction, 48(1), 1–19.
Stivers, T., & Sidnell, J. (2013). Introduction. In J. Sidnell & T. Stivers (Eds.), The handbook of conversation analysis (pp. 1–8). Malden, MA: Wiley.
ten Bosch, L., Oostdijk, N., & Boves, L. (2005). On temporal aspects of turn taking in conversational dialogues. Speech Communication, 47, 80–86.
Tolins, J., & Fox Tree, J. E. (2014). Addressee backchannels steer narrative development. Journal of Pragmatics, 70, 152–164.
Walker, M. A. (1993). Informational redundancy and resource bounds in dialogue (Ph.D. thesis). University of Pennsylvania, Philadelphia, PA.
Walmsley, P. (2007). XQuery. Sebastopol, CA: O’Reilly.
Watt, A. (2002). XPath essentials. New York: Wiley.
Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., & Sloetjes, H. (2006). ELAN: A professional framework for multimodality research. In: Proceedings of LREC 2006, Fifth international conference on language resources and evaluation.
Wong, D., & Peters, P. (2007). A study of backchannels in regional varieties of English, using corpus mark-up as the means of identification. International Journal of Corpus Linguistics, 12(4), 479–509.
Woods, A., Fletcher, P., & Hughes, A. (1986). Statistics in language studies. Cambridge: Cambridge University Press.
Yngve, V. (1970). On getting a word in edgewise. In: Papers from the sixth regional meeting of the Chicago Linguistic Society. University of Chicago, pp. 567–77.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
See Table 3.
Rights and permissions
About this article
Cite this article
Rühlemann, C. Integrating Corpus-Linguistic and Conversation-Analytic Transcription in XML: The Case of Backchannels and Overlap in Storytelling Interaction. Corpus Pragmatics 1, 201–232 (2017). https://doi.org/10.1007/s41701-017-0018-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41701-017-0018-7