Abstract
This chapter presents detail of the Annotation Task of the Big Australian Speech Corpus (Big ASC) project, in which AusTalk, a large audio-visual corpus of Australian English, was collected. We describe the scope of the task and its implementation and give an overview of the results so far. When complete, AusTalk will consist of 3 h of audio-visual recording from each of 1000 speakers of Australian English, across a wide range of tasks including scripted (read) speech, spontaneous speech and dialogue. The read speech of 100 participants has now been manually annotated but a challenge of the project was to produce transcriptions for the unscripted (spontaneous) speech data. We report on several avenues that have been explored for the automation of this task. We describe the annotation challenges, the processes that were adopted and the limitations of automated transcription.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anderson, A.H., Bader, M., Bard, E.G., Boyle, E., Doherty, G., Garrod, S., Weinert, R.: The HCRC map task corpus. Lang. Speech 34(4), 351–366 (1991)
Barras, C., Geoffrois, E., Wu, Z., Liberman, M.: Transcriber: development and use of a tool for assisting speech corpora production. Speech Commun. 33(1–2), 5–22 (2000)
Bizer, C., Heath, T., Berners-Lee, T.: Linked data - the story so far. Int. J. Semant. Web Inf. Syst. 5(3), 1–22 (2009). doi:10.4018/jswis.2009081901
Boesma, P., Weenink, D.: Praat: doing phonetics by computer (Version 5.1.05) (2009). http://www.praat.org/
Burnham, D., Estival, D., Fazio, S., Cox, F., Dale, R., Viethen, J., Wagner, M.: Building an audio-visual corpus of Australian English: large corpus collection with an economical portable and replicable Black Box. Paper presented at the Interspeech 2011, Florence (2011)
Burnham, D., Estival, D., Bugeia, P., Sefton, P., Cassidy, S.: Above and beyond speech, language and music: a virtual lab for human communication science (HCS vLab). NeCTAR (National eResearch Collaboration Tools & Resources) Virtual Laboratory (2012)
Butcher, A.: Levels of representation in the acquisition of phonology: evidence from ‘before and after’ speech. In: Dodd, B., Campbell, R., Worall, L. (eds.) Evaluating Theories of Language: Evidence from Disordered Communication, pp. 55–73. Whurr Publishers, London (1996)
Butcher, A.: Linguistic aspects of Australian aboriginal English. Clin. Linguist. Phon. 22(8), 625–642 (2008). doi:10.1080/02699200802223535
Candlin, C., Blair, D.: Australian Learners Dictionary. National Centre for English Language Teaching and Research, Australia (1997)
Cassidy, S., Estival, D., Jones, T., Burnham, D., Berghold, J.: The alveo virtual laboratory: a web based repository API. Paper presented at the 9th language resources and evaluation conference (LREC 2014), Iceland (2014)
Cox, F., Palethorpe, S.: Regional variation in the vowels of female adolescents from Sydney. Paper presented at the ICSLP 1998, Sydney (1998)
Cox, F., Palethorpe, S.: The changing face of Australian English vowels. Varieties of English around the World: English in Australia, pp. 17–44. John Benjamins, Netherlands (2001)
Cox, F., Palethorpe, S.: The border effect: vowel differences across the NSW/Victorian border. In: Moskovsky, C. (ed.), Proceedings of ALS 2003 (2004)
Harrington, J., Cox, F., Evans, Z.: An acoustic phonetic study of broad, general, and cultivated Australian English vowels. Aust. J. Linguist. 17, 155–184 (1997)
Millar, J. B., Dermody, P., Harrington, M., Vonwiller, J.: A national database of spoken language: concept, design, and implementation. Paper presented at the international conference on spoken language processing (ICSLP-90), Japan (1990). http://andosl.anu.edu.au/andosl/ANDOSLhome.html
Schiel, F., Draxler, C., Harrington, J.: Phonemic segmentation and labelling using the MAUS technique. Paper presented at the Workshop ‘new tools and methods for very-large-scale phonetics research’, University of Pennsylvania, Philadelphia (2011)
Sui, C., Haque, S., Togneri, R., Bennamoun, M.: A 3D audio-visual corpus for speech recognition. Paper presented at the SST2012, Sydney (2012a)
Sui, C., Haque, S., Togneri, R., Bennamoun, M.: Discrimination comparison between audio and visual features. Paper presented at the Asilomar 2012, Pacific Grove (2012b)
Togneri, R., Bennamoun, M., Sui, C.: Multimodal speech recognition with the AusTalk 3D audio-visual corpus. Tutorial at Interspeech 2014, Singapore (2014)
Wagner, M., Tran, D., Togneri, R., Rose, P., Powers, D., Onslow, M., Ambikairajah, E.: The big Australian speech corpus (The Big ASC). Paper presented at the 13th Australasian international conference on speech science and technology, Melbourne (2010)
Yuan, J., Liberman, M.: Speaker identification on the SCOTUS corpus. Paper presented at the Acoustics 2008 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Cassidy, S., Estival, D., Cox, F. (2017). Case Study: The AusTalk Corpus. In: Ide, N., Pustejovsky, J. (eds) Handbook of Linguistic Annotation. Springer, Dordrecht. https://doi.org/10.1007/978-94-024-0881-2_49
Download citation
DOI: https://doi.org/10.1007/978-94-024-0881-2_49
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-024-0879-9
Online ISBN: 978-94-024-0881-2
eBook Packages: Social SciencesSocial Sciences (R0)