Skip to main content
Log in

Discourse prosody planning in native (L1) and nonnative (L2) (L1-Bengali) English: a comparative study

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

This paper conducts a comparative study between L1 and L2 (L1 Bengali) English discourse level speech planning to investigate differences between L1 and L2 English speaker groups in the organization of discourse-level speech planning. For this purpose, English speech of 10 L1 English and 40 L1 Bengali speakers of the same discourse are analyzed in terms of using prosodic and acoustic cues by applying hierarchical discourse prosody framework. From this analysis, between-group differences in discourse level speech planning are found through the speech rate, locations of discourse boundary breaks as well as size and scope of speech planning and chunking units. Result of analysis shows that the speech rate of L1 English speakers is higher than that of L2 English speakers, L2 English speakers contain more break boundary than that of the L1 English speakers at every discourse level in the organization, which exhibit the fact that L2 English speakers use more intermediate chunking units and larger scale planning units than that of L1 English speakers. Between-group differences are also found through the analysis of phrase component at prosodic phrase level and accent component at the prosodic word level. These findings can be attributed to L2 English speakers’ improper phrasing, improper word level prominence and the ambiguous difference between content words and function words. The study concludes that the deficiencies in English strategy for L1 Bengali speakers’ discourse-level speech planning compared to L1 English speakers are due to the influence of L1 (Bengali) prosody at the L2 discourse level.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Acharya, S., Mandal, D., & Kumar, S. (2011). Prosodic word boundary detection for Bangla based on empirical mode decomposition of F0 contour. In: Proceedings of Oriental COCOSDA International Conference on Speech Database and Assessments (pp. 1–5). Hsinchu, Taiwan.

  • Acharya, S., Mandal, D., & Kumar, S. (2013). Prosodic word and phrase boundary detection based on F0 contour analyses using empirical mode decomposition. In: Proceedings of Oriental COCOSDA International Conference on Speech Database and Assessments (pp. 1–5). Delhi, India.

  • Anderson-Hsieh, J., Johnson, R., & Koehler, K. (1992). The relationship between native speaker judgments of nonnative pronunciation and deviance in segmentals, prosody, and syllable structure. Language Learning, 42(4), 529–555.

    Article  Google Scholar 

  • Bansal, R. K. (1966). The intelligibility of Indian English: Measures of the intelligibility of connected speech, and sentence and word material, presented to listeners of different nationalities. Unpublished Doctoral Dissertation, University of London.

  • Beckman, M. E., & Pierrehumbert, J. B. (1986). Intonational structure in Japanese and English. Phonology, 3(01), 255–309.

    Google Scholar 

  • Bhattacharya, K. (1988). Bengali phonetic reader (Vol. 28). Mysore: Central Institute of Indian Languages.

    Google Scholar 

  • Boersma, P., & Weenink, D. (2004). Retrieved May 15, 2015 from http://www.fon.hum.uva.nl/praat/.

  • Busa, M. G., & Urbani, M. (2011). A cross linguistic analysis of pitch range in English L1 and L2. In: Proceedings of International Congress of Phonetic Sciences (pp. 380–383). Hong Kong.

  • Chaudhary, S. (2009). Foreigners and foreign languages in India: A sociolinguistic history. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Chen, H., Fang, W., & Tseng, C. (2016). Prosodic prompts and information planning units in continuous speech—Relative allocation and compensation of prosodic highlight. In: Proceedings of 12th Phonetic Conference of China (pp. 21–26). Tongliao, China.

  • Chen, H. K. Y., & Tseng, C. Y. (2015). Information content, weighting and distribution in continuous speech prosody-A cross-genre comparison. In: Proceedings of Oriental COCOSDA (pp. 75–80). Shanghai, China.

  • Chen, H. K. Y., & Wei-te Fang, C. Y. T. (2015). Advance Prosodic Indexing—Acoustic realization of prompted information projection in continuous speeches and discourses. In: Proceedings of Oriental COCOSDA (pp. 31–35). Shanghai, China.

  • Chen, H. K. Y., & Wei-te Fang, C. Y. T. (2016). The convergence of perceived prosodic highlight for discourse prosody. In: Proceedings of Speech Prosody (pp. 654–658). Boston.

  • Chen, S. W., Wang, B., & Xu, Y. (2009). Closely related languages, different ways of realizing focus. In: Proceedings of Interspeech (pp. 1007–1010). United Kingdom: Brighton.

  • Den Ouden, H., Noordman, L., & Terken, J. (2009). Prosodic realizations of global and local structure and rhetorical relations in read aloud news reports. Speech Communication, 51(2), 116–129.

    Article  Google Scholar 

  • Fujisaki, H., & Hirose, K. (1984). Analysis of voice fundamental frequency contours for declarative sentences of Japanese. Journal of the Acoustical Society of Japan (E), 5(4), 233–242.

    Article  Google Scholar 

  • Fujisaki, H., & Ohno, S. (1995). Analysis and modeling of fundamental frequency contours of Greek utterances. In: Proceedings of EUROSPEECH (Vol. 2, pp. 985–988). Madrid, Spain.

  • Fujisaki, H., Ohno, S., & Wang, C. (1998). A command-response model for F0 contour generation in multilingual speech synthesis. In: The Third ESCA/COCOSDA Workshop (ETRW) on Speech Synthesis. Jenolan Caves House, Blue Mountains, Australia.

  • Grosz, B. J., & Sidner, C. L. (1986). Attention, intentions, and the structure of discourse. Computational Linguistics, 12(3), 175–204.

    Google Scholar 

  • Hayes, B., & Lahiri, A. (1991). Bengali intonational phonology. Natural Language & Linguistic Theory, 9(1), 47–96.

    Article  Google Scholar 

  • Hewings, M. (1995). Tone choice in the English intonation of non-native speakers. International Review of Applied Linguistics in Language Teaching, 33(3), 251–265.

    Google Scholar 

  • Hirose, K., & Fujisaki, H. (1982). Analysis and synthesis of voice fundamental frequency contours of spoken sentences. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing (Vol. 7, pp. 950–953). Paris, France.

  • Hirschberg, J., & Grosz, B. (1992). Intonational features of local and global discourse structure. In: Proceedings of the workshop on Speech and Natural Language (pp. 441–446). Stroudsburg, PA, USA.

  • Hirschberg, J., & Pierrehumbert, J. (1986). The intonational structuring of discourse. In: Proceedings of the 24th annual meeting on Association for Computational Linguistics (pp. 136–144). New York, USA.

  • House, J. (2013). Developing pragmatic competence in English as a lingua franca: Using discourse markers to express (inter) subjectivity and connectivity. Journal of Pragmatics, 59, 57–67.

    Article  Google Scholar 

  • Huang, N. E., Shen, Z., Long, S. R., Wu, M. C., Shih, H. H., Zheng, Q., Yen, N.C., Tung, C.C., & Liu, H. H. (1998). The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. In: Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences (Vol. 454, No. 1971, pp. 903–995).

  • Lehiste, I. (1975). The phonetic structure of paragraphs. In: Proceedings of Structure and process in speech perception (pp. 195–206). Berlin: Springer.

  • Lehiste, I. (1982). Some phonetic characteristics of discourse. Studia Linguistica, 36(2), 117–130.

    Article  Google Scholar 

  • Lewis, M. P., Simons, G. F., & Fennig, C. D. (2009). Ethnologue: Languages of the world (Vol. 9). Dallas, TX: SIL international.

    Google Scholar 

  • Mann, W. C., & Thompson, S. A. (1988). Rhetorical structure theory: Toward a functional theory of text organization. Text-Interdisciplinary Journal for the Study of Discourse, 8(3), 243–281.

    Article  Google Scholar 

  • Mayer, J., Jasinskaja, E., & Kolsch, U. (2006). Pitch range and pause duration as markers of discourse hierarchy: perception experiments. In: Proceedings of International Speech Communication Association (pp. 473–476). Pittsburgh, Pennsylvania.

  • Meng, H., Tseng, C. Y., Kondo, M., Harrison, A. M., & Visceglia, T. (2009). Studying L2 suprasegmental features in asian Englishes: a position paper. In: Proceedings of International Speech Communication Association (pp. 1715–1718). Brighton.

  • Mondonedo, M. R. (1999). Handbook of the International Phonetic Association. A Guide to the Use of the international Phonetic Alphabet. Cambridge: Cambridge University Press.

    Google Scholar 

  • Nariai, T., & Tanaka, K. (2008). A study of pitch patterns of Japanese English analyzed via comparative linguistic features of English and Japanese. In: Proceedings of International Speech Communication Association (pp. 776–779). Brisbane, Australia.

  • Narusawa, S., Minematsu, N., Hirose, K., & Fujisaki, H. (2002). A method for automatic extraction of model parameters from fundamental frequency contours of speech. In: Proceedings of Acoustics, Speech, and Signal Processing (Vol. 1, pp. 506–509). Orlando, Florida.

  • Pickering, L. (2001). The role of tone choice in improving ITA communication in the classroom. TESOL Quarterly, 35(2), 233–255.

    Article  Google Scholar 

  • Pickering, L. (2004). The structure and function of intonational paragraphs in native and nonnative speaker instructional discourse. English for Specific Purposes, 23(1), 19–43.

    Article  Google Scholar 

  • Pierrehumbert, J. B. (1980). The phonology and phonetics of English intonation. Doctoral Dissertation, Massachusetts Institute of Technology.

  • Rilling, G., Flandrin, P., & Goncalves, P. (2003). On empirical mode decomposition and its algorithms. In: IEEE-EURASIP workshop on nonlinear signal and image processing (Vol. 3, pp. 8–11). Grado, Italy.

  • Roach, P. (1998). English phonetics and phonology: A practical course (2nd edn.). Cambridge: Cambridge University Press.

    Google Scholar 

  • Silverman, K. E. A. (1987). The structure and processing of fundamental frequency contours. Unpublished Doctoral Dissertation, University of Cambridge. U.K: Cambridge.

  • Swerts, M. (1997). Prosodic features at discourse boundaries of different strength. The Journal of the Acoustical Society of America, 101(1), 514–521.

    Article  Google Scholar 

  • Tseng, C. Y. (2006). Higher level organization and discourse prosody. In: Proceedings of the Second International Symposium on Tonal Aspects of Languages (pp. 23–34). La Rochelle, France.

  • Tseng, C. Y., Cheng, Y. C., & Chang, C. (2005). Sinica COSPRO and Toolkit—Corpora and platform of Mandarin Chinese fluent speech. In: Proceedings of Oriental COCOSDA International Conference on Speech Database and Assessments (pp. 23–28). Indonesia.

  • Tseng, C. Y., & Su, C. Y. (2014). L2 discourse and information planning and their prosodic implicaitons. In: Proceedings of Oriental COCOSDA International Conference on Speech Database and Assessments (pp. 65–70). Phuket, Thailand.

  • Tyler, A., & Davies, C. (1990). Cross-linguistic communication missteps. Text-Interdisciplinary Journal for the Study of Discourse, 10(4), 385–412.

    Article  Google Scholar 

  • Tyler, A. E., Jefferies, A. A., & Davies, C. E. (1988). The effect of discourse structuring devices on listener perceptions of coherence in non-native university teacher’s spoken discourse. World Englishes, 7(2), 101–110.

    Article  Google Scholar 

  • Visceglia, T., Su, C. Y., & Tseng, C. Y. (2012). Comparison of English narrow focus production by L1 English, Beijing and Taiwan Mandarin speakers. In: Proceedings of Oriental COCOSDA International Conference on Speech Database and Assessments (pp. 47–51). Macau, China.

  • Visceglia, T., Tseng, C. Y., Su, Z. Y., & Huang, C. F. (2010). Discourse prosody planning in L1 and L2 English. In: Proceedings of Oriental COCOSDA International Conference on Speech Database and Assessments (pp. 24–25). Kathmandu, Nepal.

  • Wennerstrom, A. (1994). Intonational meaning in English discourse. Applied Linguistics, 15(4), 399–420.

    Article  Google Scholar 

  • Wennerstrom, A. (1998). Intonation as cohesion in academic discourse. Studies in Second Language Acquisition, 20(01), 1–25.

    Article  Google Scholar 

  • Xiaoli, J., Xia, W., & Aijun, L. (2009). Intonation patterns of yes-no questions for Chinese EFL learners. In: Proceedings of Oriental COCOSDA International Conference on Speech Database and Assessments (pp. 47–51). Urumqi.

  • Yule, G. (1980). Speakers’ topics and major paratones. Lingua, 52(1–2), 33–47.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shambhu Nath Saha.

Appendices

Appendix 1: detailed information of L1 American English speakers

ID

Gender

Age (years)

Educational level

1

Male

23

Undergraduate

2

Male

25

Postgraduate

3

Male

21

Undergraduate

4

Male

22

Undergraduate

5

Male

27

Postgraduate

6

Female

28

Postgraduate

7

Female

24

Postgraduate

8

Female

26

Postgraduate

9

Female

22

Undergraduate

10

Female

23

Undergraduate

Appendix 2: detailed information of L1 Bengali speakers

ID

Gender

Age (years)

Educational level

Discipline

Number of years studied English

7

Male

25

Postgraduate

Engineering

10

18

Male

27

Postgraduate

Engineering

12

32

Male

22

Undergraduate

Science

10

39

Female

20

Undergraduate

Engineering

11

40

Female

23

Undergraduate

Arts

14

9

Male

22

Undergraduate

Science

12

31

Male

31

PhD

Education

21

10

Male

24

Undergraduate

Engineering

16

25

Male

34

PhD

Engineering

25

38

Female

32

PhD

Engineering

24

8

Female

27

Postgraduate

Humanity

19

14

Male

26

Postgraduate

Arts

20

1

Male

35

PhD

Engineering

27

26

Male

28

Postgraduate

Science

21

3

Female

29

Postgraduate

Social Science

23

13

Female

26

Postgraduate

Engineering

16

19

Female

32

PhD

Engineering

23

20

Male

35

PhD

Engineering

25

21

Female

28

Postgraduate

Science

20

4

Male

25

Undergraduate

Engineering

17

2

Male

27

Postgraduate

Engineering

18

23

Male

21

Undergraduate

Law

11

28

Female

32

PhD

Engineering

22

36

Male

25

Undergraduate

Engineering

16

35

Female

20

Undergraduate

Humanity

12

37

Female

25

Postgraduate

Arts

17

6

Male

29

PhD

Science

23

24

Female

31

Undergraduate

Science

26

34

Female

27

Postgraduate

Engineering

21

16

Male

33

PhD

Engineering

27

30

Female

35

PhD

Social Science

28

33

Female

24

Undergraduate

Social Science

17

12

Male

29

PhD

Engineering

22

11

Female

35

PhD

Social Science

23

17

Female

30

Postgraduate

Science

20

15

Female

29

Postgraduate

Engineering

21

29

Female

22

Undergraduate

Engineering

15

22

Male

26

Postgraduate

Engineering

19

27

Female

21

Undergraduate

Science

14

5

Male

35

Postgraduate

Social Science

29

Appendix 3: AESOP’s specified recording setup

The speech of L1 and L2 (L1 Bengali) English speakers was recorded by using AESOP’s recording tool kit with AESOP’s specified recording platform. The detailed description of AESOP’s specified recording setup is given below.

  • Recording environment

Recording can be conducted in quiet room, such as a seminar room, lab, or a class room.

  • Sound card

Sennheiser PC155 comes equipped with a built-in sound card. No driver installation is required.

  • Recording machine

Either a desktop PC or a laptop may be used. Connect the headset to the PC or Laptop. The CUHK-SIAT recording tool is compatible with the following operating systems: Windows XP Service Pack 2.

  • Recording tool kit

The CUHK-SIAT recording tool was developed by Chinese University of Hong Kong, in collaboration with Shenzhen Institutes of Advanced Technology. It was subsequently modified to fit the requirements of this project.

  • Audio file format

Source: microphone

Sampling rate: 16 kHz

Bit rate: 16-bit

Channel: mono

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saha, S.N., Mandal, S.K.D. Discourse prosody planning in native (L1) and nonnative (L2) (L1-Bengali) English: a comparative study. Int J Speech Technol 20, 305–326 (2017). https://doi.org/10.1007/s10772-017-9409-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-017-9409-1

Keywords

Navigation