New Machine Scores and Their Combinations for Automatic Mandarin Phonetic Pronunciation Quality Assessment

Pan, Fuping; Zhao, Qingwei; Yan, Yonghong

doi:10.1007/978-3-540-74819-9_101

Fuping Pan¹,
Qingwei Zhao¹ &
Yonghong Yan¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4692))

Included in the following conference series:

International Conference on Knowledge-Based and Intelligent Information and Engineering Systems

1438 Accesses

Abstract

This paper discusses Mandarin vowel pronunciation quality assessment. The phonetic pronunciation quality is traditionally evaluated under the speech recognition framework by the phonetic posterior probability score, which may be computed by normalizing the frame-based posterior probability or be calculated on the phone segment directly. By the first method, we can achieve a human-machine scoring correlation coefficient (CC) of 0.832 for vowel; and by the second, the CC can be up to 0.847. This paper proposes a novel kind of formant feature and applies the feature to the evaluation of vowel: we transform the formant plots on the time-frequency plane to a bitmap and extract its Gabor feature for pattern classification; when use the classification probability for pronunciation assessment, we can get a CC of 0.842. Finally we combine the three scores with various linear or nonlinear methods; the best CC of 0.913 is gotten by using neural network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Franco, H., Neumeyer, L., et al.: Automatic pronunciation Scoring for Language Instruction. ICASSP, Munich, pp. 1471–1474. Munich (1997)
Google Scholar
Neumeyer, L., Franco, H.: Automatic Scoring of Pronunciation Quality. Speech Communication 30, 83–93 (2000)
Article Google Scholar
Franco, H., Neumeyer, L., Digalakis, V., Ronen, V.: Combination of machine scores for automatic grading of pronunciation quality. Speech Communication 30, 121–130 (2000)
Article Google Scholar
Yasushi, T., Masatake, D., Tatsuya, K.: Practical use of English pronunciation system for Japanese students in the CALL classroom. INTERSPEECH, pp. 1689–1692 (2004)
Google Scholar
Witt, S.M., Young, S.J.: Phone-level pronunciation scoring and assessment for interactive language learning. Speech communication 30, 95–108 (2000)
Article Google Scholar
Hillenbrand, J., Getty, L.A., Clark, M.J., et al.: Acoustic characteristics of American English vowels. Journal of the Acoustical Society of America 97, 3099–3111 (1995)
Article Google Scholar
Schmid, P., Barnard, E.: Explicit, n-best formant features for vowel classification. ICASSP, pp. 21–24 (1997)
Google Scholar
Nearey, T.M., Assmann, P.F.: Modeling the role of inherent spectral change in vowel identification. Jorunal of the Acoustical Society of America 80, 1297–1308 (1986)
Article Google Scholar
Lee, M., VanSanten, J., Mobius, B., Olive, J.: Formant Tracking Using Context-Dependent Phonemic Information. IEEE Transactions on Speech and Audio Processing 13, 741–750 (2005)
Article Google Scholar
Petkov, N.: Biologically motivated computationally intensive approaches to image pattern recognition. Future Generation Computer Systems 11, 451–465 (1995)
Article Google Scholar
Grigorescu, S.E., Petkov, N., Kruizinga, P.: Comparison of texture features based on Gabor filters. IEEE Transactions on Image Processing 11, 1160–1167 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

ThinkIT Laboratory, Institute of Acoustics, Chinese Academy of Sciences, Beijing, China
Fuping Pan, Qingwei Zhao & Yonghong Yan

Authors

Fuping Pan
View author publications
You can also search for this author in PubMed Google Scholar
Qingwei Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yonghong Yan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Bruno Apolloni Robert J. Howlett Lakhmi Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pan, F., Zhao, Q., Yan, Y. (2007). New Machine Scores and Their Combinations for Automatic Mandarin Phonetic Pronunciation Quality Assessment. In: Apolloni, B., Howlett, R.J., Jain, L. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2007. Lecture Notes in Computer Science(), vol 4692. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74819-9_101

Download citation

DOI: https://doi.org/10.1007/978-3-540-74819-9_101
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74817-5
Online ISBN: 978-3-540-74819-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics