Phonetic Alignment and Similarity

Kondrak, Grzegorz

doi:10.1023/A:1025071200644

Phonetic Alignment and Similarity

Published: August 2003

Volume 37, pages 273–291, (2003)
Cite this article

Computers and the Humanities Aims and scope Submit manuscript

Grzegorz Kondrak¹

391 Accesses
46 Citations
6 Altmetric
1 Mention
Explore all metrics

Abstract

The computation of the optimal phonetic alignment andthe phonetic similarity between wordsis an important step in many applications in computational phonology,including dialectometry.After discussing several related algorithms,I present a novel approach to the problem that employsa scoring scheme for computing phonetic similarity between phonetic segmentson the basis of multivalued articulatory phonetic features.The scheme incorporates the key concept of feature salience,which is necessary to properly balance the importance of various features.The new algorithm combines several techniquesdeveloped for sequence comparison:an extended set of edit operations,local and semiglobal modes of alignment,and the capability of retrieving a set of near-optimal alignments.On a set of 82 cognate pairs,it performs better than comparable algorithms reported in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Connolly J. H. (1997) Quantifying Target-realization Differences. Clinical Linguistics & Phonetics, 11, pp. 267–298.
Google Scholar
Covington M. A. (1996) An Algorithm to Align Words for Historical Comparison. Computational Linguistics, 22(4), pp. 481–496.
Google Scholar
Covington M. A. (1998) Alignment of Multiple Languages for Historical Comparison. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, pp. 275–280.
Dayhoff M. O., Baker W. C., Hunt L. T. (1983) Establishing Homologies in Protein Sequences. Methods in Enzymology, 91, pp. 524–545.
Google Scholar
Durbin, R., Eddy S. R., Krogh A., Mitchison G. (1998) Biological Sequence Analysis. Cambridge University Press.
Eppstein D. (1998) Finding the k Shortest Paths. SIAM Journal on Computing, 28(2), pp. 652–673.
Google Scholar
Gildea D., Jurafsky D. (1996) Learning Bias and Phonological-Rule Induction. Computational Linguistics, 22(4), pp. 497–530.
Google Scholar
Gotoh O. (1982) An Improved Algorithm for Matching Biological Sequences. Journal of Molecular Biology, 162, pp. 705–708.
Google Scholar
Hartman S. L. (1981) A Universal Alphabet for Experiments in Comparative Phonology. Computers and the Humanities, 15, pp. 75–82.
Google Scholar
Heeringa W., Nerbonne J., Kleiweg P. (2002) Validating Dialect Comparison Methods. In Gaul W. and Ritter G. (eds.), Classification, Automation, and New Media. Proceedings of the 24th Annual Conference of the Gesellschaft für Klassifikation e. V, pp. 445–452.
Hewson J. (1993) A Computer-Generated Dictionary of Proto-Algonquian. Canadian Museum of Civilization, Hull, Quebec.
Google Scholar
Kessler B. (1995) Computational Dialectology in Irish Gaelic. In Proceedings of the 6th Conference of the European Chapter of the Association for Computational Linguistics, pp. 60–67.
Kondrak G. (2002) Algorithms for Language Reconstruction. Ph.D. thesis, University of Toronto. Available at http://www.cs.ualberta.ca/∼kondrak.
Ladefoged P. (1975) A Course in Phonetics. Harcourt Brace Jovanovich, New York.
Google Scholar
Lowrance R., Wagner R. A. (1975) An Extension of the String-to-String Correction Problem. Journal of the Association for Computing Machinery, 22, pp. 177–183.
Google Scholar
Myers E. W. (1995) Seeing Conserved Signals. In Lander E. S. and Waterman M. S. (eds.), Calculating the Secrets of Life, National Academy Press, Washington, DC, pp. 56–89.
Google Scholar
Nerbonne J., Heeringa W. (1997) Measuring Dialect Distance Phonetically. In Proceedings of the 3rd Meeting of the ACL Special Interest Group in Computational Phonology.
Oakes M. P. (2000) Computer Estimation of Vocabulary in Protolanguage from Word Lists in Four Daughter Languages. Journal of Quantitative Linguistics, 7(3), pp. 233–243.
Google Scholar
Oommen B. J. (1995) String Alignment With Substitution, Insertion, Deletion, Squashing, and Expansion Operations. Information Sciences, 83, pp. 89–107.
Google Scholar
Oommen B. J., Loke R. K. S. (1997) Pattern Recognition of Strings with Substitutions, Insertions, Deletions and Generalized Transpositions. Pattern Recognition, 30(5), pp. 789–800.
Google Scholar
Smith T. F., Waterman M. S. (1981) Identification of Common Molecular Sequences. Journal of Molecular Biology, 147, pp. 195–197.
Google Scholar
Somers H. L. (1998) Similarity Metrics for Aligning Children's Articulation Data. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, pp. 1227–1231.
Somers H. L. (1999) Aligning Phonetic Segments for Children's Articulation Assessment. Computational Linguistics, 25(2), pp. 267–275.
Google Scholar
Wagner R. A., Fischer M. J. (1974) The String-to-String Correction Problem. Journal of the Association for Computing Machinery, 21(1), pp. 168–173.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada
Grzegorz Kondrak

Authors

Grzegorz Kondrak
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kondrak, G. Phonetic Alignment and Similarity. Computers and the Humanities 37, 273–291 (2003). https://doi.org/10.1023/A:1025071200644

Download citation

Issue Date: August 2003
DOI: https://doi.org/10.1023/A:1025071200644

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Phonetic Alignment and Similarity

Abstract

Access this article

Similar content being viewed by others

An Overview of Phonetic Encoding Algorithms

word.alignment: an R package for computing statistical word alignment and its evaluation

Chinese lexical database (CLD)

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Phonetic Alignment and Similarity

Abstract

Access this article

Similar content being viewed by others

An Overview of Phonetic Encoding Algorithms

word.alignment: an R package for computing statistical word alignment and its evaluation

Chinese lexical database (CLD)

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation