Reconstructing Evolution of Natural Languages: Complexity and Parameterized Algorithms

  • Iyad A. Kanj
  • Luay Nakhleh
  • Ge Xia
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4112)


In a recent article, Nakhleh, Ringe and Warnow introduced perfect phylogenetic networks—a model of language evolution where languages do not evolve via clean speciation—and formulated a set of problems for their accurate reconstruction. Their new methodology assumes networks, rather than trees, as the correct model to capture the evolutionary history of natural languages. They proved the NP-hardness of the problem of testing whether a network is a perfect phylogenetic one for characters exhibiting at least three states, leaving open the case of binary characters, and gave a straightforward brute-force parameterized algorithm for the problem of running time O(3 k n), where k is the number of bidirectional edges in the network and n is its size. In this paper, we first establish the NP-hardness of the binary case of the problem. Then we provide a more efficient parameterized algorithm for this case running in time O(2 k n 2). The presented algorithm is very simple, and utilizes some structural results and elegant operations developed in this paper that can be useful on their own in the design of heuristic algorithms for the problem. The analysis phase of the algorithm is very elegant using amortized techniques to show that the upper bound on the running time of the algorithm is much tighter than the upper bound obtained under a conservative worst-case scenario assumption. Our results bear significant impact on reconstructing evolutionary histories of languages–particularly from phonological and morphological character data, most of which exhibit at most two states (i.e., are binary), as well as on the design and analysis of parameterized algorithms.


Search Tree Internal Node Parameterized Algorithm Qualitative Character Connected Subgraph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bodlaender, H., Fellows, M., Warnow, T.: Two strikes against perfect phylogeny. In: Kuich, W. (ed.) ICALP 1992. LNCS, vol. 623, pp. 273–283. Springer, Heidelberg (1992)Google Scholar
  2. 2.
    Dobson, A.J.: Unrooted trees for numerical taxonomy (Unpublished manuscript)Google Scholar
  3. 3.
    Dobson, A.J.: Lexicostatistical grouping. Anthropological Linguistics 11, 216–221 (1969)Google Scholar
  4. 4.
    Downey, R., Fellows, M.: Parameterized Complexity. Springer, New York (1999)Google Scholar
  5. 5.
    Gleason, H.A.: Counting and calculating for historical reconstruction. Anthropological Linguistics 1, 22–32 (1959)Google Scholar
  6. 6.
    Gray, R.D., Atkinson, Q.D.: Language-tree divergence times support the anatolian theory of indo-european origin. Nature 426(6965), 435–439 (2003)CrossRefGoogle Scholar
  7. 7.
    Mallory, J.P.: In Search of the Indo-Europeans. Thames and Hudson, London (1989)Google Scholar
  8. 8.
    Nakhleh, L.: Phylogenetic Networks. PhD thesis, The University of Texas at Austin (2004)Google Scholar
  9. 9.
    Nakhleh, L., Ringe, D., Warnow, T.: Perfect phylogenetic networks: A new methodology for reconstructing the evolutionary history of natural languages. Language (in press, 2005)Google Scholar
  10. 10.
    Ringe, D.: Some consequences of a new proposal for subgrouping the IE family. In: Bergen, B.K., Plauche, M.C., Bailey, A. (eds.) 24th Annual Meeting of the Berkeley Linguistics Society, Special Session on Indo-European Subgrouping and Internal Relations, pp. 32–46 (1998)Google Scholar
  11. 11.
    Ringe, D., Warnow, T., Taylor, A.: Indo-European and computational cladistics. Transactions of the Philological Society 100(1), 59–129 (2002)CrossRefGoogle Scholar
  12. 12.
    Ringe, D., Warnow, T., Taylor, A., Michailov, A., Levison, L.: Computational cladistics and the position of Tocharian. In: Mair, V. (ed.) The Bronze Age and early Iron Age peoples of Eastern Central Asia, pp. 391–414 (1998)Google Scholar
  13. 13.
    Roberts, R.G., Jones, R., Smith, M.A.: Thermoluminescence dating of a 50,000-year-old human occupation site in Northern Australia. Science 345, 153–156 (1990)Google Scholar
  14. 14.
    Taylor, A., Warnow, T., Ringe, D.: Character-based reconstruction of a linguistic cladogram. In: Smith, J.C., Bentley, D. (eds.) Historical Linguistics 1995, Vol. I: General issues and non-Germanic languages, pp. 393–408. Benjamins, Amsterdam (2000)Google Scholar
  15. 15.
    Warnow, T.: Mathematical approaches to comparative linguistics. Proc. Natl. Acad. Sci. 94, 6585–6590 (1997)MATHCrossRefMathSciNetGoogle Scholar
  16. 16.
    Warnow, T., Ringe, D., Taylor, A.: Reconstructing the evolutionary history of natural languages. Technical Report 95-16, Institute. for Research in Cognitive Science, Univ. of Pennsylvania (1995)Google Scholar
  17. 17.
    Warnow, T., Ringe, D., Taylor, A.: Reconstructing the evolutionary history of natural languages. In: ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 314–322 (1996)Google Scholar
  18. 18.
    White, J.P., O’Connell, J.F.: A Prehistory of Australia, New Guinea, and Sahul. Academic Press, New York (1982)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Iyad A. Kanj
    • 1
  • Luay Nakhleh
    • 2
  • Ge Xia
    • 3
  1. 1.School of Computer Science, Telecommunications and Information SystemsDePaul UniversityChicagoUSA
  2. 2.Department of Computer ScienceRice UniversityHoustonUSA
  3. 3.Department of Computer ScienceLafayette CollegeEastonUSA

Personalised recommendations