Agreement Subtree Mapping Kernel for Phylogenetic Trees

  • Issei Hamada
  • Takaharu Shimada
  • Daiki Nakata
  • Kouichi Hirata
  • Tetsuji Kuboyama
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8417)

Abstract

In this paper, we introduce an agreement subtree mapping kernel counting all of the agreement subtree mappings and design the algorithm to compute it for phylogenetic trees, which are unordered leaf-labeled full binary trees, in quadratic time. Then, by applying the agreement subtree mapping kernel to trimmed phylogenetic trees obtained from all the positions in nucleotide sequences for A (H1N1) influenza viruses, we classify pandemic viruses from non-pandemic viruses and viruses in one region from viruses in the other regions. On the other hand, for leaf-labeled trees, we show that the problem of counting all of the agreement subtree mappings is #P-complete.

References

  1. 1.
    Bao, Y., Bolotov, P., Dernovoy, D., Kiryutin, B., Zaslavsky, L., Tatusova, T., Ostell, J., Lipman, D.: The influenza virus resource at the national center for biotechnology information. J. Virol. 82, 596–601 (2008). http://www.ncbi.nlm.gov/genomes/FLU/ CrossRefGoogle Scholar
  2. 2.
    Chang, C.-C., Lin, C.-J.: LIBSVM - A library for support vector machine (version 3.17) (2013). http://www.csie.ntu.edu.tw/~cjlin/libsvm
  3. 3.
    Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological Sequence Analysis: Probablistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge (1998)CrossRefGoogle Scholar
  4. 4.
    Gärtner, T.: Kernels for Structured Data. World Scientific, Norwich (2008)MATHGoogle Scholar
  5. 5.
    Kao, M.-Y., Lam, T.-W., Sung, W.-K., Ting, H.-F.: An even faster and more unifying algorithm for comparing trees via unlabeled bipartite matchings. J. Algo. 40, 212–233 (2001)CrossRefMATHMathSciNetGoogle Scholar
  6. 6.
    Kashima, H., Sakamoto, H., Koyanagi, T.: Tree kernels. J. JSAI 21, 1–9 (2006). (in Japanese)Google Scholar
  7. 7.
    Kimura, D., Kuboyama, T., Shibuya, T., Kashima, H.: A subpath kernel for rooted unordered trees. J. JSAI 26, 473–482 (2011). (in Japanese)Google Scholar
  8. 8.
    Kuboyama, T., Hirata, K., Aoki-Kinoshita, K.F.: An efficient unordered tree kernel and its application to glycan classification. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 184–195. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  9. 9.
    Leslie, C.S., Eskin, E., Noble, W.S.: The spectrum kernel: a string kernel for SVM protein classification. In: Proceedings of PSB 2002, pp. 566–575 (2002)Google Scholar
  10. 10.
    Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. J. Mach. Learn. Res. 2, 419–444 (2002)MATHGoogle Scholar
  11. 11.
    Makino, S., Shimada, T., Hirata, K., Yonezawa, K., Ito, K.: A trim distance between positions in nucleotide sequences. In: Ganascia, J.-G., Lenca, P., Petit, J.-M. (eds.) DS 2012. LNCS, vol. 7569, pp. 81–94. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  12. 12.
    Makino, S., Shimada, T., Hirata, K., Yonezawa, K., Ito, K.: A trim distance between positions as packaging signals in H3N2 influenza viruses. In: Proceedings of SCIS-ISIS 2012, pp. 1702–1707 (2012)Google Scholar
  13. 13.
    Shimada, T., Hamada, I., Hirata, K., Kuboyama, T., Yonezawa, K., Ito, K.: Clustering of positions in nucleotide sequences by trim distance. In: Proceedings of IIAI AAI 2013, pp. 129–134 (2013)Google Scholar
  14. 14.
    Shin, K., Kuboyama, T.: Kernels based on distributions of agreement subtrees. In: Wobcke, W., Zhang, M. (eds.) AI 2008. LNCS (LNAI), vol. 5360, pp. 236–246. Springer, Heidelberg (2008)Google Scholar
  15. 15.
    Shin, K., Kuboyama, T.: A generalization of Haussler’s convolutioin kernel - mapping kernel and its application to tree kernels. J. Comput. Sci. Tech. 25, 1040–1054 (2010)CrossRefMathSciNetGoogle Scholar
  16. 16.
    Sung, W.-K.: Algorithms in Bioinformatics: A Practical Introduction. Chapman & Hall/CRC, Boca Raton (2009)Google Scholar
  17. 17.
    Tai, K.-C.: The tree-to-tree correction problem. J. ACM 26, 422–433 (1979)CrossRefMATHMathSciNetGoogle Scholar
  18. 18.
    Valiant, L.G.: The complexity of enumeration and reliablity problems. SIAM J. Comput. 8, 410–421 (1979)CrossRefMATHMathSciNetGoogle Scholar
  19. 19.
    Zhang, K., Wang, J., Shasha, D.: On the editing distance between undirected acyclic graphs. Int. J. Found. Comput. Sci. 7, 43–58 (1995)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Issei Hamada
    • 1
  • Takaharu Shimada
    • 1
    • 4
  • Daiki Nakata
    • 1
  • Kouichi Hirata
    • 2
  • Tetsuji Kuboyama
    • 3
  1. 1.Graduate School of Computer Science and Systems EngineeringKyushu Institute of TechnologyIizukaJapan
  2. 2.Department of Artificial IntelligenceKyushu Institute of TechnologyIizukaJapan
  3. 3.Computer CenterGakushuin UniversityToshimaJapan
  4. 4.Mazda Motor CorporationHiroshimaJapan

Personalised recommendations