Analysis and Comparison of Genomes of HIV-1 and HIV-2 Using Apriori Algorithm, Decision Tree, and Support Vector Machine

  • Yihyun RohEmail author
  • Seokhyun Yoon
  • Min Young Lee
  • Seongpil Jang
  • Taeseon Yoon
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9771)


AIDS is caused by HIV, which can be divided into two strains: HIV-1 and HIV-2. Whereas HIV-1 is distributed around the world and is the major cause of global infections, HIV-2 is less infectious and transmissible and is therefore generally confined to West Africa. Thus this research aims to account for their difference by analyzing genome sequences of HIV-1 and HIV-2 using some methods: Apriori algorithm, Decision tree, and Support Vector Machine. Apriori demonstrates that HIV-1 has lysine, arginine, and serine as its typical amino acids, while HIV-2 has glycine, lysine, leucine, and arginine. Decision tree determines the significant positions of amino acids that can distinguish the two viruses: pos5 in 9 window, pos13 in 13 window, and pos16 in 19 window. SVM indicates that two viruses are seemingly similar but indeed different. The collective results provide a biologically verifiable background for making effective vaccines for HIV, especially for HIV-2.


HIV-1 HIV-2 Amino acids Bioinformatics Data mining Apriori algorithm Decision tree Support vector machine (SVM) 


  1. 1.
    Chinen, J., Shearer, W.T.: Secondary immunodeficiencies, including HIV infection. J. Allergy Clin. Immunol. 125(2), S195–S203 (2010)CrossRefGoogle Scholar
  2. 2.
    Sharp, P.M., Hahn, B.H.: Origins of HIV and the AIDS pandemic. Cold Spring Harb. Perspect. Med. 1(1), a006841 (2011)CrossRefGoogle Scholar
  3. 3.
    Hemelaar, J., et al.: Global and regional distribution of HIV-1 genetic subtypes and recombinants in 2004. Aids 20(16), W13–W23 (2006)CrossRefGoogle Scholar
  4. 4.
    Reeves, J.D., Doms, R.W.: Human immunodeficiency virus type 2. J. Gen. Virol. 83(6), 1253–1265 (2002)CrossRefGoogle Scholar
  5. 5.
    Keele, B.F., et al.: Chimpanzee reservoirs of pandemic and nonpandemic HIV-1. Science 313(5786), 523–526 (2005)CrossRefGoogle Scholar
  6. 6.
    Gilbert, P.B., et al.: Comparison of HIV-1 and HIV-2 infectivity from a prospective cohort study in Senegal. Stat. Med. 22(4), 573–593 (2003)CrossRefGoogle Scholar
  7. 7.
    Marlink, R., et al.: Reduced rate of disease development after HIV-2 infection as compared to HIV-1. Science 265(5178), 1587–1590 (1994)CrossRefGoogle Scholar
  8. 8.
    Kyte, J., Doolittle, R.F.: A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157(1), 105–132 (1982)CrossRefGoogle Scholar
  9. 9.
    Creighton, C., Hanash, S.: Mining gene expression databases for association rules. Bioinformatics 19(1), 79–86 (2003)CrossRefGoogle Scholar
  10. 10.
    Go, E., Lee, S., Yoon, T.: Analysis of Ebolavirus with decision tree and Apriori algorithm. Int. J. Mach. Learn. Comput. 4(6), 543 (2014)CrossRefGoogle Scholar
  11. 11.
    Stiglic, G., et al.: Comprehensive decision tree models in bioinformatics. PLoS ONE 7(3), e33812 (2012)CrossRefGoogle Scholar
  12. 12.
    Kropp, S., Caulfield, V.I.C.: Data Mining and Bioinformatics. Faculty of Information Technology, Monash University, Caulfield (2004)Google Scholar
  13. 13.
    Hsu, C.-W., Chang, C.-C., Lin, C.-J.: A practical guide to support vector classification. 1–16 (2003)Google Scholar
  14. 14.
    Byvatov, E., Schneider, G.: Support vector machine applications in bioinformatics. Appl. Bioinform. 2(2), 67–77 (2002)Google Scholar
  15. 15.
    Inokuchi, A., Washio, T., Motoda, H.: An Apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  16. 16.
    Chen, X., Wang, M., Zhang, H.: The use of classification trees for bioinformatics. Wiley Interdiscip. Rev.: Data Min. Knowl. Disc. 1(1), 55–63 (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Yihyun Roh
    • 1
    Email author
  • Seokhyun Yoon
    • 1
  • Min Young Lee
    • 1
  • Seongpil Jang
    • 2
  • Taeseon Yoon
    • 2
  1. 1.Hankuk Academy of Foreign StudiesYonginRepublic of Korea
  2. 2.Department of ScienceHankuk Academy of Foreign StudiesYonginRepublic of Korea

Personalised recommendations