A Hybrid Grid and Its Application to Orthologous Groups Clustering
Orthologous groups are useful in the genome annotation, studies on gene evolution, and comparative genomics. However, the construction of orthologous groups is difficult to automate and takes so much time as the number of genome sequences increases. Furthermore, it is not easy to guarantee the accuracy of the automatically constructed orthologous groups. We propose an automatic orthologous group construction system for a large number of genomes. A hybrid grid computer system, consisting of 40 PCs, has been devised for fast construction of the orthologous groups from large number of genome sequences. The grid system constructs orthologous groups for 89 complete prokaryotes genomes just in a week (it takes 8 months on a single computer system). Furthermore, the system provides good extensibility for adopting new genomes in the existing orthologous groups. In the real experiment of the orthologous group constructions, more than 85% of the constructed orthologous groups coincide with those of KO (KEGG Ortholog) and COGs (Clusters of Orthologous Group of Proteins). Note that KO and COGs have been constructed manually or semi-automatically at the sacrifice of the extensibility for newly completed genomes.
KeywordsGrid Computing Orthologous Group Master Node Orthologous Cluster Propose Cluster Algorithm
Unable to display preview. Download preview PDF.
- 1.Altschul, S.F., et al.: Basic Local Alignment Search Tool. Journal of Molecular Biology 215, 403–410 (1990)Google Scholar
- 5.Kim, T.K., et al.: HGBS: A Hardware-Oriented Grid BLAST System. In: Proc. of the 5th IEEE/ACM Int’l. Symposium on Cluster Computing and the Grid, BioGrid 2005 (2005)Google Scholar
- 6.Kuo, Y.L., et al.: Construct a Grid Computing Environment for Bioinformatics. In: Proc. of the International Symposium on Parallel Architectures, Algorithms and Networks(ISPAN 2004), pp. 1087–4089 (2004)Google Scholar
- 7.Lee, S.J., et al.: Exploring protein fold space by secondary structure prediction using data distribution method on Grid platform. Bioinformatics (Advance Access published on July 29, 2004)Google Scholar
- 14.Wang, L., et al.: Biogrid Computing Platform: Parallel computing for protein alignment analysis. In: HPC Asia 2002, Bangalore, India (2002)Google Scholar
- 15.Yamanishi, Y., et al.: Extraction of Organism Groups from Whole Genome Comparisons. Genome Informatics 14, 438–439 (2003)Google Scholar
- 16.Yong-Meng, T.E.O., et al.: GLAD: a system for developing and deploying large-scale bioinformatics Grid. Bioinformatics (Advance Access published on September 23, 2004)Google Scholar
- 17.COGs official homepage, http://www.ncbi.nlm.nih.gov/COG/
- 18.KO official homepage, http://www.genome.jp/kegg/ko.html
- 20.KEGG, http://www.genome.ad.jp
- 21.EtherBoot Project, http://etherboot.sourceforge.net/