Advertisement

Grid-Based and Outlier Detection-Based Data Clustering and Classification

  • Kyu Cheol Cho
  • Jong Sik Lee
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 150)

Abstract

Grid computing has been noticed as an issue to solve complex problems of large-scale bioinformatics applications and helps to improve data accuracy and processing speed on multiple computation platforms. Outlier detection helps classification success rate high and makes processing time reduce. This paper focuses on a data clustering and classification method with outlier detection which is an important bioinformatics application in grid environment. This paper proposes a grid-based and outlier detection-based clustering and classification(GODDCC) using grid computational resources with geographically distributed bioinformatics data sets. This GODDCC is able to operate large-scale bioinformatics applications in guaranteeing high bio-data accuracy with reasonable grid resources. This paper evaluates performance of GODDCC in comparing to the data clustering and classification(DCC) without outlier detection. The average of processing time of the GODDCC model records the lowest processing time and provides the highest resources utilization than the other DCC models. The outlier detection method reduces processing time for DCC models with maintaining high classification success rate and grid computing gives a great promise of high performance processing with geographically distributed and large-scale bio-data sets in bioinformatics applications.

Keywords

Grid Computing Outlier Detection Data Cluster Grid Resource High Level Architecture 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Foster, I., Kesselman, C.: The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Francisco (1998)Google Scholar
  2. 2.
    Rajapakse, J.C., Wong, L., Acharya, R.: Pattern Recognition in Bioinformatics: An Introduction. In: Rajapakse, J.C., Wong, L., Acharya, R. (eds.) PRIB 2006. LNCS (LNBI), vol. 4146, pp. 1–3. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  3. 3.
    Carpenter, G.A., Grossberg, S.: Adaptive resonance theory: Stable self-organization of neural recognition codes in response to arbitrary lists of input patterns. In: Proceedings of the 8th Conference of the Cognitive Science Society, Hillsdale, NJ, pp. 45–62 (1988)Google Scholar
  4. 4.
    Fumikazu, K., Hiroyuki, U., Kenji, S., Akihiko, K.: A network design for Open Bioin-formatics Grid(GBIGrid). In: Proc. The 3rd Annual Meeting, Chem-Bio Informatics Society, pp. 192–193 (2002)Google Scholar
  5. 5.
    Stevens, R.D., Robinson, A.J., Goble, C.A.: MyGrid: personalised bioinformatics on the information grid. Bioinformatics, 302–304 (2003)Google Scholar
  6. 6.
  7. 7.
  8. 8.
    Li, K.B.: Clustal W-MPI:ClustalW Analysis Using Distributed and Parallel Computing. Bioinformatics 19, 1585–1586 (2003)CrossRefGoogle Scholar
  9. 9.
  10. 10.
  11. 11.
    DMSO, HLA RTI-1.3 NG Programmer’s Guide Version 3.2Google Scholar
  12. 12.
    Zong, W., Wang, Y., Cai, W., Turner, S.J.: Grid Services and Service Discovery for HLA-Based Distributed Simulation. In: 8th IEEE International Workshop on Distributed Simulation and Real-Time Applications, pp. 116–124. IEEE Computer Society, Los Alamitos (2004)Google Scholar
  13. 13.
    Cai, W., Yuan, Z., Low, M.Y.H., Turner, S.J.: Federate migration in HLA-based simulation. Future Generation Computer Systems, 87–95 (2005)Google Scholar
  14. 14.
    Rycerz, K., Bubak, M., Malawski, M., Sloot, P.M.A.: HLA Grid Based Support for Simulation of Vascular Reconstruction. In: Proceedings of the CoreGRID Workshop: Integrated Research in Grid Computing, pp. 165–174 (2005)Google Scholar
  15. 15.
    Rycerz, K., Bubak, M., Malawski, M., Sloot, P.M.A.: A Framework for HLA-Based Interactive Simulation on the Grid. Simulation, 67–76 (2005)Google Scholar
  16. 16.
    Rycerz, K., Bubak, M., Malawski, M., Sloot, P.M.A.: A Grid Service for Management of Multiple HLA Federate Processes. In: Wyrzykowski, R., Dongarra, J., Meyer, N., Waśniewski, J. (eds.) PPAM 2005. LNCS, vol. 3911, pp. 699–706. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  17. 17.
    Vuong, S., Cai, X., Li, J., Pramanik, S., Suttles, D., Chen, R.: FedGrid: An HLA approach to federating grids. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2004. LNCS, vol. 3038, pp. 889–896. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  18. 18.
    Bolton, R., Hand, D.J.: Statistical Fraud Detection: A Review. Statistical Science 17(3), 235–255 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Watson, S., Arkinstall, S.: The G-protein Linked Receptor Facts Book. Academic Press, Burlington (1994)Google Scholar
  20. 20.
    Jefferson, M.F., Narayanan, M.N., Lucas, S.B.: A neural network computer method to model the INR response of individual patients anticoagulated with warfarin. Br. J. Haematol. 89(1), 29 (1995)Google Scholar
  21. 21.
    Weston, J., Watkins, C.: Multi-class support vector machines, Technical Report CSD-TR-98-04, Royal Holloway, University of London (1998)Google Scholar
  22. 22.
    Cho, K.C., Park, D.H., Ma, Y.B., Lee, J.S.: Optimal Clustering-based ART1 Classification in Bioinformatics: G-Protein Coupled Receptors Classification. In: Jiao, L., Wang, L., Gao, X.-b., Liu, J., Wu, F. (eds.) ICNC 2006. LNCS, vol. 4221, pp. 588–597. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  23. 23.
    Cho, K.C., Park, D.H., Lee, J.S.: Computational Grid-based ART1 Classification for Bioinformatics Applications. In: ICCSA 2006, Glasgow, UK, pp. 131–133 (2006)Google Scholar
  24. 24.
    Kapolka, A.: The Extensible Run-Time Infrastructure (XRTI): An Experimental Implemen-tation of Proposed Improvements to the High Level Architecture. Master’s Thesis, Naval Postgraduate School (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Kyu Cheol Cho
    • 1
  • Jong Sik Lee
    • 1
  1. 1.School of Computer Science and EngineeringInha UniversityIncheonSouth Korea

Personalised recommendations