Advertisement

Hybrid Biclustering Algorithms for Data Mining

  • Patryk Orzechowski
  • Krzysztof Boryczko
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9597)

Abstract

Hybrid methods are a branch of biclustering algorithms that emerge from combining selected aspects of pre-existing approaches. The syncretic nature of their construction enriches the existing methods providing them with new properties. In this paper the concept of hybrid biclustering algorithms is explained. A representative hybrid biclustering algorithm, inspired by neural networks and associative artificial intelligence, is introduced and the results of its application to microarray data are presented. Finally, the scope and application potential for hybrid biclustering algorithms is discussed.

Keywords

Data mining Biclustering techniques Gene expression data Microarray analysis 

Notes

Acknowledgements

This research was funded by the Polish National Science Center (NCN), grant No. 2013/11/N/ST6/03204. This research was supported in part by PL-Grid Infrastructure.

References

  1. 1.
    Broder, A., Fontoura, M., Josifovski, V., Riedel, L.: A semantic approach to contextual advertising. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 559–566. ACM (2007)Google Scholar
  2. 2.
    Busygin, S., Prokopyev, O., Pardalos, P.M.: Biclustering in data mining. Comput. Oper. Res. 35(9), 2964–2987 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    de Castro, P.A.D., de França, F.O., Ferreira, H.M., Von Zuben, F.J.: Applying biclustering to text mining: an immune-inspired approach. In: de Castro, L.N., Von Zuben, F.J., Knidel, H. (eds.) ICARIS 2007. LNCS, vol. 4628, pp. 83–94. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  4. 4.
    Franca, F.O.D.: Scalable overlapping co-clustering of word-document data, pp. 464–467. IEEE, December 2012Google Scholar
  5. 5.
    Henriques, R., Madeira, S.: Biclustering with flexible plaid models to unravel interactions between biological processes. IEEE/ACM Trans. Comput. Biol. Bioinf. 12, 738–752 (2015)CrossRefGoogle Scholar
  6. 6.
    Hussain, S.F., Bisson, G., Grimal, C.: An improved co-similarity measure for document clustering. In: Proceedings of the 2010 Ninth International Conference on Machine Learning and Applications, ICMLA 2010, pp. 190–197. IEEE Computer Society, Washington, DC (2010)Google Scholar
  7. 7.
    Kaiser, S.: Biclustering: methods, software and application. PhD thesis, Ludwig-Maximilians-Universitt Mnchen (2011)Google Scholar
  8. 8.
    Liang, T.P., Lai, H.J., Ku, Y.C.: Personalized content recommendation and user satisfaction: theoretical synthesis and empirical findings. J. Manag. Inf. Syst. 23(3), 45–70 (2006)CrossRefGoogle Scholar
  9. 9.
    Mimaroglu, S., Uehara, K.: Bit sequences and biclustering of text documents. In: ICDMW, pp. 51–56. IEEE (2007)Google Scholar
  10. 10.
    Stawarz, M., Michalak, M.: eBi – the algorithm for exact biclustering. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2012, Part II. LNCS, vol. 7268, pp. 327–334. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  11. 11.
    Stawarz, M., Michalak, M.: HRoBi – the algorithm for hierarchical rough biclustering. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2013, Part II. LNCS, vol. 7895, pp. 194–205. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  12. 12.
    Zhang, K., Katona, Z.: Contextual advertising. Mark. Sci. 31(6), 980–994 (2012)CrossRefGoogle Scholar
  13. 13.
    Zhao, H., Wee-Chung Liew, A., Wang, D.Z., Yan, H.: Biclustering analysis for pattern discovery: current techniques, comparative studies and applications. Curr. Bioinf. 7(1), 43–55 (2012)CrossRefGoogle Scholar
  14. 14.
    Eren, K., Deveci, M., et al.: Bbiclustering algorithms for gene expression data. Briefings Bioinf. 14, 279–292 (2012)CrossRefGoogle Scholar
  15. 15.
    Orzechowski, P.: Proximity measures and results validation in biclustering – a survey. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2013, Part II. LNCS, vol. 7895, pp. 206–217. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  16. 16.
    Hanczar, B., Nadif, M.: Ensemble methods for biclustering tasks. Pattern Recogn. 45(11), 3938–3949 (2012)CrossRefGoogle Scholar
  17. 17.
    Kotsiantis, S., Pintelas, P.: Combining bagging and boosting. Int. J. Comput. Intell. 1(4), 324–333 (2004)zbMATHGoogle Scholar
  18. 18.
    Pontes, B., Girldez, R., Aguilar-Ruiz, J.S.: Biclustering on expression data: a review. J. Biomed. Inform. 57, 163–180 (2015)CrossRefGoogle Scholar
  19. 19.
    Cheng, Y., Church, G.M.: Biclustering of expression data. Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, vol. 8, pp. 93–103 (2000)Google Scholar
  20. 20.
    Lazzeroni, L., Owen, A., et al.: Plaid models for gene expression data. Statistica Sinica 12(1), 61–86 (2002)MathSciNetzbMATHGoogle Scholar
  21. 21.
    Murali, T., Kasif, S.: Extracting conserved gene expression motifs from gene expression data. In: Proceedings of Pacific Symposium on Biocomputing, vol. 3, pp. 77–88 (2003)Google Scholar
  22. 22.
    Prelić, A., Bleuler, S., Zimmermann, P., Wille, A., Bühlmann, P., Gruissem, W., Hennig, L., Thiele, L., Zitzler, E.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9), 1122–1129 (2006)CrossRefGoogle Scholar
  23. 23.
    Bozdağ, D., Parvin, J.D., Catalyurek, U.V.: A biclustering method to discover co-regulated genes using diverse gene expression datasets. In: Rajasekaran, S. (ed.) BICoB 2009. LNCS, vol. 5462, pp. 151–163. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  24. 24.
    Li, G., Ma, Q., Tang, H., Paterson, A.H., Xu, Y.: QUBIC: a qualitative biclustering algorithm for analyses of gene expression data. Nucleic Acids Res. 37(15), e101–e101 (2009)CrossRefGoogle Scholar
  25. 25.
    Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans. Comput. Biol. Bioinf. 1(1), 24–45 (2004)CrossRefGoogle Scholar
  26. 26.
    Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)CrossRefGoogle Scholar
  27. 27.
    Ghosh, J., Acharya, A.: Cluster ensembles. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 1(4), 305–315 (2011)CrossRefGoogle Scholar
  28. 28.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Parvin, H., Minaei-Bidgoli, B., Alinejad-Rokny, H., Punch, W.F.: Data weighing mechanisms for clustering ensembles. Comput. Electr. Eng. 39(5), 1433–1450 (2013)CrossRefGoogle Scholar
  30. 30.
    Lancichinetti, A., Fortunato, S.: Consensus clustering in complex networks. Scientific reports 2 (2012)Google Scholar
  31. 31.
    Horzyk, A.: Information freedom and associative artificial intelligence. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2012, Part I. LNCS, vol. 7267, pp. 81–89. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  32. 32.
    Horzyk, A.: How does human-like knowledge come into being in artificial associative systems? In: Proceedings of the 8-th International Conference on Knowledge, Information and Creativity Support Systems, Krakow, Poland (2013)Google Scholar
  33. 33.
    McCall, M.N., Almudevar, A.: Affymetrix GeneChip microarray preprocessing for multivariate analyses. Brief. Bioinf. 13(5), 536–546 (2012)CrossRefGoogle Scholar
  34. 34.
    Davis, S., Meltzer, P.: GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor. Bioinformatics 14, 1846–1847 (2007)CrossRefGoogle Scholar
  35. 35.
    Gautier, L., Cope, L., Bolstad, B.M., Irizarry, R.A.: Affy–analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20(3), 307–315 (2004)CrossRefGoogle Scholar
  36. 36.
    Gentleman, R., Carey, V., Huber, W., Hahne, F.: Genefilter: methods for filtering genes from microarray experiments. R package version 1(0) R package version 1.42.0. (2011)Google Scholar
  37. 37.
    Falcon, S., Gentleman, R.: Using GOstats to test gene lists for GO term association. Bioinformatics 23(2), 257–8 (2007)CrossRefGoogle Scholar
  38. 38.
    Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc. Series B (Methodol.) 57, 289–300 (1995)MathSciNetzbMATHGoogle Scholar
  39. 39.
    Orzechowski, P., Boryczko, K.: Effective biclustering on gpu-capabilities and constraints. Przeglad Elektrotechniczny 1, 133–6 (2015)CrossRefGoogle Scholar
  40. 40.
    Hanczar, B., Nadif, M.: Study of consensus functions in the context of ensemble methods for biclustering (2013). http://cap2013.sciencesconf.org/21492/document

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Department of Automatics and BioengineeringAGH University of Science and TechnologyCracowPoland

Personalised recommendations