Using Entropy Cluster-Based Clustering for Finding Potential Protein Complexes

Le, Viet-Hoang; Kim, Sung-Ryul

doi:10.1007/978-3-319-16483-0_51

Using Entropy Cluster-Based Clustering for Finding Potential Protein Complexes

Viet-Hoang Le²⁰ &
Sung-Ryul Kim²⁰

Conference paper

2487 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9043))

Abstract

Many researches have studied the complex system today because protein complexes, formed by proteins that interact with each other to perform specific biological functions, play a significant role in the biological area. And a few years ago, E. C. Kenley and Y. R. Cho introduced an algorithms which uses the entropy of graph for clustering in [2,3] based on protein-protein interaction network.

In our study, we extend the works to find potential protein complexes while overcoming existing weaknesses of their algorithms to make the results more reliable. We firstly clean the dataset, build a graph based on protein-protein interactions, then trying to determine locally optimal clusters by growing an initial cluster combined of two selected seeds while keeping cluster’s entropy to be minimized. The cluster is formed when its entropy cannot be decreased anymore. Finally, overlapping clusters will be refined to improve their quality and compare to a curated protein complexes dataset. The result shows that the quality of clusters generated by our algorithm measured by the average cluster size considering f1-score is spectacular and the running time is better.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kenly, E.C., Cho, Y.-R.: Entropy-Based Graph Clustering: Application to Biological and Social Networks. In: 2011 IEEE 11th International Conference on Data Mining (ICDM), pp. 1550–4786 (December 2011), DOI = http://dx.doi.org/10.1109/ICDM.2011.64
Chaim, T.C., Cho, Y.-R.: Accuracy improvement in protein complex prediction from protein interaction networks by refining cluster overlaps. Proteome Sci. 10 (Suppl 1:S3)(Jun 21, 2012), doi:10.1186/1477-5956-10-S1-S3
Google Scholar
IntAct Curated Yeast, Protein-protein Interaction Datasets, ftp://ftp.ebi.ac.uk/pub/databases/intact/current/psi25/species/yeast.zip
IntAct Curated Yeast, Complexes Datasets, ftp://ftp.ebi.ac.uk/pub/databases/intact/complex/current/psi25
Razick, S., Magklaras, G., Donaldson, I.M.: iRefIndex: A consolidated protein interaction database with provenance. BMC Bioinformatics 9, 405 (2008), doi:10.1186/1471-2105-9-405.
Article Google Scholar
Graph Entropy – A Survey. G. Simonyi, http://www.renyi.hu/~simonyi/grams.pdf
Van Dongen, S.: A new clustering algorithm for graphs, National Research Institute for Mathematics and Computer Science in the Netherlands, Tech. Rep. INS-R0010 (2000)
Google Scholar
Bader, G.D., Hogue, C.W.: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4, 2 (2003)
Article Google Scholar
Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Physical Review E 70, 66111 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

AIS Lab, Department of Internet Multimedia, Konkuk University, South Korea
Viet-Hoang Le & Sung-Ryul Kim

Authors

Viet-Hoang Le
View author publications
You can also search for this author in PubMed Google Scholar
Sung-Ryul Kim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dpto. de Arquitectura y Tecnología de Computadores (ATC)., E.T.S. de Ingenierías en Informática y Telecomunicación. CITIC-UGR, Universidad de Granada, c/ Periodista Daniel Saucedo Aranda s/n, 18071, Granada, Spain
Francisco Ortuño
E.T.S. Ingenierías Informática y de Telecomunicación , , Dpto. Arquitectura y Tecnología de Computadores, CITIC-UGR, Universidad de Granada, C Periodista Rafael Gómez Montero, 18071, Granada, Spain
Ignacio Rojas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Le, VH., Kim, SR. (2015). Using Entropy Cluster-Based Clustering for Finding Potential Protein Complexes. In: Ortuño, F., Rojas, I. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2015. Lecture Notes in Computer Science(), vol 9043. Springer, Cham. https://doi.org/10.1007/978-3-319-16483-0_51

Download citation

DOI: https://doi.org/10.1007/978-3-319-16483-0_51
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16482-3
Online ISBN: 978-3-319-16483-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics