Skip to main content

Using Entropy Cluster-Based Clustering for Finding Potential Protein Complexes

  • Conference paper
  • 2487 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9043))

Abstract

Many researches have studied the complex system today because protein complexes, formed by proteins that interact with each other to perform specific biological functions, play a significant role in the biological area. And a few years ago, E. C. Kenley and Y. R. Cho introduced an algorithms which uses the entropy of graph for clustering in [2,3] based on protein-protein interaction network.

In our study, we extend the works to find potential protein complexes while overcoming existing weaknesses of their algorithms to make the results more reliable. We firstly clean the dataset, build a graph based on protein-protein interactions, then trying to determine locally optimal clusters by growing an initial cluster combined of two selected seeds while keeping cluster’s entropy to be minimized. The cluster is formed when its entropy cannot be decreased anymore. Finally, overlapping clusters will be refined to improve their quality and compare to a curated protein complexes dataset. The result shows that the quality of clusters generated by our algorithm measured by the average cluster size considering f1-score is spectacular and the running time is better.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kenly, E.C., Cho, Y.-R.: Entropy-Based Graph Clustering: Application to Biological and Social Networks. In: 2011 IEEE 11th International Conference on Data Mining (ICDM), pp. 1550–4786 (December 2011), DOI = http://dx.doi.org/10.1109/ICDM.2011.64

  2. Chaim, T.C., Cho, Y.-R.: Accuracy improvement in protein complex prediction from protein interaction networks by refining cluster overlaps. Proteome Sci. 10 (Suppl 1:S3)(Jun 21, 2012), doi:10.1186/1477-5956-10-S1-S3

    Google Scholar 

  3. IntAct Curated Yeast, Protein-protein Interaction Datasets, ftp://ftp.ebi.ac.uk/pub/databases/intact/current/psi25/species/yeast.zip

  4. IntAct Curated Yeast, Complexes Datasets, ftp://ftp.ebi.ac.uk/pub/databases/intact/complex/current/psi25

  5. Razick, S., Magklaras, G., Donaldson, I.M.: iRefIndex: A consolidated protein interaction database with provenance. BMC Bioinformatics 9, 405 (2008), doi:10.1186/1471-2105-9-405.

    Article  Google Scholar 

  6. Graph Entropy – A Survey. G. Simonyi, http://www.renyi.hu/~simonyi/grams.pdf

  7. Van Dongen, S.: A new clustering algorithm for graphs, National Research Institute for Mathematics and Computer Science in the Netherlands, Tech. Rep. INS-R0010 (2000)

    Google Scholar 

  8. Bader, G.D., Hogue, C.W.: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4, 2 (2003)

    Article  Google Scholar 

  9. Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Physical Review E 70, 66111 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Le, VH., Kim, SR. (2015). Using Entropy Cluster-Based Clustering for Finding Potential Protein Complexes. In: Ortuño, F., Rojas, I. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2015. Lecture Notes in Computer Science(), vol 9043. Springer, Cham. https://doi.org/10.1007/978-3-319-16483-0_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16483-0_51

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16482-3

  • Online ISBN: 978-3-319-16483-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics