Data Mining Via Entropy and Graph Clustering

Okafor, Anthony; Pardalos, Panos; Ragle, Michelle

doi:10.1007/978-0-387-69319-4_7

Anthony Okafor⁶,
Panos Pardalos⁶ &
Michelle Ragle⁶

Part of the book series: Springer Optimization and Its Applications ((SOIA,volume 7))

1428 Accesses
1 Citations

Abstract

Data analysis often requires the unsupervised partitioning of the data set into clusters. Clustering data is an important but a difficult problem. In the absence of prior knowledge about the shape of the clusters, similarity measures for a clustering technique are hard to specify. In this work, we propose a framework that learns from the structure of the data. Learning is accomplished by applying the K-means algorithm multiple times with varying initial centers on the data via entropy minimization. The result is an expected number of clusters and a new similarity measure matrix that gives the proportion of occurrence between each pair of patterns. Using the expected number of clusters, final clustering of data is obtained by clustering a sparse graph of this matrix.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unsupervised Classifier Based on Heuristic Optimization and Maximum Entropy Principle

A New Model and Algorithm for Clustering

Convex programming based spectral clustering

Article 14 April 2021

References

R.O. Duba and P.E. Hart. Pattern Classification and Scene Analysis. Wiley-Interscience, New York, NY, 1974.
Google Scholar
S. Fang, J.R. Rajasekera, and H.-S. J. Tsao. Entropy Optimization and Mathematical Programming. Kluwer Academic Publishers, 1997.
Google Scholar
M. Figueiredo and A.K. Jain. Unsupervised learning of finite mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3): 381–396, 2002.
Article Google Scholar
K. Frenken. Entropy Statistics and Information Theory. In H. Hanusch and A. Pyka, editors, The Elgar Companion to Neo-Schumpeterian Economics. Edward Elgar Publishing (in press).
Google Scholar
D. Hall and G Ball. ISODATA: A Novel Method of Data Analysis and Pattern Classification. Technical Report, Stanford Research Institute, Menlo Park, CA, 1965.
Google Scholar
G. Iyengar, and A. Lippman. Clustering Images using Relative Entropy for Efficient retrieval. IEEE Computer Magazine, 28(9): 23–32, 1995.
Google Scholar
A. Jain and M. Kamber. Algorithms for Clustering. Prentice Hall, 1998.
Google Scholar
M. James. Classification Algorithms. Wiley-Interscience, New York, NY, 1985.
Google Scholar
T. Kanungo, D.M. Mount, N.S. Netayahu, CD. Piako, R. Silverman, and A.Y. Wu. An Efficient K-Means Clustering Algorithm: Analysis and Implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7): 881–892, 2002.
Article Google Scholar
J.N. Kapur and H.K. Kesaven. Entropy Optimization Principle with Applications, Ch.l. London Academic, 1997.
Google Scholar
Y.W. Lim and S.U. Lee. On the Color Image Segmentation Algorithm based on Thresholding and Fuzzy C-means Techniques. Pattern Recognition, 23: 935–952, 1990.
Article Google Scholar
J.B. McQueen. Some Methods for Classification and Analysis of Multivariate Observations. In Proceedings of the Fitfth Symposium on Math, Statistics, and Probability, pages 281–297. University of California Press, Berkeley, CA, 1967.
Google Scholar
D. Miller, A. Rao, K. Rose, and A. Gersho. An Information Theoretic Framework for Optimization with Application to Supervised Learning. IEEE International Symposium on Information Theory, Whistler, B.C., Canada, September 1995.
Google Scholar
B. Mirkin. Mathematical Classification and Clustering — Nonconvex Optimization and its Applications, v11. Kluwer Academic Publishers, 1996.
Google Scholar
D. Ren. An Adaptive Nearest Neighbor Classification Algorithm. Available at www.cs.ndsu.nodak.edu/ dren/papers/CS785finalPaper.doc
Google Scholar
J. Rissanen. A Universal prior for integers and Estimation by Minimum Description Length. Annals of Statistics, 11(2): 416–431, 1983.
Article Google Scholar
J.T. Tou and R.C. Gonzalez. Pattern Recognition Principles. Addison-Wesley, 1994.
Google Scholar
M.M. Trivedi and J.C. Bezdeck. Low-level segmentation of aerial with fuzzy clustering. IEEE Transactions on Systems, Man, and Cybernetics, SMC-16: 589–598, 1986.
Article Google Scholar
H. Neemuchawala, A. Hero, and P. Carson. Image Registration using en-tropic graph-matching criteria. Proceedings of Asilomar Conference on Signals, Systems and Computers, 2002.
Google Scholar
R.K. Ahuja, T.L. Magnanti, and J.B. Orlin. Network Flows: Theory, Algorithms, and Applications. Prentice Hall, 1993.
Google Scholar
N. Wu. The Method of Maximum Entropy. Springer, 1997.
Google Scholar
C.L. Blake and C.J. Merz. UCI Repository of machine learning databases http://www.ics.uci.edu/ mlearn/MLRepository.html. Irvine, CA: University of California, Department of Information and Computer Science, 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL, 32611
Anthony Okafor, Panos Pardalos & Michelle Ragle

Authors

Anthony Okafor
View author publications
You can also search for this author in PubMed Google Scholar
Panos Pardalos
View author publications
You can also search for this author in PubMed Google Scholar
Michelle Ragle
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Florida, Gainesville, FL
Panos M. Pardalos
Florida State University, Tallahassee, FL
Vladimir L. Boginski
Dash Optimization, Englewood Cliffs, NJ
Alkis Vazacopoulos

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Okafor, A., Pardalos, P., Ragle, M. (2007). Data Mining Via Entropy and Graph Clustering. In: Pardalos, P.M., Boginski, V.L., Vazacopoulos, A. (eds) Data Mining in Biomedicine. Springer Optimization and Its Applications, vol 7. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-69319-4_7

Download citation

DOI: https://doi.org/10.1007/978-0-387-69319-4_7
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-69318-7
Online ISBN: 978-0-387-69319-4
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics

Data Mining Via Entropy and Graph Clustering

Abstract

Access this chapter

Preview

Similar content being viewed by others

Unsupervised Classifier Based on Heuristic Optimization and Maximum Entropy Principle

A New Model and Algorithm for Clustering

Convex programming based spectral clustering

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Data Mining Via Entropy and Graph Clustering

Abstract

Access this chapter

Preview

Similar content being viewed by others

Unsupervised Classifier Based on Heuristic Optimization and Maximum Entropy Principle

A New Model and Algorithm for Clustering

Convex programming based spectral clustering

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation