An Algorithm for High-Dimensional Traffic Data Clustering

Zheng, Pengjun; McDonald, Mike

doi:10.1007/11881599_8

Pengjun Zheng²³ &
Mike McDonald²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4223))

Included in the following conference series:

International Conference on Fuzzy Systems and Knowledge Discovery

1185 Accesses
1 Citations

Abstract

High-dimensional fuzzy clustering may converge to a local optimum that is significantly inferior to the global optimal partition. In this paper, a two-stage fuzzy clustering method is proposed. In the first stage, clustering is applied on the compact data that is obtained by dimensionality reduction from the full-dimensional data. The optimal partition identified from the compact data is then used as the initial partition in the second stage clustering based on full-dimensional data, thus effectively reduces the possibility of local optimum. It is found that the proposed two-stage clustering method can generally avoid local optimum without computation overhead. The proposed method has been applied to identify optimal day groups for traffic profiling using operational traffic data. The identified day groups are found to be intuitively reasonable and meaningful.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)
MATH Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys 31(3), 264–323 (1999)
Article Google Scholar
Hand, D.J., Krzanowski, W.J.: Optimising k-means Clustering Results with Standard Software Packages. Computational Statistics & Data Analysis 49(4), 969–973 (2005)
Article MATH MathSciNet Google Scholar
Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well separated clusters. Journal of Cybernetics 3, 32–57 (1974)
Article Google Scholar
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)
MATH Google Scholar
Selim, S.Z., Ismail, M.A.: K-means Type Algorithms: A Generalised Convergence Theorem and Characterisation of Local Optimality. IEEE Transactions on Pattern Analysis and Machine Intelligence 6(1), 81–87 (1984)
Article MATH Google Scholar
Pollard, D.: A Central Limit Theorem for k-Means Algorithm. Annals of Probability 10, 919–926 (1982)
Article MATH MathSciNet Google Scholar
Murthy, C.A., Chowdhury, N.: In Search of Optimal Clusters using Genetic Algorithms. Pattern Recognition Letters 17, 825–832 (1996)
Article Google Scholar
Jones, D., Beltramo, M.A.: Solving Partitioning Problems with Genetic Algorithms. In: Proceedings of Fourth International Conference of Genetic Algorithms, pp. 442–449 (1991)
Google Scholar
Laszlo, M., Mukherjee, S.: A Genetic Algorithm Using Hyper-Quadtrees for Lowdimensional K-means clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(4), 533–543 (2006)
Article Google Scholar
Vanloan, C.F.: Generalizing Singular Value Decomposition. SIAM Journal on Numerical Analysis 13(1), 76–83 (1976)
Article MathSciNet Google Scholar
Keogh, E.J., Pazzani, M.J.: A simple dimensionality reduction technique for fast similarity search in large time series databases. In: Terano, T., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 122–133. Springer, Heidelberg (2000)
Chapter Google Scholar
Jagadish, H.V., Koudas, N., Muthukrishnan, S., Poosala, V., Sevcik, K., Suel, T.: Optimal Histograms with Quality Guarantees. In: Proceedings of the 24th VLDB Conference, New York, USA (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Transportation Research Group, University of Southampton, Highfield, Southampton, SO17 1BJ, United Kingdom
Pengjun Zheng & Mike McDonald

Authors

Pengjun Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Mike McDonald
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Electrical and Electronic Engineering, Nanyang Technological University,, Block S1, Nanyang Avenue, 639798, Singapore
Lipo Wang
Life Science Research Center, School of Electronic Engineering, Xidian University,, 710071, Xi’an, Shaanxi, China
Licheng Jiao
School of Electrical and Electronic Engineering, Xidian University, 710071, Xi’an, China
Guanming Shi
School of Information Technology and Electrical Engineering, The University of Queensland, 4072, Brisbane, Queensland, Australia
Xue Li
College of Mathematics and Information Science, Hebei Normal University, 050016, Shijiazhuang, Hebei, P.R. China
Jing Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, P., McDonald, M. (2006). An Algorithm for High-Dimensional Traffic Data Clustering. In: Wang, L., Jiao, L., Shi, G., Li, X., Liu, J. (eds) Fuzzy Systems and Knowledge Discovery. FSKD 2006. Lecture Notes in Computer Science(), vol 4223. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11881599_8

Download citation

DOI: https://doi.org/10.1007/11881599_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45916-3
Online ISBN: 978-3-540-45917-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics