An Effective Algorithm for Subdimensional Clustering of High Dimensional Data

The KIPS Transactions:PartD, Vol. 10, No. 3, pp. 417-426, Jun. 2003
10.3745/KIPSTD.2003.10.3.417,   PDF Download:


The problem of finding clusters in high dimensional data is well known in the field of data mining for its importance, because cluster analysis has been widely used in numerous applications, including pattern recognition, data analysis, and market analysis. Recently, a new framework, projected clustering, to solve the problem was suggested, which first select subdimensions of each candidate cluster and then each input point is assigned to the nearest cluster according to a distance function based on the chosen subdimensions of the clusters. We propose a new algorithm for subdimensional clustering of high dimensional data, each of the three major steps of which partitions the input points into several candidate clusters with proper numbers of points, filters the clusters that can not be useful in the next steps, and then merges the remaining clusters into the predefined number of clusters using a closeness function, respectively. The result of extensive experiments shows that the proposed algorithm exhibits better performance than the other existent clustering algorithms.

Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.

Cite this article
[IEEE Style]
J. S. Park and D. H. Kim, "An Effective Algorithm for Subdimensional Clustering of High Dimensional Data," The KIPS Transactions:PartD, vol. 10, no. 3, pp. 417-426, 2003. DOI: 10.3745/KIPSTD.2003.10.3.417.

[ACM Style]
Jong Soo Park and Do Hyung Kim. 2003. An Effective Algorithm for Subdimensional Clustering of High Dimensional Data. The KIPS Transactions:PartD, 10, 3, (2003), 417-426. DOI: 10.3745/KIPSTD.2003.10.3.417.