A Hybrid Clustering Technique for Processing Large Data


The KIPS Transactions:PartB , Vol. 10, No. 1, pp. 33-40, Feb. 2003
10.3745/KIPSTB.2003.10.1.33,   PDF Download:

Abstract

Data mining plays an important role in a knowledge discovery process and various algorithms of data mining can be selected for the specific purpose. Most of traditional hierachical clustering methods are suitable for processing small data sets, so they have difficulties in handling large data sets because of limited resources and insufficient efficiency. In this study we propose a hybrid neural networks clustering technique, called PPC for Pre-Post Clustering that can be applied to large data sets and find unknown patterns. PPC combinds an artificial intelligence method, SOM and a statistical method, hierarchical clustering technique, and clusters data through two processes. In pre-clustering process, PPC digests large data sets using SOM. Then in post-clustering, PPC measures similarity values according to cohesive distances which show inner features, and adjacent distances which show external distances between clusters. At last PPC clusters large data sets using the simularity values. Experiment with UCI repository data showed that PPC had better cohensive values than the other clustering techniques.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
M. S. Kim and S. Y. Lee, "A Hybrid Clustering Technique for Processing Large Data," The KIPS Transactions:PartB , vol. 10, no. 1, pp. 33-40, 2003. DOI: 10.3745/KIPSTB.2003.10.1.33.

[ACM Style]
Man Sun Kim and Sang Yong Lee. 2003. A Hybrid Clustering Technique for Processing Large Data. The KIPS Transactions:PartB , 10, 1, (2003), 33-40. DOI: 10.3745/KIPSTB.2003.10.1.33.