XML Document Clustering Based on Sequential Pattern


The KIPS Transactions:PartD, Vol. 10, No. 7, pp. 1093-1102, Dec. 2003
10.3745/KIPSTD.2003.10.7.1093,   PDF Download:

Abstract

As the use of internet is growing, the amount of information is increasing rapidly and XML that is a standard of the web data has the property of flexibility of data representation. Therefore electronic document systems based on web, such as EDMS (Electronic Document Management System), ebXML (e-business eXtensible Markup Language), have been adopting XML as the method for exchange and standard of documents. So research on the method which can manage and search structural XML documents in an effective way is required. In this paper we propose the clustering method based on structural similarity among the many XML documents, using typical structures extracted from each document by sequential pattern mining in pre-clustering process. The proposed algorithm improves the accuracy of clustering by computing cost considering cluster cohesion and inter-cluster similarity.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
H. J. Hui and L. G. Ho, "XML Document Clustering Based on Sequential Pattern," The KIPS Transactions:PartD, vol. 10, no. 7, pp. 1093-1102, 2003. DOI: 10.3745/KIPSTD.2003.10.7.1093.

[ACM Style]
Hwang Jeong Hui and Lyu Geun Ho. 2003. XML Document Clustering Based on Sequential Pattern. The KIPS Transactions:PartD, 10, 7, (2003), 1093-1102. DOI: 10.3745/KIPSTD.2003.10.7.1093.