Structure-based Clustering for XML Document Retrieval


KIPS Transactions on Software and Data Engineering, Vol. 11, No. 7, pp. 1357-1366, Jul. 2004
10.3745/KIPSTD.2004.11.7.1357, Full Text:

Abstract

As the importance of XML is increasing to manage information and exchange data efficiently in the web, there are on going works about structural integration and retrieval. The XML document with the defined structure can retrieve the structure through the DTD or XML schema, but the existing method can't apply to XML documents which haven't the structure information. Therefore, in this paper we propose a new clustering technique as a basic research which make it possible to retrieve structure fast about the XML documents that haven't the structure information. We first extract the feature of frequent structure from each XML document. And we cluster based on the similar structure by considering the frequent structure as representative structure of the XML document, which makes it possible to retrieve the XML document faster than dealing with the whole documents that have different structure. And also we perform the structure retrieval about XML documents based on the clusters which is the group of similar structure. Moreover, we show efficiency of proposed method to describe how to apply the structure retrieval as well as to display the example of application result.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
J. H. Hwang and K. H. Ryu, "Structure-based Clustering for XML Document Retrieval," KIPS Journal D (2001 ~ 2012) , vol. 11, no. 7, pp. 1357-1366, 2004. DOI: 10.3745/KIPSTD.2004.11.7.1357.

[ACM Style]
Jeong Hee Hwang and Keun Ho Ryu. 2004. Structure-based Clustering for XML Document Retrieval. KIPS Journal D (2001 ~ 2012) , 11, 7, (2004), 1357-1366. DOI: 10.3745/KIPSTD.2004.11.7.1357.