Document Classification of Small Size Documents Using Extended Relief-F Algorithm


The KIPS Transactions:PartB , Vol. 16, No. 3, pp. 233-238, Jun. 2009
10.3745/KIPSTB.2009.16.3.233,   PDF Download:

Abstract

This paper presents an approach to the classifications of small size document using the instance-based feature filtering Relief-F algorithm. In the document classifications, we have not always good classification performances of small size document included a few features. Because total number of feature in the document set is large, but feature count of each document is very small relatively, so the similarities between documents are very low when we use general assessment of similarity and classifiers. Specially, in the cases of the classification of web document in the directory service and the classification of the sectors that cannot connect with the original file after recovery hard-disk, we have not good classification performances. Thus, we propose the Extended Relief-F(ERelief-F) algorithm using instance-based feature filtering algorithm Relief-F to solve problems of Relief-F as preprocess of classification. For the performance comparison, we tested information gain, odds ratio and Relief-F for feature filtering and getting those feature values, and used kNN and SVM classifiers. In the experimental results, the Extended Relief-F(ERelief-F) algorithm, compared with the others, performed best for all of the datasets and reduced many irrelevant features from document sets.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
H. Park, "Document Classification of Small Size Documents Using Extended Relief-F Algorithm," The KIPS Transactions:PartB , vol. 16, no. 3, pp. 233-238, 2009. DOI: 10.3745/KIPSTB.2009.16.3.233.

[ACM Style]
Heum Park. 2009. Document Classification of Small Size Documents Using Extended Relief-F Algorithm. The KIPS Transactions:PartB , 16, 3, (2009), 233-238. DOI: 10.3745/KIPSTB.2009.16.3.233.