Named Entity Recognition for Patent Documents Based on Conditional Random Fields


KIPS Transactions on Software and Data Engineering, Vol. 5, No. 9, pp. 419-424, Sep. 2016
10.3745/KTSDE.2016.5.9.419,   PDF Download:
Keywords: Conditional Random Fields, Named Entity Recognition, Patent Corpus, Kappa Coefficient, 10-Fold Cross Validation
Abstract

Named entity recognition is required to improve the retrieval accuracy of patent documents or similar patents in the claims and patent descriptions. In this paper, we proposed an automatic named entity recognition for patents by using a conditional random field that is one of the best methods in machine learning research. Named entity recognition system has been constructed from the training set of tagged corpus with 660,000 words and 70,000 words are used as a test set for evaluation. The experiment shows that the accuracy is 93.6% and the Kappa coefficient is 0.67 between manual tagging and automatic tagging system. This figure is better than the Kappa coefficient 0.6 for manually tagged results and it shows that automatic named entity tagging system can be used as a practical tagging for patent documents in replacement of a manual tagging.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
L. T. Seok, S. S. Mi, K. S. Shik, "Named Entity Recognition for Patent Documents Based on Conditional Random Fields," KIPS Transactions on Software and Data Engineering, vol. 5, no. 9, pp. 419-424, 2016. DOI: 10.3745/KTSDE.2016.5.9.419.

[ACM Style]
Lee Tae Seok, Shin Su Mi, and Kang Seung Shik. 2016. Named Entity Recognition for Patent Documents Based on Conditional Random Fields. KIPS Transactions on Software and Data Engineering, 5, 9, (2016), 419-424. DOI: 10.3745/KTSDE.2016.5.9.419.