Staff Publications

Automatic keyword extraction from documents using conditional random fields

Chengzhi ZHANG, Institute of Scientific and Technical Information of China, China; Department of Information Management, Nanjing University of Science and Technology, China
Huilin WANG, Institute of Scientific and Technical Information of China, China
Yao LIU, Institute of Scientific and Technical Information of China, China
Dan WU, Institute of Scientific and Technical Information of China, China; Department of Information Management, Peking University, China
Yi LIAO, Department of Management, Lingnan University
Bo WANG, WANG, Bo, Department of Computer Science and Technology, Peking University, China

Document Type

Journal article

Source Publication

Journal of Computational Information Systems

Publication Date

3-1-2008

Volume

Issue

First Page

1169

Last Page

1180

Keywords

Automatic indexing, Conditional random fields, Keywords extraction, Machine learning

Abstract

Keywords are subset of words or phrases from a document that can describe the meaning of the document. Many text mining applications can take advantage from it. Unfortunately, a large portion of documents still do not have keywords assigned. On the other hand, manual assignment of high quality keywords is expensive, time-consuming, and error prone. Therefore, most algorithms and systems aimed to help people perform automatic keywords extraction have been proposed. Conditional Random Fields (CRF) model is a state-of-the-art sequence labeling method, which can use the features of documents more sufficiently and effectively. At the same time, keywords extraction can be considered as the string labeling. In this paper, keywords extraction based on CRF is proposed and implemented. As far as we know, using CRF model in keyword extraction has not been investigated previously. Experimental results show that the CRF model outperforms other machine learning methods such as support vector machine, multiple linear regression model etc. in the task of keywords extraction.

Print ISSN

15539105

Publisher Statement

Language

English

Recommended Citation

Zhang, C., Wang, H., Liu, Y., Wu, D., Liao, Y., & Wang, B. (2008). Automatic keyword extraction from documents using conditional random fields. Journal of Computational Information Systems, 4(3), 1169-1180.

This document is currently not available here.

COinS

Staff Publications

Automatic keyword extraction from documents using conditional random fields

Document Type

Source Publication

Publication Date

Volume

Issue

First Page

Last Page

Keywords

Abstract

Print ISSN

Publisher Statement

Language

Recommended Citation

Search

Browse

Author Corner

Links

Staff Publications

Automatic keyword extraction from documents using conditional random fields

Authors

Document Type

Source Publication

Publication Date

Volume

Issue

First Page

Last Page

Keywords

Abstract

Print ISSN

Publisher Statement

Language

Recommended Citation

Share

Search

Browse

Author Corner

Links