Identification and analysis of Chinese organization and institution names
中文信息学报 = Journal of Chinese Information Processing
机构名称, 专有名词, 短语分析, 自然语言处理, Organization and institution names, Proper nouns, Phrase analysis, Natural language processing
As important proper nouns, Chinese names of organizations and institutions play an indispensable role in language communication. Unfortunately, due to their infinite quantity, constant creation and disappearance, and relative length and complexity, most of these names have failed to find their way into Chinese dictionaries of computer systems. Linguistically, however, these proper nouns can be viewed as a special group of compound nouns and as a simple category of noun phrase, possessing their own formation rules and physical markers. This paper presents a pioneer discussion on the analysis of Chinese names of organizations and institutions from the computational point of view. Useful linguistic rules has been drawn from the discussion and applied to the identification of names of organizations and institutions in the 6,000,000 character Mainland Hongkong Taiwan corpus of modern Chinese developed by Hong Kong Polytechnic University. Preliminary experiments show that both precision and recall rates for identifying names of colleges and universities are over 96%.