An Approach for Unknown Word Processing based on Japanese Phonological Rules

Tokiya Baba Kazuma Kusu Kenji Hatano
雑誌・プロシーディングス名: Proceedings of the 20th International Conference on Information Integration and Web-based Applications & Services
開催地(都道府県): Yogyakarta,
国名(英語): Indonesia
言語: English
出版社: ACM
ページ: 350-354
出版年: 2018
出版月: 11
出版日: 2018-11-21
DOI: 10.1145/3282373.3282853
📄 PDFを開く
       

概要

A morphological analyzer is one of the essential tools for natural language processing in studying Asian languages. One of the errors in morphological analysis is the existence of unknown words. Unknown words are not contained in dictionaries of the morphological analyzer, Consequently, many researchers usually construct rules for detecting morphemes from the unknown words based on knowledge obtained from error analyses. However, the constructed rules in previous researches might be insufficient in both quality and quantity, because these rules were usually found by computationally or manually visual recognition. For this reason, we propose a method for constructing rules based on phonology concerned with the systematic organization of sounds in languages. Phonology covers every linguistic analysis at all levels of language where sound is considered to be structured for conveying linguistic meaning. We deal with obvious rules in phonology such as vowel coalescence, haplology, prosodic shortening, prosodic lengthening, and the generated pattern of onomatopoeia. We evaluate if a morphological analyzer detect known words which are converted into unknown words accurately by our method.

引用情報

Tokiya Baba, Kazuma Kusu, , Kenji Hatano, An Approach for Unknown Word Processing based on Japanese Phonological Rules, Proceedings of the 20th International Conference on Information Integration and Web-based Applications & Services, pp.350-354, 2018-11-21, DOI: 10.1145/3282373.3282853.

Iconic One Theme | Powered by Wordpress