Multi-task Learning-based Text Classification with Subword-Phrase Extraction
概要
Text classification using deep learning, which is trained with a tremendous amount of text, has achieved superior performance than traditional methods. In addition to its success, multi-task learning has become a promising approach for text classification; for instance, a multi-task learning approach employs named entity recognition as an auxiliary task for text classification. The existing MTL-based text classification methods depend on auxiliary tasks using supervised labels, which require large human and/or financial efforts to create. To reduce these efforts, this paper proposes a multi-task learning-based text classification framework which reduces the additional efforts on supervised label creation. A basic idea to realize this is that to utilize phrasal expressions consisting of subwords (called subword-phrase). To the best of our knowledge, there has been no text classification approach on top of subwordphrases, because subwords do not always express a coherent set of meanings. The proposed framework is new to add subword-phrase recognition as an auxiliary task, and to utilize subword-phrases for text classification. To realize the low-cost auxiliary recognition task, the framework extracts subword-phrases in an unsupervised manner. The experimental evaluation of the five popular datasets for text classification showcases the effectiveness of the involvement of the subword-phrase recognition as an auxiliary task. It also shows comparative results with the state-of-the-art method.
引用情報
Yusuke Kimura, Takahiro Komamizu, Kenji Hatano, Multi-task Learning-based Text Classification with Subword-Phrase Extraction, Proceedings of the 11th International Symposium on Information and Communication Technology, pp.23--30, 2022-12-01, DOI: 10.1145/3568562.3568635.