CNN-encoded radical-level representation for Japanese processing

Yuanzhi Ke, Masafumi Hagiwara

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Although word embeddings are powerful, weakness on rare words, unknown words and issues of large vocabulary motivated people to explore alternative representations. While the character embeddings have been successful for alphabetical languages, Japanese is difficult to be processed at the character level as well because of the large vocabulary of kanji, written in the Chinese characters. In order to achieve fewer parameters and better generalization on infrequent words and characters, we proposed a model that encodes Japanese texts from the radical-level representation, inspired by the experimental findings in the field of psycholinguistics. The proposed model is comprised of a convolutional local encoder and a recurrent global encoder. For the convolutional encoder, we propose a novel combination of two kinds of convolutional filters of different strides in one layer to extract information from the different levels. We compare the proposed radical-level model with the state-of-the-art word and character embedding-based models in the sentiment classification task. The proposed model outperformed the state-of-the-art models for the randomly sampled texts and the texts that contain unknown characters, with 91% and 12% fewer parameters than the word embedding-based and character embedding-based models, respectively. Especially for the test sets of unknown characters, the results by the proposed model were 4.01% and 2.38% above the word embedding-based and character embedding-based baselines, respectively. The proposed model is powerful with cheaper computational and storage cost, can be used for devices with limited storage and to process texts of rare characters.

Original languageEnglish
JournalTransactions of the Japanese Society for Artificial Intelligence
Volume33
Issue number4
DOIs
Publication statusPublished - 2018 Jan 1

Keywords

  • Convolutional neural networks
  • Deep learning
  • Natural language processing
  • Sub-character language modeling
  • Text classification

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'CNN-encoded radical-level representation for Japanese processing'. Together they form a unique fingerprint.

  • Cite this