Language patterns in Japanese patients with Alzheimer disease: A machine learning approach

Yuki Momota, Kuo ching Liang, Toshiro Horigome, Momoko Kitazawa, Yoko Eguchi, Akihiro Takamiya, Akiko Goto, Masaru Mimura, Taishiro Kishimoto

Research output: Contribution to journalArticlepeer-review


Aim: The authors applied natural language processing and machine learning to explore the disease-related language patterns that warrant objective measures for assessing language ability in Japanese patients with Alzheimer disease (AD), while most previous studies have used large publicly available data sets in Euro-American languages. Methods: The authors obtained 276 speech samples from 42 patients with AD and 52 healthy controls, aged 50 years or older. A natural language processing library for Python was used, spaCy, with an add-on library, GiNZA, which is a Japanese parser based on Universal Dependencies designed to facilitate multilingual parser development. The authors used eXtreme Gradient Boosting for our classification algorithm. Each unit of part-of-speech and dependency was tagged and counted to create features such as tag-frequency and tag-to-tag transition-frequency. Each feature's importance was computed during the 100-fold repeated random subsampling validation and averaged. Results: The model resulted in an accuracy of 0.84 (SD = 0.06), and an area under the curve of 0.90 (SD = 0.03). Among the features that were important for such predictions, seven of the top 10 features were related to part-of-speech, while the remaining three were related to dependency. A box plot analysis demonstrated that the appearance rates of content words–related features were lower among the patients, whereas those with stagnation-related features were higher. Conclusion: The current study demonstrated a promising level of accuracy for predicting AD and found the language patterns corresponding to the type of lexical-semantic decline known as ‘empty speech’, which is regarded as a characteristic of AD.

Original languageEnglish
Pages (from-to)273-281
Number of pages9
JournalPsychiatry and Clinical Neurosciences
Issue number5
Publication statusPublished - 2023 May


  • Alzheimer disease
  • dementia
  • machine learning
  • natural language processing
  • speech-language pathology

ASJC Scopus subject areas

  • Neuroscience(all)
  • Neurology
  • Clinical Neurology
  • Psychiatry and Mental health


Dive into the research topics of 'Language patterns in Japanese patients with Alzheimer disease: A machine learning approach'. Together they form a unique fingerprint.

Cite this