Word vectorization using relations among words for neural network

Hajime Hotta, Masanobu Kittaka, Masafumi Hagiwara

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

In this paper, we propose a new vectorization method for a new generation of computational intelligence including neural networks and natural language processing. In recent years, various techniques of word vectorization have been proposed, many of which rely on the preparation of dictionaries. However, these techniques don't consider the symbol grounding problem for unknown types of data, which is one of the most fundamental issues on artificial intelligence. In order to avoid the symbol-grounding problem, pattern processing based methods, such as neural networks, are often used in various studies on self-directive systems and algorithms, and the merit of neural network is not exception in the natural language processing. The proposed method is a converter from one word input to one real-valued vector, whose algorithm is inspired by neural network architecture. he merits of the method are as follows: (1) the method requires no specific knowledge of linguistics e.g. word classes or grammatical one; (2) the method is a sequence learning technique and it can learn additional knowledge. The experiment showed the efficiency of word vectorization in terms of similarity measurement.

Original languageEnglish
Pages (from-to)75-82
Number of pages8
JournalIEEJ Transactions on Electronics, Information and Systems
Volume130
Issue number1
DOIs
Publication statusPublished - 2010

Fingerprint

Neural networks
Electric grounding
Artificial intelligence
Processing
Glossaries
Network architecture
Linguistics
Experiments

Keywords

  • Natural language processing
  • Neural netowrk
  • Self-organizing map
  • Thesaurus
  • Vectorization

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Cite this

Word vectorization using relations among words for neural network. / Hotta, Hajime; Kittaka, Masanobu; Hagiwara, Masafumi.

In: IEEJ Transactions on Electronics, Information and Systems, Vol. 130, No. 1, 2010, p. 75-82.

Research output: Contribution to journalArticle

@article{1c4989127ee744538cea7e920a11829a,
title = "Word vectorization using relations among words for neural network",
abstract = "In this paper, we propose a new vectorization method for a new generation of computational intelligence including neural networks and natural language processing. In recent years, various techniques of word vectorization have been proposed, many of which rely on the preparation of dictionaries. However, these techniques don't consider the symbol grounding problem for unknown types of data, which is one of the most fundamental issues on artificial intelligence. In order to avoid the symbol-grounding problem, pattern processing based methods, such as neural networks, are often used in various studies on self-directive systems and algorithms, and the merit of neural network is not exception in the natural language processing. The proposed method is a converter from one word input to one real-valued vector, whose algorithm is inspired by neural network architecture. he merits of the method are as follows: (1) the method requires no specific knowledge of linguistics e.g. word classes or grammatical one; (2) the method is a sequence learning technique and it can learn additional knowledge. The experiment showed the efficiency of word vectorization in terms of similarity measurement.",
keywords = "Natural language processing, Neural netowrk, Self-organizing map, Thesaurus, Vectorization",
author = "Hajime Hotta and Masanobu Kittaka and Masafumi Hagiwara",
year = "2010",
doi = "10.1541/ieejeiss.130.75",
language = "English",
volume = "130",
pages = "75--82",
journal = "IEEJ Transactions on Electronics, Information and Systems",
issn = "0385-4221",
publisher = "The Institute of Electrical Engineers of Japan",
number = "1",

}

TY - JOUR

T1 - Word vectorization using relations among words for neural network

AU - Hotta, Hajime

AU - Kittaka, Masanobu

AU - Hagiwara, Masafumi

PY - 2010

Y1 - 2010

N2 - In this paper, we propose a new vectorization method for a new generation of computational intelligence including neural networks and natural language processing. In recent years, various techniques of word vectorization have been proposed, many of which rely on the preparation of dictionaries. However, these techniques don't consider the symbol grounding problem for unknown types of data, which is one of the most fundamental issues on artificial intelligence. In order to avoid the symbol-grounding problem, pattern processing based methods, such as neural networks, are often used in various studies on self-directive systems and algorithms, and the merit of neural network is not exception in the natural language processing. The proposed method is a converter from one word input to one real-valued vector, whose algorithm is inspired by neural network architecture. he merits of the method are as follows: (1) the method requires no specific knowledge of linguistics e.g. word classes or grammatical one; (2) the method is a sequence learning technique and it can learn additional knowledge. The experiment showed the efficiency of word vectorization in terms of similarity measurement.

AB - In this paper, we propose a new vectorization method for a new generation of computational intelligence including neural networks and natural language processing. In recent years, various techniques of word vectorization have been proposed, many of which rely on the preparation of dictionaries. However, these techniques don't consider the symbol grounding problem for unknown types of data, which is one of the most fundamental issues on artificial intelligence. In order to avoid the symbol-grounding problem, pattern processing based methods, such as neural networks, are often used in various studies on self-directive systems and algorithms, and the merit of neural network is not exception in the natural language processing. The proposed method is a converter from one word input to one real-valued vector, whose algorithm is inspired by neural network architecture. he merits of the method are as follows: (1) the method requires no specific knowledge of linguistics e.g. word classes or grammatical one; (2) the method is a sequence learning technique and it can learn additional knowledge. The experiment showed the efficiency of word vectorization in terms of similarity measurement.

KW - Natural language processing

KW - Neural netowrk

KW - Self-organizing map

KW - Thesaurus

KW - Vectorization

UR - http://www.scopus.com/inward/record.url?scp=77956798167&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77956798167&partnerID=8YFLogxK

U2 - 10.1541/ieejeiss.130.75

DO - 10.1541/ieejeiss.130.75

M3 - Article

VL - 130

SP - 75

EP - 82

JO - IEEJ Transactions on Electronics, Information and Systems

JF - IEEJ Transactions on Electronics, Information and Systems

SN - 0385-4221

IS - 1

ER -