Lip reading system using novel Japanese visemes classification and hierarchical weighted discrimination

Shinsuke Okita, Yasue Mitsukura, Nozomu Hamada

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In recent years, automatic lip reading based on 'visemes' have been studied by researchers for realizing human-machine interactive communication system in many applications. However there are a lot of problems such as the definition of the number of viseme classes, discrimination method of visemes, speech recognition method based on visemes, and so on. In this paper, a novel classification of Japanese visemes and hierarchical weighted discrimination method for speech recognition are proposed to address these problems. We augmented the classification number of visemes from 6(conventional) to 9 to represent the words in more detailed by visemes. In addition, considering the difficulty in discriminating with increase of the number of visemes, the hierarchical weighted discrimination method is proposed. For the purpose of comparing with the conventional method, the ATR phonetically balanced word group, which is large vocabulary and includes various visemes, was used and applied to word recognition experiments. From these results, we confirmed the proposed method worked well.

Original languageEnglish
Title of host publicationISPACS 2013 - 2013 International Symposium on Intelligent Signal Processing and Communication Systems
Pages531-536
Number of pages6
DOIs
Publication statusPublished - 2013
Event2013 21st International Symposium on Intelligent Signal Processing and Communication Systems, ISPACS 2013 - Naha, Okinawa, Japan
Duration: 2013 Nov 122013 Nov 15

Other

Other2013 21st International Symposium on Intelligent Signal Processing and Communication Systems, ISPACS 2013
CountryJapan
CityNaha, Okinawa
Period13/11/1213/11/15

Fingerprint

Speech recognition
Communication systems
Experiments

Keywords

  • Image processing
  • lip reading
  • Pattern recognition
  • visemes mouth-shape code
  • visual speech recognition

ASJC Scopus subject areas

  • Artificial Intelligence
  • Signal Processing

Cite this

Okita, S., Mitsukura, Y., & Hamada, N. (2013). Lip reading system using novel Japanese visemes classification and hierarchical weighted discrimination. In ISPACS 2013 - 2013 International Symposium on Intelligent Signal Processing and Communication Systems (pp. 531-536). [6704608] https://doi.org/10.1109/ISPACS.2013.6704608

Lip reading system using novel Japanese visemes classification and hierarchical weighted discrimination. / Okita, Shinsuke; Mitsukura, Yasue; Hamada, Nozomu.

ISPACS 2013 - 2013 International Symposium on Intelligent Signal Processing and Communication Systems. 2013. p. 531-536 6704608.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Okita, S, Mitsukura, Y & Hamada, N 2013, Lip reading system using novel Japanese visemes classification and hierarchical weighted discrimination. in ISPACS 2013 - 2013 International Symposium on Intelligent Signal Processing and Communication Systems., 6704608, pp. 531-536, 2013 21st International Symposium on Intelligent Signal Processing and Communication Systems, ISPACS 2013, Naha, Okinawa, Japan, 13/11/12. https://doi.org/10.1109/ISPACS.2013.6704608
Okita S, Mitsukura Y, Hamada N. Lip reading system using novel Japanese visemes classification and hierarchical weighted discrimination. In ISPACS 2013 - 2013 International Symposium on Intelligent Signal Processing and Communication Systems. 2013. p. 531-536. 6704608 https://doi.org/10.1109/ISPACS.2013.6704608
Okita, Shinsuke ; Mitsukura, Yasue ; Hamada, Nozomu. / Lip reading system using novel Japanese visemes classification and hierarchical weighted discrimination. ISPACS 2013 - 2013 International Symposium on Intelligent Signal Processing and Communication Systems. 2013. pp. 531-536
@inproceedings{dbc52bb9ac614a229d5174d6de3687b7,
title = "Lip reading system using novel Japanese visemes classification and hierarchical weighted discrimination",
abstract = "In recent years, automatic lip reading based on 'visemes' have been studied by researchers for realizing human-machine interactive communication system in many applications. However there are a lot of problems such as the definition of the number of viseme classes, discrimination method of visemes, speech recognition method based on visemes, and so on. In this paper, a novel classification of Japanese visemes and hierarchical weighted discrimination method for speech recognition are proposed to address these problems. We augmented the classification number of visemes from 6(conventional) to 9 to represent the words in more detailed by visemes. In addition, considering the difficulty in discriminating with increase of the number of visemes, the hierarchical weighted discrimination method is proposed. For the purpose of comparing with the conventional method, the ATR phonetically balanced word group, which is large vocabulary and includes various visemes, was used and applied to word recognition experiments. From these results, we confirmed the proposed method worked well.",
keywords = "Image processing, lip reading, Pattern recognition, visemes mouth-shape code, visual speech recognition",
author = "Shinsuke Okita and Yasue Mitsukura and Nozomu Hamada",
year = "2013",
doi = "10.1109/ISPACS.2013.6704608",
language = "English",
isbn = "9781467363617",
pages = "531--536",
booktitle = "ISPACS 2013 - 2013 International Symposium on Intelligent Signal Processing and Communication Systems",

}

TY - GEN

T1 - Lip reading system using novel Japanese visemes classification and hierarchical weighted discrimination

AU - Okita, Shinsuke

AU - Mitsukura, Yasue

AU - Hamada, Nozomu

PY - 2013

Y1 - 2013

N2 - In recent years, automatic lip reading based on 'visemes' have been studied by researchers for realizing human-machine interactive communication system in many applications. However there are a lot of problems such as the definition of the number of viseme classes, discrimination method of visemes, speech recognition method based on visemes, and so on. In this paper, a novel classification of Japanese visemes and hierarchical weighted discrimination method for speech recognition are proposed to address these problems. We augmented the classification number of visemes from 6(conventional) to 9 to represent the words in more detailed by visemes. In addition, considering the difficulty in discriminating with increase of the number of visemes, the hierarchical weighted discrimination method is proposed. For the purpose of comparing with the conventional method, the ATR phonetically balanced word group, which is large vocabulary and includes various visemes, was used and applied to word recognition experiments. From these results, we confirmed the proposed method worked well.

AB - In recent years, automatic lip reading based on 'visemes' have been studied by researchers for realizing human-machine interactive communication system in many applications. However there are a lot of problems such as the definition of the number of viseme classes, discrimination method of visemes, speech recognition method based on visemes, and so on. In this paper, a novel classification of Japanese visemes and hierarchical weighted discrimination method for speech recognition are proposed to address these problems. We augmented the classification number of visemes from 6(conventional) to 9 to represent the words in more detailed by visemes. In addition, considering the difficulty in discriminating with increase of the number of visemes, the hierarchical weighted discrimination method is proposed. For the purpose of comparing with the conventional method, the ATR phonetically balanced word group, which is large vocabulary and includes various visemes, was used and applied to word recognition experiments. From these results, we confirmed the proposed method worked well.

KW - Image processing

KW - lip reading

KW - Pattern recognition

KW - visemes mouth-shape code

KW - visual speech recognition

UR - http://www.scopus.com/inward/record.url?scp=84894158868&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84894158868&partnerID=8YFLogxK

U2 - 10.1109/ISPACS.2013.6704608

DO - 10.1109/ISPACS.2013.6704608

M3 - Conference contribution

AN - SCOPUS:84894158868

SN - 9781467363617

SP - 531

EP - 536

BT - ISPACS 2013 - 2013 International Symposium on Intelligent Signal Processing and Communication Systems

ER -