The estimation of heights and occupied areas of humans from two orthogonal views for fall detection

Dao Huu Hung, Hideo Saito

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

In this paper, we present a video-based method of detecting fall incidents of the elderly living alone. We propose using the measures of humans' heights and occupied areas to distinguish three typical states of humans: standing, sitting, and lying. Two relatively orthogonal views are utilized, in turn, simplifying the estimation of occupied areas as the product of widths of the same person, observed in two cameras. However, the feature estimation based on sizes of silhouettes varies across the viewing window due to the camera perspective. To deal with it, we suggest using Local Empirical Templates (LET) that are defined as the sizes of standing people in local image patches. Two important characteristics of LET are: (1) LET in unknown scenes can be easily extracted by an automatic manner, and (2) by its nature, LET hold the perspective information that can be used for feature normalization. The normalization process is not only to cancel the perspective but also to take the features of standing people as the baselines. We realize that heights of standing people are greater than that of sitting and lying people. People in standing states also occupy smaller areas than whom in sitting and lying states. Thus, three humans' states fall into three separable regions of the proposed feature space, composing of normalized heights and normalized occupied areas. Fall incidents can be inferred from time-series analysis of human state transition. We test the performance of our method on 24 video samples in Multi-view Fall Dataset (1) leading to high detection rates and low false alarms, which outperform the state-of-the-art methods (2) (3) tested on the same benchmark dataset.

Original languageEnglish
Pages (from-to)117-127
Number of pages11
JournalIEEJ Transactions on Electronics, Information and Systems
Volume133
Issue number1
DOIs
Publication statusPublished - 2013

Fingerprint

Cameras
Time series analysis

Keywords

  • And time-series analysis of human state transition
  • Fall detection
  • Local empirical templates
  • Normalized height
  • Normalized occupied area
  • Orthogonal views

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Cite this

@article{f304b7aa89f545dbbe6da09a32dce23d,
title = "The estimation of heights and occupied areas of humans from two orthogonal views for fall detection",
abstract = "In this paper, we present a video-based method of detecting fall incidents of the elderly living alone. We propose using the measures of humans' heights and occupied areas to distinguish three typical states of humans: standing, sitting, and lying. Two relatively orthogonal views are utilized, in turn, simplifying the estimation of occupied areas as the product of widths of the same person, observed in two cameras. However, the feature estimation based on sizes of silhouettes varies across the viewing window due to the camera perspective. To deal with it, we suggest using Local Empirical Templates (LET) that are defined as the sizes of standing people in local image patches. Two important characteristics of LET are: (1) LET in unknown scenes can be easily extracted by an automatic manner, and (2) by its nature, LET hold the perspective information that can be used for feature normalization. The normalization process is not only to cancel the perspective but also to take the features of standing people as the baselines. We realize that heights of standing people are greater than that of sitting and lying people. People in standing states also occupy smaller areas than whom in sitting and lying states. Thus, three humans' states fall into three separable regions of the proposed feature space, composing of normalized heights and normalized occupied areas. Fall incidents can be inferred from time-series analysis of human state transition. We test the performance of our method on 24 video samples in Multi-view Fall Dataset (1) leading to high detection rates and low false alarms, which outperform the state-of-the-art methods (2) (3) tested on the same benchmark dataset.",
keywords = "And time-series analysis of human state transition, Fall detection, Local empirical templates, Normalized height, Normalized occupied area, Orthogonal views",
author = "Hung, {Dao Huu} and Hideo Saito",
year = "2013",
doi = "10.1541/ieejeiss.133.117",
language = "English",
volume = "133",
pages = "117--127",
journal = "IEEJ Transactions on Electronics, Information and Systems",
issn = "0385-4221",
publisher = "The Institute of Electrical Engineers of Japan",
number = "1",

}

TY - JOUR

T1 - The estimation of heights and occupied areas of humans from two orthogonal views for fall detection

AU - Hung, Dao Huu

AU - Saito, Hideo

PY - 2013

Y1 - 2013

N2 - In this paper, we present a video-based method of detecting fall incidents of the elderly living alone. We propose using the measures of humans' heights and occupied areas to distinguish three typical states of humans: standing, sitting, and lying. Two relatively orthogonal views are utilized, in turn, simplifying the estimation of occupied areas as the product of widths of the same person, observed in two cameras. However, the feature estimation based on sizes of silhouettes varies across the viewing window due to the camera perspective. To deal with it, we suggest using Local Empirical Templates (LET) that are defined as the sizes of standing people in local image patches. Two important characteristics of LET are: (1) LET in unknown scenes can be easily extracted by an automatic manner, and (2) by its nature, LET hold the perspective information that can be used for feature normalization. The normalization process is not only to cancel the perspective but also to take the features of standing people as the baselines. We realize that heights of standing people are greater than that of sitting and lying people. People in standing states also occupy smaller areas than whom in sitting and lying states. Thus, three humans' states fall into three separable regions of the proposed feature space, composing of normalized heights and normalized occupied areas. Fall incidents can be inferred from time-series analysis of human state transition. We test the performance of our method on 24 video samples in Multi-view Fall Dataset (1) leading to high detection rates and low false alarms, which outperform the state-of-the-art methods (2) (3) tested on the same benchmark dataset.

AB - In this paper, we present a video-based method of detecting fall incidents of the elderly living alone. We propose using the measures of humans' heights and occupied areas to distinguish three typical states of humans: standing, sitting, and lying. Two relatively orthogonal views are utilized, in turn, simplifying the estimation of occupied areas as the product of widths of the same person, observed in two cameras. However, the feature estimation based on sizes of silhouettes varies across the viewing window due to the camera perspective. To deal with it, we suggest using Local Empirical Templates (LET) that are defined as the sizes of standing people in local image patches. Two important characteristics of LET are: (1) LET in unknown scenes can be easily extracted by an automatic manner, and (2) by its nature, LET hold the perspective information that can be used for feature normalization. The normalization process is not only to cancel the perspective but also to take the features of standing people as the baselines. We realize that heights of standing people are greater than that of sitting and lying people. People in standing states also occupy smaller areas than whom in sitting and lying states. Thus, three humans' states fall into three separable regions of the proposed feature space, composing of normalized heights and normalized occupied areas. Fall incidents can be inferred from time-series analysis of human state transition. We test the performance of our method on 24 video samples in Multi-view Fall Dataset (1) leading to high detection rates and low false alarms, which outperform the state-of-the-art methods (2) (3) tested on the same benchmark dataset.

KW - And time-series analysis of human state transition

KW - Fall detection

KW - Local empirical templates

KW - Normalized height

KW - Normalized occupied area

KW - Orthogonal views

UR - http://www.scopus.com/inward/record.url?scp=84873831330&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84873831330&partnerID=8YFLogxK

U2 - 10.1541/ieejeiss.133.117

DO - 10.1541/ieejeiss.133.117

M3 - Article

AN - SCOPUS:84873831330

VL - 133

SP - 117

EP - 127

JO - IEEJ Transactions on Electronics, Information and Systems

JF - IEEJ Transactions on Electronics, Information and Systems

SN - 0385-4221

IS - 1

ER -