Linguistic Summarization of Video for Fall Detection Using Voxel Person and Fuzzy Logic

Derek Anderson¹, Robert H Luke, James M Keller, Marjorie Skubic, Marilyn Rantz, Myra Aud

Affiliations

PMID: 20046216
PMCID: PMC2630288
DOI: 10.1016/j.cviu.2008.07.006

Linguistic Summarization of Video for Fall Detection Using Voxel Person and Fuzzy Logic

Derek Anderson et al. Comput Vis Image Underst. 2009 Jan.

. 2009 Jan;113(1):80-89.

doi: 10.1016/j.cviu.2008.07.006.

Authors

Derek Anderson¹, Robert H Luke, James M Keller, Marjorie Skubic, Marilyn Rantz, Myra Aud

Affiliation

¹ Department of Electrical and Computer Engineering, University of Missouri, 349 Engineering Building West, Columbia, Missouri 65211-2300 USA.

PMID: 20046216
PMCID: PMC2630288
DOI: 10.1016/j.cviu.2008.07.006

Abstract

In this paper, we present a method for recognizing human activity from linguistic summarizations of temporal fuzzy inference curves representing the states of a three-dimensional object called voxel person. A hierarchy of fuzzy logic is used, where the output from each level is summarized and fed into the next level. We present a two level model for fall detection. The first level infers the states of the person at each image. The second level operates on linguistic summarizations of voxel person's states and inference regarding activity is performed. The rules used for fall detection were designed under the supervision of nurses to ensure that they reflect the manner in which elders perform these activities. The proposed framework is extremely flexible. Rules can be modified, added, or removed, allowing for per-resident customization based on knowledge about their cognitive and physical ability.

PubMed Disclaimer

Figures

**Fig. 1**
Voxel person construction. Cameras capture the raw video from different viewpoints, silhouette extraction is performed for each camera, voxel sets are calculated from the silhouettes for each camera, and the voxel sets are intersected to calculate voxel person.

**Fig. 2**
Fuzzy inference outputs plotted for a voxel person fall. The x-axis is time, measured in frames, and the y-axis is the fuzzy inference outputs. The red curve is **upright**, the blue curve is **in-between**, and the green curve is **on-the-ground**. The frame rate was 3 per second, so the above plot is approximately 23 seconds of activity.

**Fig. 3**
Color-coding of voxel person according to the membership output values. Voxel persons color is a mixture of the fuzzy rule system outputs. The **upright** state determines the amount of red, **in-between** is green, and **on-the-ground** is blue.

**Fig. 4**
Activity recognition framework, which utilizes a hierarchy of fuzzy logic based on voxel person representation. The first level is reasoning about the state of the individual. Linguistic summarizations are produced and fuzzy logic is used again to reason about human activity.

**Fig. 5**
Detection of a large recent change in voxel person’s speed. (a) Motion vector magnitudes are computed, (b) a fixed size window, placed directly before the start of the summarization, is smoothed with a mean filter, and (c) the maximum of the derivative of the filtered motion vector magnitudes is found in the first and second halves of the window. The feature is the ratio of the two maximum values.

**Fig. 6**
Example images and their corresponding silhouettes from the fall data set. Lying on the couch and sitting on the chair with feet up activities, which could be misinterpreted as a fall, are not recognized as a fall in our system, an advantage of rule-based reasoning and knowledge about three-dimensional voxel person.

**Fig. 7**
Approximately 11 minutes of video analysis, 2,042 frames total. A total of 4 falls occurred and 38 linguistic summarizations were produced. The **upright** membership is shown in red, **in-between** membership is shown in blue, and **on-the-ground** is shown in green. Dashed vertical purple lines are the manually inserted moments where a fall occurred.

**Fig. 8**
Fifty-eight frames (approximately 19 seconds) from a sequence where the person fell and was able to get back up. Red is **upright**, blue is **in-between**, and green is **on-the-ground**.

**Fig. 9**
Sixty-three frames (approximately 21 seconds) where the person fell and tried to get back up three times. Red is **upright**, blue is **in-between**, and green is **on-the-ground**.

See this image and copyright information in PMC

References

1. Stauffer C, Grimson WEL. Learning patterns of activity using real-time tracking. IEEE Trans on Pattern Analysis and Machine Intelligence. 2000;22:747–757.
1. Oliver NM, Rosario B, Pentland AP. A Bayesian Computer Vision System for Modeling Human Interactions. IEEE Tans on Pattern Analysis and Machine Intelligence. 2000;22:831–843.
1. Toyama K, Krumm J, Brumitt B, Meyers B. Wallflower: principles and practice of background maintenance. Proceedings of the Seventh IEEE International Conference on Computer Vision. 1999;1:255–261.
1. Luke RH, Anderson D, Keller JM, Skubic M. Moving Object Segmentation from Video Using Fused Color and Texture Features in Indoor Environments. Journal of Real-Time Image Processing. 2008
1. Parag T, Elgammal A, Mittal A. A framework for feature selection for background subtraction. IEEE Computer Society Conf on Computer Vision and Pattern Recognition. 2006;2:1916–1923.

Grants and funding

T15 LM007089/LM/NLM NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Linguistic Summarization of Video for Fall Detection Using Voxel Person and Fuzzy Logic

Affiliation

Linguistic Summarization of Video for Fall Detection Using Voxel Person and Fuzzy Logic

Authors

Affiliation

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources