A Survey on Vision-based Fall Detection
Citations
58 |
Activity summarisation and fall detection in a supportive home environment”,
- Nait-Charif, McKenna
- 2004
(Show Context)
Citation Context ...ction. Feng et al. [14] propose a novel vision-based fall detection method for monitoring elderly people in a house care environment. The foreground human silhouette is extracted via background modeling and tracked throughout the video sequence. The human body is represented with ellipse fitting, and the silhouette motion is modeled by an integrated normalized motion energy image computed over a short-term video sequence. Then, the shape deformation quantified from the fitted silhouettes is used as the features to distinguish different postures of the human. Inactivity detection is adopted by [28] to detect falls. In [28] ceiling-mounted, wide-angle cameras with vertically oriented optical axes are used to reduce the influence of occlusion. Nait-Charif et al. [28] use learned models of spatial context, which are used in conjunction with a tracker to achieve these goals. Nater et al. [30] present an approach for unusual event detection based on a tree of trackers. Each tracker is specialized for a specific type of activity. Falls are detected when none of the specialized trackers for “normal” activities can explain the observation. Charfi et al. [11, 10] introduce a spatio-temporal huma... |
43 |
Linguistic summarization of video for fall detection using voxel person and fuzzy logic.
- Anderson, Luke, et al.
- 2009
(Show Context)
Citation Context ...or a specific type of activity. Falls are detected when none of the specialized trackers for “normal” activities can explain the observation. Charfi et al. [11, 10] introduce a spatio-temporal human fall descriptor, named STHF, that uses several combinations of transformations of geometrical features. The well-known SVM classifier is applied to the STHF descriptor to classify falls and normal activities. 3.2 3D-based Methods Using Multiple RGB Cameras Another category of vision-based methods for fall detection is 3D-based methods using multiple RGB cameras. The calibrated multi-camera systems [6, 4, 5, 1, 2] allow 3D reconstruction of the object but require a careful and timeconsuming calibration process. Auvinet et al. [6, 4] use a network of multiple calibrated cameras to reconstruct the 3D shape of the person. Fall events are detected by analyzing the volume distribution along the vertical axis, and an alarm is triggered when the major part of this distribution is abnormally near the floor. In a later work [5], the fall alarm would be triggered when the major part of this distribution is abnormally near the floor during a predefined period of time. Anderson et al. [1, 2] employ multiple camera... |
32 |
A Customized Human Fall Detection System Using OmniCamera Images and
- Miaou
- 2006
(Show Context)
Citation Context ...ultiple actions simultaneously with less intrusion. Vision-based methods can be broadly divided into three categories: fall detection using a single RGB camera, 3D-based methods using multiple cameras, and 3D-based methods using depth cameras. 3.1 Fall Detection Using a Single RGB Camera Fall detections using a single RGB camera have been extensively studied, as the systems are easy to set up and are inexpensive. Shape related features, inactivity detection and human motion analysis are the most commonly used clues for detecting falls. Shape related features are widely used for fall detection [26, 34, 38, 25, 36, 14, 29]. In [36, 25], fall detection is based on width to height aspect ratio of the person. Mirmahboub et al. [26] use a simple background separation method to create the silhouette of the person, and several features are then extracted from the silhouette area. Finally, A SVM classifier is employed to perform the classification based on these silhouette-related features. Rougier et al. [34] use a shape matching technique to track the silhouette of the person in the target video clip. The shape deformation is then quantified from these silhouettes, and the classification is based on the shape deform... |
27 | A multi-camera vision system for fall detection and alarm generation.
- Cucchiara, Prati, et al.
- 2007
(Show Context)
Citation Context ...defined period of time. Anderson et al. [1, 2] employ multiple cameras and a hierarchy of fuzzy logic to detect falls. Overall, using multiple cameras offers the advantage of allowing 3D reconstruction and extraction of 3D features for fall detection. The proposed method introduces the voxel person, which is a linguistic summarization of temporal fuzzy inference curves, to represent the states of a three-dimensional object. Aside from providing 3D reconstruction, multi-camera systems can also be used for other purposes, like viewpoint independent [37] fall detection, monitoring multiple rooms [12] and fusion of different cameras’ results [44]. Thome et al. [37] propose a multi-view fall detection system by which motion is modeled by a layered hidden Markov model (LHMM). The proposed method uses a multi-view setting, where the low-level steps are (mainly) performed independently in each view, leading to the extraction of simple image features compatible with real-time achievement. Then, a fusion unit merges the output of each camera to provide a viewpoint-independent pose classification. Cucchiara et al. [12] use multiple cameras to monitor different rooms. A single room is monitored by... |
27 |
A survey on fall detection: Principles and approaches. Neurocomputing 2013
- Mubashir, Shao, et al.
(Show Context)
Citation Context ...sors are accelerometers), and being exclusively vision-based. Generally speaking, the sensor-based methods require subjects to actively cooperate by wearing the sensors, which can be problematic and possibly uncomfortable (e.g., wearing sensors while sleeping, to detect falls during a night trip to the restroom). The vision-based methods are less intrusive, as all information is collected from cameras. This literature review focuses on vision-based research work and contains a comprehensive study of recent proposed fall detection methods using depth cameras. Compared to existing review papers [27, 18], this literature review has the following contributions: (1) we focus on recent vision-based fall detection techniques. Specifically, the recent depth cameras based fall detection methods are extensively summarized in this survey. (2) we are not aware of any literature discussing the publicly available fall datasets. However, establishing several benchmark datasets is extremely important for the fall detection community, which enables researchers to fairly compare their methods with others. This literature review introduces several publicly available fall datasets that can serve as benchmarks... |
26 | An acoustic fall detector system that uses sound height information to reduce the false alarm rate
- Popescu, Li, et al.
- 2008
(Show Context)
Citation Context ...all detection system should be both high sensitivity and good specificity. Unfortunately, the existing vision-based fall detection methods cannot satisfy both accuracy and robustness. Although we can develop more sophisticated vision-based techniques, it is also worth emphasizing that the vison-based methods are not necessary to be a standalone system for fall detection, can also be combined with other modules to form a general system. The overall fall detection system may contain additional modules, both to improve accuracy, and to include additional functionality, such as an acoustic module [31, 43] and a module sending an alert about the detected fall. We believe that including sound processing and speech recognition would help significantly towards obtaining a robust system. Sound processing may produce additional features to be used for classifying a candidate fall event. Speech recognition can be used so that the system initiates a dialog with the subject, in the case where a fall has been detected. For example, the system can ask “Are you OK?” and the user can respond to indicate that there was no actual fall and no need to issue an alert. 5. ACKNOWLEDGMENTS This work was partially ... |
23 |
Intelligent video surveillance for monitoring elderly in home environments.
- Nasution, Emmanuel
- 2007
(Show Context)
Citation Context ...ultiple actions simultaneously with less intrusion. Vision-based methods can be broadly divided into three categories: fall detection using a single RGB camera, 3D-based methods using multiple cameras, and 3D-based methods using depth cameras. 3.1 Fall Detection Using a Single RGB Camera Fall detections using a single RGB camera have been extensively studied, as the systems are easy to set up and are inexpensive. Shape related features, inactivity detection and human motion analysis are the most commonly used clues for detecting falls. Shape related features are widely used for fall detection [26, 34, 38, 25, 36, 14, 29]. In [36, 25], fall detection is based on width to height aspect ratio of the person. Mirmahboub et al. [26] use a simple background separation method to create the silhouette of the person, and several features are then extracted from the silhouette area. Finally, A SVM classifier is employed to perform the classification based on these silhouette-related features. Rougier et al. [34] use a shape matching technique to track the silhouette of the person in the target video clip. The shape deformation is then quantified from these silhouettes, and the classification is based on the shape deform... |
20 | Modeling human activity from voxel person using fuzzy logic. Fuzzy Systems,
- Anderson, Luke, et al.
- 2009
(Show Context)
Citation Context ...or a specific type of activity. Falls are detected when none of the specialized trackers for “normal” activities can explain the observation. Charfi et al. [11, 10] introduce a spatio-temporal human fall descriptor, named STHF, that uses several combinations of transformations of geometrical features. The well-known SVM classifier is applied to the STHF descriptor to classify falls and normal activities. 3.2 3D-based Methods Using Multiple RGB Cameras Another category of vision-based methods for fall detection is 3D-based methods using multiple RGB cameras. The calibrated multi-camera systems [6, 4, 5, 1, 2] allow 3D reconstruction of the object but require a careful and timeconsuming calibration process. Auvinet et al. [6, 4] use a network of multiple calibrated cameras to reconstruct the 3D shape of the person. Fall events are detected by analyzing the volume distribution along the vertical axis, and an alarm is triggered when the major part of this distribution is abnormally near the floor. In a later work [5], the fall alarm would be triggered when the major part of this distribution is abnormally near the floor during a predefined period of time. Anderson et al. [1, 2] employ multiple camera... |
20 |
Robust video surveillance for fall detection based on human shape deformation
- Rougier, Meunier, et al.
- 2011
(Show Context)
Citation Context ...ultiple actions simultaneously with less intrusion. Vision-based methods can be broadly divided into three categories: fall detection using a single RGB camera, 3D-based methods using multiple cameras, and 3D-based methods using depth cameras. 3.1 Fall Detection Using a Single RGB Camera Fall detections using a single RGB camera have been extensively studied, as the systems are easy to set up and are inexpensive. Shape related features, inactivity detection and human motion analysis are the most commonly used clues for detecting falls. Shape related features are widely used for fall detection [26, 34, 38, 25, 36, 14, 29]. In [36, 25], fall detection is based on width to height aspect ratio of the person. Mirmahboub et al. [26] use a simple background separation method to create the silhouette of the person, and several features are then extracted from the silhouette area. Finally, A SVM classifier is employed to perform the classification based on these silhouette-related features. Rougier et al. [34] use a shape matching technique to track the silhouette of the person in the target video clip. The shape deformation is then quantified from these silhouettes, and the classification is based on the shape deform... |
19 |
Monocular 3-D head tracking to detect falls of elderly people,”
- Rougier
- 2006
(Show Context)
Citation Context ...activities. Arie et al. [29] present a novel method to distinguish different postures including standing, sitting, bending/squatting, lying on the side and lying toward the camera. The proposed method extracts the projection histograms of the segmented human body silhouette as the main feature vector. Posture classification is completed by k-Nearest Neighbor (k-NN) algorithm and evidence accumulation technique. The motion pattern differences between falls and other daily activities, like walking, sitting down, drinking and etc, are significant. Much of the research is based on motion analysis [15, 22, 40, 33]. Liao et al. [22] use human motion analysis and human silhouette shape variations to detect slip-only and fall events. The motion measure is obtained by analyzing the energy of the motion active (MA) area in the integrated spatio-temporal energy (ISTE) map. Homa et al. [15] discuss applying Integrated Time Motion Image (ITMI) to fall detection. Integrated Time Motion Image (ITMI) is a type of spatio-temporal database that includes motion and time of motion occurrence. Given a video clip, the integrated time motion images are calculated to represent the motion pattern that occurred in the vide... |
18 |
Challenges, issues and trends in fall detection systems.
- Igual, Medrano, et al.
- 2013
(Show Context)
Citation Context ...sors are accelerometers), and being exclusively vision-based. Generally speaking, the sensor-based methods require subjects to actively cooperate by wearing the sensors, which can be problematic and possibly uncomfortable (e.g., wearing sensors while sleeping, to detect falls during a night trip to the restroom). The vision-based methods are less intrusive, as all information is collected from cameras. This literature review focuses on vision-based research work and contains a comprehensive study of recent proposed fall detection methods using depth cameras. Compared to existing review papers [27, 18], this literature review has the following contributions: (1) we focus on recent vision-based fall detection techniques. Specifically, the recent depth cameras based fall detection methods are extensively summarized in this survey. (2) we are not aware of any literature discussing the publicly available fall datasets. However, establishing several benchmark datasets is extremely important for the fall detection community, which enables researchers to fairly compare their methods with others. This literature review introduces several publicly available fall datasets that can serve as benchmarks... |
16 |
An Automated Active Vision System for Fall Detection and Posture Analysis in Ambient Assisted Living Applications,” October
- Leone, Diraco, et al.
- 2010
(Show Context)
Citation Context ...esults in a separate fall confidence (so no external camera calibration is needed). These confidences are then combined into an overall decision. Hung et al. [16, 17] propose using the measures of humans’ heights and occupied areas to distinguish three typical states of humans: standing, sitting and lying. Two relatively orthogonal views are utilized, in turn, simplifying the estimation of occupied areas as the product of widths of the same person, observed in two cameras. 3.3 3D-based Methods Using Depth Cameras The earliest depth camera used for fall detection is the TimeOf-Flight 3D camera [13]. Since the price of a Time-OfFlight 3D camera is expensive, very few researchers adopted it for fall detection. But this situation has changed since the advent of the affordable depth sensing technology, like Microsoft Kinect. With the help of the depth cameras, the calculation of the distance from the top of the person to the floor is simple, which can then be used as a feature to detect falls [13, 21, 32, 19]. Diraco et al. [13] use a wall-mounted Time-Of-Flight 3D camera to monitor the scene. The system identifies a fall event when the human centroid gets closer than a certain threshold to... |
15 |
Fall detection with multiple cameras: An occlusion-resistant method based on 3D silhouette vertical distribution
- Auvinet, Multon, et al.
- 2010
(Show Context)
Citation Context ...or a specific type of activity. Falls are detected when none of the specialized trackers for “normal” activities can explain the observation. Charfi et al. [11, 10] introduce a spatio-temporal human fall descriptor, named STHF, that uses several combinations of transformations of geometrical features. The well-known SVM classifier is applied to the STHF descriptor to classify falls and normal activities. 3.2 3D-based Methods Using Multiple RGB Cameras Another category of vision-based methods for fall detection is 3D-based methods using multiple RGB cameras. The calibrated multi-camera systems [6, 4, 5, 1, 2] allow 3D reconstruction of the object but require a careful and timeconsuming calibration process. Auvinet et al. [6, 4] use a network of multiple calibrated cameras to reconstruct the 3D shape of the person. Fall events are detected by analyzing the volume distribution along the vertical axis, and an alarm is triggered when the major part of this distribution is abnormally near the floor. In a later work [5], the fall alarm would be triggered when the major part of this distribution is abnormally near the floor during a predefined period of time. Anderson et al. [1, 2] employ multiple camera... |
15 | Fall detection from depth map video sequences
- Rougier, Anvient, et al.
- 2011
(Show Context)
Citation Context ...reas as the product of widths of the same person, observed in two cameras. 3.3 3D-based Methods Using Depth Cameras The earliest depth camera used for fall detection is the TimeOf-Flight 3D camera [13]. Since the price of a Time-OfFlight 3D camera is expensive, very few researchers adopted it for fall detection. But this situation has changed since the advent of the affordable depth sensing technology, like Microsoft Kinect. With the help of the depth cameras, the calculation of the distance from the top of the person to the floor is simple, which can then be used as a feature to detect falls [13, 21, 32, 19]. Diraco et al. [13] use a wall-mounted Time-Of-Flight 3D camera to monitor the scene. The system identifies a fall event when the human centroid gets closer than a certain threshold to the floor, and the person does not move for a certain number of seconds once close to the floor. In a related approach, Leone et al. [21] employ a 3D range camera. A fall event is detected based on two rules: (1) the distance of the person’s center-of-mass from the floor plane decreases below a threshold within a time window of about 900ms; (2) the person’s motion remains negligible within a time window of abou... |
14 |
Fall incidents detection for intelligent video surveillance
- Tao, Turjo, et al.
(Show Context)
Citation Context ...ultiple actions simultaneously with less intrusion. Vision-based methods can be broadly divided into three categories: fall detection using a single RGB camera, 3D-based methods using multiple cameras, and 3D-based methods using depth cameras. 3.1 Fall Detection Using a Single RGB Camera Fall detections using a single RGB camera have been extensively studied, as the systems are easy to set up and are inexpensive. Shape related features, inactivity detection and human motion analysis are the most commonly used clues for detecting falls. Shape related features are widely used for fall detection [26, 34, 38, 25, 36, 14, 29]. In [36, 25], fall detection is based on width to height aspect ratio of the person. Mirmahboub et al. [26] use a simple background separation method to create the silhouette of the person, and several features are then extracted from the silhouette area. Finally, A SVM classifier is employed to perform the classification based on these silhouette-related features. Rougier et al. [34] use a shape matching technique to track the silhouette of the person in the target video clip. The shape deformation is then quantified from these silhouettes, and the classification is based on the shape deform... |
12 |
An eigenspace-based approach for human fall detection using integrated time motion image and neural network.
- Foroughi, Naseri, et al.
- 2008
(Show Context)
Citation Context ...activities. Arie et al. [29] present a novel method to distinguish different postures including standing, sitting, bending/squatting, lying on the side and lying toward the camera. The proposed method extracts the projection histograms of the segmented human body silhouette as the main feature vector. Posture classification is completed by k-Nearest Neighbor (k-NN) algorithm and evidence accumulation technique. The motion pattern differences between falls and other daily activities, like walking, sitting down, drinking and etc, are significant. Much of the research is based on motion analysis [15, 22, 40, 33]. Liao et al. [22] use human motion analysis and human silhouette shape variations to detect slip-only and fall events. The motion measure is obtained by analyzing the energy of the motion active (MA) area in the integrated spatio-temporal energy (ISTE) map. Homa et al. [15] discuss applying Integrated Time Motion Image (ITMI) to fall detection. Integrated Time Motion Image (ITMI) is a type of spatio-temporal database that includes motion and time of motion occurrence. Given a video clip, the integrated time motion images are calculated to represent the motion pattern that occurred in the vide... |
12 | Acoustic fall detection using gaussian mixture models and gmm supervectors.
- Zhuang, Huang, et al.
- 2009
(Show Context)
Citation Context ...all detection system should be both high sensitivity and good specificity. Unfortunately, the existing vision-based fall detection methods cannot satisfy both accuracy and robustness. Although we can develop more sophisticated vision-based techniques, it is also worth emphasizing that the vison-based methods are not necessary to be a standalone system for fall detection, can also be combined with other modules to form a general system. The overall fall detection system may contain additional modules, both to improve accuracy, and to include additional functionality, such as an acoustic module [31, 43] and a module sending an alert about the detected fall. We believe that including sound processing and speech recognition would help significantly towards obtaining a robust system. Sound processing may produce additional features to be used for classifying a candidate fall event. Speech recognition can be used so that the system initiates a dialog with the subject, in the case where a fall has been detected. For example, the system can ask “Are you OK?” and the user can respond to indicate that there was no actual fall and no need to issue an alert. 5. ACKNOWLEDGMENTS This work was partially ... |
11 | A realtime, multiview fall detection system: A LHMMbased approach.
- Thome, Miguet, et al.
- 2008
(Show Context)
Citation Context ...ution is abnormally near the floor during a predefined period of time. Anderson et al. [1, 2] employ multiple cameras and a hierarchy of fuzzy logic to detect falls. Overall, using multiple cameras offers the advantage of allowing 3D reconstruction and extraction of 3D features for fall detection. The proposed method introduces the voxel person, which is a linguistic summarization of temporal fuzzy inference curves, to represent the states of a three-dimensional object. Aside from providing 3D reconstruction, multi-camera systems can also be used for other purposes, like viewpoint independent [37] fall detection, monitoring multiple rooms [12] and fusion of different cameras’ results [44]. Thome et al. [37] propose a multi-view fall detection system by which motion is modeled by a layered hidden Markov model (LHMM). The proposed method uses a multi-view setting, where the low-level steps are (mainly) performed independently in each view, leading to the extraction of simple image features compatible with real-time achievement. Then, a fusion unit merges the output of each camera to provide a viewpoint-independent pose classification. Cucchiara et al. [12] use multiple cameras to monitor... |
9 | Visual Sensor Based Abnormal Event Detection with Moving Shadow Removal in Home Healthcare Applications. Sensors
- Lee, Chung
- 2012
(Show Context)
Citation Context ...d to recognize five activities: standing, falling from standing, falling from chair, sitting on chair, and sitting on floor. The main analysis is based on 3D depth information. If the person goes out of the range of the depth camera, RGB video is used to analyze the activities. The kinematic model features that are extracted from 3D depth information include two components: structure similarity and vertical height of the person. For RGB video analysis, the width-height ratio of the detected human bounding box is used to recognize different activities. Other methods using depth cameras include [23, 3, 20]. Xin et al. [23] combine shape-based fall characterization and a learning-based classifier to distinguish falls from other daily actions. Curvature scale space (CSS) features of human silhouettes are extracted at each frame and then an action is represented by a bag of CSS words (BoCSS). The BoCSS representation of a fall is distinguished from those of other actions by the pre-trained extreme learning machine (ELM) classifier. In a later work [3], instead of representing an action as a bag of CSS words, Fisher Vector (FV) encoding is used to describe the action based on CSS features. A pre-tr... |
8 | Multiple Cameras Fall Dataset.
- Auvinet, Rougier, et al.
- 2010
(Show Context)
Citation Context ...multiple calibrated RGB cameras. SDUFall1 [23]: A Kinect camera was set up to record the dataset. Three channels were collected: RGB video (.avi), depth video (.avi), and 20 skeleton joint positions (.txt). All videos were recorded at a resolution of 320x240 pixels per frame and 30 frames per second in AVI format. Twenty subjects participated in the data recording. Each subject performed 6 actions 10 times each: falling down, bend1http://www.sucro.org/homepage/wanghaibo/SDUFall.html Table 1: Five publicly available fall datasets SDUFall EDF OCCU Dataset introduced in [9] Dataset introduced in [7] camera type one Kinect two Kinects two Kinects one RGB camera eight calibrated RGB cameras camera viewpoints one two two NaN eight fall type falls with different directions eight fall directions occluded falls falls with different directions forward, backward falls, falls from sitting down and loss of balance number of falls 200 320 60 192 200 activities of daily life Yes Yes Yes Yes Yes simulated scenarios 1 1 1 4 (home, coffee room, office, lecture room) 24 ing, squatting, sitting, lying and walking. Each action was recorded under certain conditions. These conditions include carrying or not... |
8 | Tracker trees for unusual event detection.
- Nater, Grabner, et al.
- 2009
(Show Context)
Citation Context ..., and the silhouette motion is modeled by an integrated normalized motion energy image computed over a short-term video sequence. Then, the shape deformation quantified from the fitted silhouettes is used as the features to distinguish different postures of the human. Inactivity detection is adopted by [28] to detect falls. In [28] ceiling-mounted, wide-angle cameras with vertically oriented optical axes are used to reduce the influence of occlusion. Nait-Charif et al. [28] use learned models of spatial context, which are used in conjunction with a tracker to achieve these goals. Nater et al. [30] present an approach for unusual event detection based on a tree of trackers. Each tracker is specialized for a specific type of activity. Falls are detected when none of the specialized trackers for “normal” activities can explain the observation. Charfi et al. [11, 10] introduce a spatio-temporal human fall descriptor, named STHF, that uses several combinations of transformations of geometrical features. The well-known SVM classifier is applied to the STHF descriptor to classify falls and normal activities. 3.2 3D-based Methods Using Multiple RGB Cameras Another category of vision-based meth... |
8 |
Fall detection in homes of older adults using the Microsoft Kinect
- Stone, Skubic
(Show Context)
Citation Context ...he human centroid height relative to the ground and the body velocity are used to determine if a fall has occurred. Michal et al. [19] use a ceiling-mounted 3D depth camera to detect falls. A KNN classifier is used to distinguish the lying pose from common daily activities based on features including head to floor distance, person area and shape’s major length to width. Human motion analysis is further employed to classify between intentional lying postures and accidental falls. Analyzing how a human has moved during the last frames in a world coordinate system is another commonly used method [35, 42, 41, 24]. Eric et al. [35] propose a twostage fall detection method. The first stage of the system characterizes the vertical state of a segmented 3D object for each frame, and then identifies on ground events through temporal segmentation of the vertical state time series of tracked 3D objects. The second stage of the system utilizes an ensemble of decision trees and features extracted from an on ground event to compute a confidence that a fall preceded it. Zhong et al. [42] propose a statistical method based on Kinect depth cameras, that makes a decision based on information about how the human move... |
6 |
Detecting falls with 3d range camera in ambient assisted living applications: A preliminary study.
- Leone, Diraco, et al.
- 2011
(Show Context)
Citation Context ...reas as the product of widths of the same person, observed in two cameras. 3.3 3D-based Methods Using Depth Cameras The earliest depth camera used for fall detection is the TimeOf-Flight 3D camera [13]. Since the price of a Time-OfFlight 3D camera is expensive, very few researchers adopted it for fall detection. But this situation has changed since the advent of the affordable depth sensing technology, like Microsoft Kinect. With the help of the depth cameras, the calculation of the distance from the top of the person to the floor is simple, which can then be used as a feature to detect falls [13, 21, 32, 19]. Diraco et al. [13] use a wall-mounted Time-Of-Flight 3D camera to monitor the scene. The system identifies a fall event when the human centroid gets closer than a certain threshold to the floor, and the person does not move for a certain number of seconds once close to the floor. In a related approach, Leone et al. [21] employ a 3D range camera. A fall event is detected based on two rules: (1) the distance of the person’s center-of-mass from the floor plane decreases below a threshold within a time window of about 900ms; (2) the person’s motion remains negligible within a time window of abou... |
6 |
Slip and fall event detection using Bayesian Belief Network
- Liao, Huang, et al.
(Show Context)
Citation Context ...activities. Arie et al. [29] present a novel method to distinguish different postures including standing, sitting, bending/squatting, lying on the side and lying toward the camera. The proposed method extracts the projection histograms of the segmented human body silhouette as the main feature vector. Posture classification is completed by k-Nearest Neighbor (k-NN) algorithm and evidence accumulation technique. The motion pattern differences between falls and other daily activities, like walking, sitting down, drinking and etc, are significant. Much of the research is based on motion analysis [15, 22, 40, 33]. Liao et al. [22] use human motion analysis and human silhouette shape variations to detect slip-only and fall events. The motion measure is obtained by analyzing the energy of the motion active (MA) area in the integrated spatio-temporal energy (ISTE) map. Homa et al. [15] discuss applying Integrated Time Motion Image (ITMI) to fall detection. Integrated Time Motion Image (ITMI) is a type of spatio-temporal database that includes motion and time of motion occurrence. Given a video clip, the integrated time motion images are calculated to represent the motion pattern that occurred in the vide... |
6 | Privacy preserving automatic fall detection for elderly using rgbd cameras.
- Zhang, Tian, et al.
- 2012
(Show Context)
Citation Context ...change and then finalized by the inactivity detection. The key novelty of the proposed method lies in calculating the velocity based on the contraction or expansion of the width, height and depth of the 3D bounding box. By explicitly using the 3D bounding box, the proposed algorithm does not require any prior knowledge of the scene (i.e. floor). The inactivity situation is defined as a lack of motion for the monitored human in a pre-defined time window. The Microsoft Kinect SDK provides skeletal joints tracking, which enables researchers to analyze the human body key joints for fall detection [8, 39]. Zhen-Peng et al. [8] propose a single depth camera based fall detection method by analyzing the human body key joints. In the proposed approach, a pose-invariant and efficient randomized decision tree (RDT) algorithm is employed to extract the 3D body joints at each frame. Then, the 3D trajectory of the head joint is fed to a pre-trained support vector machine (SVM) classifier to determine whether or not a fall action has occured. Chenyang et al. [39] propose a RGB-D cameras based method to recognize five activities: standing, falling from standing, falling from chair, sitting on chair, and ... |
6 |
Introducing a statistical behavior model into camera-based fall detection
- Zweng, Zambanini, et al.
(Show Context)
Citation Context ... employ multiple cameras and a hierarchy of fuzzy logic to detect falls. Overall, using multiple cameras offers the advantage of allowing 3D reconstruction and extraction of 3D features for fall detection. The proposed method introduces the voxel person, which is a linguistic summarization of temporal fuzzy inference curves, to represent the states of a three-dimensional object. Aside from providing 3D reconstruction, multi-camera systems can also be used for other purposes, like viewpoint independent [37] fall detection, monitoring multiple rooms [12] and fusion of different cameras’ results [44]. Thome et al. [37] propose a multi-view fall detection system by which motion is modeled by a layered hidden Markov model (LHMM). The proposed method uses a multi-view setting, where the low-level steps are (mainly) performed independently in each view, leading to the extraction of simple image features compatible with real-time achievement. Then, a fusion unit merges the output of each camera to provide a viewpoint-independent pose classification. Cucchiara et al. [12] use multiple cameras to monitor different rooms. A single room is monitored by a single camera. Multiple cameras are used to... |
5 | Experiments with computer vision methods for fall detection.
- Zhang, Becker, et al.
- 2010
(Show Context)
Citation Context ...activities. Arie et al. [29] present a novel method to distinguish different postures including standing, sitting, bending/squatting, lying on the side and lying toward the camera. The proposed method extracts the projection histograms of the segmented human body silhouette as the main feature vector. Posture classification is completed by k-Nearest Neighbor (k-NN) algorithm and evidence accumulation technique. The motion pattern differences between falls and other daily activities, like walking, sitting down, drinking and etc, are significant. Much of the research is based on motion analysis [15, 22, 40, 33]. Liao et al. [22] use human motion analysis and human silhouette shape variations to detect slip-only and fall events. The motion measure is obtained by analyzing the energy of the motion active (MA) area in the integrated spatio-temporal energy (ISTE) map. Homa et al. [15] discuss applying Integrated Time Motion Image (ITMI) to fall detection. Integrated Time Motion Image (ITMI) is a type of spatio-temporal database that includes motion and time of motion occurrence. Given a video clip, the integrated time motion images are calculated to represent the motion pattern that occurred in the vide... |
5 | A viewpoint-independent statistical method for fall detection.
- Zhang, Liu, et al.
- 2012
(Show Context)
Citation Context ...he human centroid height relative to the ground and the body velocity are used to determine if a fall has occurred. Michal et al. [19] use a ceiling-mounted 3D depth camera to detect falls. A KNN classifier is used to distinguish the lying pose from common daily activities based on features including head to floor distance, person area and shape’s major length to width. Human motion analysis is further employed to classify between intentional lying postures and accidental falls. Analyzing how a human has moved during the last frames in a world coordinate system is another commonly used method [35, 42, 41, 24]. Eric et al. [35] propose a twostage fall detection method. The first stage of the system characterizes the vertical state of a segmented 3D object for each frame, and then identifies on ground events through temporal segmentation of the vertical state time series of tracked 3D objects. The second stage of the system utilizes an ensemble of decision trees and features extracted from an on ground event to compute a confidence that a fall preceded it. Zhong et al. [42] propose a statistical method based on Kinect depth cameras, that makes a decision based on information about how the human move... |
4 |
Fall detection using multiple cameras.
- Auvinet, Reveret, et al.
- 2008
(Show Context)
Citation Context ...or a specific type of activity. Falls are detected when none of the specialized trackers for “normal” activities can explain the observation. Charfi et al. [11, 10] introduce a spatio-temporal human fall descriptor, named STHF, that uses several combinations of transformations of geometrical features. The well-known SVM classifier is applied to the STHF descriptor to classify falls and normal activities. 3.2 3D-based Methods Using Multiple RGB Cameras Another category of vision-based methods for fall detection is 3D-based methods using multiple RGB cameras. The calibrated multi-camera systems [6, 4, 5, 1, 2] allow 3D reconstruction of the object but require a careful and timeconsuming calibration process. Auvinet et al. [6, 4] use a network of multiple calibrated cameras to reconstruct the 3D shape of the person. Fall events are detected by analyzing the volume distribution along the vertical axis, and an alarm is triggered when the major part of this distribution is abnormally near the floor. In a later work [5], the fall alarm would be triggered when the major part of this distribution is abnormally near the floor during a predefined period of time. Anderson et al. [1, 2] employ multiple camera... |
3 |
Fall detection for elderly person care in a vision-based home surveillance environment using a monocular camera. Signal Image Video Process
- Feng, Liu, et al.
(Show Context)
Citation Context ...ultiple actions simultaneously with less intrusion. Vision-based methods can be broadly divided into three categories: fall detection using a single RGB camera, 3D-based methods using multiple cameras, and 3D-based methods using depth cameras. 3.1 Fall Detection Using a Single RGB Camera Fall detections using a single RGB camera have been extensively studied, as the systems are easy to set up and are inexpensive. Shape related features, inactivity detection and human motion analysis are the most commonly used clues for detecting falls. Shape related features are widely used for fall detection [26, 34, 38, 25, 36, 14, 29]. In [36, 25], fall detection is based on width to height aspect ratio of the person. Mirmahboub et al. [26] use a simple background separation method to create the silhouette of the person, and several features are then extracted from the silhouette area. Finally, A SVM classifier is employed to perform the classification based on these silhouette-related features. Rougier et al. [34] use a shape matching technique to track the silhouette of the person in the target video clip. The shape deformation is then quantified from these silhouettes, and the classification is based on the shape deform... |
3 |
Fall detection using ceiling-mounted 3d depth camera.
- Kepski, Kwolek
- 2014
(Show Context)
Citation Context ...reas as the product of widths of the same person, observed in two cameras. 3.3 3D-based Methods Using Depth Cameras The earliest depth camera used for fall detection is the TimeOf-Flight 3D camera [13]. Since the price of a Time-OfFlight 3D camera is expensive, very few researchers adopted it for fall detection. But this situation has changed since the advent of the affordable depth sensing technology, like Microsoft Kinect. With the help of the depth cameras, the calculation of the distance from the top of the person to the floor is simple, which can then be used as a feature to detect falls [13, 21, 32, 19]. Diraco et al. [13] use a wall-mounted Time-Of-Flight 3D camera to monitor the scene. The system identifies a fall event when the human centroid gets closer than a certain threshold to the floor, and the person does not move for a certain number of seconds once close to the floor. In a related approach, Leone et al. [21] employ a 3D range camera. A fall event is detected based on two rules: (1) the distance of the person’s center-of-mass from the floor plane decreases below a threshold within a time window of about 900ms; (2) the person’s motion remains negligible within a time window of abou... |
3 |
Automatic monocular system for human fall detection based on variations in silhouette area’,
- Mirmahboub, Samavi, et al.
- 2013
(Show Context)
Citation Context ...ultiple actions simultaneously with less intrusion. Vision-based methods can be broadly divided into three categories: fall detection using a single RGB camera, 3D-based methods using multiple cameras, and 3D-based methods using depth cameras. 3.1 Fall Detection Using a Single RGB Camera Fall detections using a single RGB camera have been extensively studied, as the systems are easy to set up and are inexpensive. Shape related features, inactivity detection and human motion analysis are the most commonly used clues for detecting falls. Shape related features are widely used for fall detection [26, 34, 38, 25, 36, 14, 29]. In [36, 25], fall detection is based on width to height aspect ratio of the person. Mirmahboub et al. [26] use a simple background separation method to create the silhouette of the person, and several features are then extracted from the silhouette area. Finally, A SVM classifier is employed to perform the classification based on these silhouette-related features. Rougier et al. [34] use a shape matching technique to track the silhouette of the person in the target video clip. The shape deformation is then quantified from these silhouettes, and the classification is based on the shape deform... |
2 |
Definition and performance evaluation of a robust svm based fall detection solution.
- Charfi, Miteran, et al.
- 2012
(Show Context)
Citation Context ...era and one was made with multiple calibrated RGB cameras. SDUFall1 [23]: A Kinect camera was set up to record the dataset. Three channels were collected: RGB video (.avi), depth video (.avi), and 20 skeleton joint positions (.txt). All videos were recorded at a resolution of 320x240 pixels per frame and 30 frames per second in AVI format. Twenty subjects participated in the data recording. Each subject performed 6 actions 10 times each: falling down, bend1http://www.sucro.org/homepage/wanghaibo/SDUFall.html Table 1: Five publicly available fall datasets SDUFall EDF OCCU Dataset introduced in [9] Dataset introduced in [7] camera type one Kinect two Kinects two Kinects one RGB camera eight calibrated RGB cameras camera viewpoints one two two NaN eight fall type falls with different directions eight fall directions occluded falls falls with different directions forward, backward falls, falls from sitting down and loss of balance number of falls 200 320 60 192 200 activities of daily life Yes Yes Yes Yes Yes simulated scenarios 1 1 1 4 (home, coffee room, office, lecture room) 24 ing, squatting, sitting, lying and walking. Each action was recorded under certain conditions. These conditio... |
2 |
The estimation of heights and occupied areas of humans from two orthogonal views for fall detection.
- Hung, Saito
- 2013
(Show Context)
Citation Context ...the output of each camera to provide a viewpoint-independent pose classification. Cucchiara et al. [12] use multiple cameras to monitor different rooms. A single room is monitored by a single camera. Multiple cameras are used to cover different rooms and the camera handoff is treated by warping the person’s appearance in the new view by means of homography. Zweng et al. [44] detect falls using multiple cameras. Each of the camera inputs results in a separate fall confidence (so no external camera calibration is needed). These confidences are then combined into an overall decision. Hung et al. [16, 17] propose using the measures of humans’ heights and occupied areas to distinguish three typical states of humans: standing, sitting and lying. Two relatively orthogonal views are utilized, in turn, simplifying the estimation of occupied areas as the product of widths of the same person, observed in two cameras. 3.3 3D-based Methods Using Depth Cameras The earliest depth camera used for fall detection is the TimeOf-Flight 3D camera [13]. Since the price of a Time-OfFlight 3D camera is expensive, very few researchers adopted it for fall detection. But this situation has changed since the advent o... |
2 |
Depth-based human fall detection via shape features and improved extreme learning machine.
- Ma, Wang, et al.
- 2014
(Show Context)
Citation Context ... that can serve as benchmarks. The rest of the paper is organized as follows. In Section 2, several publicly available fall datasets are introduced. We first talk about the classification of vision-based fall detection methods and then introduce them separately in Section 3. Finally, we conclude and discuss future directions of research in Section 4. 2. PUBLIC FALL DATASETS We introduce five publicly available fall datasets in this section. Three of them were recorded using Kinect cameras, one was collected by a single RGB camera and one was made with multiple calibrated RGB cameras. SDUFall1 [23]: A Kinect camera was set up to record the dataset. Three channels were collected: RGB video (.avi), depth video (.avi), and 20 skeleton joint positions (.txt). All videos were recorded at a resolution of 320x240 pixels per frame and 30 frames per second in AVI format. Twenty subjects participated in the data recording. Each subject performed 6 actions 10 times each: falling down, bend1http://www.sucro.org/homepage/wanghaibo/SDUFall.html Table 1: Five publicly available fall datasets SDUFall EDF OCCU Dataset introduced in [9] Dataset introduced in [7] camera type one Kinect two Kinects two Kin... |
2 |
Human fall detection based on adaptive background mixture model and hmm.
- Tra, Pham
- 2013
(Show Context)
Citation Context ...ultiple actions simultaneously with less intrusion. Vision-based methods can be broadly divided into three categories: fall detection using a single RGB camera, 3D-based methods using multiple cameras, and 3D-based methods using depth cameras. 3.1 Fall Detection Using a Single RGB Camera Fall detections using a single RGB camera have been extensively studied, as the systems are easy to set up and are inexpensive. Shape related features, inactivity detection and human motion analysis are the most commonly used clues for detecting falls. Shape related features are widely used for fall detection [26, 34, 38, 25, 36, 14, 29]. In [36, 25], fall detection is based on width to height aspect ratio of the person. Mirmahboub et al. [26] use a simple background separation method to create the silhouette of the person, and several features are then extracted from the silhouette area. Finally, A SVM classifier is employed to perform the classification based on these silhouette-related features. Rougier et al. [34] use a shape matching technique to track the silhouette of the person in the target video clip. The shape deformation is then quantified from these silhouettes, and the classification is based on the shape deform... |
1 |
Shape feature encoding via fisher vector for efficient fall detection in depth-videos. Applied Soft Computing,
- Aslan, Sengur, et al.
- 2015
(Show Context)
Citation Context ...d to recognize five activities: standing, falling from standing, falling from chair, sitting on chair, and sitting on floor. The main analysis is based on 3D depth information. If the person goes out of the range of the depth camera, RGB video is used to analyze the activities. The kinematic model features that are extracted from 3D depth information include two components: structure similarity and vertical height of the person. For RGB video analysis, the width-height ratio of the detected human bounding box is used to recognize different activities. Other methods using depth cameras include [23, 3, 20]. Xin et al. [23] combine shape-based fall characterization and a learning-based classifier to distinguish falls from other daily actions. Curvature scale space (CSS) features of human silhouettes are extracted at each frame and then an action is represented by a bag of CSS words (BoCSS). The BoCSS representation of a fall is distinguished from those of other actions by the pre-trained extreme learning machine (ELM) classifier. In a later work [3], instead of representing an action as a bag of CSS words, Fisher Vector (FV) encoding is used to describe the action based on CSS features. A pre-tr... |
1 |
Fall detection using body volume reconstruction and vertical repartition analysis.
- Auvinet, Multon, et al.
- 2010
(Show Context)
Citation Context ...or a specific type of activity. Falls are detected when none of the specialized trackers for “normal” activities can explain the observation. Charfi et al. [11, 10] introduce a spatio-temporal human fall descriptor, named STHF, that uses several combinations of transformations of geometrical features. The well-known SVM classifier is applied to the STHF descriptor to classify falls and normal activities. 3.2 3D-based Methods Using Multiple RGB Cameras Another category of vision-based methods for fall detection is 3D-based methods using multiple RGB cameras. The calibrated multi-camera systems [6, 4, 5, 1, 2] allow 3D reconstruction of the object but require a careful and timeconsuming calibration process. Auvinet et al. [6, 4] use a network of multiple calibrated cameras to reconstruct the 3D shape of the person. Fall events are detected by analyzing the volume distribution along the vertical axis, and an alarm is triggered when the major part of this distribution is abnormally near the floor. In a later work [5], the fall alarm would be triggered when the major part of this distribution is abnormally near the floor during a predefined period of time. Anderson et al. [1, 2] employ multiple camera... |
1 |
Fall detection based on body part tracking using a depth camera. IEEE journal of biomedical and health informatics,
- Bian, Hou, et al.
- 2014
(Show Context)
Citation Context ...change and then finalized by the inactivity detection. The key novelty of the proposed method lies in calculating the velocity based on the contraction or expansion of the width, height and depth of the 3D bounding box. By explicitly using the 3D bounding box, the proposed algorithm does not require any prior knowledge of the scene (i.e. floor). The inactivity situation is defined as a lack of motion for the monitored human in a pre-defined time window. The Microsoft Kinect SDK provides skeletal joints tracking, which enables researchers to analyze the human body key joints for fall detection [8, 39]. Zhen-Peng et al. [8] propose a single depth camera based fall detection method by analyzing the human body key joints. In the proposed approach, a pose-invariant and efficient randomized decision tree (RDT) algorithm is employed to extract the 3D body joints at each frame. Then, the 3D trajectory of the head joint is fed to a pre-trained support vector machine (SVM) classifier to determine whether or not a fall action has occured. Chenyang et al. [39] propose a RGB-D cameras based method to recognize five activities: standing, falling from standing, falling from chair, sitting on chair, and ... |
1 |
Optimized spatio-temporal descriptors for real-time fall detection: comparison of support vector machine and adaboost-based classification.
- Charfi, Miteran, et al.
- 2013
(Show Context)
Citation Context .... Inactivity detection is adopted by [28] to detect falls. In [28] ceiling-mounted, wide-angle cameras with vertically oriented optical axes are used to reduce the influence of occlusion. Nait-Charif et al. [28] use learned models of spatial context, which are used in conjunction with a tracker to achieve these goals. Nater et al. [30] present an approach for unusual event detection based on a tree of trackers. Each tracker is specialized for a specific type of activity. Falls are detected when none of the specialized trackers for “normal” activities can explain the observation. Charfi et al. [11, 10] introduce a spatio-temporal human fall descriptor, named STHF, that uses several combinations of transformations of geometrical features. The well-known SVM classifier is applied to the STHF descriptor to classify falls and normal activities. 3.2 3D-based Methods Using Multiple RGB Cameras Another category of vision-based methods for fall detection is 3D-based methods using multiple RGB cameras. The calibrated multi-camera systems [6, 4, 5, 1, 2] allow 3D reconstruction of the object but require a careful and timeconsuming calibration process. Auvinet et al. [6, 4] use a network of multiple c... |
1 |
Robust spatio-temporal descriptors for real-time svm-based fall detection.
- Charfi, Miteran, et al.
- 2014
(Show Context)
Citation Context .... Inactivity detection is adopted by [28] to detect falls. In [28] ceiling-mounted, wide-angle cameras with vertically oriented optical axes are used to reduce the influence of occlusion. Nait-Charif et al. [28] use learned models of spatial context, which are used in conjunction with a tracker to achieve these goals. Nater et al. [30] present an approach for unusual event detection based on a tree of trackers. Each tracker is specialized for a specific type of activity. Falls are detected when none of the specialized trackers for “normal” activities can explain the observation. Charfi et al. [11, 10] introduce a spatio-temporal human fall descriptor, named STHF, that uses several combinations of transformations of geometrical features. The well-known SVM classifier is applied to the STHF descriptor to classify falls and normal activities. 3.2 3D-based Methods Using Multiple RGB Cameras Another category of vision-based methods for fall detection is 3D-based methods using multiple RGB cameras. The calibrated multi-camera systems [6, 4, 5, 1, 2] allow 3D reconstruction of the object but require a careful and timeconsuming calibration process. Auvinet et al. [6, 4] use a network of multiple c... |
1 | Detecting fall incidents of the elderly based on human-ground contact areas.
- Hung, Saito, et al.
- 2013
(Show Context)
Citation Context ...the output of each camera to provide a viewpoint-independent pose classification. Cucchiara et al. [12] use multiple cameras to monitor different rooms. A single room is monitored by a single camera. Multiple cameras are used to cover different rooms and the camera handoff is treated by warping the person’s appearance in the new view by means of homography. Zweng et al. [44] detect falls using multiple cameras. Each of the camera inputs results in a separate fall confidence (so no external camera calibration is needed). These confidences are then combined into an overall decision. Hung et al. [16, 17] propose using the measures of humans’ heights and occupied areas to distinguish three typical states of humans: standing, sitting and lying. Two relatively orthogonal views are utilized, in turn, simplifying the estimation of occupied areas as the product of widths of the same person, observed in two cameras. 3.3 3D-based Methods Using Depth Cameras The earliest depth camera used for fall detection is the TimeOf-Flight 3D camera [13]. Since the price of a Time-OfFlight 3D camera is expensive, very few researchers adopted it for fall detection. But this situation has changed since the advent o... |
1 |
Fall detection system using kinectaAZs infrared sensor.
- Mastorakis, Makris
- 2014
(Show Context)
Citation Context ...he human centroid height relative to the ground and the body velocity are used to determine if a fall has occurred. Michal et al. [19] use a ceiling-mounted 3D depth camera to detect falls. A KNN classifier is used to distinguish the lying pose from common daily activities based on features including head to floor distance, person area and shape’s major length to width. Human motion analysis is further employed to classify between intentional lying postures and accidental falls. Analyzing how a human has moved during the last frames in a world coordinate system is another commonly used method [35, 42, 41, 24]. Eric et al. [35] propose a twostage fall detection method. The first stage of the system characterizes the vertical state of a segmented 3D object for each frame, and then identifies on ground events through temporal segmentation of the vertical state time series of tracked 3D objects. The second stage of the system utilizes an ensemble of decision trees and features extracted from an on ground event to compute a confidence that a fall preceded it. Zhong et al. [42] propose a statistical method based on Kinect depth cameras, that makes a decision based on information about how the human move... |
1 | Evaluating depth-based computer vision methods for fall detection under occlusions.
- Zhang, Conly, et al.
- 2014
(Show Context)
Citation Context ...wo falls along each direction in each viewpoint in the EDF dataset. So, there are 160 falls in each viewpoint and 320 falls in total. In the EDF dataset, subjects also performed a total of 100 actions that tend to produce features similar to those of a fall event, namely: 20 examples of picking up something from the floor, 20 cases of sitting on the floor and 20 examples of lying down on the floor, 20 examples of tying shoelaces and 20 examples of doing plank exercise. The dataset was recorded at a resolution of 320x240 pixels per frame and at a frame rate of about 25 frames per second. OCCU3 [41]: Two Kinect depth cameras were set up at two corners of a simulated apartment to collect occluded falls. An occluded fall refers to the end of the fall can be completely occluded by a certain object, like a bed. Each of the 5 subjects performed six occluded falls in each viewpoint in the OCCU dataset. The OCCU dataset includes 25,618 frames and 30 totally occluded falls in videos from the first viewpoint, and 23,703 frames and 30 totally occluded falls in videos from the second viewpoint performed by the same subjects. Each viewpoint was recorded at separate times from the other viewpoint, an... |