. 2008;3(11):e3807.

doi: 10.1371/journal.pone.0003807. Epub 2008 Nov 27.

Object segmentation from motion discontinuities and temporal occlusions--a biologically inspired model

Cornelia Beck¹, Thilo Ognibeni, Heiko Neumann

Affiliations

PMID: 19043613
PMCID: PMC2586919
DOI: 10.1371/journal.pone.0003807

Object segmentation from motion discontinuities and temporal occlusions--a biologically inspired model

Cornelia Beck et al. PLoS One. 2008.

. 2008;3(11):e3807.

doi: 10.1371/journal.pone.0003807. Epub 2008 Nov 27.

Authors

Cornelia Beck¹, Thilo Ognibeni, Heiko Neumann

Affiliation

¹ Institute for Neural Information Processing, University of Ulm, Ulm, Germany. cornelia.beck@uni-ulm.de

PMID: 19043613
PMCID: PMC2586919
DOI: 10.1371/journal.pone.0003807

Abstract

Background: Optic flow is an important cue for object detection. Humans are able to perceive objects in a scene using only kinetic boundaries, and can perform the task even when other shape cues are not provided. These kinetic boundaries are characterized by the presence of motion discontinuities in a local neighbourhood. In addition, temporal occlusions appear along the boundaries as the object in front covers the background and the objects that are spatially behind it.

Methodology/principal findings: From a technical point of view, the detection of motion boundaries for segmentation based on optic flow is a difficult task. This is due to the problem that flow detected along such boundaries is generally not reliable. We propose a model derived from mechanisms found in visual areas V1, MT, and MSTl of human and primate cortex that achieves robust detection along motion boundaries. It includes two separate mechanisms for both the detection of motion discontinuities and of occlusion regions based on how neurons respond to spatial and temporal contrast, respectively. The mechanisms are embedded in a biologically inspired architecture that integrates information of different model components of the visual processing due to feedback connections. In particular, mutual interactions between the detection of motion discontinuities and temporal occlusions allow a considerable improvement of the kinetic boundary detection.

Conclusions/significance: A new model is proposed that uses optic flow cues to detect motion discontinuities and object occlusion. We suggest that by combining these results for motion discontinuities and object occlusion, object segmentation within the model can be improved. This idea could also be applied in other models for object segmentation. In addition, we discuss how this model is related to neurophysiological findings. The model was successfully tested both with artificial and real sequences including self and object motion.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: Competing Interests: The authors have declared that no competing interests exist.

Figures

**Figure 1. 3D scenario with two objects.**
This figure depicts a typical scenario for a person moving in a room. A static object (green) and a moving object (blue) are located in the room in front of the background. On the left, static occlusion regions with respect to the observer perspective are marked with gray overlay. Due to the spatial configuration the green object is partly covering the blue one, both objects are occluding the background texture. When the observer is moving forward, an expansional flow field is generated that is partly superimposed by the translational movement of the blue object. The optic flow, i.e. the projection of the 3D flow is shown on the projection plane. The alignment of the objects in the 2D projection is shown on the right. Here, also the kinetic occlusions generated by the movement of the blue object are depicted. On its left side, background texture is uncovered (disocclusion), on the right side it is temporarily covered (occlusion). Note, that the expansional flow leads to further kinetic occlusion regions along the outline of both objects, for simplicity this is not included in the sketch.

**Figure 2. Sketch of the biologically inspired model.**
V1_Model Motion and MT_Model Motion represent the basic modules for optic flow estimation. In TO_Model regions that have been occluded or disoccluded are estimated. In MSTl_Model motion discontinuities are computed based on MT_Model input due to spatial on-center-off-surround receptive fields. The information of areas MSTl_Model, TO_Model, and V2_Model is combined in a higher level processing area (HLP_Model). Feedforward connections are depicted with dark blue arrows, feedback connections with light blue arrows. The interactions between MSTl_Model and TO_Model are depicted with green arrows.

**Figure 3. Optic flow estimation at occlusions.**
Occlusions lead to problems for motion estimation algorithms based on the correlation between only two frames: Parts of the image are only visible in one of the frames, thus no corresponding image positions can be found at these locations. This problem can be solved using only one additional temporally forward-looking step (future step).

**Figure 4. Detection of motion discontinuities.**
Some examples for motion discontinuities are given on the left bottom. We use a motion discontinuity detector built of an on-center-off-surround RF that will respond very strongly if center and surround motion differ. If a homogeneous flow field is presented, only a weak response is produced.

**Figure 5. Detection of occlusion regions.**
To detect occlusions and disocclusions in the motion sequence, we compare the motion energy at each spatial position that was estimated using the past frame pair t₋₁/t₀ and using the future frame pair t₀/t₁. A high difference typically occurs at occlusion and disocclusion positions due to regions that are only visible in t₋₁ or t₁ and thus entail very ambiguous motion estimates.

**Figure 6. Overview of mechanisms for scene interpretation.**
Top row: The optic flow of the input image is computed in V1_Model and MT_Model, spatial contrast neurons in MSTl_Model compute the motion discontinuities. Based on the detected motion boundaries a simple filling-in mechanism provides a scene segmentation. Bottom row: In TO_Model input from V1_Model neurons is used for a temporal on-center-off-surround processing step to detect occlusion and disocclusion regions. In HLP_Model these regions are restricted to the motion discontinuities or luminance contours provided from V2_Model to find the corresponding object that is adjacent to the occlusion region, namely the occluder. The results of the object segmentation are used to find the label of the corresponding object (indicated by the arrow from the top row, third column). Based on these data, the corresponding depth order can be computed. Interactions between MSTl_Model and TO_Model are not depicted in this figure.

**Figure 7. Experiment 1: Flowergarden sequence.**
A) Input image. B) Optic flow estimated in area MT_Model, direction is indicated by a color code, speed by the corresponding saturation. C) Motion discontinuities appear due to the faster optic flow on the tree and along the regions where no movement is indicated as for the sky. D) TO_Model responds strongly along the contours of the tree trunk as during the translational self-motion the trunk occludes parts of the background (white color indicates disocclusion areas, black color occlusion areas). The results shown here include feedback from MSTl_Model neurons.

**Figure 8. Experiment 2: Moving boxes.**
Results for an input sequence with 5 boxes and the background all moving in different directions. A) Input image with arrows indicating the movement of the objects. The background is slowly moving to the left. B) Mean optic flow estimations in area MT_Model marked with a color code that is superimposed on the input image. In C) the detected occlusion (black) and disocclusion (white) regions are shown. Note that depending on the direction of the object movement these regions appear all along the object boundaries or just on two sides (for a movement in vertical or horizontal direction). D) Contours of the objects as provided by V2_Model Form. This activity is used to achieve a clear localization of the occlusion boundary to the corresponding occluder. E) A clear segmentation of the object boundaries is achieved using the motion discontinuities detected with MSTl_Model on-center-off-surround neurons. F) After the detected boundaries have been grouped and filled, the image is segmented in different regions representing the objects of the scene. G) Classification of object movement. The difference of object and background motion is computed as explained in the Methods section. Light object boundaries indicate a strong difference, darker outlines represent a movement similar to the background. Note, that object 5 and 2 have a strong motion contrast to the background despite the similar movement direction due to a much higher speed than the background. H) The results of the relative depth order derived automatically from the scene. A confidence value is applied to get a probability for the correctness of the depth order (indicated in percent). This is derived from the number of positions belonging to the object that indicate that the object is in front (#pos_front) and the number of positions that indicate that the object is in the background (#pos_bg) (conf = max(#pos_front, #pos_bg)/(#pos_front+#pos_bg).

**Figure 9. Experiment 3: Independently moving object in a scene with a moving observer.**
A) Input image of the sequence (generated in the XVR environment, download at www.vrmedia.it), the gray arrow indicates the movement of the independently moving object. B) The optic flow in area MT_Model is depicted, the object movement is correctly indicating a translation to the right. C) Occlusions and disocclusions are correctly detected on the right and left side of the object, respectively. The result shown here include feedback from MSTl_Model. D) Motion discontinuities as computed by MSTl_Model on-center-off-surround neurons show the object boundary, E) after the grouping and filling-in step the object can be segmented.

**Figure 10. Experiment 4: City view through a window.**
Artificially generated scene with a background moving to the left while the aperture is fixed. A) One image of the input sequence. B) The mean optic flow as detected in MT_Model. C) The movement generates occlusions on the left (black positions) and disocclusions on the right side (white positions). D) The motion discontinuities show the complete object boundary. E) After segmentation two objects are detected depicted in different colors, the aperture (gray) and the region within the window (white). F) The corresponding occluder to the occlusion positions with respect to the objects segmented like shown in E), the colors indicate the assignment. Most positions correctly indicate the aperture as the object causing the occlusion.

**Figure 11. Experiment 5: Rotating rectangle.**
A bar is rotating around its center in front of a stationary background. A) Input image of the sequence. B) The motion estimates of area MT_Model, C) Discclusion regions appear on the upper left and the lower right, in contrast occlusions are found at the lower left and the upper right, this diagonal appearance is due to the rotational movement of the object. The result indicated here is without feedback from motion discontinuities. D) The motion boundary is correctly detected using the motion discontinuities, however, also in the object center MSTl_Model neurons respond strongly when the movement switches from zero movement to the smallest movement that can be detected with the model. E) When including the interaction between occlusion and motion discontinuity detection, the erroneously detected central part is erased. F) Occlusion regions are correctly restricted due to feedback from motion discontinuity neurons as shown in D. The feedback is slightly blurred as occlusion regions may be significantly bigger than motion discontinuities.

**Figure 12. Experiment 6: Detection of moving objects in a real sequence.**
A) Input image of the sequence representing two objects moving in opposite directions and a translational camera movement upwards. B) Mean optic flow estimated in area MT_Model, the direction of movement is depicted with the color code shown in the top right corner. C) In movement direction of the objects the dark region represents the occlusions detected, behind the objects white positions indicate the disoccluded region. Due to higher object speed the regions here are bigger than in the other experiments. According to the noise included in the scene, the estimates also get noisier, but still the overall response reflects the correct occlusion and disocclusion regions. D) The motion discontinuities including temporal integration (three frames used) clearly indicate the object boundary, E) after grouping the scene is segmented into background (black) and the two objects (gray and white). The motion discontinuities in D) in the upper left and the lower right part are not according to the results of the detected kinetic occlusion. The results in E) after the interaction with TO_Model thus correctly indicate only 2 objects. F) Comparison of motion discontinuity results without (left column) and with (right column) temporal integration. Without temporal integration the quality of the motion discontinuities is reduced: For expample, the gap in the smaller object at the lower left corner can only be closed using the temporal integration (first row, position indicated in light blue in D). Also the outline of the other object becomes straighter (second row, position indicated in red in D).

See this image and copyright information in PMC

References

1. Ke Q, Kanade T. A Subspace approach to Layer Extraction. Proc Computer Vision and Pattern Recognition. 2001:255–262.
1. Ogale AS, Fermüller C, Aloimonos Y. Motion Segmentation Using Occlusions. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2005;27:988–992. - PubMed
1. Weiss Y, Adelson EH. A Unified Mixture Framework for Motion Segmentation: Incorporating Spatial Coherence and Estimating the Number of Models. Proc Computer Vision and Pattern Recognition. 1996:321–326.
1. Niyogi SA. Detecting kinetic occlusion. In Proc ICCV, IEEE Computer Society Press. 1995:1044–1049.
1. Black MJ, Fleet DJ. Probabilistic Detection and Tracking of Motion Boundaries. Int'l J Computer Vision. 2000;38:231–245.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Object segmentation from motion discontinuities and temporal occlusions--a biologically inspired model

Affiliation

Object segmentation from motion discontinuities and temporal occlusions--a biologically inspired model

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources