Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 May 13:arXiv:2505.06686v2.

Detection of moving objects using self-motion constraints on optic flow

Affiliations

Detection of moving objects using self-motion constraints on optic flow

Hope Lutwak et al. ArXiv. .

Abstract

As we move through the world, the pattern of light projected on our eyes is complex and dynamic, yet we are still able to distinguish between moving and stationary objects. We propose that humans accomplish this by exploiting constraints that self-motion imposes on retinal velocities. When an eye translates and rotates in a stationary 3D scene, the velocity at each retinal location is constrained to a line segment in the 2D space of retinal velocities. The slope and intercept of this segment is determined by the eye's translation and rotation, and the position along the segment is determined by local scene depth. Since all possible velocities arising from a stationary scene must lie on this segment, velocities that are not must correspond to objects moving within the scene. We hypothesize that humans make use of these constraints by using deviations of local velocity from these constraint lines to detect moving objects. To test this, we used a virtual reality headset to present rich wide-field stimuli, simulating the visual experience of translating forward in several virtual environments with varied precision of depth information. Participants had to determine if a cued object moved relative to the scene. Consistent with the hypothesis, we found that performance depended on the deviation of the object velocity from the constraint segment, rather than a difference between retinal velocities of the object and its local surround. We also found that the endpoints of the constraint segment reflected the precision of depth information available in the different virtual environments.

Keywords: ego-motion; local motion; moving object detection; optic flow; virtual reality.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:. Simulated optic flow field for an observer moving in a rigid environment.
(A) Photograph of an outdoor scene, taken from (Burge and Geisler, 2011). (B) Depth map of the image in panel A (intensity proportional to log of depth) as measured with LiDAR. (C) Calculated optic flow vectors for an observer moving forward and upward. Two highlighted vectors correspond to an artificially inserted independently moving object (red), and a location in the scene that is adjacent to a large depth discontinuity. Both highlighted velocities differ substantially (in both speed and ) from their average surrounding velocity.
Figure 2:
Figure 2:. Eye-centered projective geometry defines the depth constraint line on image velocity.
(A) Schematic of a single fixating observer undergoing 3D translation T and rotation Ω. A stationary object in the world (teacup) is projected onto an image plane (gray skewed rectangle) at distance f from the eye. The velocity v=vx,vy of the projected object on the image plane is determined by both the translation and rotation of the observer. If the object is moving independently in the world, that velocity also contributes to its velocity on the image plane. (B) The velocity at each image location depends on the distance (depth) of the corresponding 3D world location, and the set of possibilities lie along a single depth constraint line segment (solid black line) in the 2D space of velocities.
Figure 3:
Figure 3:. Binocular views of VR environments
(A) Full stimulus with 1/f textured ground plane and randomly placed cubes. (B) Monocular stimulus with textured ground plane and cubes, shown to the right eye. (C) Spheres stimulus with 0.025 m diameter spheres scattered on the ground plane as well as one sphere placed randomly in the same place as each cube in the full and monocular conditions.
Figure 4:
Figure 4:. Target object speeds and directions tested.
(A) Bird’s eye view of target motions in the X-Z plane. Z+ are directions away from the viewer, while Z- are directions towards the observer. (B) Optic flow for a translating fixating observer within environment consisting of textured ground plane, and cubes, with a target cube moving independently 0.3 m/s away from the observer (90°, outlined red point in panel A. (B) Velocities of points on the target cube (red), and surround velocities within a 3.5° radius (dark gray). Velocity vectors are drawn scaled up by a factor of 20 for visualization. (C) Moving target (red) and surround (gray) velocity vectors, redrawn in a common coordinate system, along with constraint segments associated with the target locations. For this target motion, the target velocities are similar to those of the surround, but are far from the constraint segments. (D) Target and surround velocities for a target moving at 0.1 m/s largely toward the observer (285°, outlined blue point in panel A. In this condition, target velocities lie near the constraint segments, but far from the surround velocities.
Figure 5:
Figure 5:. Distance to the constraint versus distance to the surround.
(A) Left: Velocities of target points (red) for a target moving 0.3 m/s away from the observer, along with the constraint segments for each velocity (black). Right: The psychometric fit, with data from all target motion conditions (inset) arranged according to distance to the constraint. The circled data point corresponds to the target velocities plotted on the left. (B) Left: Same target velocities as panel A (red), but now shown with corresponding surround velocities (gray). Right: Psychometric fit based on calculating the distance to the surround.(C) Comparison of deviance of fits based on the distance to the constraint vs. distance to the surround for each subject and stimulus condition. For every comparison, the distance to the constraint has a much better fit (lower deviance) than the distance to the surround (higher deviance).
Figure 6:
Figure 6:. Varying the mean depth and depth range.
(A) Illustration of effect of mean depth estimate on velocity constraint segment. Shifting the mean depth of the target further away leads to constraint segments that are closer to the target velocity (i.e., the depth-dependent velocity component of Eq. 4 is reduced). (B) Illustration of the effect of increasing the depth range from a factor of 1.05 to a factor of 1.3. (C) Optimal depth constraint segment for all 8 subjects (identified by color) in each of the three VR environments. For all subjects, the fitted depth range was broader and shifted further away for the spheres stimulus.

Similar articles

References

    1. Baumann C. and Dierkes K. (2023). Neon accuracy test report. Pupil Labs.
    1. Bradley D. and Goyal M. (2008). Velocity computation in the primate visual system. Nat Rev Neurosci, 9:686––695. - PMC - PubMed
    1. Brainard D. H. (1997). The psychophysics toolbox. Spatial vision, 10(4):433–436. - PubMed
    1. Brenner E. (1991). Judging object motion during smooth pursuit eye movements: The role of optic flow. Vision Research, 31(11):1893–1902. - PubMed
    1. Britten K. H. and van Wezel R. J. (2002). Area MST and heading perception in macaque monkeys. Cerebral Cortex, 12(7):692––701. - PubMed

Publication types

LinkOut - more resources