Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Nov 12;116(46):22959-22965.
doi: 10.1073/pnas.1912154116. Epub 2019 Oct 28.

Compact single-shot metalens depth sensors inspired by eyes of jumping spiders

Affiliations

Compact single-shot metalens depth sensors inspired by eyes of jumping spiders

Qi Guo et al. Proc Natl Acad Sci U S A. .

Abstract

Jumping spiders (Salticidae) rely on accurate depth perception for predation and navigation. They accomplish depth perception, despite their tiny brains, by using specialized optics. Each principal eye includes a multitiered retina that simultaneously receives multiple images with different amounts of defocus, and from these images, distance is decoded with relatively little computation. We introduce a compact depth sensor that is inspired by the jumping spider. It combines metalens optics, which modifies the phase of incident light at a subwavelength scale, with efficient computations to measure depth from image defocus. Instead of using a multitiered retina to transduce multiple simultaneous images, the sensor uses a metalens to split the light that passes through an aperture and concurrently form 2 differently defocused images at distinct regions of a single planar photosensor. We demonstrate a system that deploys a 3-mm-diameter metalens to measure depth over a 10-cm distance range, using fewer than 700 floating point operations per output pixel. Compared with previous passive depth sensors, our metalens depth sensor is compact, single-shot, and requires a small amount of computation. This integration of nanophotonics and efficient computation brings artificial depth sensing closer to being feasible on millimeter-scale, microwatts platforms such as microrobots and microsensor networks.

Keywords: depth sensor; jumping spider; metalens.

PubMed Disclaimer

Conflict of interest statement

Competing interest statement: F.C. is a cofounder of Metalenz. D.S. is on the Scientific Advisory Board of Metalenz.

Figures

Fig. 1.
Fig. 1.
Jumping spider and metalens depth sensor. (A) Jumping spiders can sense depth using either 1 of their 2 front-facing principal eyes (highlighted). Unlike the single retina found in human eyes, jumping spiders have multiple retinae that are layered and semitransparent. The layered-retinae structure can simultaneously measure multiple images of the same scene with different amounts of defocus, and behavioral evidence suggests that spiders measure depth using the defocus cues that are available in these images (26). (B) The metalens depth sensor estimates depth by mimicking the jumping spider. It uses a metalens to simultaneously capture 2 images with different defocus, and it uses efficient calculations to produce depth from these images. The jumping spider’s depth perception operates normally under green light (26), and we similarly designed the metalens to operate at a wavelength of 532 nm. We coupled the metalens with a spectral filter to limit the spectral bandwidth and with a rectangular aperture to prevent overlap between the 2 adjacent images. The images depicted on the photosensor were taken from experiments and show 2 fruit flies located at different distances. The corresponding depth map computed by the sensor is shown on the right, with color used to represent object distance. The closer and farther flies are colored red and blue, respectively.
Fig. 2.
Fig. 2.
Operating principle. (A) A conventional thin-lens camera, in which the PSF width σ on the photosensor is determined by the optics and the depth Z (the object distance) according to the lens equation (Eq. 1). Zs is the distance between the lens and the photosensor. Zf is the in-focus distance. Σ is the entrance pupil (lens) radius. The solid black curve next to the photosensor represents a vertical cut of the PSF h, which is drawn here with a Gaussian shape. (B) The metalens depth sensor encodes the phase profiles of 2 thin lenses in 1 aperture. The 2 effective lenses have distinct in-focus distances (Zf+,Zf) (red and blue) and off-axis alignments that create 2 adjacent images (I+,I) with different PSF widths (σ+,σ). The effective image centers are shifted from the optical axis by ±D. The dashed red and blue curves next to the metalens show the transmitted wavefronts. Due to spatial multiplexing, the overall phase profile is highly discontinuous and therefore cannot be easily achieved with conventional (Fresnel) diffractive optical elements. (C) From a pair of input images (I+,I), a small set of calculations was used to produce the depth at each pixel across the image, generating a depth map Z(x,y) according to Eq. 5. A confidence map C(x,y) that indicates the precision of the depth prediction at each pixel was computed alongside, according to Eq. 6. The computation flows from left to right, beginning with the per-pixel mean I=12(I++I) and difference δI=I+I; Laplacian of the average image 2I computed by convolving the average image with a discrete Laplacian filter; and convolution with a band-pass filter F to attenuate noise and vignetting. From F*2I and F*δI, the depth and confidence map Z and C were computed by Eqs. 5 and 6. Parameters α,β,γ1,γ2, and γ3 were determined by the optics and were precalibrated. To eliminate large errors in the depth map, we thresholded it by showing only pixels with confidence values greater than 0.5.
Fig. 3.
Fig. 3.
Metalens design. (A) Transmission efficiency and phase shift as a function of the nanopillar width. A, Inset shows the schematic of the metalens building block: a square titanium dioxide (TiO2) nanopillar on a glass substrate. Pillar height: H = 600 nm. Lattice unit cell size (center-to-center distance between neighboring nanopillars): U = 230 nm. By varying the pillar width (W) from 90 to 190 nm, the phase shift changes from 0 to 2π, and the transmission remains high. (B) Top-view SEM image of the right portion of a fabricated metalens. (Scale bar: 2 μm.) (C) Enlarged view of the highlighted region in B, with nanopillars corresponding to the 2 lens-phase profiles marked with red and blue. (Scale bar: 500 nm.) (D) Side-view SEM image of the edge of the metalens showing that the nanopillars have vertical sidewalls. (Scale bar: 200 nm.)
Fig. 4.
Fig. 4.
Performance analysis. (A) PSFs corresponding to the 2 images (I+, I), measured by using green LED point light sources placed at different distances Z in front of the metalens. A spectral filter was used to limit the light bandwidth (10-nm bandwidth centered at 532 nm). The asymmetry in the PSFs results from chromatic aberration and can be eliminated by using a monochromatic laser source (SI Appendix, Fig. S8). (B) Depth Z measured by the metalens sensor as a function of known object distance. Different colors correspond to different confidence thresholds. The solid curves are the mean Z¯ of the measured depth over many different object points that are located at the same known distance. The upper and lower boundaries of the shaded regions are the corresponding mean deviations of measured depth |ZZ¯|¯. In obtaining both Z¯ and |ZZ¯|¯, only pixels whose confidence values are above the threshold are counted. The mean deviation is thus smaller for larger confidence threshold. The solid black line represents the ideal depth measurements (i.e., those equal to the known distances), and the dashed black lines represent ±5% relative differences between the measured depth and the known object distance. Within the distance range of 0.3 to 0.4 m, the measured depth is close to the ideal measurements. The mean deviation over this range is around 5% of the object distances, for a confidence threshold of 0.5. Beyond this range, the measured depth trends toward constant values that do not depend on object distance, as indicated by plateaus on the left and right. At these distances, the captured images I+, I are too blurry to provide useful contrast information for the depth measurement.
Fig. 5.
Fig. 5.
Input images and output depth maps. The sensor produces real-time depth and confidence maps of 400 × 400 pixels at >100 frames per second. (A and B) It can measure fast-moving objects such as fruit flies (A) and water streams (B) because the 2 images (I+, I) are captured in a single shot instead of sequentially over time. (B and C) It can also measure translucent structures such as water streams (B) and flames (C) because it relies only on ambient light instead of reflections from a controlled light source. (D) A slanted plane with text expresses the difference in defocus between the 2 images (I+, I). Color bar is in meters. The images and depth map for scenes A, B, and D were produced by illuminating the scenes with a green LED. Depth maps were thresholded at confidence greater than 0.5, which is the threshold that yields a mean deviation of about 5% of object distance between 0.3 and 0.4 m in Fig. 4B. Additional images and videos are available in SI Appendix.

References

    1. McManamon P. F, Field Guide to Lidar (SPIE Press, Bellingham, WA, 2015).
    1. Achar S., Bartels J. R, Whittaker W. L., Kutulakos K. N., Narasimhan S. G, Epipolar time-of-flight imaging. ACM Trans. Graph. 36, 37 (2017).
    1. Gupta M., Nayar S. K, Hullin M. B, Martin J., Phasor imaging: A generalization of correlation-based time-of-flight imaging. ACM Trans. Graph. 34, 156 (2015).
    1. Hansard M., Lee S., Choi O., Horaud R. P., Time-of-Flight Cameras: Principles, Methods and Applications (Springer Science & Business Media, New York, 2012).
    1. Heide F., Heidrich W., Hullin M., Wetzstein G., Doppler time-of-flight imaging. ACM Trans. Graph. 34, 36 (2015).

Publication types

LinkOut - more resources