. 2023 Nov 28;12(23):4293.

doi: 10.3390/foods12234293.

DPF-Nutrition: Food Nutrition Estimation via Depth Prediction and Fusion

Yuzhe Han¹, Qimin Cheng¹, Wenjin Wu², Ziyang Huang¹

Affiliations

¹ School of Electronic Information and Communication, Huazhong University of Science and Technology, Wuhan 430074, China.
² Institute of Agricultural Products Processing and Nuclear Agricultural Technology, Hubei Academy of Agricultural Science, Wuhan 430064, China.

PMID: 38231726
PMCID: PMC10706621
DOI: 10.3390/foods12234293

DPF-Nutrition: Food Nutrition Estimation via Depth Prediction and Fusion

Yuzhe Han et al. Foods. 2023.

. 2023 Nov 28;12(23):4293.

doi: 10.3390/foods12234293.

Authors

Yuzhe Han¹, Qimin Cheng¹, Wenjin Wu², Ziyang Huang¹

Affiliations

¹ School of Electronic Information and Communication, Huazhong University of Science and Technology, Wuhan 430074, China.
² Institute of Agricultural Products Processing and Nuclear Agricultural Technology, Hubei Academy of Agricultural Science, Wuhan 430064, China.

PMID: 38231726
PMCID: PMC10706621
DOI: 10.3390/foods12234293

Abstract

A reasonable and balanced diet is essential for maintaining good health. With advancements in deep learning, an automated nutrition estimation method based on food images offers a promising solution for monitoring daily nutritional intake and promoting dietary health. While monocular image-based nutrition estimation is convenient, efficient and economical, the challenge of limited accuracy remains a significant concern. To tackle this issue, we proposed DPF-Nutrition, an end-to-end nutrition estimation method using monocular images. In DPF-Nutrition, we introduced a depth prediction module to generate depth maps, thereby improving the accuracy of food portion estimation. Additionally, we designed an RGB-D fusion module that combined monocular images with the predicted depth information, resulting in better performance for nutrition estimation. To the best of our knowledge, this was the pioneering effort that integrated depth prediction and RGB-D fusion techniques in food nutrition estimation. Comprehensive experiments performed on Nutrition5k evaluated the effectiveness and efficiency of DPF-Nutrition.

Keywords: RGB-D fusion; deep learning; depth prediction; nutrition estimation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
The example images from Nutrition5k dataset. (a) RGB images. (b) Depth maps. (c) Nutritional annotations.

**Figure 2**
Incorrect image samples. (a) Food is not fully incorporated in the image. (b) Dishes are overlapping. (c) Non-food image.

**Figure 3**
The overall framework of our DPF-Nutrition, which consists of depth prediction module and RGB-D fusion module. We adopt depth prediction transformer (DPT) to generate the predicted depth map. We design a cross-modal attention block (CAB) to extract and integrate the complementary features of RGB and depth images. ⨁ indicates element-wise addition, Ⓖ denotes global average pool.

**Figure 4**
(a) The structure of depth prediction module. The input image is transformed into feature vectors by ResNet-50 feature extractor and consequently embedded into two-dimensional tokens. The tokens are then fed into transformer encoder. The tokens from different transformer stages are reassembled into image-like feature maps at various resolutions. Finally, the image-like feature maps are fused progressively to generate the depth prediction. (b) The structure of transformer encoder. ⨁ indicates element-wise addition.

**Figure 5**
The structures of the RGB-D fusion paradigms. (a) Fusion–enhancement. (b) Enhancement–fusion. (c) Our proposed. ⨁ denotes element-wise addition, ⨂ indicates pixel-wise multiplication, Ⓒ represents cross-channel concatenation.

**Figure 6**
The structure of CAB. GAP indicates global average pooling, ⨁ denotes element-wise addition, ⨂ indicates pixel-wise multiplication, Ⓒ represents cross-channel concatenation, Mean represents mean function along the channel dimension.

**Figure 7**
The sample results of the depth estimation. (a) RGB images. (b) Estimated depth maps. (c) Actual depth maps.

**Figure 8**
The visualization results. (a) The ROI heat-maps of different nutrients. (b) The nutrition facts.

See this image and copyright information in PMC

Cited by

A systematic literature review on integrating AI-powered smart glasses into digital health management for proactive healthcare solutions.
Wang B, Zheng Y, Han X, Kong L, Xiao G, Xiao Z, Chen S. Wang B, et al. NPJ Digit Med. 2025 Jul 5;8(1):410. doi: 10.1038/s41746-025-01715-x. NPJ Digit Med. 2025. PMID: 40617964 Free PMC article.
Visual nutrition analysis: leveraging segmentation and regression for food nutrient estimation.
Zhao Y, Zhu P, Jiang Y, Xia K. Zhao Y, et al. Front Nutr. 2024 Dec 17;11:1469878. doi: 10.3389/fnut.2024.1469878. eCollection 2024. Front Nutr. 2024. PMID: 39742105 Free PMC article.

References

1. Greenhalgh S. Soda industry influence on obesity science and policy in China. J. Public Health Policy. 2019;40:5–16. doi: 10.1057/s41271-018-00158-x. - DOI - PubMed
1. Matthews J. 2011 Food & Health Survey Consumer Attitudes toward Food Safety, Nutrition & Health. Volume 31 International Food Information Council Foundation; Washington, DC, USA: 2011.
1. Subar A.F., Kirkpatrick S.I., Mittl B., Zimmerman T.P., Thompson F.E., Bingley C., Willis G., Islam N.G., Baranowski T., McNutt S., et al. The automated self-administered 24-hour dietary recall (ASA24): A resource for researchers, clinicians and educators from the National Cancer Institute. J. Acad. Nutr. Diet. 2012;112:1134. doi: 10.1016/j.jand.2012.04.016. - DOI - PMC - PubMed
1. Meyers A., Johnston N., Rathod V., Korattikara A., Gorban A., Silberman N., Guadarrama S., Papandreou G., Huang J., Murphy K.P. Im2Calories: Towards an automated mobile vision food diary; Proceedings of the IEEE International Conference on Computer Vision; Santiago, Chile. 7–13 December 2015; pp. 1233–1241. - DOI
1. Ege T., Yanai K. Image-based food calorie estimation using knowledge on food categories, ingredients and cooking directions; Proceedings of the on Thematic Workshops of ACM Multimedia 2017; Mountain View, CA, USA. 23–27 October 2017; pp. 367–375. - DOI

Grants and funding

42271352/National Natural Science Foundation of China

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

DPF-Nutrition: Food Nutrition Estimation via Depth Prediction and Fusion

Affiliations

DPF-Nutrition: Food Nutrition Estimation via Depth Prediction and Fusion

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources