Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Sep 7:193:108057.
doi: 10.1016/j.neunet.2025.108057. Online ahead of print.

Inter-modality feature prediction through multimodal fusion for 3D shape defect detection

Affiliations

Inter-modality feature prediction through multimodal fusion for 3D shape defect detection

Mujtaba Asad et al. Neural Netw. .

Abstract

3D shape defect detection plays an important role in autonomous industrial inspection. However, accurate detection of anomalies remains challenging due to the complexity of multimodal sensor data, especially when both color and structural information are required. In this work, we propose a lightweight inter-modality feature prediction framework that effectively utilizes multimodal fused features from the inputs of RGB, depth and point clouds for efficient 3D shape defect detection. Our proposed framework consists of three main key components: 1) Modality-specific pre-trained feature extractor networks, 2) Multi-level Adaptive Dual-Modal Gated Fusion (ADMGF) module that effectively combines the RGB and depth features to obtain rich spatial and contextual information. 3) A lightweight inter-modal feature prediction network that utilizes the fused RGB-Depth features to predict the corresponding point cloud features and vice versa, forming a bidirectional learning mechanism through tri-modal inputs. Our model eliminates the need for large memory banks or pixel-level reconstructions. Comprehensive experiments on the MVTec3D-AD and Eyecandies datasets showed significant improvements in performance over the state-of-the-art methods.

Keywords: Anomaly detection; Cross-attention; Industrial automation,; Inter-modality representation learning; Multi-level feature fusion.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

LinkOut - more resources