Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Oct 14:5:e222.
doi: 10.7717/peerj-cs.222. eCollection 2019.

Synthetic dataset generation for object-to-model deep learning in industrial applications

Affiliations

Synthetic dataset generation for object-to-model deep learning in industrial applications

Matthew Z Wong et al. PeerJ Comput Sci. .

Abstract

The availability of large image data sets has been a crucial factor in the success of deep learning-based classification and detection methods. Yet, while data sets for everyday objects are widely available, data for specific industrial use-cases (e.g., identifying packaged products in a warehouse) remains scarce. In such cases, the data sets have to be created from scratch, placing a crucial bottleneck on the deployment of deep learning techniques in industrial applications. We present work carried out in collaboration with a leading UK online supermarket, with the aim of creating a computer vision system capable of detecting and identifying unique supermarket products in a warehouse setting. To this end, we demonstrate a framework for using data synthesis to create an end-to-end deep learning pipeline, beginning with real-world objects and culminating in a trained model. Our method is based on the generation of a synthetic dataset from 3D models obtained by applying photogrammetry techniques to real-world objects. Using 100K synthetic images for 10 classes, an InceptionV3 convolutional neural network was trained, which achieved accuracy of 96% on a separately acquired test set of real supermarket product images. The image generation process supports automatic pixel annotation. This eliminates the prohibitively expensive manual annotation typically required for detection tasks. Based on this readily available data, a one-stage RetinaNet detector was trained on the synthetic, annotated images to produce a detector that can accurately localize and classify the specimen products in real-time.

Keywords: 3D Modelling; Computer science applications; Convolutional neural network; Deep learning with limited data; Industrial computer vision; Photogrammetry; Synthetic data.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1. Overall system design.
Figure 2
Figure 2. Visualization of the steps in photogrammetry: (A) camera calibration and point cloud generation; (B) after mesh generation; (C) after texture generation.
Figure 3
Figure 3. Flow diagram for synthetic image rendering.
Figure 4
Figure 4. (A) The rings on the shell (red rings) around which random normalized camera locations are sampled from (blue points); (B) The uniform distribution on the sphere of lamp locations.
Figure 5
Figure 5. Examples of generated synthetic images (A–T).
Figure 6
Figure 6. Example subset of images captured for the modelling of a yogurt pot at different elevations and angles (A–D).
Notice how a magazine was used as a base to create more keypoints for camera alignment.
Figure 7
Figure 7. Sample of proprietary general environment test set (A–H).
Figure 8
Figure 8. Confusion matrix for InceptionV3 using synthetic data on general environment test set.
Figure 9
Figure 9. Confusion matrix for InceptionV3 benchmark experiment on general environment test set.
Figure 10
Figure 10. Detection bounding boxes and confidence scores for classification (A–J).
Figure 11
Figure 11. Real and rendered image validation scores for product detection, logged every 200 training steps.
The DAC accuracy for real images deviates from the rendered accuracy at around 20K steps.

References

    1. Agisoft LLC Agisoft photoscan. 2018. http://www.agisoft.com/ http://www.agisoft.com/
    1. Bergstra J, Bardenet R, Bengio Y, Kégl B. Algorithms for hyper-parameter optimization. Proceedings of the 24th International Conference on Neural Information Processing Systems, NIPS’11; Red Hook: Curran Associates Inc.; 2011. pp. 2546–2554.
    1. Bergstra J, Bengio Y. Random search for hyper-parameter optimization. Journal of Machine Learning Research. 2012;13:281–305.
    1. Chollet F. Keras. 2015. https://github.com/fchollet/keras https://github.com/fchollet/keras
    1. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L. ImageNet: a large-scale hierarchical image database. CVPR 2009; Miami beach Florida. 2009.

LinkOut - more resources