Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 19;8(2):18.
doi: 10.3390/jimaging8020018.

Few-Shot Object Detection: Application to Medieval Musicological Studies

Affiliations

Few-Shot Object Detection: Application to Medieval Musicological Studies

Bekkouch Imad Eddine Ibrahim et al. J Imaging. .

Abstract

Detecting objects with a small representation in images is a challenging task, especially when the style of the images is very different from recent photos, which is the case for cultural heritage datasets. This problem is commonly known as few-shot object detection and is still a new field of research. This article presents a simple and effective method for black box few-shot object detection that works with all the current state-of-the-art object detection models. We also present a new dataset called MMSD for medieval musicological studies that contains five classes and 693 samples, manually annotated by a group of musicology experts. Due to the significant diversity of styles and considerable disparities between the artistic representations of the objects, our dataset is more challenging than the current standards. We evaluate our method on YOLOv4 (m/s), (Mask/Faster) RCNN, and ViT/Swin-t. We present two methods of benchmarking these models based on the overall data size and the worst-case scenario for object detection. The experimental results show that our method always improves object detector results compared to traditional transfer learning, regardless of the underlying architecture.

Keywords: cultural heritage; few-shot image classification; few-shot object detection; medieval singing; musical iconography; transfer learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Examples of medieval singing illuminations.
Figure 2
Figure 2
Examples of object annotations.
Figure 3
Figure 3
Inference of the Mask RCNN architecture for few-shot instance segmentation on our medieval singing dataset.

References

    1. Bekkouch I.E.I., Eyharabide V., Billiet F. Dual Training for Transfer Learning: Application on Medieval Studies; Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN); Shenzhen, China. 18–22 July 2021; pp. 1–8. - DOI
    1. Bekkouch I.E.I., Constantin N.D., Eyharabide V., Billiet F. Adversarial Domain Adaptation for Medieval Instrument Recognition. In: Arai K., editor. Intelligent Systems and Applications. Springer International Publishing; Cham, Switzerland: 2022. pp. 674–687.
    1. Bekkouch I.E.I., Aidinovich T., Vrtovec T., Kuleev R., Ibragimov B. Multi-agent shape models for hip landmark detection in MR scans. In: Išgum I., Landman B.A., editors. Medical Imaging 2021: Image Processing. Vol. 11596. International Society for Optics and Photonics, SPIE; Bellingham, WA, USA: 2021. pp. 153–162. - DOI
    1. Bekkouch I.E.I., Nicolae D.C., Khan A., Kazmi S.M.A., Khattak A.M., Ibragimov B. Adversarial Reconstruction Loss for Domain Generalization. IEEE Access. 2021;9:42424–42437. doi: 10.1109/ACCESS.2021.3066041. - DOI
    1. Wu B., Xu C., Dai X., Wan A., Zhang P., Yan Z., Tomizuka M., Gonzalez J., Keutzer K., Vajda P. Visual Transformers: Token-based Image Representation and Processing for Computer Vision. arXiv. 20202006.03677

LinkOut - more resources