Few-Shot Object Detection: Application to Medieval Musicological Studies
- PMID: 35200721
- PMCID: PMC8880595
- DOI: 10.3390/jimaging8020018
Few-Shot Object Detection: Application to Medieval Musicological Studies
Abstract
Detecting objects with a small representation in images is a challenging task, especially when the style of the images is very different from recent photos, which is the case for cultural heritage datasets. This problem is commonly known as few-shot object detection and is still a new field of research. This article presents a simple and effective method for black box few-shot object detection that works with all the current state-of-the-art object detection models. We also present a new dataset called MMSD for medieval musicological studies that contains five classes and 693 samples, manually annotated by a group of musicology experts. Due to the significant diversity of styles and considerable disparities between the artistic representations of the objects, our dataset is more challenging than the current standards. We evaluate our method on YOLOv4 (m/s), (Mask/Faster) RCNN, and ViT/Swin-t. We present two methods of benchmarking these models based on the overall data size and the worst-case scenario for object detection. The experimental results show that our method always improves object detector results compared to traditional transfer learning, regardless of the underlying architecture.
Keywords: cultural heritage; few-shot image classification; few-shot object detection; medieval singing; musical iconography; transfer learning.
Conflict of interest statement
The authors declare no conflict of interest.
Figures
References
-
- Bekkouch I.E.I., Eyharabide V., Billiet F. Dual Training for Transfer Learning: Application on Medieval Studies; Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN); Shenzhen, China. 18–22 July 2021; pp. 1–8. - DOI
-
- Bekkouch I.E.I., Constantin N.D., Eyharabide V., Billiet F. Adversarial Domain Adaptation for Medieval Instrument Recognition. In: Arai K., editor. Intelligent Systems and Applications. Springer International Publishing; Cham, Switzerland: 2022. pp. 674–687.
-
- Bekkouch I.E.I., Aidinovich T., Vrtovec T., Kuleev R., Ibragimov B. Multi-agent shape models for hip landmark detection in MR scans. In: Išgum I., Landman B.A., editors. Medical Imaging 2021: Image Processing. Vol. 11596. International Society for Optics and Photonics, SPIE; Bellingham, WA, USA: 2021. pp. 153–162. - DOI
-
- Bekkouch I.E.I., Nicolae D.C., Khan A., Kazmi S.M.A., Khattak A.M., Ibragimov B. Adversarial Reconstruction Loss for Domain Generalization. IEEE Access. 2021;9:42424–42437. doi: 10.1109/ACCESS.2021.3066041. - DOI
-
- Wu B., Xu C., Dai X., Wan A., Zhang P., Yan Z., Tomizuka M., Gonzalez J., Keutzer K., Vajda P. Visual Transformers: Token-based Image Representation and Processing for Computer Vision. arXiv. 20202006.03677
LinkOut - more resources
Full Text Sources