Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022;11(1):1-18.
doi: 10.1007/s13735-021-00225-2. Epub 2022 Jan 26.

Interactive video retrieval evaluation at a distance: comparing sixteen interactive video search systems in a remote setting at the 10th Video Browser Showdown

Affiliations

Interactive video retrieval evaluation at a distance: comparing sixteen interactive video search systems in a remote setting at the 10th Video Browser Showdown

Silvan Heller et al. Int J Multimed Inf Retr. 2022.

Abstract

The Video Browser Showdown addresses difficult video search challenges through an annual interactive evaluation campaign attracting research teams focusing on interactive video retrieval. The campaign aims to provide insights into the performance of participating interactive video retrieval systems, tested by selected search tasks on large video collections. For the first time in its ten year history, the Video Browser Showdown 2021 was organized in a fully remote setting and hosted a record number of sixteen scoring systems. In this paper, we describe the competition setting, tasks and results and give an overview of state-of-the-art methods used by the competing systems. By looking at query result logs provided by ten systems, we analyze differences in retrieval model performances and browsing times before a correct submission. Through advances in data gathering methodology and tools, we provide a comprehensive analysis of ad-hoc video search tasks, discuss results, task design and methodological challenges. We highlight that almost all top performing systems utilize some sort of joint embedding for text-image retrieval and enable specification of temporal context in queries for known-item search. Whereas a combination of these techniques drive the currently top performing systems, we identify several future challenges for interactive video search engines and the Video Browser Showdown competition itself.

Keywords: Content-based retrieval; Evaluations; Interactive video retrieval; Video browsing; Video content analysis.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
VBS2021 was organized as a fully virtual session
Fig. 2
Fig. 2
Keyframes from a Visual KIS task, total duration of the video shown was 25 s. The task was solved by 13 out of 16 teams
Fig. 3
Fig. 3
Time deltas of teams between first and last appearance of correct item in result logs before submission on shot level
Fig. 4
Fig. 4
Relation between the rank of first occurrence of a shot in the result logs and time delta to correct submission. As expected, time delta increases with rank, with variance increasing as well
Fig. 5
Fig. 5
Best rank of correct item appearing in result log. Teams are ordered by descending score on the x-axis. Different teams have used different thresholds to log results, for UX and performance reasons
Fig. 6
Fig. 6
Green cells show the best achieved logged rank rs between 1 and 300 in time t of a correct scene frame in a task. The best rank rv of a correct video frame from the same result log is included, while tcs presents the time of the tool’s correct submission. Red values are for the best detected ranks of searched video frames if searched scene frames were not present in the logged result sets for a task. Red or orange cells show a browsing failure where the frame or video was retrieved but the team did not submit a correct result
Fig. 7
Fig. 7
Share of AVS submissions judged as correct over time during an AVS task
Fig. 8
Fig. 8
AVS submissions over time with a mean sliding window of size 25
Fig. 9
Fig. 9
Cumulative unique correct video submissions over time during an AVS task
Fig. 10
Fig. 10
Selected AVS metrics per task. Higher y-axis values indicate that for a task, teams found it easier to find results to submit
Fig. 11
Fig. 11
Selected AVS metrics per task, looking at correct submissions. Higher y-axis values indicate that for a given task, it is easier to find results which judges deem correct
Fig. 12
Fig. 12
Share of overall submissions per task over precision per team and task with the color indicating the evaluation metric score normalized over the best score of a task. For each task, all teams are represented as a dot
Fig. 13
Fig. 13
Similar but differently judged AVS submissions, judged as incorrect (red border, on the left) vs. judged as correct (green border, on the right)

References

    1. Amato G, Bolettieri P, Carrara F, Debole F, Falchi F, Gennaro C, Vadicamo L, Vairo, C (2021) The VISIONE video search system: exploiting off-the-shelf text search engines for large-scale video retrieval. J Imag 7(5). 10.3390/jimaging7050076 - PMC - PubMed
    1. Amato G, Bolettieri P, Falchi F, Gennaro C, Messina N, Vadicamo L, Vairo C (2021) VISIONE at video browser showdown 2021. In: International conference on multimedia modeling. Springer, pp 473–478. 10.1007/978-3-030-67835-7_47
    1. Amato G, Falchi F, Gennaro C, Rabitti F (2017) Searching and annotating 100M images with yfcc100m-hnfc6 and mi-file. In: Workshop on content-based multimedia indexing. ACM, pp 26:1–26:4. 10.1145/3095713.3095740
    1. Andreadis S, Moumtzidou A, Gkountakos K, Pantelidis N, Apostolidis K, Galanopoulos D, Gialampoukidis I, Vrochidis, S, Mezaris V, Kompatsiaris I (2021) VERGE in vbs 2021. In: International conference on multimedia modeling. Springer, pp. 398–404. 10.1007/978-3-030-67835-7_35
    1. Benavente R, Vanrell M, Baldrich R. Parametric fuzzy sets for automatic color naming. JOSA A. 2008;25(10):2582–2593. doi: 10.1364/JOSAA.25.002582. - DOI - PubMed

LinkOut - more resources