Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Sep 4;10(1):14671.
doi: 10.1038/s41598-020-71639-x.

A realistic fish-habitat dataset to evaluate algorithms for underwater visual analysis

Affiliations

A realistic fish-habitat dataset to evaluate algorithms for underwater visual analysis

Alzayat Saleh et al. Sci Rep. .

Abstract

Visual analysis of complex fish habitats is an important step towards sustainable fisheries for human consumption and environmental protection. Deep Learning methods have shown great promise for scene analysis when trained on large-scale datasets. However, current datasets for fish analysis tend to focus on the classification task within constrained, plain environments which do not capture the complexity of underwater fish habitats. To address this limitation, we present DeepFish as a benchmark suite with a large-scale dataset to train and test methods for several computer vision tasks. The dataset consists of approximately 40 thousand images collected underwater from 20 habitats in the marine-environments of tropical Australia. The dataset originally contained only classification labels. Thus, we collected point-level and segmentation labels to have a more comprehensive fish analysis benchmark. These labels enable models to learn to automatically monitor fish count, identify their locations, and estimate their sizes. Our experiments provide an in-depth analysis of the dataset characteristics, and the performance evaluation of several state-of-the-art approaches based on our benchmark. Although models pre-trained on ImageNet have successfully performed on this benchmark, there is still room for improvement. Therefore, this benchmark serves as a testbed to motivate further development in this challenging domain of underwater computer vision.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
A comparison of fish datasets. (a) QUT, (b) Fish4Knowledge, (c) Rockfish, and (d) our proposed dataset DeepFish. (ac) Datasets are acquired from constrained environments, whereas DeepFish has more realistic and challenging environments. (Figures a–c were obtained from the open-source datasets,,).
Figure 2
Figure 2
Locations where the DeepFish images were acquired. Most of DeepFish has been acquired from the Hinchinbrook/Palm Islands region in North Eastern Australia, the rest from Western Australia (not shown in the map). (The map was created using QGIS version 3.8, which is available at https://qgis.org).
Figure 3
Figure 3
DeepFish image samples across 20 different habitats.
Figure 4
Figure 4
Deep learning methods. The architecture used for the four computer vision tasks of classification, counting, localization, and segmentation consists of two components. The first component is the ResNet-50 backbone which is used to extract features from the input image. The second component is either a feed-forward network that outputs a scalar value for the input image or an upsampling path that outputs a value for each pixel in the image.
Figure 5
Figure 5
Qualitative results on counting, localization, and segmentation. (a) Prediction results of the model trained with the LCFCN loss. (b) Annotations that represent the (xy) coordinates of each fish within the images. (c) Prediction results of the model trained with the focal loss. (d) Annotations that represent the full segmentation masks of the corresponding fish.

Similar articles

Cited by

References

    1. Anantharajah, K., Ge, Z., McCool, C., Denman, S. Fookes, C. Corke, P. I. Tjondronegoro, D. & Sridharan, S. Local inter-session variability modelling for object classification. In IEEE Winter Conference on Applications of Computer Vision, 309–316 (2014).
    1. Barnes LM, Bellwood DR, Sheaves M, Tanner JK. The use of clear-water non-estuarine mangroves by reef fishes on the great barrier reef. Mar. Biol. 2012;159:211–220. doi: 10.1007/s00227-011-1801-9. - DOI
    1. Boom B, He J, Palazzo S, Huang PX, Beyan C, Chou H-M, Lin F-P, Spampinato C, Fisher RB. A research tool for long-term and continuous analysis of fish assemblage in coral-reefs using underwater camera footage. Ecol. Inform. 2014;23:83–97. doi: 10.1016/j.ecoinf.2013.10.006. - DOI
    1. Bradley M, Baker RW, Nagelkerken I, Sheaves M. Context is more important than habitat type in determining use by juvenile fish. Landsc. Ecol. 2019;34:427–442. doi: 10.1007/s10980-019-00781-3. - DOI
    1. Cordts, M., Omran, M., Ramos, S., Rehfeld, T. Enzweiler, M. Benenson, R. Franke, U. Roth, S. & Schiele, B. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3213–3223 (2016).

Publication types