Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Oct 28;22(21):8268.
doi: 10.3390/s22218268.

Class-Aware Fish Species Recognition Using Deep Learning for an Imbalanced Dataset

Affiliations

Class-Aware Fish Species Recognition Using Deep Learning for an Imbalanced Dataset

Simegnew Yihunie Alaba et al. Sensors (Basel). .

Abstract

Fish species recognition is crucial to identifying the abundance of fish species in a specific area, controlling production management, and monitoring the ecosystem, especially identifying the endangered species, which makes accurate fish species recognition essential. In this work, the fish species recognition problem is formulated as an object detection model to handle multiple fish in a single image, which is challenging to classify using a simple classification network. The proposed model consists of MobileNetv3-large and VGG16 backbone networks and an SSD detection head. Moreover, a class-aware loss function is proposed to solve the class imbalance problem of our dataset. The class-aware loss takes the number of instances in each species into account and gives more weight to those species with a smaller number of instances. This loss function can be applied to any classification or object detection task with an imbalanced dataset. The experimental result on the large-scale reef fish dataset, SEAMAPD21, shows that the class-aware loss improves the model over the original loss by up to 79.7%. The experimental result on the Pascal VOC dataset also shows the model outperforms the original SSD object detection model.

Keywords: class-aware loss; deep learning; fish recognition; imbalanced data; object detection; species classification.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Proposed architecture. The architecture comprises the MobileNetv3-large [24] backbone and SSD [37] detection head. The MobileNetv3 feature extraction network is a lightweight model with good feature extraction capability. The extracted high-level features are input to the SSD detection head for classification and regression tasks. The network outputs fish species type and bounding box information for each image. The model is also trained with the VGG16 [25] backbone network.
Figure 2
Figure 2
Overview of SSD detection [37]. (a) SSD takes an input image and ground truth boxes for each object during training. In a convolutional fashion, at each location, different small sets (e.g., 4) of default boxes of different aspect ratios in several feature maps with different scales (e.g., 8 × 8 and 4 × 4 in (b,c)) are evaluated. For each default box, the shape offsets and the confidences for all object categories ((c1,c2, …, cp)) are predicted; cx, cy, w, and h refer to the x center, y center, width, and height of the bounding box, respectively.
Figure 3
Figure 3
The sample number of occurrences per species distribution in SEAMAPD21 [7] shows a highly imbalanced structure.
Figure 4
Figure 4
Sample images from the SEAMAPD21 dataset. The images are almost similar to the background, which makes the identification more challenging. Some of the fish images are challenging even for humans to detect. There might be occlusion due to vertical bars or other fish as well.
Figure 5
Figure 5
VGG300 backbone qualitative outputs. All fish species are detected in the sample images except the bottom left image. In the bottom left image, one fish is not detected, which is on the middle right side of the image. It is not easy to spot the missed fish, even for humans.
Figure 6
Figure 6
MobileNetv3 backbone qualitative outputs. These sample outputs show the missed detection using the MobileNetv3 backbone, whereas the VGG backbone detects them. One and two fish, respectively, are not detected in the first and second images of the first row. There is the same number of fish missed detection in the second row. However, two and three fish are not detected from the last row of images.
Figure 6
Figure 6
MobileNetv3 backbone qualitative outputs. These sample outputs show the missed detection using the MobileNetv3 backbone, whereas the VGG backbone detects them. One and two fish, respectively, are not detected in the first and second images of the first row. There is the same number of fish missed detection in the second row. However, two and three fish are not detected from the last row of images.
Figure 7
Figure 7
VGG512 backbone qualitative outputs. All fish species in each image are detected with high confidence.
Figure 7
Figure 7
VGG512 backbone qualitative outputs. All fish species in each image are detected with high confidence.

References

    1. Chang C., Fang W., Jao R.C., Shyu C., Liao I.C. Development of an intelligent feeding controller for indoor intensive culturing of eel. Aquac. Eng. 2005;32:343–353. doi: 10.1016/j.aquaeng.2004.07.004. - DOI
    1. Cabreira A.G., Tripode M., Madirolas A. Artificial neural networks for fish-species identification. ICES J. Mar. Sci. 2009;66:1119–1129.
    1. Churnside J.H., Wells R., Boswell K.M., Quinlan J.A., Marchbanks R.D., McCarty B.J., Sutton T.T. Surveying the distribution and abundance of flying fishes and other epipelagics in the northern Gulf of Mexico using airborne lidar. Bull. Mar. Sci. 2017;93:591–609. doi: 10.5343/bms.2016.1039. - DOI
    1. Jalali M.A., Ierodiaconou D., Monk J., Gorfine H., Rattray A. Predictive mapping of abalone fishing grounds using remotely-sensed LiDAR and commercial catch data. Fish. Res. 2015;169:26–36. doi: 10.1016/j.fishres.2015.04.009. - DOI
    1. Boswell K.M., Wilson M.P., Cowan J.H., Jr. A semiautomated approach to estimating fish size, abundance, and behavior from dual-frequency identification sonar (DIDSON) data. N. Am. J. Fish. Manag. 2008;28:799–807. doi: 10.1577/M07-116.1. - DOI

LinkOut - more resources