AI for fully automated identification of southern right whales to improve research and conservation outcomes

South Africa
Biology
DOI: 10.18258/72812
$11,740
Raised of $10,000 Goal
117%
Funded on 11/15/24
Successfully Funded
  • $11,740
    pledged
  • 117%
    funded
  • Funded
    on 11/15/24

Methods

Summary

Phase 1: A standardised annotated southern right whale dataset needs to be developed using image data from the three largest global datasets available (Argentina, South Africa and Australia). These datasets will allow the development, training and testing of several AI models for this and any future project. 

Datasets from each research region will need to be pulled together to generate a standardised dataset, with correctly oriented images with annotations ready for algorithm training (Phase 2), by running through the steps below for each of the three datasets:

1) Data preparation

2) Infrastructure setup

3) Algorithm development - detections/orientations

4) Data annotation

5) Evaluation and output

Phase 2: This phase will reduce the matching processing time by developing AI algorithms and training models using SRW image data from the same three datasets. The ultimate goal of automated photo-ID accuracy benchmark is 90% top-1 of potential matches found. Specific steps are:

1) Algorithm development - identification

2) Data augmentation

3) Infrastructure setup

4) Model training

5) Model testing

6) Collation and presentation of results

Once an automated photo-identification algorithm is available, it is envisioned that it will be applied to a web-based system and associated database management system that is used across all SRW research groups worldwide. The matching algorithms, developed under this proposal, will be open source, with the code uploaded in a GitHub repository

Challenges

While the project will provide a solution to reduce image matching times, human interaction will still be required to verify the predictions by checking annotations, orientations, and ID results. While a result is achievable, confidence levels cannot be guaranteed. However, trials using small datasets on the models selected have been promising. 

Protocols

This project has not yet shared any protocols.