Methods
Summary
Phase 1: A standardised annotated southern right whale dataset needs to be developed using image data from the three largest global datasets available (Argentina, South Africa and Australia). These datasets will allow the development, training and testing of several AI models for this and any future project.
Datasets from each research region will need to be pulled together to generate a standardised dataset, with correctly oriented images with annotations ready for algorithm training (Phase 2), by running through the steps below for each of the three datasets:
1) Data preparation
2) Infrastructure setup
3) Algorithm development - detections/orientations
4) Data annotation
5) Evaluation and output
Phase 2: This phase will reduce the matching processing time by developing AI algorithms and training models using SRW image data from the same three datasets. The ultimate goal of automated photo-ID accuracy benchmark is 90% top-1 of potential matches found. Specific steps are:
1) Algorithm development - identification
2) Data augmentation
3) Infrastructure setup
4) Model training
5) Model testing
6) Collation and presentation of results
Once an automated photo-identification algorithm is available, it is envisioned that it will be applied to a web-based system and associated database management system that is used across all SRW research groups worldwide. The matching algorithms, developed under this proposal, will be open source, with the code uploaded in a GitHub repository
Challenges
While the project will provide a solution to reduce image matching times, human interaction will still be required to verify the predictions by checking annotations, orientations, and ID results. While a result is achievable, confidence levels cannot be guaranteed. However, trials using small datasets on the models selected have been promising.
Protocols
This project has not yet shared any protocols.