Abstract:
People from all around the world face problems in the identification of fish species and users
need to have access to scientific expertise to do so and, the situation is not different for
Mauritians. An automated means to identify fish species would prove to be a real advantage
to different stakeholders namely the government, marine managers, fish farmers, fisherman,
fish mongers, boat owners, seafood industrialists, marine biologists, oceanographers, tourists,
students and to the public at large. Thus, in this project, an innovative smartphone application
has been developed for the identification of fish species that are commonly found in the lagoons
and coastal areas, including estuaries and the outer reef zones of Mauritius. Our dataset
consists of 1520 images with 40 images for each of the 38 fish species that was studied. Eightypercent
of the data was used for training, ten percent was used for validation and the remaining
ten percent was used for testing. All the images were first converted to the grayscale format
before the application of a Gaussian blur to remove noise. A thresholding operation is then
performed on the images in order to subtract the fish from the background. This enabled us to
draw a contour around the fish from which several features were extracted. These include:
width of the fish, height of the fish, ratio of height to width, minimum height at the start of the
tail, ratio of this minimum height to the height of the fish, distance of this minimum height
from the mouth, ratio of this distance to the width of the fish, area of the fish, ratio of this area
to the area of the bounding rectangle, perimeter of the fish contour, ratio of this perimeter to
the perimeter of the bounding rectangle, ratio of area to perimeter, mean RGB values for each
channel (extracted from the original images) and the proportion of pixels in which the red
colour (blue and green) is highest. A number of classifiers such as kNN, Support Vector
Machines, neural networks, decision trees and random forest were used to find the best
performing one. In our case, we found that the kNN algorithm achieved the highest accuracy
of 96%. Another model for the recognition was created using the TensorFlow framework which
produced an accuracy of 98%. Thus, the results demonstrate the effectiveness of the software
in fish identification and in the future, we intend to increase the number of fish species in our
dataset and to tackle challenging issues such as partial occlusions and pose variations through
techniques such as data augmentation.