TY - JOUR
T1 - Image-based insect species and gender classification by trained supervised machine learning algorithms
AU - Tuda, Midori
AU - Luna-Maldonado, Alejandro Isabel
N1 - Funding Information:
MT was supported by JSPS KAKENHI (JP23405008, JP25430194, JP26304016 and 19K06840) from JSPS (Japan Society for the Promotion of Science) and AILM by PAICYT (Programa de Apoyo a la Investigación Científica y Tecnológica) (367) from Universidad Autónoma de Nuevo León. We thank the Laboratory of Insect Natural Enemies, Faculty of Agriculture, Kyushu University, Universidad Autónoma de Nuevo León, Mexican Council for Science and Technology, as well as Mexican Ministry of Education for their support. AILM thanks Michael Mayo for his advice on Weka at an earlier stage of the study.
Funding Information:
MT was supported by JSPS KAKENHI ( JP23405008 , JP25430194 , JP26304016 and 19K06840 ) from JSPS (Japan Society for the Promotion of Science) and AILM by PAICYT (Programa de Apoyo a la Investigación Científica y Tecnológica) ( 367 ) from Universidad Autónoma de Nuevo León. We thank the Laboratory of Insect Natural Enemies, Faculty of Agriculture, Kyushu University, Universidad Autónoma de Nuevo León, Mexican Council for Science and Technology, as well as Mexican Ministry of Education for their support. AILM thanks Michael Mayo for his advice on Weka at an earlier stage of the study.
Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2020/11
Y1 - 2020/11
N2 - Classification of specimens is the important first step to characterize populations and species assemblages. Although species-level classification has been a popular goal, the sex difference and sex ratio are also an important property in ecology and pest control. Here we focus on the images of mixed sex specimens of a stored product pest beetle (Callosobruchus chinensis) and its parasitoids (parasitic wasps; Anisopteromalus and Heterospilus) in various postures and classify them into species and sex, by training supervised machine learning programs: logistic model trees (LMT), random forest, support vector machine (SVM), simple logistic regression, multilayer perceptron and AdaBoost (adaptive boosting). Both object-based features and pixel-based features were extracted from each image. Simple logistic regression, LMT and AdaBoost (employing simple logistic regression as base learner) performed well to classify sexes or species/sexes; average true positive rates (prediction accuracy) of 88.5–98.5% were achieved for within-species sexing of beetles or wasps, 97.3% for two species sexing and 93.3% for three species sexing. For most datasets, the best performed models incorporated both object-based features and pixel-based features. LMT models were identical to simple logistic regression models in most cases. Robust performance and small variation in prediction accuracy of simple logistic regression, irrespective of classification target (sexes or species), was shown, and this is probably because of the efficient feature selection implemented in the algorithm. This study is one of the earliest to classify the gender of insects using machine learning based on still images.
AB - Classification of specimens is the important first step to characterize populations and species assemblages. Although species-level classification has been a popular goal, the sex difference and sex ratio are also an important property in ecology and pest control. Here we focus on the images of mixed sex specimens of a stored product pest beetle (Callosobruchus chinensis) and its parasitoids (parasitic wasps; Anisopteromalus and Heterospilus) in various postures and classify them into species and sex, by training supervised machine learning programs: logistic model trees (LMT), random forest, support vector machine (SVM), simple logistic regression, multilayer perceptron and AdaBoost (adaptive boosting). Both object-based features and pixel-based features were extracted from each image. Simple logistic regression, LMT and AdaBoost (employing simple logistic regression as base learner) performed well to classify sexes or species/sexes; average true positive rates (prediction accuracy) of 88.5–98.5% were achieved for within-species sexing of beetles or wasps, 97.3% for two species sexing and 93.3% for three species sexing. For most datasets, the best performed models incorporated both object-based features and pixel-based features. LMT models were identical to simple logistic regression models in most cases. Robust performance and small variation in prediction accuracy of simple logistic regression, irrespective of classification target (sexes or species), was shown, and this is probably because of the efficient feature selection implemented in the algorithm. This study is one of the earliest to classify the gender of insects using machine learning based on still images.
UR - http://www.scopus.com/inward/record.url?scp=85089822212&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85089822212&partnerID=8YFLogxK
U2 - 10.1016/j.ecoinf.2020.101135
DO - 10.1016/j.ecoinf.2020.101135
M3 - Article
AN - SCOPUS:85089822212
SN - 1574-9541
VL - 60
JO - Ecological Informatics
JF - Ecological Informatics
M1 - 101135
ER -