浮游植物类别不均衡图像分类方法对比研究
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

安徽省科技重大专项(202203a07020002);合肥综合性科学中心环境研究院科研团队建设项目(HYKYTD2024004);安徽省生态环境科研项目(2023hb0011);中国科学院合肥物质科学研究院院长基金(YZJJ2024QN01)


Comparative study of class-imbalanced image classification algorithms for phytoplankton
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 文章评论
    摘要:

    自然水体中浮游植物物种丰富且类别分布不均,采集的显微图像中优势类别样本远多于劣势类别样本,导致深度学习方法在劣势类别上的分类准确率低。针对浮游植物类别不平衡引起的深度学习模型分类误差问题,分析了宏观领域类别不平衡问题的多种解决方法和策略,探究这些方法在浮游植物显微图像领域的实用性。采集了巢湖流域中常见的29个藻属、18044张图像,构建了具有严重类别不平衡特性的浮游植物显微图像数据集,并提出使用微平均和宏平均综合评价模型的分类能力。实验结果表明,常规方法训练的模型预测劣势类别样本时的F1值较低,而使用重采样大类中平方根采样法训练的模型在微平均和宏平均两个指标上均有明显提升,分类F1值分别达到了0.932和0.852。特别地,在样本数量最少的10个类别上,微平均和宏平均的F1值分别提高了9.64%和15.94%。为自然水体浮游植物群落结构自动化检测提供了更有效的深度学习模型训练方法。

    Abstract:

    The distribution of phytoplankton classes in freshwater in imbalanced,with collected microscopic images containing significantly more samples of advantaged classes than of disadvantaged items. General deep-learning-based image classification methods trained on such datasets generally perform poorly in classifying disadvantaged classes. In addressing the classification errors caused by the class-imbalanced phytoplankton dataset in deep learning model,various solutions for handing this issue in macro-domain have been analyzed. The practicality of these methods in the domain of microscopic images of phytoplankton is explored. A dataset consisting of 29 genera and 18044 images from Lake Chaohu was collected,constructing a microscopic image dataset of phytoplankton with class-imbalanced problem. An evaluation of the model's classification abilities was proposed using both micro-average and macro-average metrics. Experimental results indicate that the model trained by general method performs lower F1 values when predicting samples from disadvantaged classes. Conversely,the model trained by the square-root sampling method in the re-sampling major category exhibit significant improvement in both micro-average and macro-average metrics,with F1 values reaching 0.932 and 0.852,respectively. Particularly,on the top 10 disadvantaged genera,the F1 values for micro-average and macro-average increased by 9.64% and 15.94%,respectively. This study provides an effective method for training deep learning model for the automated detection of phytoplankton community structure in freshwater.

    参考文献
    相似文献
    引证文献
引用本文

梁天泓,殷高方,邵新童,赵南京,张小玲,贾仁庆,徐敏,张子豪,胡翔,黄朋,董鸣,陈晓伟.浮游植物类别不均衡图像分类方法对比研究.生态学报,2025,45(7):3534~3543

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数: