农业大数据学报 ›› 2023, Vol. 5 ›› Issue (4): 47-55.doi: 10.19788/j.issn.2096-6369.230405

• 数据论文 • 上一篇    下一篇

一种面向深度神经网络模型的棉花常见病害训练数据集

赵鸿鑫1,2(), 邵明月1,2, 潘攀1,2, 王芝奥1,2, 牟强1,2, 贺子康1,2, 张建华1,2,*()   

  1. 1.中国农业科学院农业信息研究所,北京 100081
    2.中国农业科学院国家南繁研究院,海南 三亚 572024
  • 收稿日期:2023-08-18 接受日期:2023-09-05 出版日期:2023-12-26 发布日期:2024-01-05
  • 通讯作者: 张建华,E-mail:zhangjianhua@caas.cn。
  • 作者简介:赵鸿鑫,E-mail:865090195@qq.com
  • 基金资助:
    国家自然科学基金(31971792);国家自然科学基金(32160421);国家重点研发计划(2022YFF0711805);三亚崖州湾科技城科技专项资助(SCKJ-JYRC-2023-45);中国农业科学院创新工程(CAAS-ASTIP-2023-AII);三亚中国农业科学院国家南繁研究院南繁专项(ZDXM23011);中央级公益性科研院所基本科研业务费专项(Y2022XK24);中央级公益性科研院所基本科研业务费专项(Y2022QC17);中央级公益性科研院所基本科研业务费专项(JBYW-AII-2022-14);中央级公益性科研院所基本科研业务费专项(JBYW-AII-2023-06);三亚中国农业科学院国家南繁研究院南繁专项(YDLH01);三亚中国农业科学院国家南繁研究院南繁专项(YDLH07);三亚中国农业科学院国家南繁研究院南繁专项(YBXM10);三亚中国农业科学院国家南繁研究院南繁专项(YBXM2312)

A Training Dataset for Deep Neural Network Model Recognition of Common Cotton Diseases

ZHAO HongXin1,2(), SHAO MingYue1,2, PAN Pan1,2, WANG ZhiAo1,2, MU Qiang1,2, HE ZiKang1,2, ZHANG JianHua1,2,*()   

  1. 1. Agricultural Information Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
    2. Hainan National Breeding and Multiplication Institute at Sanya, Chinese Academy of Agricultural Sciences, Sanya 572024, Hainan, China
  • Received:2023-08-18 Accepted:2023-09-05 Online:2023-12-26 Published:2024-01-05

摘要:

深度神经网络是棉花病害智能识别的一种重要方法。覆盖更多病害、土壤和环境信息的科学数据既是此类方法发展的基础,也是当前的关键制约因素之一。本文提出的棉花病害数据采集自中国海南省三亚市坡田洋高标准农田示范基地中的棉花种植田块,覆盖了炭疽病、细菌性角斑病、褐斑病和枯萎病四种常见棉花病害,包括3453张高分辨率的健康叶片和不同生长阶段的病叶图像。所有样品获取均采用田间随机采样方式,经筛选后由10名棉花病理学专家进行鉴定与标注,同时另选20名标注者对标注后图像进行随机重复标注以检测质量,Vision Transformer模型被引入以进一步验证数据集的稳定性。相对于其他同类数据集,当前数据集数据采集于复杂的田间环境,覆盖了常见棉花病害且具有高分辨率,可更好地服务于棉花病害智能识别模型、算法的研究、训练与验证。

数据摘要:

项目 描述
数据库(集)名称 一种面向深度神经网络模型的棉花常见病害训练数据集
所属学科 农业科学、计算机科学
数据时间范围 2021年12月-2023年8月
数据地理空间覆盖 海南省三亚市坡田洋基地平原种植区域,中心经纬度为(109.165497,18.3931609999999)
数据类型与技术格式 棉花图像,*.png;棉花病害分类标准,*.TXT
数据库(集)组成 数据集由3453个图像文件和一个文本类型文件构成,图像文件归属文件夹命名为棉花病害数据,其中的文件均为*.png文件。文本文件归属文件夹命名为棉花病害数据集,其中文件均为*.TXT。
数据量 2.74 GB
数据可用性 CSTR:17058.11.sciencedb.agriculture.00029
DOI:10.57760/sciencedb.agriculture.00029
基金项目 国家自然科学基金(31971792, 32160421);国家重点研发计划(2022YFF0711805);三亚崖州湾科技城科技专项资助(SCKJ-JYRC-2023-45);中国农业科学院创新工程(CAAS-ASTIP-2023-AII, ZDXM23011);中央级公益性科研院所基本科研业务费专项(Y2022XK24,Y2022QC17, JBYW-AII-2022-14, JBYW-AII-2023-06);三亚中国农业科学院国家南繁研究院南繁专项(YDLH01, YDLH07, YBXM10, ZDXM23011, YBXM2312)

关键词: 深度学习, 棉花炭疽病, 棉花细菌性角斑病, 棉花褐斑病, 棉花枯萎病, 图像识别技术

Abstract:

In the realm of cotton disease identification, the Deep Neural Network emerges as a pivotal paradigm. Progress in this sphere hinges on the availability of a comprehensive repository of scientific data, encapsulating a broader spectrum of diseases, variegated soil profiles, and multifaceted environmental attributes. Currently, this dearth of data serves as the principal bottleneck, impeding the advancement of state-of-the-art models and algorithms.Within this scholarly exposition, we present a meticulously curated cotton disease dataset, poised to bridge this knowledge chasm. This dataset comprehensively encompasses four prevalent cotton diseases: anthracnose, bacterial blight, brown spot, and wilt disease. These maladies' exemplars were meticulously gleaned from cotton fields situated in the Potianyang High-standard Farmland Demonstration Base, nestled serenely in Sanya, Hainan Province, China.The dataset unfolds as a magnum opus, comprising 3 453 high-resolution images. These vivid snapshots provide a panoramic view, capturing the pristine vitality of healthy leaves, juxtaposed with leaves beset by disease at various growth stages. The data acquisition, executed with precision, leveraged field random sampling methodologies, ensuring a faithful reflection of the natural complexity in real-world cotton plantations.Every image underwent meticulous scrutiny, with ten seasoned mavens in cotton pathology meticulously overseeing the annotation. An additional cohort of twenty annotators conducted a second round of annotations on randomly selected image subsets, fortifying the dataset's integrity and precision. The Vision Transformer model was employed to guarantee the dataset's resilience and accuracy.This hallowed dataset was meticulously gathered amidst the complexity of field environments, encapsulating the nuances of major cotton diseases in their native habitat. Its high image resolution, akin to an opulent tapestry of visual data, renders it an invaluable resource for pioneering research, astute training, and the relentless validation of astute, intelligent cotton disease recognition models and algorithms. This opulent repository caters to the discriminating tastes of researchers, practitioners, and sagacious decision-makers, furnishing them with a comprehensive and crystalline understanding of the multifaceted tapestry of cotton diseases and their intricate management.

Data summary:

Item Description
Dataset name A Training Dataset for Deep Neural Network Model Recognition of Common Cotton Diseases
Specific subject area Agricultural Science, Computer Science
Time range December, 2021-August, 2023
Geographical scope This dataset covers the plain planting area of Potianyang Base in Sanya City, Hainan Province, with a central latitude and longitude of (109.165497,18.3931609999999)
Data types and technical formats Cotton Image Format *. jpg, Cotton Disease Classification Standard Format *. txt
Dataset structure The dataset consists of 3453 image files and one text file. The image files belong to a folder named Cotton Disease Data, all of which are *. JPG files. The folder where the text files belong is named the Cotton Disease Dataset, where all files are *. TXT
Volume of data 2.74 GB
Data accessibility CSTR:17058.11.sciencedb.agriculture.00029
DOI:10.57760/sciencedb.agriculture.00029
Financial support National Key R&D Plan (2022YFF0711805); Science and Technology Special Fund for Sanya Yazhou Bay Science and Technology City (SCKJ-JYRC-2023-45);Innovation Engineering of the Chinese Academy of Agricultural Sciences (CAAS - ASTIP - 2023 - AII, ZDXM23011); Special funds for basic research business of central level public welfare research institutes (Y2022XK24, Y2022QC17, JBYW - AII - 2022 - 14, JBYW - AII - 2023 - 06);
Sanya Chinese Academy of Agricultural Sciences National South Breeding Research Institute South Breeding Special Project (YDLH01, YDLH07, YBXM10, ZDXM23011, YBXM2312)

Key words: deep learning, cotton anthracnose, cotton angular leaf spot, cotton brown spot, Fusarium wilt, image recognition technology