数据论文

一种面向深度神经网络模型的棉花常见病害训练数据集

展开
  • 1.中国农业科学院农业信息研究所,北京 100081
    2.中国农业科学院国家南繁研究院,海南 三亚 572024
赵鸿鑫,E-mail:865090195@qq.com

收稿日期: 2023-08-18

  录用日期: 2023-09-05

  网络出版日期: 2024-01-05

基金资助

国家自然科学基金(31971792);国家自然科学基金(32160421);国家重点研发计划(2022YFF0711805);三亚崖州湾科技城科技专项资助(SCKJ-JYRC-2023-45);中国农业科学院创新工程(CAAS-ASTIP-2023-AII);三亚中国农业科学院国家南繁研究院南繁专项(ZDXM23011);中央级公益性科研院所基本科研业务费专项(Y2022XK24);中央级公益性科研院所基本科研业务费专项(Y2022QC17);中央级公益性科研院所基本科研业务费专项(JBYW-AII-2022-14);中央级公益性科研院所基本科研业务费专项(JBYW-AII-2023-06);三亚中国农业科学院国家南繁研究院南繁专项(YDLH01);三亚中国农业科学院国家南繁研究院南繁专项(YDLH07);三亚中国农业科学院国家南繁研究院南繁专项(YBXM10);三亚中国农业科学院国家南繁研究院南繁专项(YBXM2312)

A Training Dataset for Deep Neural Network Model Recognition of Common Cotton Diseases

Expand
  • 1. Agricultural Information Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
    2. Hainan National Breeding and Multiplication Institute at Sanya, Chinese Academy of Agricultural Sciences, Sanya 572024, Hainan, China

Received date: 2023-08-18

  Accepted date: 2023-09-05

  Online published: 2024-01-05

摘要

深度神经网络是棉花病害智能识别的一种重要方法。覆盖更多病害、土壤和环境信息的科学数据既是此类方法发展的基础,也是当前的关键制约因素之一。本文提出的棉花病害数据采集自中国海南省三亚市坡田洋高标准农田示范基地中的棉花种植田块,覆盖了炭疽病、细菌性角斑病、褐斑病和枯萎病四种常见棉花病害,包括3453张高分辨率的健康叶片和不同生长阶段的病叶图像。所有样品获取均采用田间随机采样方式,经筛选后由10名棉花病理学专家进行鉴定与标注,同时另选20名标注者对标注后图像进行随机重复标注以检测质量,Vision Transformer模型被引入以进一步验证数据集的稳定性。相对于其他同类数据集,当前数据集数据采集于复杂的田间环境,覆盖了常见棉花病害且具有高分辨率,可更好地服务于棉花病害智能识别模型、算法的研究、训练与验证。

数据摘要:

项目 描述
数据库(集)名称 一种面向深度神经网络模型的棉花常见病害训练数据集
所属学科 农业科学、计算机科学
数据时间范围 2021年12月-2023年8月
数据地理空间覆盖 海南省三亚市坡田洋基地平原种植区域,中心经纬度为(109.165497,18.3931609999999)
数据类型与技术格式 棉花图像,*.png;棉花病害分类标准,*.TXT
数据库(集)组成 数据集由3453个图像文件和一个文本类型文件构成,图像文件归属文件夹命名为棉花病害数据,其中的文件均为*.png文件。文本文件归属文件夹命名为棉花病害数据集,其中文件均为*.TXT。
数据量 2.74 GB
数据可用性 CSTR:17058.11.sciencedb.agriculture.00029
DOI:10.57760/sciencedb.agriculture.00029
基金项目 国家自然科学基金(31971792, 32160421);国家重点研发计划(2022YFF0711805);三亚崖州湾科技城科技专项资助(SCKJ-JYRC-2023-45);中国农业科学院创新工程(CAAS-ASTIP-2023-AII, ZDXM23011);中央级公益性科研院所基本科研业务费专项(Y2022XK24,Y2022QC17, JBYW-AII-2022-14, JBYW-AII-2023-06);三亚中国农业科学院国家南繁研究院南繁专项(YDLH01, YDLH07, YBXM10, ZDXM23011, YBXM2312)

本文引用格式

赵鸿鑫, 邵明月, 潘攀, 王芝奥, 牟强, 贺子康, 张建华 . 一种面向深度神经网络模型的棉花常见病害训练数据集[J]. 农业大数据学报, 2023 , 5(4) : 47 -55 . DOI: 10.19788/j.issn.2096-6369.230405

Abstract

In the realm of cotton disease identification, the Deep Neural Network emerges as a pivotal paradigm. Progress in this sphere hinges on the availability of a comprehensive repository of scientific data, encapsulating a broader spectrum of diseases, variegated soil profiles, and multifaceted environmental attributes. Currently, this dearth of data serves as the principal bottleneck, impeding the advancement of state-of-the-art models and algorithms.Within this scholarly exposition, we present a meticulously curated cotton disease dataset, poised to bridge this knowledge chasm. This dataset comprehensively encompasses four prevalent cotton diseases: anthracnose, bacterial blight, brown spot, and wilt disease. These maladies' exemplars were meticulously gleaned from cotton fields situated in the Potianyang High-standard Farmland Demonstration Base, nestled serenely in Sanya, Hainan Province, China.The dataset unfolds as a magnum opus, comprising 3 453 high-resolution images. These vivid snapshots provide a panoramic view, capturing the pristine vitality of healthy leaves, juxtaposed with leaves beset by disease at various growth stages. The data acquisition, executed with precision, leveraged field random sampling methodologies, ensuring a faithful reflection of the natural complexity in real-world cotton plantations.Every image underwent meticulous scrutiny, with ten seasoned mavens in cotton pathology meticulously overseeing the annotation. An additional cohort of twenty annotators conducted a second round of annotations on randomly selected image subsets, fortifying the dataset's integrity and precision. The Vision Transformer model was employed to guarantee the dataset's resilience and accuracy.This hallowed dataset was meticulously gathered amidst the complexity of field environments, encapsulating the nuances of major cotton diseases in their native habitat. Its high image resolution, akin to an opulent tapestry of visual data, renders it an invaluable resource for pioneering research, astute training, and the relentless validation of astute, intelligent cotton disease recognition models and algorithms. This opulent repository caters to the discriminating tastes of researchers, practitioners, and sagacious decision-makers, furnishing them with a comprehensive and crystalline understanding of the multifaceted tapestry of cotton diseases and their intricate management.

Data summary:

Item Description
Dataset name A Training Dataset for Deep Neural Network Model Recognition of Common Cotton Diseases
Specific subject area Agricultural Science, Computer Science
Time range December, 2021-August, 2023
Geographical scope This dataset covers the plain planting area of Potianyang Base in Sanya City, Hainan Province, with a central latitude and longitude of (109.165497,18.3931609999999)
Data types and technical formats Cotton Image Format *. jpg, Cotton Disease Classification Standard Format *. txt
Dataset structure The dataset consists of 3453 image files and one text file. The image files belong to a folder named Cotton Disease Data, all of which are *. JPG files. The folder where the text files belong is named the Cotton Disease Dataset, where all files are *. TXT
Volume of data 2.74 GB
Data accessibility CSTR:17058.11.sciencedb.agriculture.00029
DOI:10.57760/sciencedb.agriculture.00029
Financial support National Key R&D Plan (2022YFF0711805); Science and Technology Special Fund for Sanya Yazhou Bay Science and Technology City (SCKJ-JYRC-2023-45);Innovation Engineering of the Chinese Academy of Agricultural Sciences (CAAS - ASTIP - 2023 - AII, ZDXM23011); Special funds for basic research business of central level public welfare research institutes (Y2022XK24, Y2022QC17, JBYW - AII - 2022 - 14, JBYW - AII - 2023 - 06);
Sanya Chinese Academy of Agricultural Sciences National South Breeding Research Institute South Breeding Special Project (YDLH01, YDLH07, YBXM10, ZDXM23011, YBXM2312)

参考文献

[1] ZHANG J H, KONG F T, WU, J Z, et al. Automatic image segmentation method for cotton leaves with disease under natural environment[J]. Journal of Integrative Agriculture, 2018, 17(8):1800-1814. DOI:10. 1016/S2095- 3119(18)61915-X.
[2] PAN P, GUO W, ZHENG X, et al. Xoo-YOLO: a detection method for wild rice bacterial blight in the field from the perspective of unmanned aerial vehicles[J]. Frontiers in Plant Science, 2023, 14: 1256545. DOI:10.3389/fpls.2023.1256545.
[3] ZHANG J, KONG F, ZHAI Z, et al. Robust image segmentation method for cotton leaf under natural conditions based on immune algorithm and PCNN algorithm[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2018, 32(5): 1854011. DOI:10.1142/ S0218001418540113.
[4] 张建华, 孔繁涛, 吴建寨, 等. 基于改进VGG卷积神经网络的棉花病害识别模型[J]. 中国农业大学学报, 2018, 23(11):161-171.
[5] 邵明月, 张建华, 冯全, 等. 深度学习在植物叶部病害检测与识别的研究进展[J]. 智慧农业(中英文), 2022, 4(1):29-46.
[6] SHAO M, HE P, ZHANG Y, et al. Identification method of cotton leaf diseases based on Bilinear Coordinate Attention Enhancement Module[J]. Agronomy, 2023, 13(1):88. DOI:10.3390/agronomy13010088.
[7] 周国民. 我国农业大数据应用进展综述[J]. 农业大数据学报, 2019, 1(1):16-23.doi:10.19788/j.issn.2096-6369.190102.
[8] VASILEIADIS S, PUGLISI E, ARENA M, et al. Soil bacterial diversity screening using single 16S rRNA gene V regions coupled with Multi-million read generating sequencing technologies[J]. Plos One, 2012, 8(7): e42671.
[9] ZHANG K, SHI Y, CUI X, et al. Salinity is a key determinant for soil microbial communities in a desert ecosystem[J]. Msystems, 2019, 4(1): 225-243.
[10] VASILEIADIS S, PUGLISI E, ARENA M, et al. Soil bacterial diversity screening using single 16S rRNA gene V regions coupled with Multi-million read generating sequencing technologies[J]. Plos One, 2012, 8(7): e42671.
[11] 闫靖昆, 黄毓贤, 秦伟森, 等. 棉田复杂背景下棉花黄萎病病斑分割算法研究[J]. 南京师大学报(自然科学版), 2021, 44(4):127-134.
[12] 王曾龙, 蒋勇勇, 彭海峰, 等. 一种基于Faster-RCNN的棉花虫害识别与统计方法[J]. 大众科技, 2023, 25(5):5-7+12.
[13] 伍维模, 吕双庆, 赵长巍, 等. 基于小型卷积神经网络的南疆棉花图像分类[J]. 智慧农业导刊, 2023, 3(8):17-23.
[14] 张嘉镐, 杨济东, 赵俊杰, 等. 基于改进的YOLOv5对棉花枯萎病的识别算法[J]. 电脑知识与技术, 2023, 19(20):51-53+56.
[15] 陈洛轩, 林成创, 郑招良, 等. Transformer在计算机视觉场景下的研究综述[J]. 计算机科学, 2023, 9(14):1-5.
[16] 潘攀, 张建华, 郑晓明, 等. 深度学习在作物及其近缘种抗病性智能鉴定上的研究进展[J]. 浙江农业学报, 2023, 35(8):1993-2012.
[17] BAI Y, QIAN J M, ZHOU J M, et al. Crop microbiome: break‐ through technology for agriculture[J]. Proceedings of the Chinese Academy of Sciences, 2017, 32(3):260-265.
[18] 吕新, 梁斌, 张立福, 等. 新疆生产建设兵团棉花生产大数据平台建设与探索[J]. 农业大数据学报, 2020, 2(1):70-78.
[19] 梁斌, 仵晓娟, 李继玲, 等. 林果大数据分析应用平台设计研究—以新疆生产建设兵团为例[J]. 中南林业科技大学学报, 2020, 40(9): 173-182.
[20] 张萌, 董伟, 钱蓉, 等. 安徽省植保大数据平台建设与应用展望[J]. 农业大数据学报, 2020, 2(1):36-44.
[21] ZHANG J J, KOBERT K, FLOURI T A, et al. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics, 2014, 30(5): 614-620.
文章导航

/