农业大数据学报 ›› 2024, Vol. 6 ›› Issue (3): 400-411.doi: 10.19788/j.issn.2096-6369.000023

• “面向高质量共享的科学数据安全”专刊(下) • 上一篇    下一篇

数据驱动的农业深度学习方法计量分析

李佳乐1,2,3(), 张建华1,2,3, 王健1,2,3, 周国民1,2,3,*()   

  1. 1.中国农业科学院农业信息研究所 北京 100081
    2.国家农业科学数据中心,北京 100081
    3.三亚中国农业科学院国家南繁研究院,海南三亚 572024
  • 收稿日期:2023-12-19 接受日期:2024-03-03 出版日期:2024-09-26 发布日期:2024-10-01
  • 通讯作者: 周国民,E-mail:zhouguomin@caas.cn
  • 作者简介:李佳乐,E-mail:252211923@qq.com
  • 基金资助:
    国家重点研发计划(2022YFF0711805);国家自然科学基金(31971792);国家自然科学基金(32160421);中国农业科学院创新工程(CAAS-ASTIP-2023-AII);中国农业科学院创新工程(ZDXM23011);三亚中国农业科学院国家南繁研究院南繁专项(YBXM2312);三亚中国农业科学院国家南繁研究院南繁专项(YDLH01);三亚中国农业科学院国家南繁研究院南繁专项(YDLH07);三亚中国农业科学院国家南繁研究院南繁专项(YBXM10);中央级公益性科研院所基本科研业务费专项(JBYW-AII-2023-06);三亚崖州湾科技城科技专项(SCKJ-JYRC-2023-45)

Metrological Analysis of Data-driven Deep Learning Methods for Agriculture

LI JiaLe1,2,3(), ZHANG JianHua1,2,3, WANG Jian1,2,3, ZHOU GuoMin1,2,3,*()   

  1. 1. Agricultural Information Institute of Chinese Academy of Agricultural Sciences, Beijing 10081, China
    2. National Agriculture Science Data Center, Beijing 10081, China
    3. Hainan National Breeding and Multiplication Institute at Sanya, Chinese Academy of Agricultural Sciences, Sanya 572024, Hainan, China
  • Received:2023-12-19 Accepted:2024-03-03 Published:2024-09-26 Online:2024-10-01

摘要:

随着人工智能、计算机视觉、深度学习等科学技术在农业领域的发展与应用,数据驱动的农业深度学习模型成为农业科学的新型研究范式,农业数据集是深度学习模型训练的基础,高质量、大规模、多样性的数据集能够有效提升模型性能,从而助力深度学习在智慧农业领域的应用。为帮助相关领域研究者更好地了解数据对于深度学习的驱动力,充分发挥深度学习在农业领域的应用,本文通过计量分析的方法,总结农业数据集的类型、规模、来源等基本特质,根据深度学习方法将其划分为目标检测、图像分割、图像识别等4个类别,根据应用领域将其划分为视觉导航、特征识别、无损检测等7个类别。结果显示,数据集类型以图像数据为主,图像的数据量主要集中在50—1 500张范围内,由于农业数据采集的特殊性,数据集大部分由个人构建,部分来自公开数据集,主要利用数据集开展特征识别。在未来,随着模型的规模越来越大,对于数据集的要求也不断升级,因此需要持续构建大规模、分布均衡、标注准确的数据集。本文通过强调数据对深度学习模型的驱动力及重要性,为数据推动深度学习农业应用提供理论依据。

关键词: 数字农业, 深度学习, 数据集, 计量分析

Abstract:

With the development and application of artificial intelligence, computer vision, deep learning and other science and technology in the field of agriculture, the data-driven deep learning model for agriculture has become a new research paradigm for agricultural information extraction, and agricultural datasets are the basis for deep learning model training, and high-quality, large-scale, and diverse datasets can effectively improve the model performance, thus boosting the application of deep learning in the field of smart agriculture. To help researchers in related fields better understand the driving force of data for deep learning and give full play to the application of deep learning in the field of agriculture, this paper analyzes the datasets through metrology and summarizes the basic qualities of agricultural datasets such as type, scale, and source, which are divided into four categories according to the deep learning methods, such as target detection, image segmentation, and image recognition, and into seven categories according to the application areas, such as visual navigation, feature recognition, non-destructive testing and other 7 categories. The results show that the type of dataset is dominated by image data, and the data volume of images is concentrated in the range of 500 to 1500, and due to the specificity of agricultural data collection, most of the dataset is constructed by individuals and some of them are from public datasets, and the dataset is mainly utilized to carry out feature recognition. In the future, as the scale of the model becomes larger and larger, the requirements for the dataset are also upgraded, and it is necessary to continuously construct large-scale, balanced distribution, and accurately labeled datasets.In this paper, we provide a theoretical basis for data to promote deep learning agricultural applications by emphasizing the driving force and the importance of data to the deep learning model.

Key words: Digital agriculture, deep learning, datasets, metrological analysis