数据资源

2016—2023年广东省主要农作物审定品种知识图谱构建数据集

  • 高卓君 ,
  • 张丹丹 ,
  • 陈荣宇
展开
  • 1.广东省农业科学院农业经济与信息研究所,广州 510640
    2.中国农业科学院农业信息研究所/国家新闻出版署农业融合出版知识挖掘与知识服务重点实验室,北京 100081
    3.海丰县农业科学研究所,广东汕尾 516499
高卓君,E-mail:1290035379@qq.com
张丹丹,E-mail:zhangdandan01@caas.cn

收稿日期: 2024-08-15

  录用日期: 2024-09-29

  网络出版日期: 2025-06-23

基金资助

广东省岭南特色农业科学数据中心建设(2021B1212100005);作物种业数据资源知识融合与共享服务研究(2023KMKS04)

Construction Data Set of Knowledge Map of main Crops Approved Varieties in Guangdong Province from 2016 to 2023

  • GAO ZhuoJun ,
  • ZHANG DanDan ,
  • CHEN RongYu
Expand
  • 1. Institute of Agricultural Economics and Information, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China
    2. Agricultural Information Institute of CAAS / Key Laboratory of Knowledge Mining and Knowledge Service for Agricultural Convergence Publishing, Beijing 100081, China
    3. Haifeng County Agricultural Science Research Institute, Shanwei 516499, Guangdong, China

Received date: 2024-08-15

  Accepted date: 2024-09-29

  Online published: 2025-06-23

摘要

结合广东省农作物审定品种数据和知识图谱相关技术开展研究。种业是农业产业链的起始环节,是保障国家粮食安全和经济发展的重要支柱,审定品种作为该环节的重要创新资源,经由严格测试和客观评价后予以推广,有效实现种质资源的保护和利用,推动种业高质量发展。随着农业信息化的推进,农业数据量剧增,大数据、人工智能等现代信息技术对提高农业生产效率和优化资源配置带来了突出作用。知识图谱作为人工智能和语义网络的重要分支技术,已广泛应用于各大领域,而农业领域的知识图谱研究,相对侧重作物栽培、水肥管理、病虫害防治等重点问题。本研究基于数据可靠性、实用性、连续性等因素,通过获取广东省农业农村厅公开发布信息,采集了2016—2023年共计8年的广东省农作物审定品种数据作为基础数据,该数据以.doc格式存储,包含大量文字和字符。为便于机器识别及后续知识图谱构建,本研究通过数据清洗去除噪声影响,根据品种特征特性和产量表现提取共性属性,最终整理合并了水稻、玉米、大豆三类农作物审定品种共计823条种质资源数据,并以.xlsx和.json两种格式存储为结构化数据。为验证数据有效性,本研究采用Neo4j图形数据库成功构建了广东省主要农作物审定品种知识图谱。相关科研和生产单位可基于本数据集建立农作物审定品种专家知识库,并通过数据库扩充、多源数据融合等操作,构建面向具体农业任务的智能问答、管理决策、信息推荐等智慧服务。

数据摘要:

项目 描述
数据集名称 2016—2023年广东省主要农作物审定品种知识图谱构建数据集
所属学科 农学其他学科(21099)
研究主题 农作物;农业知识图谱;数据挖掘
数据时间范围 2016—2023年
时间分辨率
数据地理空间覆盖 广东省
数据类型与技术格式 .xlsx,.json
数据库(集)组成 1个表格文件和3个文本文件。表格文件包含2016-2023年广东省三类农作物(水稻、玉米、大豆)审定品种共823条种质资源数据;文本文件为水稻、玉米、大豆根据其特征特性和产量表现提取的共性高频属性数据。
数据量 4.18 MB
主要数据指标 作物类别、品种名称、品种来源、生育期、种植时间、形态特征、抗病性、产量表现、平均亩产、种植地区等
数据可用性
CSTR: 17058.11.sciencedb.agriculture.00117; https://cstr.cn/17058.11.sciencedb.agriculture.00117
DOI: 10.57760/sciencedb.agriculture.00117; https://doi.org/10.57760/sciencedb.agriculture.00117
经费支持 广东省岭南特色农业科学数据中心(2021B1212100005);作物种业数据资源知识融合与共享服务研究(2023KMKS04)

本文引用格式

高卓君 , 张丹丹 , 陈荣宇 . 2016—2023年广东省主要农作物审定品种知识图谱构建数据集[J]. 农业大数据学报, 2025 , 7(2) : 261 -268 . DOI: 10.19788/j.issn.2096-6369.100042

Abstract

This study is carried out in combination with the data of crops approved varieties in Guangdong Province and related technologies of knowledge map. Seed industry is the initial link of agricultural industrial chain and an important pillar to ensure national food security and economic development. As an important innovative resource in this link, approved varieties are popularized after strict testing and objective evaluation, which effectively realizes the protection and utilization of germplasm resources and promotes the high-quality development of seed industry. With the advancement of agricultural informatization, the amount of agricultural data has increased dramatically, and modern information technologies such as big data and artificial intelligence have played a prominent role in improving agricultural production efficiency and optimizing resource allocation. As an important branch technology of artificial intelligence and semantic network, knowledge mapping has been widely used in various fields, while the research of knowledge mapping in agricultural field focuses on key issues such as crop cultivation, water and fertilizer management, pest control and so on. Based on the reliability, practicability, continuity and other factors of data, this study collected the eight-year crop variety data of Guangdong Province from 2016 to 2023 as basic data by obtaining the information publicly released by the Guangdong Provincial Department of Agriculture and Rural Affairs. The data was stored in. doc format and contained a lot of characters and characters. In order to facilitate machine identification and subsequent knowledge map construction, this study removed the influence of noise by data cleaning, and extracted common attributes according to the characteristics and yield performance of varieties. Finally, 823 germplasm resources data of three crops approved varieties by rice, corn and soybean were sorted and merged, and stored as structured data in. xlsx and. json formats. In order to verify the validity of the data, the knowledge map of main crops approved varieties in Guangdong Province was successfully constructed by using the graphic database: Neo4j. Relevant scientific research and production units can establish an expert knowledge base of crops approved varieties based on this data set, and build intelligent services such as intelligent question and answer, management decision and information recommendation for specific agricultural tasks through database expansion and multi-source data fusion.

Data summary:

Items Description
Dateset name Construction Data Set of Knowledge Map of main Crops Approved Varieties in Guangdong Province from 2016 to 2023
Specific subject area Other disciplines of agriculture
Research topic Crops; Agricultural knowledge map; Data mining
Time range 2016-2023
Temporal resolution Year
Geographical scope Guangdong Province
Data types and technical formats .xlsx,.json
Dataset structure This dataset consists of one tabular file and three text files, the tabular file contains a total of 823 germplasm resource data of three types of crops (rice, corn and soybean) in Guangdong Province from 2016 to 2023, and the text file extracts common high-frequency attribute data for rice, maize and soybean according to their characteristic characteristics and yield performance..
Volume of dataset 4.18 MB
Key index in dataset Crop category, variety name, variety source, growth period, planting time, morphological characteristics, disease resistance, yield performance, average yield per mu, planting area, etc
Data accessibility CSTR: 17058.11.sciencedb.agriculture.00117; https://cstr.cn/17058.11.sciencedb.agriculture.00117
DOI: 10.57760/sciencedb.agriculture.00117; https://doi.org/10.57760/sciencedb.agriculture.00117
Financial support Guangdong Provincial Lingnan Characteristic Agriculture Science Data Center (2021B1212100005);
Research on knowledge fusion and shared services of crop seed industry data resources (2023KMKS04)

参考文献

[1] 王晓鸣, 邱丽娟, 景蕊莲, 等. 作物种质资源表型性状鉴定评价:现状与趋势. 植物遗传资源学报, 2022, 23(1):12-20.
[2] 刘旭, 李立会, 黎裕, 等. 作物种质资源研究回顾与发展趋势. 农学学报, 2018, 8(1):1-6.
[3] 穆维松, 刘天琪, 苗子溦, 等. 知识图谱技术及其在农业领域应用研究进展. 农业工程学报, 2023, 39(16):1-12.
[4] 王润周, 张新生. 基于混合动态掩码与多策略融合的医疗知识图谱问答. 计算机科学与探索, 2024, 18(10):2770-2786.
[5] 王楚童, 李明达, 孙孟轩, 等. 融合大规模医学事实的跨语言双层知识图谱. 软件学报, 2025, 36(3):1240-1253.
[6] 李保金, 李叶, 刘颖. 基于科学知识图谱的图书情报领域学术热点分析. 辽宁工业大学学报(社会科学版), 2024, 26(2):37-42.
[7] SONG H, LI Y, WANG Y. Visualization and Analysis of Global Agricultural E-Commerce Research Based on Knowledge Graph. International Conference on Communications, Information System and Computer Engineering, Haikou(CN), 2019.DOI:10.1109/CISCE.2019.00112.
[8] 李泽中, 齐晨旭, 戎佳. 多源知识融合的企业知识服务模型构建研究. 情报科学, 2022, 40(12):56-62.
[9] SINGHAL A. Introducing the Knowledge Graph: things, not strings[EB/OL].(2012-5-16) [2024-08-09]. https://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-things-not.html.
[10] 沈利言. 面向水稻栽培方案的实体关系抽取与知识图谱构建方法研究. 南京: 南京农业大学, 2019.
[11] 许多, 鲁旺平, 许瑞清, 等. 基于农业时空多模态知识图谱的水稻精准施肥决策方法. 华中农业大学学报, 2023, 42(3):281-292.
[12] 戈为溪, 周俊, 袁立存, 等. 基于知识图谱与案例推理的水稻精准施肥推荐模型. 农业工程学报, 2023, 39(2):126-133.
[13] GE W, ZHOU J, ZHENG P, et al. A recommendation model of rice fertilization using knowledge graph and case-based reasoning. Computers and Electronics in Agriculture, 2024, 219: 108751. https://doi.org/10.1016/j.compag.2024.108751.
[14] LIU X, BAI X, WANG L, et al. Review and trend analysis of knowledge graphs for crop pest and diseases. IEEE Access, 2019, 7:62251-62264. DOI:10.1109/ACCESS.2019.2915987.
[15] 李贯峰, 李卫军. 一个基于枸杞病虫害领域本体的语义检索模型. 计算机技术与发展, 2017, 27(9):48-52.
[16] ZHOU J, LI J, WANG C, et al. Crop disease identification and interpretation method based on multimodal deep learning. Computers and Electronics in Agriculture, 2021, 189(3):106408.
[17] 唐闻涛, 胡泽林. 农业知识图谱研究综述. 计算机工程与应用, 2024, 60(2):63-76.
文章导航

/