Journal of Agricultural Big Data ›› 2025, Vol. 7 ›› Issue (3): 379-392.doi: 10.19788/j.issn.2096-6369.100053

Previous Articles     Next Articles

Agricultural Pest and Disease Information Retrieval Dataset

WANG Zhen1,2(), QIN Feng2, QIAO Xi1,2, HUANG Cong2, LIU Bo2, WAN FangHao2, WANG Chen JiaoZi2, HUANG YiQi1,*()   

  1. 1. College of Mechanical Engineering, Guangxi University, Nanning 530000, China
    2. Institute of Agricultural Genomics, Chinese Academy of Agricultural Sciences, Shenzhen 518000, Guangdong, China
  • Received:2024-12-30 Accepted:2025-04-21 Online:2025-09-26 Published:2025-09-28
  • Contact: HUANG YiQi

Abstract:

With the rapid development of natural language processing and information retrieval technologies, the effective extraction and application of knowledge in the agricultural field have become increasingly important. The core of information retrieval lies in quickly and accurately locating relevant information from the knowledge base based on users' query requirements [1]. However, due to the lack of high-quality text datasets in the agricultural field in China, the further development of agricultural pest and disease information retrieval technology has been restricted. In addition, traditional search engines have shown low efficiency and insufficient accuracy in information retrieval in the agricultural field. Users often need to spend a lot of time and energy to re-screen and organize the massive and disordered data information to obtain valuable agricultural knowledge. To address the above problems, this paper has reorganized the text data on animals, plants, diseases, and invasive species accumulated by the laboratory over the years, combined with extensive literature research data, and after the processes of automated or semi-automated data cleaning and denoising, reorganized the unstructured data into structured data, and finally stored it in excel format. The constructed agricultural information retrieval dataset includes three major categories: domestic agricultural pests and diseases, invasive alien species, and quarantine species. Among them, agricultural pests and diseases include 1,254 diseases and 440 pests related to 83 crops; invasive alien species include 70 invasive alien animals and 130 invasive alien plants; Quarantine species include 99 kinds of insects, 9 kinds of mollusks, 19 kinds of fungi, 25 kinds of prokaryotes, 18 kinds of nematodes, 37 kinds of viruses and viroids, and 42 kinds of weeds. A total of 2,143 kinds of pests and diseases. In total, there are 1,983 types of pests and diseases. This dataset covers a wide range of categories and can provide basic data support for the research and development of human-computer interaction-friendly intelligent applications such as agricultural information retrieval, epidemic prevention and quarantine, and database construction in the agricultural field. At the same time, it can provide relevant data query services for scientific research institutions and functional departments engaged in pest-related work.

Data summary:

Items Description
Dataset name Agricultural Pest and Disease Information Retrieval Dataset
Specific subject area Computer science and technology; Other disciplines in agronomy
Research topic Agricultural information retrieval; data mining; artificial intelligence
Time range 2012-2024
Geographical scope China
Data types and technical formats .xlsx
Dataset structure The agricultural information retrieval dataset includes three categories of domestic agricultural pests and diseases, invasive species from abroad, and quarantine species. Among them, agricultural pests and diseases include 1 254 kinds of plant-related diseases and 440 kinds of insect pests related to 83 kinds of crops; invasive species from abroad include 70 kinds of invasive animals and 130 kinds of invasive plants; Quarantine species include 99 kinds of insects, 9 kinds of mollusks, 19 kinds of fungi, 25 kinds of prokaryotes, 18 kinds of nematodes, 37 kinds of viruses and viroids, and 42 kinds of weeds. A total of 2,143 kinds of pests and diseases. The data of each category is saved in separate Excel format files.
Volume of data 4.96 MB
Key index in dataset Types of pests and diseases
Data accessibility CSTR:17058.11.sciencedb.agriculture.00187; https://cstr.cn/17058.11.sciencedb.agriculture.00187
DOI:10.57760/sciencedb.agriculture.00187; https://doi.org/10.57760/sciencedb.agriculture.00187
Financial support National key research and development plan (2021YFD1400100, 2021YFD1400102, 2021YFD1400101), The Agricultural Science and Technology Innovation Program (ASTIP)(CAAS-ZDRW202505).

Key words: agricultural data, web mining, information retrieval, datasets, agricultural disease, agricultural pest