Journal of Agricultural Big Data

Select

Image Dataset of Wheat, Corn, and Rice Seedlings in Heilongjiang Province in 2022

QIN JiaLe, YUAN JiangHao, SONG GuoZhu, YAO HongXun, GUO LeiFeng, WANG XiaoLi

Journal of Agricultural Big Data 2024, 6 (4): 558-563. DOI: 10.19788/j.issn.2096-6369.100026

Abstract （1532）

HTML （62）

PDF（pc）（2655KB）（889）

Save

During the cultivation process, most field crops are typically grown in open fields. The northeastern region of China experiences relatively low temperatures throughout the year. During the seedling stage of crops, significant fluctuations in sunlight and rainfall can easily lead to issues such as weak and stunted seedlings, poorly developed root systems, and slow growth. Timely monitoring and management of crops during the seedling stage can help in understanding their growth status and environmental conditions, enabling prompt decision-making.Experimental data was collected from May 9, 2022, to June 16, 2022. RGB cameras installed at 11 meteorological stations in the experimental fields collected data seven times a day at 6:00, 8:00, 10:00, 12:00, 14:00, 16:00, and 18:00. The images were captured at a height of 2.4 meters with a field of view angle of 90°, covering an area of 4.4 meters in length and 2.5 meters in width. Photography was mainly conducted through natural light conditions with a downward vertical perspective.After organizing and screening, the dataset comprises approximately 2.59 GB of data, including 1.48 GB of visible light RGB data and 1.11 GB of near-infrared spectral data. This dataset enables leaf age identification through visible light RGB data and near-infrared spectral data. Extracted features (color features, image features, texture features, vegetation indices) can be inputted into machine learning regression models for analysis and prediction. Moreover, this dataset is suitable for constructing convolutional neural network models for crop recognition or seedling identification, facilitating precise crop detection and further research on issues such as missed or replanted seedlings after transplanting.

Data summary：

Items	Description
Dataset name	Image Dataset of Wheat, Corn, and Rice Seedlings in Heilongjiang Province in 2022
Specific subject area	Agricultural science
Research Topic	Computer vision
Time range	May 2022-July 2022
Temporal resolution	1 day
Data types and technical formats	.jpg
Dataset structure	The dataset consists of two parts of data, one is the field crop visible light RGB image data set, and the other is the field crop multispectral near-infrared image data set, of which: 1. The field crop image data contains data within 38 days, and the data volume is 1.48G; 2. Daejeon near-infrared spectral data within 38 days, the data volume is 1.11G.
Volume of dataset	2.59 GB
Key index in dataset	RGB images and near-infrared spectral images
Data accessibility	CSTR: https://cstr.cn/17058.11.sciencedb.agriculture.00092 DOI: https://doi.org/10.57760/sciencedb.agriculture.00092 hNASDC Access link: https://agri.scidb.cn/, restricted access
Financial support	National Key R&D Program of China (2021ZD0110901); Science and Technology Planning Project of Inner Mongolia Autonomous Region (2021GG0341)

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Survey of Differential Privacy Algorithms and Applications for High- Dimensional Data Publishing

LONG Chun, QIN ZeXiu, LI LiSha, LI Jing, YANG Fan, WEI JinXia, FU YuHao

Journal of Agricultural Big Data 2024, 6 (2): 170-184. DOI: 10.19788/j.issn.2096-6369.200001

Abstract （1518）

HTML （48）

PDF（pc）（804KB）（895）

Save

With the further development of big data and machine learning technologies, handling high-dimensional data with complex structures, relationships, and rich semantic information containing dozens to hundreds of features has become a challenge. Safely utilizing such high-dimensional data, while ensuring the privacy of individuals, has become a significant topic today. Upon reviewing existing literature, we found numerous reviews on differential privacy technology itself, but few on the algorithms and applications of differential privacy specifically tailored for high-dimensional data. Therefore, this paper provides a review of the application of differential privacy in the field of high-dimensional data, aiming to delve into the strengths and weaknesses of different methods in protecting the privacy of high-dimensional data and to guide future research directions for differential privacy algorithms tailored for high-dimensional data publishing. Firstly, this paper introduces the principles and characteristics of differential privacy, summarizing the current research work on the technology itself. Then, it analyzes the application of differential privacy in high-dimensional data environments from the perspectives of data dimensionality reduction and data synthesis, discussing the challenges and issues faced by differential privacy and proposing preliminary solutions to better address the issues of privacy protection and data analysis in the current high-dimensional data landscape. Lastly, potential future research directions are proposed to facilitate technological exchange and further advancements in the application of differential privacy in high-dimensional data settings.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Construction Process and Technological Prospects of Large Language Models in the Agricultural Vertical Domain

ZHANG YuQin, ZHU JingQuan, DONG Wei, LI FuZhong, GUO LeiFeng

Journal of Agricultural Big Data 2024, 6 (3): 412-423. DOI: 10.19788/j.issn.2096-6369.000052

Abstract （1327）

HTML （100）

PDF（pc）（1315KB）（5762）

Save

With the proliferation of the internet, accessing agricultural knowledge and information has become more convenient. However, this information is often static and generic, failing to provide tailored solutions for specific situations. To address this issue, vertical domain models in agriculture combine agricultural data with large language models (LLMs), utilizing natural language processing and semantic understanding technologies to provide real-time answers to agricultural questions and play a crucial role in agricultural decision-making and extension. This paper details the construction process of LLMs in the agricultural vertical domain, including data collection and preprocessing, selecting appropriate pre-trained LLM base models, fine-tuning training, Retrieval Augmented Generation (RAG), evaluation. The paper also discusses the application of the LangChain framework in agricultural Q&A systems. Finally, the paper summarizes some challenges in building LLMs for the agricultural vertical domain, including data security challenges, model forgetting challenges, and model hallucination challenges, and proposes future development directions for agricultural models, including the utilization of multimodal data, real-time data updates, the integration of multilingual knowledge, and optimization of fine-tuning costs to further promote the intelligence and modernization of agricultural production.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Rice Yield Prediction UAV Remote Sensing Image Dataset of Heilongjiang Province in 2023

YUAN JiangHao, ZHENG ZuoJun, CHU ChangMing, YAO HongXun, LIU HaiLong, GUO LeiFeng

Journal of Agricultural Big Data 2024, 6 (4): 546-551. DOI: 10.19788/j.issn.2096-6369.100031

Abstract （1298）

HTML （84）

PDF（pc）（3288KB）（743）

Save

Rice is one of the three major grain crops in China, and accurate, efficient and timely prediction of rice yield is crucial for variety selection and optimization of field management. UAV remote sensing system is widely used in crop pest and disease identification, crop growth monitoring and crop phenotyping by virtue of its advantages of fast, non-destructive, low cost and high throughput. To explore the role of spectral data in estimating rice yield, this dataset used UAV remote sensing to collect multispectral images of rice growth process, 106 sample points of 1 m×1 m were selected for manual sampling and yield measurement, and at the same time, visible images were collected after the sampling to realize the correlation between spectral images and yield data. The dataset of this paper was constructed after manual checking and organizing. The data collection location was Heilongjiang Province, and the UAV collected the data under cloudless and light-sufficient conditions, and the collection time was from July to August in 2023, and a total of 3 days of multispectral data and 1 day of visible light data were collected with different varieties in the experimental field. The dataset in this paper was complete in all data and provided data support for research on yield estimation.

Data summary：

Items	Description
Dataset name	Rice Yield Prediction UAV Remote Sensing Image Dataset of Heilongjiang Province in 2023
Specific subject area	Agricultural Science
Research Topic	computer vision
Time range	July 2023- August 2023
Temporal resolution	Day
Data types and technical formats	.tif,.xlsx,.jpg
Dataset structure	The dataset consists of three parts of data. The first part is the multispectral image data of the entire growth period of rice, including six spectral channels: blue (450nm), green (555nm), red (660nm), red edge 1 (720nm), red edge 2 (750nm), and near-infrared (840nm), with a total of 14226 images, approximately 32.6GB; The second part is production data, saved in. xlsx format; The third part is visible light image data used to annotate sampling points, totaling 746 images, approximately 18.9GB.
Volume of dataset	51.5 GB
Key index in dataset	Gradient settings, plot labeling, yield, multispectral images, RGB images
Data accessibility	CSTR：https://cstr.cn/17058.11.sciencedb.agriculture.00131 DOI：https://doi.org/10.57760/sciencedb.agriculture.00131 NASDC Access link： https://agri.scidb.cn/, restricted access
Financial support	National Science and Technology Major Project（2021ZD0110901）

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Severity Recognition Method of Field Wheat Fusarium Head Blight Based on AR Glasses and Improved YOLOv8m-seg

XU Wei, ZHOU JiaLiang, QIAN Xiao, FU ShouFu

Journal of Agricultural Big Data 2024, 6 (4): 497-508. DOI: 10.19788/j.issn.2096-6369.000065

Abstract （1205）

HTML （40）

PDF（pc）（6582KB）（1070）

Save

Timely detection of the severity of Fusarium head blight in the field and taking corresponding prevention and control measures based on the severity of the disease can improve the quality of wheat production. The current methods for identifying the severity of wheat Fusarium head blight are mostly based on identifying one or several wheat ears, which is not suitable for field investigations due to its low efficiency. To address this issue, the study proposes an efficient and accurate method for identifying the severity of wheat Fusarium head blight in the field. By introducing CBAM attention mechanism to improve the performance of YOLOv8m-seg model. Using the improved YOLOv8m-seg model to segment wheat ear instances in the collected distant images, and then using non target suppression method to cut individual wheat ear. Then, using the improved YOLOv8m-seg model to segment diseased and healthy spikelets in each wheat ear, the severity of Fusarium head blight in each wheat ear is calculated based on the number of diseased and healthy spikelets. To verify the effectiveness of the method proposed in this article, two datasets were constructed for testing, namely dateset of wheat ear (D-WE) and dateset of wheat spikelet (D-WS). The experimental results show that YOLOv8m-seg has better overall performance than YOLOv8n-seg, YOLOv8s-seg, YOLOv8l-seg, and YOLOv8x-seg on two datasets. The model that introduces CBAM is superior to the model that introduces SE, ECA, and CA attention mechanisms. Compared with the original model, the mean average precision of the improved YOLOv8m-seg model has increased by 0.9 percentage points and 1.2 percentage points on two datasets, respectively. The severity recognition method for Fusarium head blight proposed in this study has improved the severity accuracy by 38.4 percentage points, 6.2 percentage points, and 2.4 percentage points compared to the other three recognition methods. After deploying the improved YOLOv8m-seg model through TensorRT inference framework, the total algorithm time consumed is only 1/7 of the original. Finally, this study conducted a investigation on the severity of wheat Fusarium head blight in three locations based on AR glasses. The results showed that the average counting accuracy of intelligent identification of wheat Fusarium head blight based on AR glasses was as high as 0.953, and the investigation time is one-third of the manual investigation time. This fully demonstrates the effectiveness of the proposed method and lays a good foundation for intelligent field investigation of wheat Fusarium head blight.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Agri-CBI: Agricultural Big Data Security Governance Model Leveraging Cloud-Blockchain Integration

YUE RuiJun, HE Liang, TANG MinRui, YAN Wei, LIU ShengQuan, YANG WanXia, SUN WeiHong, HUANG YongFeng

Journal of Agricultural Big Data 2024, 6 (3): 333-350. DOI: 10.19788/j.issn.2096-6369.000039

Abstract （1104）

HTML （30）

PDF（pc）（5325KB）（518）

Save

The current agricultural production model in China is transitioning from traditional to smart agriculture. In response to the continuous expansion of data scale in various agricultural organizations and the problem of "Data Silos" in data sharing, it is difficult to gather agricultural data on a large scale to guide precise agricultural decision-making. This study is based on Cloud-Blockchain Integration and data security governance related technologies in distributed agriculture scenarios to solve the above-mentioned problems and explore their practical application effects. In a distributed agricultural scenario, based on IPFS, blockchain, and cloud computing, design an agricultural big data governance algorithm that can be deployed on smart contracts, construct a multi-party agricultural data aggregation model, as well as a complete, secure, traceable data protection model and typical scenario application model. Taking the agricultural production of Huaxing Farm and its affiliated agricultural organizations in Changji, Xinjiang as an example, further build a Cloud-Blockchain Integration agricultural big data platform. By comparing the performance of the agricultural big data governance model based on Cloud-Blockchain Integration with two traditional models, the experiment shows that the comprehensive performance of the model in this article can achieve a better balance and achieved better performance compared to the traditional models.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Multispectral Image Dataset of Wheat Full Growth Cycle in Beijing Province in 2024

WANG JianLi, QU MingShan, LIU ZhenYu, SHI KaiLi, ZHANG ShiRui, LI GuangWei, ZHANG ZhongLili

Journal of Agricultural Big Data 2025, 7 (1): 126-131. DOI: 10.19788/j.issn.2096-6369.100045

Abstract （1041）

HTML （56）

PDF（pc）（1296KB）（970）

Save

Wheat is one of the major global food crops, and with the development of Internet of Things (IoT) technology, multispectral dynamic acquisition technology identifies substances and features that are difficult to distinguish in the visible range by capturing rich spectral information, thus providing more detailed data support for water and fertilizer deficiency diagnosis, pest and disease warning, etc. Currently, most studies use a drone remote sensing platform equipped with a multispectral camera to acquire multispectral images of the wheat canopy, however, the drone has high operation and maintenance costs and is unable to collect continuous growth information throughout the entire growth cycle of wheat in real time, in contrast to multispectral in-situ monitoring equipment that can collect real-time growth data throughout the entire growth cycle of a crop in a specific region on a day-by-day basis, thus realizing continuous crop growth dynamics monitoring. In this study, between April 9 and June 6, 2024, images of wheat in the test field set up in the National Precision Agriculture Research and Demonstration Base in Xiaotangshan, Beijing, were collected at the nodulation, earning, flowering, and grouting stages. The valid data after screening and organizing were multispectral images collected from 6:00 to 18:00 every day at a frequency of one hour, with a data volume of 1.42 GB. The image data were captured by the multispectral in situ monitoring equipment deployed in the natural field environment at regular intervals, and stored in the form of folders. The data are screened and organized by professional staff to ensure high quality and reliability. This dataset can be used to realize the tasks of water and fertilizer deficit diagnosis, pest and disease monitoring of wheat through the multispectral image data. The extracted information such as reflectance value, vegetation index, color characteristics, texture characteristics, vegetation coverage and other information can be brought into the prediction model for analysis and prediction. At the same time, the present dataset is also suitable for constructing the chlorophyll content of wheat, network model for biomass estimation and other studies.

Data summary:

Items	Description
Dataset name	Multispectral image Dataset of Wheat Full Growth Cycle in Beijing Province in 2024
Specific subject area	Agricultural science
Research topic	Computer vision
Time range	April 2024-June 2024
Temporal resolution	1 hour
Geographical scope	National Precision Agriculture Research and Demonstration Base in Xiaotangshan, Beijing,
Data types and technical formats	.tif
Dataset structure	The dataset consists of multispectral images of wheat canopy, covering 610 time periods.
Volume of dataset	1.42 GB
Key index in dataset	Multispectral images
Data accessibility	https://cstr.cn/17058.11.sciencedb.agriculture.00121 https://doi.org/10.57760/sciencedb.agriculture.00121 NASDC Access Link: https://agri.scidb.cn/, Restricted Access
Financial support	National Key Research and Development Program of China (2022YFD1900404), Beijing Academy of Agricultural and Forestry Excellent Youth Science Fund (YXQN202304)

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Agricultural Pest and Disease Information Retrieval Dataset

WANG Zhen, QIN Feng, QIAO Xi, HUANG Cong, LIU Bo, WAN FangHao, WANG Chen JiaoZi, HUANG YiQi

Journal of Agricultural Big Data 2025, 7 (3): 379-392. DOI: 10.19788/j.issn.2096-6369.100053

Abstract （989）

HTML （81）

PDF（pc）（1936KB）（836）

Save

With the rapid development of natural language processing and information retrieval technologies, the effective extraction and application of knowledge in the agricultural field have become increasingly important. The core of information retrieval lies in quickly and accurately locating relevant information from the knowledge base based on users' query requirements [1]. However, due to the lack of high-quality text datasets in the agricultural field in China, the further development of agricultural pest and disease information retrieval technology has been restricted. In addition, traditional search engines have shown low efficiency and insufficient accuracy in information retrieval in the agricultural field. Users often need to spend a lot of time and energy to re-screen and organize the massive and disordered data information to obtain valuable agricultural knowledge. To address the above problems, this paper has reorganized the text data on animals, plants, diseases, and invasive species accumulated by the laboratory over the years, combined with extensive literature research data, and after the processes of automated or semi-automated data cleaning and denoising, reorganized the unstructured data into structured data, and finally stored it in excel format. The constructed agricultural information retrieval dataset includes three major categories: domestic agricultural pests and diseases, invasive alien species, and quarantine species. Among them, agricultural pests and diseases include 1,254 diseases and 440 pests related to 83 crops; invasive alien species include 70 invasive alien animals and 130 invasive alien plants; Quarantine species include 99 kinds of insects, 9 kinds of mollusks, 19 kinds of fungi, 25 kinds of prokaryotes, 18 kinds of nematodes, 37 kinds of viruses and viroids, and 42 kinds of weeds. A total of 2,143 kinds of pests and diseases. In total, there are 1,983 types of pests and diseases. This dataset covers a wide range of categories and can provide basic data support for the research and development of human-computer interaction-friendly intelligent applications such as agricultural information retrieval, epidemic prevention and quarantine, and database construction in the agricultural field. At the same time, it can provide relevant data query services for scientific research institutions and functional departments engaged in pest-related work.

Data summary:

Items	Description
Dataset name	Agricultural Pest and Disease Information Retrieval Dataset
Specific subject area	Computer science and technology; Other disciplines in agronomy
Research topic	Agricultural information retrieval; data mining; artificial intelligence
Time range	2012-2024
Geographical scope	China
Data types and technical formats	.xlsx
Dataset structure	The agricultural information retrieval dataset includes three categories of domestic agricultural pests and diseases, invasive species from abroad, and quarantine species. Among them, agricultural pests and diseases include 1 254 kinds of plant-related diseases and 440 kinds of insect pests related to 83 kinds of crops; invasive species from abroad include 70 kinds of invasive animals and 130 kinds of invasive plants; Quarantine species include 99 kinds of insects, 9 kinds of mollusks, 19 kinds of fungi, 25 kinds of prokaryotes, 18 kinds of nematodes, 37 kinds of viruses and viroids, and 42 kinds of weeds. A total of 2,143 kinds of pests and diseases. The data of each category is saved in separate Excel format files.
Volume of data	4.96 MB
Key index in dataset	Types of pests and diseases
Data accessibility	CSTR:17058.11.sciencedb.agriculture.00187; https://cstr.cn/17058.11.sciencedb.agriculture.00187 DOI:10.57760/sciencedb.agriculture.00187; https://doi.org/10.57760/sciencedb.agriculture.00187
Financial support	National key research and development plan (2021YFD1400100, 2021YFD1400102, 2021YFD1400101), The Agricultural Science and Technology Innovation Program (ASTIP)(CAAS-ZDRW202505).

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

A Review and Analysis of Keyword Search Technologies for Data Privacy Protection

YANG Yu, WANG Wei, CHEN ShiWu

Journal of Agricultural Big Data 2024, 6 (2): 185-204. DOI: 10.19788/j.issn.2096-6369.000012

Abstract （928）

HTML （31）

PDF（pc）（2950KB）（1386）

Save

In the modern information society, data privacy protection has become a focal point of public attention. As internet users increasingly prioritize personal information security, research on privacy protection in the field of information retrieval has become crucial. Privacy-protecting keyword search technology aims to provide secure and private search services without revealing users' query intentions. Although existing technologies have made progress in meeting basic needs, how to reduce the risk of privacy leaks while maintaining efficiency remains a challenge. For this purpose, this paper provides a detailed review of privacy-protecting keyword search technology, systematically analyzing the principles, strengths, and weaknesses of current mainstream technologies. The study finds that although existing technologies can encrypt user queries to prevent direct leakage of sensitive information, there is still a potential risk of privacy leakage between the query pattern, access mode, and returned results. In response to this issue, the paper proposes a series of improvement directions to enhance the effectiveness of privacy protection. Furthermore, current privacy protection technologies face numerous challenges in practical applications, involving aspects such as technological enhancement and privacy compliance. By integrating and innovating cutting-edge technologies related to privacy-protecting keyword search, new ideas and solutions are expected to resolve these technical problems and promote the development of privacy protection technology to a higher level. Finally, the paper provides an outlook on the future development directions and innovative application models of privacy-protecting keyword search technology.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

An Overview of Zero-Knowledge Proof Technology and Its Typical Algorithms and Tools

WAN Wei, LIU JianWei, LONG Chun, LI Jing, YANG Fan, FU YuHao, YUAN ZiMeng

Journal of Agricultural Big Data 2024, 6 (2): 205-219. DOI: 10.19788/j.issn.2096-6369.200002

Abstract （856）

HTML （39）

PDF（pc）（517KB）（3799）

Save

In the context of the increasing importance of data security and privacy protection, Zero-Knowledge Proofs (ZKPs) have provided a powerful tool for protecting privacy. This article comprehensively discusses the technology of zero-knowledge proofs and their application in modern cryptography. First, the article introduces the basic concepts of zero-knowledge proofs, as well as different types of ZKPs such as Snarks and Starks, along with their technical characteristics and application scenarios. In particular, the article conducts an in-depth study of ZK-Snarks. At the same time, the article also discusses other proof mechanisms such as ZK-Stark and Bulletproofs, comparing their differences in design and performance. Then, it focuses on the application of ZKPs in the blockchain environment and analyzes the related tools for writing zero-knowledge proofs. Finally, it points out some potential problems and future research directions in the field of zero-knowledge proofs.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Data et of Organic Fertilizer Raw Materials and Commercial Organic Fertilizer Tetracyclines in Inner Mongolia in 2020

LI BaoHe, LIU TingTing, LI YanFang, REN CHao, YIN Xin, SHA Na, DONG Qi, CHEN JunJun, DI CaiXia, LI XiuPing

Journal of Agricultural Big Data 2024, 6 (4): 575-579. DOI: 10.19788/j.issn.2096-6369.100036

Abstract （808）

HTML （5）

PDF（pc）（377KB）（122）

Save

Organic fertilizer production is mainly based on livestock and poultry manure raw materials, which is one of the important sources of antibiotic-contaminated soil. It is of great significance to explore the impact of antibiotic residues on the ecological environment of organic fertilizer, especially the impact of the "organic fertilizer-soil-crop" pathway, which is of great significance for the safe application of organic fertilizer. This data set collected a total of 116 samples of sheep manure, cow manure, livestock and poultry manure raw materials and commercial organic fertilizers using livestock and poultry manure as the main raw materials in different regions of Inner Mongolia, including 74 organic fertilizer raw materials and 42 finished products, and the content of 4 tetracyclic antibiotics was detected to evaluate the possibility of tetracyclic antibiotic risks in different livestock and poultry manure and different regions, so as to provide a basis for the safety risk monitoring and evaluation of regional organic fertilizers. The data showed that the detection rate of raw material oxytetracycline was 8.62%, tetracycline 10.34%, chloromycin 10.34%, doxycycline 1.72%, commodity organic fertilizer chloromycin 2.58%, doxycycline 0.86%, and the detection rate of raw antibiotics was higher than that of commercial organic fertilizer.

Data summary：

Items	Description
Dataset name	Dataset of organic fertilizer raw materials and commercial organic fertilizer Tetracyclines in Inner Mongolia in 2020
Specific subject area	Agriculture
Research topic	Antibiotics in organic fertilizer
Time range	2020year
Temporal resolution	Year
Geographical scope	Inner Mongolia, China
Spatial resolution	City
Data types and technical formats	.xlsx
Dataset structure	The data set consisted of 1 Excel, including antibiotic characteristics in different regions of Inner Mongolia.
Volume of dataset	21 kB
Key index in dataset	Oxytetracycline;Tetracycline;Chlortetracycline;Doxycycline
Data accessibility	CSTR：https://cstr.cn/17058.11.sciencedb.agriculture.00028 DOI：https://doi.org/10.57760/sciencedb.agriculture.00028 NASDC Access Link: https://agri.scidb.cn/，Restricted Access
Financial support	Application and popularization of parameter kit detection technology for toxic and harmful substances in grain (2024TG08-3);Observation and Monitoring of Agricultural Basic Long Term Scientific and Technological Work (NAES036AE04); Inner Mongolia Agriculture and Animal Husbandry Innovation Fund (2023CYZX05).

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Exploration of Big Data Security Issues in the Field of Intelligent Agriculture

WU YunKun, YANG Ying, LI Hao, XIONG Jian, CHEN XiangLing

Journal of Agricultural Big Data 2024, 6 (3): 380-391. DOI: 10.19788/j.issn.2096-6369.000029

Abstract （787）

HTML （32）

PDF（pc）（1031KB）（1577）

Save

In the context of the rapid development of informatization, intelligent agriculture is an inevitable trend in agricultural development, and agricultural big data plays an important role in the realization of intelligent agriculture. Although agricultural big data has brought huge industrial momentum, many data security-related issues arose. It is crucial to handle the relationship between agricultural big data technology and data security effectively. First and foremost, this paper redefined the agricultural big data by analyzing various perspectives comprehensively, and elaborated on its promotion role in each aspect of the agricultural supply chain through a case study. Furthermore, it conducted an in-depth analysis on the distinctive attributes of agricultural big data, including its ubiquity, sociality, intersectionality, and more. Lastly, based on three fundamental elements of security (confidentiality, integrity and availability), three key functions of security (authentication, authorization and audit) and proprietary characteristics of agricultural big data, from the perspective of the seven-stage life cycle of the big data (data collection, data transmission, data storage, etc.), we proceed to construct a comprehensive framework for managing big data security risks in intelligent agriculture scenarios. The unique features of agriculture present particular obstacles within the broader context of big data. To address this issue, a customized solution has been devised, taking into account the specific needs and requirements of intelligent farming practices. This paper will introduce fresh insights and perspectives to address future data security issues in the field of intelligent agriculture, aiming to promote faster and safer development of intelligent agriculture.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Progress of Agricultural Big Data Research (2024)

Agricultural Information Institute of CAAS

Journal of Agricultural Big Data 2024, 6 (4): 433-468. DOI: 10.19788/j.issn.2096-6369.200003

Abstract （758）

HTML （85）

PDF（pc）（10122KB）（2043）

Save

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

D-PAG: Cross-modal Wolfberry Pest Recognition Model Based on Parameter-Efficient Fine-Tuning

XING JiaLu, LIU JianPing, ZHOU GuoMin, LIU LiBo, WANG Jian

Journal of Agricultural Big Data 2024, 6 (4): 509-521. DOI: 10.19788/j.issn.2096-6369.000067

Abstract （684）

HTML （22）

PDF（pc）（3056KB）（1145）

Save

With the development of multimodal foundation models (large models), efficiently transferring them to specific domains or tasks has become a current hot topic. This study uses the multimodal large model CLIP as the base model and employs parameter-efficient fine-tuning methods, such as Prompt and Adapter, to adapt CLIP to the task of goji berry pest identification. It introduces a cross-modal parameter-efficient fine-tuning model for goji berry pest recognition, named D-PAG. Firstly, learnable Prompts and Adapters are embedded in the input or hidden layers of the CLIP encoder to capture pest features. Then, gated units are utilized to integrate the Prompt and Adapter, further balancing the learning capacity. A GCS-Adapter is designed within the Adapter to enhance the attention mechanism for cross-modal semantic information fusion. To validate the effectiveness of the method, experiments were conducted on the goji berry pest dataset and the fine-grained dataset IP102. The experimental results indicate that with only 20% of the sample size, an accuracy of 98.8% was achieved on the goji dataset, and an accuracy of 99.5% was reached with 40% of the samples. On IP102, an accuracy of 75.6% was attained, comparable to ViT. This approach allows for efficient transfer of the foundational knowledge of multimodal large models to the specific domain of pest recognition with minimal additional parameters, providing a new technical solution for efficiently addressing agricultural image processing problems.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Dataset of Aromatic Components from the Fruits of 242 Table Grape Varieties

JI XiaoHao, WU YaJing, YU YiFei, WANG XiaoDi, LIU FengZhi, LI MingLiang, WANG He, LIU Xia, LIU Jun, WANG HaiBo

Journal of Agricultural Big Data 2025, 7 (1): 118-125. DOI: 10.19788/j.issn.2096-6369.100023

Abstract （656）

HTML （16）

PDF（pc）（1343KB）（698）

Save

Aroma is one of the important quality traits of grapes and a focus of research on grape quality as well as an essential aspect of molecular design breeding. Grapes have a rich germplasm resource, which also exhibits abundant genetic diversity in aroma traits. In this study, solid-phase microextraction coupled with gas chromatography-mass spectrometry (SPME-GC-MS) was used to measure the aromatic components and their contents in the fruit of 242 grape varieties. The study also conducted correlation analyses between the components and sensory evaluation, inter-component correlations, and principal component analysis among the varieties. A total of 526 volatile components were detected, and 108 potential aroma components were screened, including esters, alcohols, aldehydes, ketones, terpenes, hydrocarbons, acids, and furans, covering eight types. Esters were the most numerous, followed by terpenes, while aldehydes were the most frequent, followed by hydrocarbons and alcohols. The top ten components with the highest correlation coefficients related to aroma sensory evaluation included six esters, three terpenes, and one hydrocarbon, with ethyl hexanoate having the highest correlation coefficient, followed by ethyl 2-hexenoate and linalool. Components of the same type exhibited high correlations, especially esters, terpenes, and furans, while correlations between different types were relatively low. Principal component analysis showed that most of the varieties clustered together and diverged in three principal component directions, which highly corresponded with the results of the sensory evaluation. This study provides essential data support for researching the genetic diversity of grape aroma traits and the specificity of germplasm.

Data summary:

Items	Description
Dataset name	Dataset of Aromatic Components from the Fruits of 242 Table Grape Varieties
Specific subject area	Agronomy, biology
Research topic	Grape aroma components and content
Time range	2023-2024
Temporal resolution	year
Geographical scope	Huailai County, Zhangjiakou City, Hebei Province
Data types and technical formats	.XLS and.XLSX
Dataset structure	This dataset consists of 248 tabular data entries, primarily including GC-MS measurement results of fruit aroma from 242 grape varieties, summaries of volatile components and abundance, summaries of aroma components and abundance, sensory evaluation, correlations between aroma components and sensory evaluation, and correlations among aroma components.
Volume of dataset	39.54 MB
Data accessibility	DOI：https://doi.org/10.57760/sciencedb.agriculture.00107 CSTR： https://cstr.cn/17058.11.sciencedb.agriculture.00107
Financial support	National Key R&D Program(2023YFD1200103); National Agricultural Science and Technology Park Special Project (2021C-01); Key R&D Plan of Shandong Province (2022TZXD0010); The Agricultural Science and Technology Innovation Program of the Chinese Academy of Agricultural Sciences (CAAS-ASTIP-2021-RIP-02); Huailai Grape and Wine Industry Science & Technology Task Force

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Rule-Based Framework for Scientific Data Security Governance: A New Tool for Understanding the Imbalance and Challenges of Data Protection and Utilization

WANG Jian, ZHOU GuoMin, LIAO FangYu, XU ZhePing, ZHANG JianHua, LIU TingTing

Journal of Agricultural Big Data 2024, 6 (3): 295-306. DOI: 10.19788/j.issn.2096-6369.000068

Abstract （629）

HTML （32）

PDF（pc）（1116KB）（987）

Save

With the implementation of various data security laws and regulations centered on privacy protection, and the emergence of new governance factors such as data sovereignty, technological competition, and geopolitics, the requirements for the "protection" of scientific data have been increasingly elevated. This has objectively suppressed the "utilization" functions of data collection, processing, transmission, and analysis, leading to a significant risk of imbalance between the protection and utilization of scientific data. This imbalance is externally manifested in challenges such as the excessive burden of legal compliance and the weakening availability of public scientific data. The academic community, data managers, and policymakers urgently need effective analytical tools to understand and address these challenges in a systematic way. In response to this gap, this paper proposes a rule-based governance framework for scientific data security, aiming to provide a systematic analytical tool to address the protection-utilization imbalance and related challenges from the perspective of governance rules, including laws, ethics, and institutional policies. This framework integrates the major rule types in scientific data security governance and introduces three analytical tools: the "Island-Bridge Model," the "Law-Ethics Balance," and the "Moderate Implementation" principle, to explain the interaction mechanisms of these rules. The framework establishes the transmission paths between governance rules and the protection-utilization balance and uses these tools to explain two key challenges—excessive compliance burdens and weakened public scientific data availability—demonstrating its explanatory power and practical value. In the context of the long-term tightening of data security regulations, the rule-based analytical perspective and tools proposed in this paper enrich the theoretical foundation of scientific data security governance and provide practical references for addressing these challenges. The framework also offers essential theoretical support for policy communication among the academic community, data managers, and policymakers, ensuring the sustainable utilization of scientific data in the future.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Mongolia Grazing Density Dataset from 2006 to 2020

HUANG Jing, LI Ting, LI PengFei, ALTANSUKH Ochir, YANG MeiHuan, WANG Tao, LI Sha

Journal of Agricultural Big Data 2025, 7 (1): 77-84. DOI: 10.19788/j.issn.2096-6369.100037

Abstract （626）

HTML （13）

PDF（pc）（1571KB）（303）

Save

The health of Mongolia's grassland system is related to the efficiency of its livestock husbandry and ecological security at home and abroad. Measuring and controlling livestock grazing density is important for maintaining the health of Mongolia's grassland ecosystems and realizing the sustainable development of the livestock industry. The lack of information on spatial grazing density gradients has hindered the advancement of research related to grassland carrying capacity.This study is based on the 2015 gridded livestock of the world (GLW) dataset, population density, soil moisture, annual precipitation, surface temperature and net primary productivity (NPP). Using the Google Earth Engine (GEE) cloud platform to run the random forest regression algorithm, the Mongolian grazing density estimation model was established. The accuracy of the model was tested based on the statistical data of livestock stocks in the province, and combined with the predictor data of different years, the spatial distribution of the grazing density in Mongolia from 2006 to 2020 was simulated. In order to ensure the accuracy of the dataset, three error measurement indexes of decision coefficient (R²), mean absolute error (MAE) and root mean square error (RMSE) were used to verify the dataset. The simulation results showed that the grazing density in Mongolia from 2006 to 2020 was higher in the north and lower in the south. From 2006 to 2010, Mongolia grazing density expanded significantly, and the proportion of grazing density above 5 TLU/km² increased from 0.223% to 51.390%. There was no significant change in grazing density in most areas of Mongolia from 2010 to 2020. The test results showed that the dataset could well realize the spatial simulation of grazing density in Mongolia. The fitting R² of the simulation data in 2006, 2010, 2015 and 2020 with the livestock stocks in Mongolia province were 0.844, 0.734, 0.914 and 0.926, respectively, which passed the significance test. MAE were 5.195, 3.513, 2.336, 3.461, and RMSE were 8.135, 5.257, 4.200, 5.909, respectively. The grazing density dataset in Mongolia provided by this study provides important information support for the sustainable development of grassland ecosystem and the livelihood security of herders in this region.

Data summary:

Item	Description
Dataset name	Mongolia Grazing Density Dataset from 2006 to 2020
Specific subject area	Surveying and mapping science and technology
Research topic	Estimation of grazing density dataset in Mongolia
Time range	2006, 2010, 2015, 2020
Temporal resolution	Year
Geographical scope	Mongolia
Spatial resolution	1 km
Data types and technical formats	.tif
Dataset structure	Dataset on grazing intensity in Mongolia in 2006, 2010, 2015, 2020
Volume of dataset	36.37 MB
Key index in dataset	Pastoral population density, soil moisture, annual precipitation, surface temperature, NPP
Data accessibility	https://doi.org/17058.11.sciencedb.agriculture.00047 https://cstr.cn/10.57760/sciencedb.agriculture.00047
Financial support	National Key R&D Program of China (2022YFE0119200),Mongolian Foundation for Science and Technology (grant number NSFC_2022/01, CHN2022/276)

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

The Dataset for Crop Phenology of Winter Wheat and Summer Maize in The Alluvium Plain of the Old Yellow River From 2008 to 2022

DING DaWei, YONG BeiBei, REN Wen, XIE Kun, ZHAO YongJian, WANG GuangShuai, CHEN JinPing, WANG MingHui

Journal of Agricultural Big Data 2024, 6 (4): 552-557. DOI: 10.19788/j.issn.2096-6369.100029

Abstract （624）

HTML （33）

PDF（pc）（823KB）（396）

Save

The farmland ecosystem of the Huanghuai Plain primarily cultivates winter wheat and summer maize, which have played a crucial role in ensuring national food security. To grasp the key phenological period of primary crops precisely is of great significance for estimating crop yield, improving the level of agricultural production management, and preventing agricultural meteorological disasters. The dataset integrates ecological observation data on the phenology of different crops in a two-cropping winter wheat-summer maize continuous cropping system in the alluvial plain of the Old Yellow River over the past 15 years (2008-2022). It mainly includes information on observation plots, winter wheat phenological period data, and summer maize phenological period data. It will serve as a valuable resource for regional agricultural quantitative remote sensing, crop growth model simulation, agricultural climate change research, and decision-making in agricultural production and management.

Data summary：

Items	Description
Dataset name	The Dataset for Crop Phenology of Winter Wheat and Summer Maize in The Alluvium Plain of the Old Yellow River From 2008 to 2022
Specific subject area	Agricultural Science
Research topic	Crop phenology period of winter wheat and summer maize
Time range	From 2008 to 2022
Data types and technical formats	.xlsx
Dataset structure	The dataset includes the variety and crop phenology period of winter wheat and summer maize in six long-term observation plots at the Alluvium Plain of the Old Yellow River from 2008 to 2022.
Volume of dataset	21.7 kB
Key index in dataset	The crop phenological period of winter wheat and summer maize
Data accessibility	CSTR：https://cstr.cn/17058.11.sciencedb.agriculture.00034 DOI：https://doi.org/10.57760/sciencedb.agriculture.00034 NASDC Access link: https://agri.scidb.cn/, restricted access
Financial support	Central Public-Interest Scientific Institution Basal Research Fund (Y2024JC31, Y2024JC08, IFI2024-24); The Scientific and Technological Project of Henan Province(242102110222); National Agricultural Experimental Station for Agricultural Environment, Shangqiu (NAES038AE05).

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Spatial Feature Fusion-Based ViT Method for Fine-Grained Classification of Wolfberry Pests

SUN LuLu, LIU JianPing, ZHOU GuoMin, WANG Jian, LIU LiBo

Journal of Agricultural Big Data 2024, 6 (4): 522-531. DOI: 10.19788/j.issn.2096-6369.000066

Abstract （600）

HTML （17）

PDF（pc）（2588KB）（552）

Save

To address the fine-grained pest classification challenge faced in wolfberry cultivation, we propose an agricultural pest fine-grained classification model—Spatial Feature Fusion-based Data Augmented Visual Transformer (ESF-ViT). The model first utilizes the self-attention mechanism to crop images of the foreground targets to enhance image input and supplement more detailed representations. Secondly, it combines the self-attention mechanism with a Graph Convolutional Network (GCN) to extract spatial information from the pest regions, learning the spatial posture features of the pests. To validate the effectiveness of the proposed model, we conducted experimental research on the CUB-200-2011, IP102, and Ningxia wolfberry pest dataset WPIT9K. The experimental results show that the proposed method outperforms the baseline ViT model by 1.83%, 2.09%, and 2.01% respectively, and surpasses the existing state-of-the-art pest classification models. The proposed model effectively solves the fine-grained pest image classification problem in the field of agricultural pest recognition, providing a visual model for efficient pest monitoring and early warning.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Navigating the Distinctiveness of Research Data Protection: Framework and Mode

WANG Jian, ZHOU GuoMin, ZHANG JianHua, XU ZhePing, LIU TingTing

Journal of Agricultural Big Data 2024, 6 (3): 307-324. DOI: 10.19788/j.issn.2096-6369.000069

Abstract （592）

HTML （22）

PDF（pc）（1854KB）（1395）

Save

In recent years, increasing data security regulations have posed significant compliance challenges for scientific data management. Data classification and grading for protection has become a focal point for academia, practitioners, and regulatory bodies. However, existing research mostly focuses on compliance interpretation and reactive measures, lacking a systematic theoretical analysis of scientific data protection. This gap limits the development of frameworks and models in the field. To address this, based on an extensive survey of current practices, this paper identifies six key security characteristics of scientific data: multi-regulation, strict ethical regulation, disciplinary differences, Pareto distribution of "scale-risk," public interest, and dynamic sensitivity. It proposes a classification and grading framework, along with three protection models: comprehensive, balanced, and streamlined. Additionally, the paper introduces a "compliance-cost-benefit" triangle to explain the trade-offs among these factors. The proposed framework clarifies the complexity of classifying scientific data, distinguishing between data classification and grading, and offering insights into their interaction. This theoretical model provides valuable reference for future research and practical tools for addressing challenges in scientific data security management.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

A Review of the Evolution and Applications of AI Knowledge Distillation Technology

MAO KeBiao, DAI Wang, GUO ZhongHua, SUN XueHong, XIAO LiuRui

Journal of Agricultural Big Data 2025, 7 (2): 144-154. DOI: 10.19788/j.issn.2096-6369.000106

Abstract （586）

HTML （32）

PDF（pc）（1491KB）（746）

Save

Knowledge Distillation (KD) in Artificial Intelligence (AI) achieves model lightweighting through a teacher-student framework, emerging as a key technology to address the performance-efficiency bottleneck in deep learning. This paper systematically analyzes KD’s theoretical framework from the perspective of algorithm evolution, categorizing knowledge transfer paths into four paradigms: response-based, feature-based, relation-based, and structure-based. It establishes a comparative evaluation system for dynamic and static KD methods. We deeply explore innovative mechanisms such as cross-modal feature alignment, adaptive distillation architectures, and multi-teacher collaborative validation, while analyzing fusion strategies like progressive knowledge transfer and adversarial distillation. Through empirical analysis in computer vision and natural language processing, we assess KD’s practicality in scenarios like image classification, semantic segmentation, and text generation. Notably, we highlight KD’s potential in agriculture and geosciences, enabling efficient deployment in resource-constrained settings for precision agriculture and geospatial analysis. Current models often face issues like ambiguous knowledge selection mechanisms and insufficient theoretical interpretability. Accordingly, we discuss the feasibility of automated distillation systems and multimodal knowledge fusion, offering new technical pathways for edge intelligence deployment and privacy computing, particularly suited for agricultural intelligence and geoscience research.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Development and Practice of Comprehensive Financing Service Platform for Major Agricultural and Rural Projects

WANG ZhiQiang, NIU MingLei, GUO HongYu, YU HongJun, TAN YaoYao

Journal of Agricultural Big Data 2025, 7 (1): 85-89. DOI: 10.19788/j.issn.2096-6369.000058

Abstract （582）

HTML （8）

PDF（pc）（339KB）（375）

Save

With the advancement of agricultural and rural modernization, the demand for investment is growing rapidly. However, central financial investment is relatively limited, making it urgent to guide financial and social capital investment. This paper reviews relevant research and practical experiences, and discusses the practice of the Ministry of Agriculture and Rural Affairs in constructing a financing project database for agricultural and rural infrastructure construction and upgrading it to a comprehensive financing service platform. It analyzes the effectiveness and shortcomings of the financing project database, elaborates on the construction ideas, main functions, and architectural design of the upgrade to a comprehensive financing service platform, and proposes measures to deepen the development and sharing of financing and investment data resource, as well as prospects for the future development of the platform. In the future, the platform is expected to further improve its functions, enhance service quality, attract more financial institutions and enterprises to settle in, and provide stronger support for financial investment of major agricultural and rural projects, and promote the sustained development of the agricultural and rural economy.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Image-Text Multi-Modal Dataset of Corn Leaf Diseases based on Manual Annotation and Contrast Generation Model

WANG YanFang, XIAN GuoJian, ZHAO RuiXue

Journal of Agricultural Big Data 2025, 7 (3): 371-378. DOI: 10.19788/j.issn.2096-6369.100060

Abstract （579）

HTML （42）

PDF（pc）（1322KB）（496）

Save

Accurately identifying corn leaf diseases is an important part of intelligent agricultural management. The existing maize disease data sets have problems such as uneven quality, fuzzy label categories, and lack of multimodal data, especially the scarcity of disease description data in the Chinese context. This data set integrates the image data of corn disease from open source platforms such as AI Challenger, PlantVillage and OpenDataLab, and complements the high-definition disease images collected in the field. A Chinese multimodal data set containing 1653 images is constructed. Each image has its corresponding diagnostic text description, covering key information such as disease type, disease characteristics and severity. At the same time, the cn-clip and CPT2 Chinese large model are combined to achieve image description generation, which provides a method for automatic annotation. This data set can provide high-quality data support for the development of an intelligent diagnosis model of corn disease, the generation of Chinese image description and the construction of an agricultural multimodal knowledge map.

Data summary:

Item	Description
Dataset name	Image-Text Multi-Modal Dataset of Corn Leaf Diseases based on Manual Annotation and Contrast Generation Model
Specific subject area	Agricultural Science, Computer Science
Research topic	Computer vision, Cross-modal retrieval, Image captioning
Data types and technical formats	.jpg
Dataset structure	The dataset is composed of two parts: image data of corn leaf disease and corresponding text description data, including: 1. the original image data set of leaf disease, including 9 typical disease image data, including large leaf spot, small leaf spot, brown spot, Curvularia leaf spot, common rust, southern rust, gray spot, round spot and dwarf mosaic, with a total of 1653 pieces; 2. the diagnostic text description corresponding to the image has an average length of about 32 characters, a total of 1653.
Volume of dataset	3.87 GB
Key index in dataset	Image and its corresponding description text
Data accessibility	CSTR:17058.11.sciencedb.agriculture.00226; https://cstr.cn/17058.11.sciencedb.agriculture.00226 DOI:10.57760/sciencedb.agriculture.00226; https://doi.org/10.57760/sciencedb.agriculture.00226
Financial support	National Science and Technology Major Project（2021ZD0113705）.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Metrological Analysis of Data-driven Deep Learning Methods for Agriculture

LI JiaLe, ZHANG JianHua, WANG Jian, ZHOU GuoMin

Journal of Agricultural Big Data 2024, 6 (3): 400-411. DOI: 10.19788/j.issn.2096-6369.000023

Abstract （577）

HTML （20）

PDF（pc）（1590KB）（532）

Save

With the development and application of artificial intelligence, computer vision, deep learning and other science and technology in the field of agriculture, the data-driven deep learning model for agriculture has become a new research paradigm for agricultural information extraction, and agricultural datasets are the basis for deep learning model training, and high-quality, large-scale, and diverse datasets can effectively improve the model performance, thus boosting the application of deep learning in the field of smart agriculture. To help researchers in related fields better understand the driving force of data for deep learning and give full play to the application of deep learning in the field of agriculture, this paper analyzes the datasets through metrology and summarizes the basic qualities of agricultural datasets such as type, scale, and source, which are divided into four categories according to the deep learning methods, such as target detection, image segmentation, and image recognition, and into seven categories according to the application areas, such as visual navigation, feature recognition, non-destructive testing and other 7 categories. The results show that the type of dataset is dominated by image data, and the data volume of images is concentrated in the range of 500 to 1500, and due to the specificity of agricultural data collection, most of the dataset is constructed by individuals and some of them are from public datasets, and the dataset is mainly utilized to carry out feature recognition. In the future, as the scale of the model becomes larger and larger, the requirements for the dataset are also upgraded, and it is necessary to continuously construct large-scale, balanced distribution, and accurately labeled datasets.In this paper, we provide a theoretical basis for data to promote deep learning agricultural applications by emphasizing the driving force and the importance of data to the deep learning model.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Governance of Scientific Data Sharing within International Organizations and International Science and Technology Project

LI YiZhan, DONG Lu, WANG DongYao, ZHANG Hong, WANG ZhiQiang, WEI Ren, LI ZeXia

Journal of Agricultural Big Data 2024, 6 (2): 161-169. DOI: 10.19788/j.issn.2096-6369.000031

Abstract （568）

HTML （20）

PDF（pc）（442KB）（715）

Save

The increasing interconnectedness and reusability of scientific data have brought forth unparalleled challenges in the realm of international science and technology (S&T) cooperation. Against the backdrop of a dynamic and intricate international landscape, the rapid evolution of information technology has accentuated the salience of security concerns surrounding scientific data, thereby imposing significant challenges upon extant security frameworks. This paper, following a meticulous elucidation of the concept of scientific data safety, systematically consolidates governance policies and practices pertaining to scientific data sharing gleaned from prominent international S&T organizations and major international S&T initiatives. Four discerning recommendations are delineated: (1) Advocating the principle of "openness as the norm, with exceptions," by discerningly deploying "exceptions" and "exemption lists" to strike a harmonious balance between obligations and rights; (2) Continuously enhancing the domestic governance policy ecosystem for scientific data security, thereby fostering transparency and compliance with scientific data policies within the ambit of international S&T collaborations; (3) Adopting a comprehensive approach that duly considers factors such as scientific data resources and risk control technologies, to facilitate genuine peer-to-peer scientific data exchanges; (4) Instituting a proficient data safety team adept in international policies, regulations, and best practices, and fortified with a robust array of technical competencies.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Ontology Construction in the Field of Wheat Sharp Eyespot Control

LIU KeYi, CUI YunPeng, GU Gang, WANG Mo

Journal of Agricultural Big Data 2024, 6 (4): 485-496. DOI: 10.19788/j.issn.2096-6369.000011

Abstract （566）

HTML （19）

PDF（pc）（3190KB）（306）

Save

Wheat Sharp Eyespot is a soil-borne fungal disease commonly found in China's wheat areas, which can occur throughout the entire reproductive period of wheat and has a great impact on the yield and quality of wheat in China. By constructing a Wheat Sharp Eyespot control domain ontology and modeling domain knowledge, we aim to integrate and share the knowledge in the field of Wheat Sharp Eyespot control to provide important support and guidance for agricultural decision-making and disease control. The ontology construction process for Wheat Sharp Eyespot control is proposed to meet the actual needs of Wheat Sharp Eyespot control. For the problems of low efficiency and limited expert knowledge in constructing ontologies by manual methods, this study will explore new methods for ontology construction. Special attention will be paid to the methodology of mining core concepts of the ontology to reduce the subjectivity and limitations in the construction process, so that the ontology will have a wider application potential.In this study, used the literature in the field of Wheat Sharp Eyespot control as a data source, KeyBERT keyword extraction algorithm was used to mine the core concepts of ontology, and BERT embedding and cosine similarity were used to find out the subphrases in the document that were most similar to the document itself. Hierarchical relationships between ontology concepts were extracted by hierarchical clustering, topic modeling was performed using BERTopic, Transformer and c-TF-IDF were used to create dense clusters.Finally, Protégé was used to visualize and express the ontology concepts and inter-concept relationships.In this study, the results of thematic and hierarchical clustering were analyzed and condensed to classify the ontology of Wheat Sharp Eyespot control into eight parent concepts, which were pathogenicity pattern, wheat growth period, etiology of the disease, disease area, disease extent, symptoms and control measures. According to the characteristics of the Wheat Sharp Eyespot control domain, 11 object attributes, 16 first-level data attributes, and 8 second-level data attributes were defined for the Wheat Sharp Eyespot control ontology by organizing and analyzing the associations among the parent concepts. Finally, Protégé was used to visualize and express the ontology concepts and inter-concept relationships. This study proposed a method for constructing a domain ontology for Wheat Sharp Eyespot control, described the basic method for constructing an ontology by building a corpus of Wheat Sharp Eyespot, gived a process framework for constructing a domain ontology, and described in detail the algorithms and construction tools used in the construction. The data source of this study was mainly scientific and technical literature, and the ontology can be extended in the future by further expanding the data source. The assessment part of the ontology mainly relied on the assessment of domain experts at present, and quantitative assessment can be added in the future.The Wheat Sharp Eyespot control domain ontology constructed in this study contained a more complete conceptual system of Wheat Sharp Eyespot, meeting the ontology evaluation criteria and ontology construction requirements, and can provide reference for the construction of domain ontology, and provide powerful support for knowledge discovery and downstream applications in the field of Wheat Sharp Eyespot prevention and control, such as intelligent Q&A, intelligent recommendation, and so on.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Application Analysis of Blockchain and Confidential Computing Technology in Material Database Platform

GONG HaiYan, MA FuQiang, ZHANG DaWei, LI XiaoGang

Journal of Agricultural Big Data 2024, 6 (2): 241-252. DOI: 10.19788/j.issn.2096-6369.000026

Abstract （544）

HTML （17）

PDF（pc）（1260KB）（1743）

Save

With the rise of data-driven material design driven by artificial intelligence and materials science, material science data has become a focal point of production factors, national strategic resources, and international competition. However, as material data sharing increases, data security issues become increasingly important. Issues such as data leakage, misuse, and tampering threaten the competitiveness of enterprises. We first review mainstream data security protection technologies, including access control and encryption technologies, which constitute the traditional data security protection model, ensuring security during data transmission and storage. Next, the development of blockchain technology is introduced. Blockchain technology can achieve confidentiality, integrity, and availability during data transmission and storage, but these mechanisms still cannot address privacy issues during data usage, nor can they protect the confidentiality and integrity of data during usage. Then, the advantages of confidential computing technology are analyzed. By executing calculations in a hardware-based trusted execution environment, confidential computing technology minimizes the trusted computing base, providing comprehensive data protection and adhering to the concept of "data usability without visibility" to protect data during usage, thereby constructing end-to-end lifecycle data security. Finally, we combine the advantages of blockchain and confidential computing technology to propose a trustworthy infrastructure solution for material data based on blockchain and confidential computing, to achieve security throughout the data lifecycle and provide strong support for the secure application of material data.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Strongly Promote the Construction of Smart Agriculture, Accelerate the Formation of New Quality Productive Forces in Agriculture

WANG XiaoBing, LIU Yang, LIANG Dong, KANG Ting, KANG ChunPeng, YIN RuiFeng, CHEN Sha, REN YuJue, XU Yang

Journal of Agricultural Big Data 2024, 6 (4): 469-475. DOI: 10.19788/j.issn.2096-6369.000054

Abstract （542）

HTML （27）

PDF（pc）（363KB）（1315）

Save

Agriculture is the most fundamental and representative traditional industry, and accelerating the construction of an agricultural powerhouse requires the cultivation and development of new quality productive forces in agriculture. Digital technology is the leading force of the new round of scientific and technological revolution and industrial transformation, and comprehensively promoting the integration of digital agriculture will become an important focus for accelerating the formation of new quality productive forces in agriculture. This article studies the scientific connotation of new quality productive forces, the basic characteristics of agricultural new quality productive forces, the realistic needs, development trends, and key directions of smart agriculture construction, combining domestic and foreign literature, statistical data, and survey results. It proposes measures to promote the construction of smart agriculture and accelerate the formation of new quality productive forces in agriculture. China's smart agriculture construction is moving from "scenic spots" to "scenery," and has entered a new stage of synergistic and efficient integration of big data, IoT, blockchain, artificial intelligence, satellite remote sensing, and BeiDou Navigation Satellite System (BDS) and other modern information technologies.It is necessary to start from key industries and accelerate the formation of a favorable ecological environment for the development of smart agriculture. We should regard smart agriculture as an important lever for advancing the construction of an agricultural powerhouse, addressing the biggest constraint factor of data, promoting the deep integration of modern information technology with agricultural industries, cultivating a high-quality innovative workforce, fully leveraging the role of enterprises as the main innovation actors, strengthening key core technology R&D, aligning with the development trends of digitalization and greenization, and accelerating the cultivation and development of new agricultural productive forces.

Reference | Related Articles | Metrics | Comments（0）

Select

Research on the Realization Path of Digital Platform Enabling Common Prosperity: A Case Study of the Agricultural-Expo Online in Zhejiang

WANG YanJiong, LIU Chang, CHEN HuiJie, TANG Yan, HUO ZengHui, CHEN FuQiao

Journal of Agricultural Big Data 2024, 6 (4): 476-484. DOI: 10.19788/j.issn.2096-6369.000049

Abstract （490）

HTML （24）

PDF（pc）（679KB）（427）

Save

Digital platform is an important opportunity for major players to seize the commanding heights of a new round of science and technology industrial revolution, which effectively links producers and consumers in a low-cost, highly inclusive, high-efficiency and high-value way, and plays an important role in helping rural revitalization and achieving common prosperity. This paper takes "agricultural-expo online" of Zhejiang platform as an example, analyzes the framework and functions of digital platform, shows the initial results of digital platform enabling common prosperity, refines the theoretical framework of digital platform enabling common prosperity, and discusses the constraints and substantive problems of digital platform enabling common prosperity. In view of the problems existing in digital platforms, such as unclear corporate responsibility, unfair value distribution and unbalanced benefit distribution, unbalanced application of digital basic platforms and obstacles to the development of public welfare digital platforms, this paper puts forward countermeasures and suggestions such as clarifying responsibility boundaries, allocating resources fairly, optimizing platform functions, exploring sustainable profit models and multi-party collaborative support. It provides a reference for answering how digital platform can promote common prosperity more effectively.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Promoting the Transformation of Modern Agriculture: Reflections and Prospects for the Development of Smart Agriculture

KONG FanTao, ZHAO RenJie, ZHANG XinRui, LIU ZhenHu, CAO ShanShan

Journal of Agricultural Big Data 2025, 7 (2): 155-160. DOI: 10.19788/j.issn.2096-6369.000090

Abstract （477）

HTML （33）

PDF（pc）（397KB）（446）

Save

With the rapid development of information technology, smart agriculture, as an important direction for modern agriculture, is gradually becoming a key force in promoting the transformation and upgrading of agriculture in China. This article provides a review of the current research status of smart agriculture both domestically and internationally, and explores its specific practice and development trends in China. This article points out that smart agriculture can not only improve agricultural production efficiency and product quality, but also provide new ideas for solving many problems faced by traditional agriculture. At the technical level, the application of emerging technologies such as the Internet of Things, big data, and cloud computing has made the agricultural production process more intelligent and precise; In terms of management, emphasis is placed on optimizing resource allocation and improving service efficiency by building a comprehensive service platform. In addition, policy support is crucial for the development of smart agriculture. The government should increase investment in infrastructure construction and talent cultivation, establish and improve relevant legal and regulatory systems to ensure data security. At the same time, we encourage all sectors of society to actively participate in the construction of smart agriculture, forming a good situation of multi-party collaboration. Finally, this article proposes several directions that future smart agriculture needs to focus on: first, deepening technology research and development; second, strengthening cross disciplinary cooperation; third, emphasizing the summary and promotion of practical experience.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Security Challenges and Countermeasures on Open Sharing of Scientific Data in the Context of Open Science

LIAO FangYu, LI Jing, LONG Chun, YANG Fan, YUAN ZiMeng

Journal of Agricultural Big Data 2024, 6 (2): 146-155. DOI: 10.19788/j.issn.2096-6369.000027

Abstract （465）

HTML （31）

PDF（pc）（487KB）（788）

Save

Scientific data is a strategic and fundamental scientific and technological resource, profoundly impacting national security, economic development and technological progress. In the context of open science, scientific data, as the outcome and important support of data-intensive scientific research paradigms, also faces severe security challenges in terms of security and compliance, trusted and reliable sharing exchange. Focus on these challenges and aims to promote the open sharing of scientific data, the author propose several feasible strategies from the aspects of policy, management, technology, evaluation, and supervision, where the core is to construct a dynamic, fine-grained, and domain-applicable security classification and grading system, to promote the secure development and utilization of scientific data and accelerate transformation into a scientific and technological powerhouse.

Reference | Related Articles | Metrics | Comments（0）

Select

Design and Practice of Marine Scientific Data Security Governance System

FU Yu, JIANG XiaoYi, WEI YangMing, TONG Xing, XU MoGen, WANG Yi

Journal of Agricultural Big Data 2024, 6 (2): 286-294. DOI: 10.19788/j.issn.2096-6369.000037

Abstract （451）

HTML （22）

PDF（pc）（3159KB）（359）

Save

In the current era of data economy, data security has risen to the national security strategy. Marine scientific data security is the most urgent core problem of marine data resource management and sharing services. Based on the full analysis of the needs and goals of marine scientific data security governance, this paper proposes a data-centered marine scientific data security governance system framework, and proposes a marine scientific data security governance system model under this framework. The governance practice is carried out in the National Marine Science Data Center to optimize the security management of all stages of the whole life cycle of marine scientific data. On the basis of ensuring data security, the openness of marine scientific data is further improved to give full play to the value of marine data.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Vegetation Cover Dataset of Mongolia from 1990 to 2022

YANG MeiHuan, LI YaWen, WANG Tao

Journal of Agricultural Big Data 2025, 7 (1): 69-76. DOI: 10.19788/j.issn.2096-6369.100041

Abstract （413）

HTML （17）

PDF（pc）（6239KB）（338）

Save

The Mongolian Plateau, a crucial ecological barrier in Northern China, necessitates stable and healthy ecological functions in Mongolia for understanding regional vegetation's response to global warming and reinforcing our northern ecological defenses. Fractional Vegetation Cover (FVC) is an indicator used to assess the extent of vegetation cover on the Earth's surface. It is commonly utilized to measure the coverage provided by vegetation, serving as a crucial metric for evaluating the health of grassland ecosystems. Monitoring changes in FVC is significant for promptly detecting trends in grassland degradation and recovery. Variations in FVC directly impact soil erosion and water loss, and monitoring and controlling FVC can help slow down soil erosion and maintain the stability of grassland ecosystems. This study aims to generate and validate an annual FVC dataset with a spatial resolution of 1/12° spanning from 1990 to 2022, with the objective of comprehensively reflecting the distribution of vegetation cover in Mongolia over an extended temporal series. To ensure the accuracy and reliability of the dataset, the study integrated MOD13Q1 data for computational calibration and validation, thereby guaranteeing the precision of FVC calculations. By constructing this FVC dataset, the study provides a scientific basis for the conservation and management of the grassland ecosystem in Mongolia.

Data summary:

Item	Description
Dataset name	Vegetation Cover Dataset of Mongolia from 1990 to 2022
Specific subject area	Ecology
Research topic	Vegetation Monitoring and Analysis
Time range	1990-2022
Temporal resolution	1 year
Geographical scope	（41°—53°N，87°—121°E）
Spatial resolution	1/12°
Data types and technical formats	.tiff
Dataset structure	The dataset is the annual 1/12°vegetation coverage of Mongolia from 1990 to 2022.
Volume of dataset	7.47 MB
Key index in dataset	NDVI, FVC
Data accessibility	https://cstr.cn/17058.11.sciencedb.agriculture.00118 https://doi.org/10.57760/sciencedb.agriculture.00118
Financial support	The National Key Research and Development Program project (2022YFE0119200); National Natural Science Foundation of China projects (41977059, 41501571).

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Fine Classification Dataset of Crops in the Transboundary Basin of the Heilongjiang River Between Russia and China, 2015-2023

LIU Meng, WANG JuanLe, LI Kai, JIANG JiaWei, ZOU WeiHao

Journal of Agricultural Big Data 2025, 7 (1): 22-30. DOI: 10.19788/j.issn.2096-6369.100035

Abstract （408）

HTML （29）

PDF（pc）（5058KB）（450）

Save

The Heilongjiang transboundary basin region, where the Russian Far East and northeastern China are located, is rich in natural resources and has great potential for the development and utilization of agricultural resources. Facing the crisis of increasing global conflicts and shortage of food supply chain, strengthening the monitoring and development and utilization of agricultural resources in the Heilongjiang basin is of great significance to guarantee global food security. In this dataset, the Heilongjiang transboundary watershed is used as the study area, and machine learning and sample migration methods are applied to construct a comprehensive set of fine classification system for agricultural crops. Based on historical remote sensing image data and the Google Earth Engine (GEE) cloud platform, the classification of major crops such as wheat, corn, soybean and rice in 2015, 2020 and 2023 was completed with an overall accuracy of more than 84% and a Kappa coefficient of more than 0.81, using Landsat images as the data source. The analysis of spatial and temporal changes reveals the pattern and changing characteristics of crops in the Heilongjiang transboundary watershed, and provides decision-making support for the optimal allocation of arable land resources in this watershed.

Data summary:

Item	Description
Dataset name
Specific subject area	Land resources and information technology
Research topic	Fine classification of crops in the transboundary basin of the Heilongjiang River
Time range	2015, 2020, 2023year
Temporal resolution	year
Geographical scope	Heilongjiang Transboundary Basin
Spatial resolution	10 m, 30 m
Data types and technical formats	.tif
Dataset structure	This dataset contains fine categorized data of crops in the transboundary basin of Heilongjiang for the years 2015, 2020 and 2023, each year corresponds to 8 Tiff files, totaling 24 records.
Volume of dataset	1.92 GB
Key index in dataset	Fine classification of crops (wheat, maize, soybean, rice) in the transboundary basin of the Heilongjiang River
Data accessibility	https://cstr.cn/17058.11.sciencedb.agriculture.00041 https://doi.org/10.57760/sciencedb.agriculture.00041
Financial support	The ANSO "Belt and Road" International Alliance of Scientific Organizations (Grant No. AN-SO-CR-KP-2022-06), the China Science and Technology Basic Resource Survey Program (Grant No. 2022FY101902), China Engineering Science and Technology Knowledge Center Construction Project (Grant No. CKCEST-2023-1-5)

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Dataset of the Resistance Classification of Corn Northern Leaf Blight and Evaluation of Maize Varieties in Shanxi Province from 2016 to 2023

YANG JunWei, WANG JianJun, WEN ShengHui, WANG FuRong, MA ZhouJie

Journal of Agricultural Big Data 2025, 7 (1): 107-111. DOI: 10.19788/j.issn.2096-6369.100009

Abstract （407）

HTML （17）

PDF（pc）（713KB）（303）

Save

The field phenotypic resistance analysis of 1439 maize hybrids from Shanxi province product comparison test were conducted from 2016 to 2023 through artificial inoculation. This dataset is stored in Excel format and contains a total of 1439 rows of data, with each row representing a corn variety. The columns in the dataset include: corn type, variety name, maturity, sowing time, inoculation time, resistance survey time, resistance grading of each variety to northern leaf blight, etc. The establishment and sharing of this dataset can provide technical support for the screening and subsequent promotion and utilization of maize varieties resistant to big spot disease, and also provide reference materials for the breeding of materials resistant to big spot disease.

Data summary:

Ttem	Description
Dataset name	Dataset of the Resistance Classification of Corn Northern Leaf Blight and Evaluation of Maize Varieties in Shanxi Province from 2016 to 2023
Specific subject area	Agricultural science
Research topic	Corn northern leaf blight
Time range	2016-2023
Temporal resolution	Year
Geographical scope	Shanxi Province
Data types and technical formats	.xls
Dataset structure	The field phenotypic resistance analysis of 1439 maize hybrids from Shanxi province product comparison test were conducted from 2016 to 2023 through artificial inoculation. This dataset is stored in Excel format and contains a total of 1439 rows of data, with each row representing a corn variety.
Volume of dataset	277 kB
Key index in dataset	Corn type, variety name, maturity period, sowing time, inoculation time, resistance investigation time, classification of resistance to Corn northern leaf blight in various varieties
Data accessibility	https://cstr.cn/17058.11.sciencedb.agriculture.00162 https://doi.org/10.57760/sciencedb.agriculture.00162
Financial support	National Agricultural Basic Long term Science and Technology Work Monitoring Project (NAES088PP15); Youth Project of Shanxi Basic Research Program (202203021212438); Key Agricultural Research and Development Plan of Xinzhou City (20220207); Xinzhou Basic Research Plan (20220506).

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Research on the Trusted Level of Metrology Scientific Data

LIU ZiLong, GONG Hao, WANG Juan, XIONG XingChuang

Journal of Agricultural Big Data 2024, 6 (2): 253-258. DOI: 10.19788/j.issn.2096-6369.000034

Abstract （403）

HTML （10）

PDF（pc）（1020KB）（285）

Save

Scientific data security is not only an inherent requirement for the development of scientific data management theory, but also a practical need for the development of scientific data. In the field of metrology, with the development of metrology digital transformation, reliable requirements have been put forward for data security in metrology science based on the universal significance of data security. The development strategy CIPM2030+formulated by the International Bureau of Weights and Measures (BIPM) proposes to achieve "remote+X" in the digital transformation of metrology, and its services still focus on metrology scientific data. Therefore, in "remote+X" measurement, the authority of measurement in the physical world is transformed into the credibility of scientific measurement data in the digital world. This article proposes three methods to establish the credibility of metrological data, namely, the anti-repudiation and identity trustworthiness of the data generation subject, the anti-tampering and traceability of the data transmission process, and provides corresponding implementation technical suggestions. It also specifically introduces the data security system design of the National Institute of Metrology in achieving the credibility of metrological data. Through these studies, we will gradually establish a trustworthy system for metrological scientific data, and build a trustworthy foundation for the secure application of metrological scientific data and the digital transformation of metrology.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

SSR Molecular Fingerprint Dataset for 291 Grape Varieties

WU YaJing, JI XiaoHao, YU YiFei, SHI Meng, WANG XiaoDi, WANG BaoLiang, LIU FengZhi, LI MingLiang, LI He, LIU Jun, WANG HaiBo

Journal of Agricultural Big Data 2025, 7 (1): 112-117. DOI: 10.19788/j.issn.2096-6369.100022

Abstract （395）

HTML （12）

PDF（pc）（1414KB）（242）

Save

China occupies an important position in the global grape industry, with cultivation area and yield ranking among the top in the world. As a key branch of the fruit tree industry, the grape industry plays a pillar role in increasing farmers' income and rural revitalization. China's grape variety collection has grown, thanks to the introduction of grapes from other countries and new types developed within China. These additions have been a key factor in growing and strengthening the grape industry. However, the increase in variety has also brought about homogenization and identification difficulties, and traditional morphological feature identification methods are no longer suitable for current needs. This investigation entailed the extraction of DNA from 291 table grape germplasm samples. Employing 30 fluorescently-tagged SSR molecular markers, PCR amplification and fluorescent capillary electrophoresis were conducted to establish a molecular fingerprint database for these cultivars. The molecular fingerprint database constructed in this study contains a total of 8730 pieces of information. Further analysis shows that the average number of genotypes for the 30 selected primer loci is 10.7, with a heterozygosity range of 0.21 to 0.62 and an average heterozygosity of 0.38. Based on the number of genotypes at 30 primer loci, 216 varieties were speculated to be diploid, while 75 were polyploid. It was found that the similarity between diploid varieties was generally low, while the similarity between polyploid varieties was relatively high. The results of this study not only provide accurate basis for grape variety identification, but also provide important data for the analysis of genetic relationships of germplasm resources, which is of great significance for theoretical research and practical applications.

Data summary:

Items	Description
Dataset name	SSR Molecular Fingerprint Dataset for 291 Grape Varieties
Specific subject area	Agronomy, biology
Research topic	SSR molecular fingerprint of grape varieties
Time range	2022—2023
Temporal resolution	one year
Geographical scope	Huailai County, Zhangjiakou City, Hebei Province
Data types and technical formats	.XLSX
Dataset structure	This dataset consists of 6 table data, mainly including PCR product molecular weight information of 291 grape varieties and 30 primer sites, primer names and sequence information of 30 pairs of primers, variety ploidy inference results, polyploid variety similarity matrix, diploid variety similarity matrix, and genotype frequency.
Volume of dataset	299 kB
Data accessibility	DOI：10.57760/sciencedb.agriculture.00103 CSTR：17058.11.sciencedb.agriculture.00103
Financial support	National Key R&D Program(2023YFD1200100); National Agricultural Science and Technology Park Special Project (2021C-01); Key R&D Plan of Shandong Province (2022TZXD0010); The Agricultural Science and Technology Innovation Program of the Chinese Academy of Agricultural Sciences (CAAS-ASTIP-2021-RIP-02); Huailai Grape and Wine Industry Technology Mission

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Research on Countermeasures of Scientific Data Exit Management in China

WEI Xin, KONG LiHua, WANG Yang

Journal of Agricultural Big Data 2024, 6 (2): 156-160. DOI: 10.19788/j.issn.2096-6369.000036

Abstract （390）

HTML （23）

PDF（pc）（329KB）（451）

Save

In the age of informationization, scientific and technological innovation is increasingly dependent on scientific data. As the international competitive situation becomes increasingly severe, the issues of scientific data security and exit management are becoming more and more prominent. At present, there are some problems in China's scientific data exit management, such as imperfect policies and regulations, insufficient technical ability and incomplete international cooperation mechanism, etc. There is an urgent need to improve the relevant management system, establish a coordinated and unified scientific data exit management system; strengthen the research of data security monitoring and supervision technology, promote the innovation and upgrading of key technologies in scientific data; actively explore bilateral and multilateral international cooperation on scientific data, and lead the formulation of an international framework of scientific data rules. We will work with other countries to address the risks and challenges of scientific data export, promote the safe and orderly global circulation of scientific data, and accelerate the development of scientific and technological innovation.

Reference | Related Articles | Metrics | Comments（0）

Select

Research on the Security Classification Conceptual Framework of Space Environment Scientific Data

XU Qi, HU XiaoYan, ZOU ZiMing, TONG JiZhou

Journal of Agricultural Big Data 2024, 6 (2): 259-268. DOI: 10.19788/j.issn.2096-6369.000051

Abstract （384）

HTML （7）

PDF（pc）（2440KB）（588）

Save

It is necessary that establish a multi-dimensional and comprehensive security classification framework for space environmental data resources and form domain data security classification rules for complying with the requirements of the Data Security Law of the People's Republic of China and carrying out fine-grained domain data safety grading management. Space environmental scientific data resources are characterized by multiple-sources, multiple types, multiple spatial and temporal resolutions, and multiple modes. In order to meet the needs of data flow and sharing, domain data application, security management and so on, the National Space Science Data Center（NSSDC）has combined and analyzed the classification methods and features of different levels of data resources for the data security classification standards in other industries through case study and qualitative analysis. A logical line for determining the security level following damage is established by mapping it to data security classification rules, based on domain and data resource characteristics as well as post-reverse analysis. Based on these findings, a conceptual framework for data safety classification is developed that can be applied to various types of space environmental scientific data. The conceptual framework of space environmental scientific data security classification proposes a methodology for identifying data features based on domain data classification, and provides an approach for assessing security impacts based on confidentiality, integrity, accessibility, and authenticity. It also presents a reference framework for data security classification rules, which serves as the foundation for implementing data security classification management in the field of space environment and supports the establishment of an important data catalog in this domain.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

A Multi-Omics Dataset for Functional Gene Mining in Animals

LIU Hong, DOU JingWen, WANG Yue, LIAO Yong, LIU XiaoLei, LI XinYun, ZHAO ShuHong, FU YuHua

Journal of Agricultural Big Data 2025, 7 (1): 96-106. DOI: 10.19788/j.issn.2096-6369.100039

Abstract （384）

HTML （9）

PDF（pc）（1178KB）（946）

Save

Single-omics data alone is insufficient to comprehensively reveal the complex molecular mechanisms of gene regulation traits. Integrating different types and levels of biological omics data is of great significance for understanding the complex molecular networks within organisms. This dataset provides individual-level omics data (WGS, RNA-Seq, ChIP-Seq, and ATAC-Seq) and genome annotation information for 61,191 individuals from 21 animal species, with an effective data size of 2.8 TB. Additionally, this dataset includes gene and phenotype entity recognition data obtained through deep learning algorithms. Overall, this multi-omics dataset can be used for gene discovery and functional validation of agriculturally important traits, offering valuable resources for cross-species comparative studies. It also supports the construction of models for identifying key genes associated with economic traits in animals and facilitates algorithm research.

Data summary:

Item	Description
Dataset name	A Multi-Omics Dataset for Functional Gene Mining in Animals
Specific subject area	Agronomy
Research topic	Animal Multi-Omics Dataset
Time range	2000-2022
Data types and technical formats	.txt,.vcf, ped, map, bed, bim, fam
Dataset stucture	The dataset consists of five parts: Functional annotation information for 403,216 genes across 21 species. Genomic variation data for 10,835 individuals from 21 species, encompassing 877.59 million variations. Gene expression matrix data for 44,638 individuals from 21 species. Epigenetic signal matrix data for 5,718 individuals from 21 species, including 124 markers such as H3K27ac. The pre-labeled gene and phenotype data of 2794237 articles from 21 species.
Volume of dataset	2.8 TB
Key index in dataset	Gene functional annotation, genomic variation information, gene expression matrices, epigenetic signal matrices, gene and phenotypic pre-labeled data
Data accessibility	https://cstr.cn/17058.11.sciencedb.agriculture.00024 https://doi.org/10.57760/sciencedb.agriculture.00024 PUBLIC, CC BY-NC 4.0
Financial support	National Natural Science Foundation of China General Program (32272841); Hubei International Science and technology cooperation project (2022EHB055)

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Most Read Articles