Journal of Agricultural Big Data >
Dataset of Soil Total Nitrogen Content and Soil Total Phosphorus Content of the Kherlen River Basin in 2022
Received date: 2023-06-08
Accepted date: 2023-09-08
Online published: 2023-11-14
The ecological and environmental security of the Kherlen River Basin has attracted more and more attention in China and Mongolia. It is of great significance to investigate the contents of soil total nitrogen (STN) and soil total phosphorus (STP) in the basin for accurately estimating the load of non-point sources (NPS) and studying the state of resources and environment and sustainable development. It is time-consuming and labor-intensive to obtain a wide range of STN and STP contents using traditional sampling methods, while STN and STP not only have spatial heterogeneity, but also have heterogeneity in their relationships with auxiliary variables. Moreover, a single global model cannot fit complex heterogeneous relationships, and it is difficult for the local modeling method to overcome dimensional disaster problems. Therefore, the two-point machine learning (TPML) method is introduced in this paper. The TPML method first establishes a global model based on the difference of paired points, and then constructs a local model based on the prediction difference of the global model. It can expand the sample size from n to n2, achieving the prediction of high-precision and large-scale STN and STP contents using limited sampling points. Based on 18 auxiliary variables of topography, climate, soil properties, vegetation and spatial location, etc, the study produced the distribution dataset of STN and STP contents in the basin using the TPML method. Futhermore, using the ten-fold cross-validation method, the study confirmed that the prediction accuracy of TPML model is more than 10% higher than that of Ordinary Kriging (OK) model. The mean absolute deviation (MAE) and mean root mean squared error (RMSE) of STN content predicted by the TPML method are 0.309% and 0.456% respectively. The mean MAE of STN content predicted by random forest (RF), inverse distance weighted (IDW) and OK methods is 0.329%, 0.247% and 1.864%, and the mean RMSE is 0.468%, 0.387% and 1.976%, respectively. The mean MAE and mean RMSE of STP content predicted by TPML method are 0.640% and 0.861%. The mean MAE of STP content predicted by RF, IDW and OK methods is 0.643%, 0.396% and 1.357%, and the mean RMSE is 0.862%, 0.523% and 1.651%, respectively.
Data summary:
| Item | Description |
|---|---|
| Dataset name | Dataset of Soil Total Nitrogen Content and Soil Total Phosphorus Content of the Kherlen River Basin in 2022 |
| Specific subject area | Land resources and information technology |
| Research topic | Prediction of soil total nitrogen content and soil total phosphorus content |
| Time range | 2022 |
| Geographical scope | Kherlen River Basin |
| Spatial resolution | 250 m |
| Data types and technical formats | 250 m high resolution distribution map of soil total nitrogen content 250 m high resolution distribution map of soil total phosphorus content |
| Dataset structure | The dataset is soil total nitrogen (STN) and soil total phosphorus (STP) content at 250 m resolution in the Kherlen River Basin in 2022 |
| Volume of data | 32.84 MB |
| Key index in dataset | Soil total nitrogen content, Soil total phosphorus content |
| Data accessibility | CSTR:17058.11.sciencedb.agriculture.00018 DOI:10.57760/sciencedb.agriculture.00018 |
| Financial support | Research and development on remote sensing monitoring and assessment technology of non-point source pollution in Kherlen River Basin under the National Key Research and Development Program(2021YFE0102300) |
WANG ChenYi, GAO BingBo, Sukhbaatar Chinzorig, FENG QuanLong, FENG AiPing, JIANG ChuanLiang, ZHANG ZhongHao, JI ShuRui . Dataset of Soil Total Nitrogen Content and Soil Total Phosphorus Content of the Kherlen River Basin in 2022[J]. Journal of Agricultural Big Data, 2023 , 5(3) : 104 -111 . DOI: 10.19788/j.issn.2096-6369.230314
| [1] | Shen Z, Chen L, Ding X, et al. Long-term variation (1960-2003) and causal factors of non-point-source nitrogen and phosphorus in the upper reach of the Yangtze River[J]. Journal of Hazardous Materials, 2013, 252: 45-56. DOI:10.1016/j.jhazmat.2013.02.039. |
| [2] | Shen Q, Wang Y, Wang X, et al. Comparing interpolation methods to predict soil total phosphorus in the Mollisol area of Northeast China[J]. Catena, 2019, 174: 59-72. DOI:10.1016/j.catena.2018.10.052. |
| [3] | Kumar S, Lal R, Liu D. A geographically weighted regression kriging approach for mapping soil organic carbon stock[J]. Geoderma, 2012, 189: 627-634. DOI:10.1016/j.geoderma.2012.05.022. |
| [4] | Wang K, Zhang C, Li W. Predictive mapping of soil total nitrogen at a regional scale: A comparison between geographically weighted regression and cokriging[J]. Applied Geography, 2013, 42: 73-85. DOI:10.1016/j.apgeog.2013.04.002. |
| [5] | Song X-D, Brus D J, Liu F, et al. Mapping soil organic carbon content by geographically weighted regression: A case study in the Heihe River Basin, China[J]. Geoderma, 2016, 261: 11-22. DOI:10.1016/j.geoderma.2015.06.024. |
| [6] | Khaledian Y, Miller B A. Selecting appropriate machine learning methods for digital soil mapping[J]. Applied Mathematical Modelling, 2020, 81: 401-418. DOI:10.1016/j.apm.2019.12.016. |
| [7] | Wadoux A M-C, Minasny B, McBratney A B. Machine learning for digital soil mapping: Applications, challenges and suggested solutions[J]. Earth-Science Reviews, 2020, 210: 103359. DOI:10.1016/j.earscirev.2020.103359. |
| [8] | 王铭鑫, 范超, 高秉博, 等. 融合半变异函数的空间随机森林插值方法[J]. 中国生态农业学报(中英文), 2022, 30(3): 451-457. DOI:10.12357/cjea.20210628. |
| [9] | 彭涛, 赵丽, 张爱军, 等. 土壤全氮的无人机高光谱响应特征及估测模型构建[J]. 农业工程学报, 2023, 39(4): 92-101. DOI:10.11975/j.issn.1002-6819.202211021. |
| [10] | Hengl T, Leenaars J G, Shepherd K D, et al. Soil nutrient maps of Sub-Saharan Africa: assessment of soil nutrient content at 250 m spatial resolution using machine learning[J]. Nutrient Cycling in Agroecosystems, 2017, 109: 77-102. DOI:10.1007/s10705-017-9870-x. |
| [11] | Gomez C, Chevallier T, Moulin P, et al. Prediction of soil organic and inorganic carbon concentrations in Tunisian samples by mid-infrared reflectance spectroscopy using a French national library[J]. Geoderma, 2020, 375: 114469. DOI:10.1016/j.geoderma.2020.114469. |
| [12] | Ramirez-Lopez L, Behrens T, Schmidt K, et al. The spectrum-based learner: A new local approach for modeling soil vis-NIR spectra of complex datasets[J]. Geoderma, 2013, 195: 268-279. DOI:10.1016/j.geoderma.2012.12.014. |
| [13] | Gao B, Stein A, Wang J. A two-point machine learning method for the spatial prediction of soil pollution[J]. International Journal of Applied Earth Observation and Geoinformation, 2022, 108: 102742. DOI:10.1016/j.jag.2022.102742. |
| [14] | 王雨雪, 杨柯, 高秉博, 等. 基于两点机器学习方法的土壤有机质空间分布预测[J]. 农业工程学报, 2022, 38(12): 65-73. DOI:10.11975/j.issn.1002-6819.2022.12.008. |
| [15] | 霍明珠, 高秉博, 乔冬云, 等. 基于APCS-MLR受体模型的农田土壤重金属源解析[J]. 农业环境科学学报, 2021, 40(05): 978-986. |
| [16] | Wang Q, Xie Z, Li F. Using ensemble models to identify and apportion heavy metal pollution sources in agricultural soils on a local scale[J]. Environmental Pollution, 2015, 206: 227-235. DOI:10.1016/j.envpol.2015.06.040. |
/
| 〈 |
|
〉 |