Dataset of Soil Total Nitrogen Content and Soil Total Phosphorus Content of the Kherlen River Basin in 2022

Expand
  • 1. College of Land Science and Technology, China Agricultural University, Beijing 100083, China
    2. Institute of Geography and Geoecology, Mongolian Academy of Sciences, Ulaanbaatar 15170, Mongolia
    3. Ministry of Ecology and Environment Center for Satellite Application on Ecology and Environment, Beijing 100094, China
    4. College of Environmental and Geographical Sciences, Shanghai Normal University, Shanghai 200234, China

Received date: 2023-06-08

  Accepted date: 2023-09-08

  Online published: 2023-11-14

Abstract

The ecological and environmental security of the Kherlen River Basin has attracted more and more attention in China and Mongolia. It is of great significance to investigate the contents of soil total nitrogen (STN) and soil total phosphorus (STP) in the basin for accurately estimating the load of non-point sources (NPS) and studying the state of resources and environment and sustainable development. It is time-consuming and labor-intensive to obtain a wide range of STN and STP contents using traditional sampling methods, while STN and STP not only have spatial heterogeneity, but also have heterogeneity in their relationships with auxiliary variables. Moreover, a single global model cannot fit complex heterogeneous relationships, and it is difficult for the local modeling method to overcome dimensional disaster problems. Therefore, the two-point machine learning (TPML) method is introduced in this paper. The TPML method first establishes a global model based on the difference of paired points, and then constructs a local model based on the prediction difference of the global model. It can expand the sample size from n to n2, achieving the prediction of high-precision and large-scale STN and STP contents using limited sampling points. Based on 18 auxiliary variables of topography, climate, soil properties, vegetation and spatial location, etc, the study produced the distribution dataset of STN and STP contents in the basin using the TPML method. Futhermore, using the ten-fold cross-validation method, the study confirmed that the prediction accuracy of TPML model is more than 10% higher than that of Ordinary Kriging (OK) model. The mean absolute deviation (MAE) and mean root mean squared error (RMSE) of STN content predicted by the TPML method are 0.309% and 0.456% respectively. The mean MAE of STN content predicted by random forest (RF), inverse distance weighted (IDW) and OK methods is 0.329%, 0.247% and 1.864%, and the mean RMSE is 0.468%, 0.387% and 1.976%, respectively. The mean MAE and mean RMSE of STP content predicted by TPML method are 0.640% and 0.861%. The mean MAE of STP content predicted by RF, IDW and OK methods is 0.643%, 0.396% and 1.357%, and the mean RMSE is 0.862%, 0.523% and 1.651%, respectively.

Data summary:

Item Description
Dataset name Dataset of Soil Total Nitrogen Content and Soil Total Phosphorus Content of the Kherlen River Basin in 2022
Specific subject area Land resources and information technology
Research topic Prediction of soil total nitrogen content and soil total phosphorus content
Time range 2022
Geographical scope Kherlen River Basin
Spatial resolution 250 m
Data types and technical formats 250 m high resolution distribution map of soil total nitrogen content
250 m high resolution distribution map of soil total phosphorus content
Dataset structure The dataset is soil total nitrogen (STN) and soil total phosphorus (STP) content at 250 m resolution in the Kherlen River Basin in 2022
Volume of data 32.84 MB
Key index in dataset Soil total nitrogen content, Soil total phosphorus content
Data accessibility CSTR:17058.11.sciencedb.agriculture.00018
DOI:10.57760/sciencedb.agriculture.00018
Financial support Research and development on remote sensing monitoring and assessment technology of non-point source pollution in Kherlen River Basin under the National Key Research and Development Program(2021YFE0102300)

Cite this article

WANG ChenYi, GAO BingBo, Sukhbaatar Chinzorig, FENG QuanLong, FENG AiPing, JIANG ChuanLiang, ZHANG ZhongHao, JI ShuRui . Dataset of Soil Total Nitrogen Content and Soil Total Phosphorus Content of the Kherlen River Basin in 2022[J]. Journal of Agricultural Big Data, 2023 , 5(3) : 104 -111 . DOI: 10.19788/j.issn.2096-6369.230314

References

[1] Shen Z, Chen L, Ding X, et al. Long-term variation (1960-2003) and causal factors of non-point-source nitrogen and phosphorus in the upper reach of the Yangtze River[J]. Journal of Hazardous Materials, 2013, 252: 45-56. DOI:10.1016/j.jhazmat.2013.02.039.
[2] Shen Q, Wang Y, Wang X, et al. Comparing interpolation methods to predict soil total phosphorus in the Mollisol area of Northeast China[J]. Catena, 2019, 174: 59-72. DOI:10.1016/j.catena.2018.10.052.
[3] Kumar S, Lal R, Liu D. A geographically weighted regression kriging approach for mapping soil organic carbon stock[J]. Geoderma, 2012, 189: 627-634. DOI:10.1016/j.geoderma.2012.05.022.
[4] Wang K, Zhang C, Li W. Predictive mapping of soil total nitrogen at a regional scale: A comparison between geographically weighted regression and cokriging[J]. Applied Geography, 2013, 42: 73-85. DOI:10.1016/j.apgeog.2013.04.002.
[5] Song X-D, Brus D J, Liu F, et al. Mapping soil organic carbon content by geographically weighted regression: A case study in the Heihe River Basin, China[J]. Geoderma, 2016, 261: 11-22. DOI:10.1016/j.geoderma.2015.06.024.
[6] Khaledian Y, Miller B A. Selecting appropriate machine learning methods for digital soil mapping[J]. Applied Mathematical Modelling, 2020, 81: 401-418. DOI:10.1016/j.apm.2019.12.016.
[7] Wadoux A M-C, Minasny B, McBratney A B. Machine learning for digital soil mapping: Applications, challenges and suggested solutions[J]. Earth-Science Reviews, 2020, 210: 103359. DOI:10.1016/j.earscirev.2020.103359.
[8] 王铭鑫, 范超, 高秉博, 等. 融合半变异函数的空间随机森林插值方法[J]. 中国生态农业学报(中英文), 2022, 30(3): 451-457. DOI:10.12357/cjea.20210628.
[9] 彭涛, 赵丽, 张爱军, 等. 土壤全氮的无人机高光谱响应特征及估测模型构建[J]. 农业工程学报, 2023, 39(4): 92-101. DOI:10.11975/j.issn.1002-6819.202211021.
[10] Hengl T, Leenaars J G, Shepherd K D, et al. Soil nutrient maps of Sub-Saharan Africa: assessment of soil nutrient content at 250 m spatial resolution using machine learning[J]. Nutrient Cycling in Agroecosystems, 2017, 109: 77-102. DOI:10.1007/s10705-017-9870-x.
[11] Gomez C, Chevallier T, Moulin P, et al. Prediction of soil organic and inorganic carbon concentrations in Tunisian samples by mid-infrared reflectance spectroscopy using a French national library[J]. Geoderma, 2020, 375: 114469. DOI:10.1016/j.geoderma.2020.114469.
[12] Ramirez-Lopez L, Behrens T, Schmidt K, et al. The spectrum-based learner: A new local approach for modeling soil vis-NIR spectra of complex datasets[J]. Geoderma, 2013, 195: 268-279. DOI:10.1016/j.geoderma.2012.12.014.
[13] Gao B, Stein A, Wang J. A two-point machine learning method for the spatial prediction of soil pollution[J]. International Journal of Applied Earth Observation and Geoinformation, 2022, 108: 102742. DOI:10.1016/j.jag.2022.102742.
[14] 王雨雪, 杨柯, 高秉博, 等. 基于两点机器学习方法的土壤有机质空间分布预测[J]. 农业工程学报, 2022, 38(12): 65-73. DOI:10.11975/j.issn.1002-6819.2022.12.008.
[15] 霍明珠, 高秉博, 乔冬云, 等. 基于APCS-MLR受体模型的农田土壤重金属源解析[J]. 农业环境科学学报, 2021, 40(05): 978-986.
[16] Wang Q, Xie Z, Li F. Using ensemble models to identify and apportion heavy metal pollution sources in agricultural soils on a local scale[J]. Environmental Pollution, 2015, 206: 227-235. DOI:10.1016/j.envpol.2015.06.040.
Outlines

/