2022年克鲁伦河流域土壤全氮含量与土壤全磷含量数据集
收稿日期: 2023-06-08
录用日期: 2023-09-08
网络出版日期: 2023-11-14
基金资助
国家重点研发计划项目克鲁伦河流域面源污染遥感监测与评估技术研发(2021YFE0102300);国家自然科学基金项目(42271428)
Dataset of Soil Total Nitrogen Content and Soil Total Phosphorus Content of the Kherlen River Basin in 2022
Received date: 2023-06-08
Accepted date: 2023-09-08
Online published: 2023-11-14
克鲁伦河流域生态环境安全在中蒙两国受到越来越多关注,掌握流域土壤全氮(STN)和土壤全磷(STP)含量对于准确估算流域面源污染(NPS)负荷、研究流域资源环境状况与可持续发展具有重要意义。传统采样方法在获取大范围的STN和STP含量时耗时耗力、STN与STP存在空间异质性、STN和STP与辅助变量间的关系也存在空间异质性等。单一的全局模型无法拟合复杂的异质性关系,而局部建模方法难以克服维度灾难问题,因此本文引入了两点机器学习(TPML)方法。该方法首先基于点对差异建立全局模型,然后基于全局模型的预测差异构建局部模型,能够将样本量从n扩充至n2,可利用有限的采样点数据实现高精度大范围的STN和STP含量预测。本文结合地形、气候、土壤属性、植被及空间位置等共18个辅助变量,采用TPML方法,制作了流域STN和STP含量分布数据集。并基于十折交叉验证方法证实了TPML方法相较于普通克里格(OK)方法,预测精度提高超过10%。TPML方法预测STN含量的平均绝对误差(MAE)均值和平均均方根误差(RMSE)分别为0.309%、0.456%,随机森林(RF)、反距离加权(IDW)与OK方法预测STN含量的平均MAE分别为0.329%、0.247%与1.864%,平均RMSE分别为0.468%、0.387%、1.976%。TPML方法预测STP含量的平均MAE和平均RMSE分别为0.640%和0.861%,RF、IDW与OK方法预测STP含量的平均MAE分别为0.643%、0.396%与1.357%,平均RMSE分别为0.862%、0.523%与1.651%。
数据摘要:
| 项目 | 描述 |
|---|---|
| 数据库(集)名称 | 2022年克鲁伦河流域土壤全氮含量与土壤全磷含量数据集 |
| 所属学科 | 土地资源与信息技术 |
| 研究主题 | 土壤全氮含量与土壤全磷含量预测 |
| 数据时间范围 | 2022年 |
| 数据地理空间覆盖 | 克鲁伦河流域 |
| 空间分辨率 | 250 m |
| 数据类型与技术格式 | 250 m高分辨率土壤全氮含量分布(TIF格式) 250 m高分辨率土壤全磷含量分布(TIF格式) |
| 数据库(集)组成 | 数据集为2022年克鲁伦河流域250 m分辨率的土壤全氮(STN)与土壤全磷(STP)含量. |
| 数据量 | 32.84 MB |
| 主要数据指标 | 土壤全氮含量、土壤全磷含量 |
| 数据可用性 | CSTR:17058.11.sciencedb.agriculture.00018 DOI:10.57760/sciencedb.agriculture.00018 |
| 经费支持 | 国家重点研发计划项目克鲁伦河流域面源污染遥感监测与评估技术研发(2021YFE0102300),国家自然科学基金项目(42271428) |
王辰怡, 高秉博, Sukhbaatar Chinzorig, 冯权泷, 冯爱萍, 姜传亮, 张中浩, 及舒蕊 . 2022年克鲁伦河流域土壤全氮含量与土壤全磷含量数据集[J]. 农业大数据学报, 2023 , 5(3) : 104 -111 . DOI: 10.19788/j.issn.2096-6369.230314
The ecological and environmental security of the Kherlen River Basin has attracted more and more attention in China and Mongolia. It is of great significance to investigate the contents of soil total nitrogen (STN) and soil total phosphorus (STP) in the basin for accurately estimating the load of non-point sources (NPS) and studying the state of resources and environment and sustainable development. It is time-consuming and labor-intensive to obtain a wide range of STN and STP contents using traditional sampling methods, while STN and STP not only have spatial heterogeneity, but also have heterogeneity in their relationships with auxiliary variables. Moreover, a single global model cannot fit complex heterogeneous relationships, and it is difficult for the local modeling method to overcome dimensional disaster problems. Therefore, the two-point machine learning (TPML) method is introduced in this paper. The TPML method first establishes a global model based on the difference of paired points, and then constructs a local model based on the prediction difference of the global model. It can expand the sample size from n to n2, achieving the prediction of high-precision and large-scale STN and STP contents using limited sampling points. Based on 18 auxiliary variables of topography, climate, soil properties, vegetation and spatial location, etc, the study produced the distribution dataset of STN and STP contents in the basin using the TPML method. Futhermore, using the ten-fold cross-validation method, the study confirmed that the prediction accuracy of TPML model is more than 10% higher than that of Ordinary Kriging (OK) model. The mean absolute deviation (MAE) and mean root mean squared error (RMSE) of STN content predicted by the TPML method are 0.309% and 0.456% respectively. The mean MAE of STN content predicted by random forest (RF), inverse distance weighted (IDW) and OK methods is 0.329%, 0.247% and 1.864%, and the mean RMSE is 0.468%, 0.387% and 1.976%, respectively. The mean MAE and mean RMSE of STP content predicted by TPML method are 0.640% and 0.861%. The mean MAE of STP content predicted by RF, IDW and OK methods is 0.643%, 0.396% and 1.357%, and the mean RMSE is 0.862%, 0.523% and 1.651%, respectively.
Data summary:
| Item | Description |
|---|---|
| Dataset name | Dataset of Soil Total Nitrogen Content and Soil Total Phosphorus Content of the Kherlen River Basin in 2022 |
| Specific subject area | Land resources and information technology |
| Research topic | Prediction of soil total nitrogen content and soil total phosphorus content |
| Time range | 2022 |
| Geographical scope | Kherlen River Basin |
| Spatial resolution | 250 m |
| Data types and technical formats | 250 m high resolution distribution map of soil total nitrogen content 250 m high resolution distribution map of soil total phosphorus content |
| Dataset structure | The dataset is soil total nitrogen (STN) and soil total phosphorus (STP) content at 250 m resolution in the Kherlen River Basin in 2022 |
| Volume of data | 32.84 MB |
| Key index in dataset | Soil total nitrogen content, Soil total phosphorus content |
| Data accessibility | CSTR:17058.11.sciencedb.agriculture.00018 DOI:10.57760/sciencedb.agriculture.00018 |
| Financial support | Research and development on remote sensing monitoring and assessment technology of non-point source pollution in Kherlen River Basin under the National Key Research and Development Program(2021YFE0102300) |
| [1] | Shen Z, Chen L, Ding X, et al. Long-term variation (1960-2003) and causal factors of non-point-source nitrogen and phosphorus in the upper reach of the Yangtze River[J]. Journal of Hazardous Materials, 2013, 252: 45-56. DOI:10.1016/j.jhazmat.2013.02.039. |
| [2] | Shen Q, Wang Y, Wang X, et al. Comparing interpolation methods to predict soil total phosphorus in the Mollisol area of Northeast China[J]. Catena, 2019, 174: 59-72. DOI:10.1016/j.catena.2018.10.052. |
| [3] | Kumar S, Lal R, Liu D. A geographically weighted regression kriging approach for mapping soil organic carbon stock[J]. Geoderma, 2012, 189: 627-634. DOI:10.1016/j.geoderma.2012.05.022. |
| [4] | Wang K, Zhang C, Li W. Predictive mapping of soil total nitrogen at a regional scale: A comparison between geographically weighted regression and cokriging[J]. Applied Geography, 2013, 42: 73-85. DOI:10.1016/j.apgeog.2013.04.002. |
| [5] | Song X-D, Brus D J, Liu F, et al. Mapping soil organic carbon content by geographically weighted regression: A case study in the Heihe River Basin, China[J]. Geoderma, 2016, 261: 11-22. DOI:10.1016/j.geoderma.2015.06.024. |
| [6] | Khaledian Y, Miller B A. Selecting appropriate machine learning methods for digital soil mapping[J]. Applied Mathematical Modelling, 2020, 81: 401-418. DOI:10.1016/j.apm.2019.12.016. |
| [7] | Wadoux A M-C, Minasny B, McBratney A B. Machine learning for digital soil mapping: Applications, challenges and suggested solutions[J]. Earth-Science Reviews, 2020, 210: 103359. DOI:10.1016/j.earscirev.2020.103359. |
| [8] | 王铭鑫, 范超, 高秉博, 等. 融合半变异函数的空间随机森林插值方法[J]. 中国生态农业学报(中英文), 2022, 30(3): 451-457. DOI:10.12357/cjea.20210628. |
| [9] | 彭涛, 赵丽, 张爱军, 等. 土壤全氮的无人机高光谱响应特征及估测模型构建[J]. 农业工程学报, 2023, 39(4): 92-101. DOI:10.11975/j.issn.1002-6819.202211021. |
| [10] | Hengl T, Leenaars J G, Shepherd K D, et al. Soil nutrient maps of Sub-Saharan Africa: assessment of soil nutrient content at 250 m spatial resolution using machine learning[J]. Nutrient Cycling in Agroecosystems, 2017, 109: 77-102. DOI:10.1007/s10705-017-9870-x. |
| [11] | Gomez C, Chevallier T, Moulin P, et al. Prediction of soil organic and inorganic carbon concentrations in Tunisian samples by mid-infrared reflectance spectroscopy using a French national library[J]. Geoderma, 2020, 375: 114469. DOI:10.1016/j.geoderma.2020.114469. |
| [12] | Ramirez-Lopez L, Behrens T, Schmidt K, et al. The spectrum-based learner: A new local approach for modeling soil vis-NIR spectra of complex datasets[J]. Geoderma, 2013, 195: 268-279. DOI:10.1016/j.geoderma.2012.12.014. |
| [13] | Gao B, Stein A, Wang J. A two-point machine learning method for the spatial prediction of soil pollution[J]. International Journal of Applied Earth Observation and Geoinformation, 2022, 108: 102742. DOI:10.1016/j.jag.2022.102742. |
| [14] | 王雨雪, 杨柯, 高秉博, 等. 基于两点机器学习方法的土壤有机质空间分布预测[J]. 农业工程学报, 2022, 38(12): 65-73. DOI:10.11975/j.issn.1002-6819.2022.12.008. |
| [15] | 霍明珠, 高秉博, 乔冬云, 等. 基于APCS-MLR受体模型的农田土壤重金属源解析[J]. 农业环境科学学报, 2021, 40(05): 978-986. |
| [16] | Wang Q, Xie Z, Li F. Using ensemble models to identify and apportion heavy metal pollution sources in agricultural soils on a local scale[J]. Environmental Pollution, 2015, 206: 227-235. DOI:10.1016/j.envpol.2015.06.040. |
/
| 〈 |
|
〉 |