农业大数据学报 ›› 2024, Vol. 6 ›› Issue (2): 230-240.doi: 10.19788/j.issn.2096-6369.000046

• “面向高质量共享的科学数据安全”专刊(上) • 上一篇    下一篇

地学领域科学数据处理与分析软件自主性分析

王卷乐1,2,3,*(), 李凯1,2, 段博文1, 苏娜4   

  1. 1.中国科学院地理科学与资源研究所 资源与环境信息系统国家重点实验室,北京 100101
    2.中国科学院大学,北京 100049
    3.江苏省地理信息资源开发与利用协同创新中心,南京 210023
    4.中国科学院科技战略咨询研究院,北京 100190
  • 收稿日期:2024-04-08 接受日期:2024-05-07 出版日期:2024-06-26 发布日期:2024-07-03
  • 通讯作者: *
  • 作者简介:王卷乐,E-mail:wangjl@igsnrr.ac.cn
  • 基金资助:
    国家重点研发计划项目(2022YFF0711600);国家科技基础条件平台委托任务(2023WT14)

Analysis of Autonomy in Geosciences Data Processing and Analysis Software

WANG JuanLe1,2,3,*(), LI Kai1,2, DUAN BoWen1, SU Na4   

  1. 1. State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
    2. College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China
    3. Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China
    4. Institutes of Science and Development, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2024-04-08 Accepted:2024-05-07 Published:2024-06-26 Online:2024-07-03

摘要:

科学数据的重要性已得到广泛共识,随着科学数据的不断积累,其数据处理分析软件的能力水平将成为科学数据能否高效发挥作用的关键和瓶颈。地球科学领域研究涉及多尺度、多类型、多来源的数据,其数据处理与分析软件的需求极为强烈。本研究针对地球科学领域的特点,分析其主要数据处理和分析软件的现况,辨识我国的软件自主程度。调研分析涉及地理、海洋、地质、大气、生态、灾害、农业等16个专题,选取了177个主流软件/工具,重点获取其软件/工具名称、概况、主要功能、应用服务/典型案例、优缺点、对标产品等指标。分析发现地学科学数据处理与分析领域软件/工具完全开放(开源)的占三分之二,商业的、限制性的或开放程度不明确的占三分之一。主要软件/工具开发国家有美国、中国、加拿大、英国等以及一些国际组织。从专题分布看,主要体现在土地退化、人口社会经济、知识图谱、遥感处理等。自主保障风险较高的软件主要在空间化、大气、野火、冻土等专题。约三分之一的专业软件/工具可以直接应用到国家科学数据中心,且可与云平台结合。结合人工智能时代和“数据要素×”的发展,从五个方面提出加强我国自主科学数据处理软件/工具的开发应用建议。

关键词: 地球科学, 科学数据, 软件/工具, 自主性, 国家科学数据中心

Abstract:

The importance of scientific data has been widely recognized, and as scientific data continues to accumulate, the capability of its data processing software will become a key bottleneck in determining whether scientific data can be effectively utilized. The field of Earth science involves multi-scale, multi-type, and multi-source data in research, leading to a strong demand for data processing and analysis software. This study, aimed at the characteristics of the earth science field, analyzes the current state of its main data processing and analysis software, identifies the degree of software autonomy in China, and expects to propose corresponding development suggestions. The survey covers 16 topics including geography, oceanography, geology, atmospheric sciences, ecology, disasters, agriculture, etc., and selects 177 mainstream software/tools, focusing on obtaining indicators such as software/tool names, summaries, main functions, application services/typical cases, advantages and disadvantages, and benchmarking software. The analysis found that these software/tools in the field of geoscience data processing and analysis are completely open (open source) accounting for two-thirds, the last one-thirds are commercial, restrictive, or unknown openness. The main software/tools are developed in countries such as the United States, China, Canada, the United Kingdom, and some international organizations. From the perspective of topic distribution, this is mainly reflected in the following areas: land degradation, socio-economic demographics, knowledge graphs, and remote sensing big data processing. From the perspective of autonomy, the main high-risk software packages are mainly distributed in fields such as spatialization, atmosphere, wildfires, and permafrost. Among the surveyed software/tools, about one-third of the professional software/tool can be applied to the National Science Data Center, and can be used in Cloud Platform. Combining the era of artificial intelligence and the development of "Data Element X", the future should strongly enhance the development and deployment application of China's autonomous scientific data processing software/tool from 5 perspectives.

Key words: earth science, scientific data, software, autonomy, National Scientific Data Center