农业大数据学报 ›› 2019, Vol. 1 ›› Issue (4): 86-97.doi: 10.19788/j.issn.2096-6369.190409

• 专刊——科学数据管理 • 上一篇    下一篇

科学数据共享系统的现状与趋势

李云婷1,2(), 温亮明1,2, 张丽丽1, 黎建辉1()   

  1. 1.中国科学院计算机网络信息中心,北京 100190
    2.中国科学院大学,北京 100049
  • 收稿日期:2019-10-25 出版日期:2019-12-26 发布日期:2020-04-08
  • 通讯作者: 黎建辉 E-mail:liyunting@cnic.cn;lijh@cnic.cn
  • 作者简介:李云婷,女,硕士生,研究方向:科学数据云存储、分布式存储、科学数据管理;E-mail:liyunting@cnic.cn
  • 基金资助:
    中国科学院战略性先导科技专项(A类)子课题(XDA19020104)

The Status and Trends of Scientific Data Sharing Systems

Yunting Li1,2(), Liangming Wen1,2, Lili Zhang1, Jianhui Li1()   

  1. 1.Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190,China
    2.University of Chinese Academy of Sciences, Beijing 100049,China
  • Received:2019-10-25 Online:2019-12-26 Published:2020-04-08
  • Contact: Jianhui Li E-mail:liyunting@cnic.cn;lijh@cnic.cn

摘要:

数据密集型科研已经成为大数据时代科学发展的新范式,科学数据开放共享已成科技界的普遍共识。在长期实践中,科学数据共享形成了以科学仪器、数据平台、数据出版、众包处理、数据交易等为典型代表的不同模式。与之对应,针对不同的领域和应用场景出现了种类繁多的解决方案,如仓储型、联邦服务型、数据分发型和按需计算与分析云服务型等。本文在对上述四类主流科学数据共享系统的服务内容、技术特征、应用场景与代表性系统分析比较的基础上,提出科学数据共享系统未来发展的的趋势,并以中国科学院战略性科技先导专项“地球大数据科学工程”研发的地球大数据云服务平台为典型案例,进行了深入的剖析。本文认为,未来的科学数据共享系统将围绕着科学数据全生命周期管理的需求,形成具备数据获取、存储、分发共享、计算分析、智能服务等功能于一体的融合型云服务系统,并将实现数据的FAIR化、智能关联和机器可理解,促进数据共享良性生态的形成。

关键词: 科学数据共享系统, 数据共享, 数据融合, 智能处理, 数据生态, 科学数据管理, 科学数据, 数据系统

Abstract:

Data-intensive research is emerging as a new paradigm for science discovery in the era of big data, and the use of open data has become common in the scientific community. Over time, different models of scientific data sharing have emerged, including scientific instruments models, data platforms models, data publishing models, crowdsourcing and data market models. Correspondingly, a variety of solutions have emerged for different fields and applications, such as data repositories, data federated services systems, data distribution systems, and on-demand computing and analysis cloud services systems. This paper examines development and future trends in scientific data sharing systems, using the Big Earth Data Cloud Services Platform as an example. It analyzes and compares the typical services and technical characteristics, using scenarios and representative systems of the above-mentioned four types of mainstream scientific data sharing systems. Our analysis suggests that future scientific data sharing systems will focus on the need to manage the full life-cycle of scientific data and will converge into a cloud service system providing functions such as data acquisition, storage, distribution and sharing, analysis, and intelligent services. By making data FAIR (Findable, Accessible, Interoperable and Reusable), machine understandable and AI-Ready, promote the formation of data sharing eco-systems.

Key words: scientific data sharing system, data sharing, data fusion, intelligent processing, data ecology, scientific data management, scientific data, data system

中图分类号: 

  • TP315