农业大数据学报 ›› 2025, Vol. 7 ›› Issue (2): 227-237.doi: 10.19788/j.issn.2096-6369.100040

• 数据资源 • 上一篇    下一篇

榴莲99份种质资源变异位点数据集

冀晓昊1,2(), 郑道君3, 谢圣华3, 时梦2, 钟义旺3, 王莹莹2, 王孝娣2, 刘凤之2, 冯学杰3,*(), 王海波1,2,*()   

  1. 1.中国农业科学院国家南繁研究院,海南三亚 572024
    2.中国农业科学院果树研究所/农业农村部园艺作物种质资源利用重点实验室,辽宁兴城 125100
    3.海南省农业科学院三亚研究所/海南省农业科学院热带果树研究所/农业农村部热带果蔬遗传资源评价与利用重点实验室(省部共建)/海南省热带果树生物学重点实验室/农业农村部海口热带果树科学观测实验站/海南省热带果树野外科学观测研究站,海口 571100
  • 收稿日期:2024-06-27 接受日期:2024-08-13 出版日期:2025-06-26 发布日期:2025-06-23
  • 通讯作者: 冯学杰,E-mail:13807680898@163.com
    王海波,E-mail:haibo8316@163.com
  • 作者简介:冀晓昊,E-mail:jixiaohao2006@163.com
  • 基金资助:
    中国农业科学院南繁专项(SWAQ09);中国农业科学院创新工程项目(CAAS-ASTIP-2021-RIP-02)

Variant Site Dataset of 99 Durio zibethinus Germplasm Resources

JI XiaoHao1,2(), ZHENG DaoJun3, XIE ShengHua3, SHI Meng2, ZHONG YiWang3, WANG YingYing2, WANG XiaoDi2, LIU FengZhi2, FENG XueJie3,*(), WANG HaiBo1,2,*()   

  1. 1. National Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya 572024, Hainan, China
    2. Research Institute of Pomology, Chinese Academy of Agricultural Sciences/Key Laboratory of Horticultural Crops Germplasm Resources Utilization, Ministry of Agriculture and Rural Affairs, Xingcheng 125100, Liaoning, China
    3. Sanya Institute, Hainan Academy of Agricultural Sciences / Institute of Tropical Fruit Trees, Hainan Academy of Agricultural Sciences / Key Laboratory of Genetic Resources Evaluation and Utilization of Tropical Fruits and Vegetables (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs / Key Laboratory of Tropical Fruit Tree Biology of Hainan Province / Haikou Scientific Observation and Experimental Station for Tropical Fruit Trees, Ministry of Agriculture and Rural Affairs/ Hainan Field Scientific Observation and Research Station for Tropical Fruit Trees, Haikou 571100, Hainan, China
  • Received:2024-06-27 Accepted:2024-08-13 Published:2025-06-26 Online:2025-06-23

摘要:

榴莲具有较高的经济价值和营养价值。我国榴莲高度依赖进口,海南省榴莲产业处于刚刚起步阶段,存在面积少、产量低、品种完全依赖引种而缺乏自主性、配套栽培技术欠缺等诸多问题,导致市场需求大而产业薄弱的矛盾突出,迫切需要进行榴莲种质资源收集、鉴定与评价。该研究对99份榴莲种质资源提取DNA,构建文库并开展了二代全基因组测序,对测序数据开展了质控、变异位点挖掘注释和群体进化等生信分析。测序数据量共计1.62 Tb,共挖掘到54 974 697个变异位点,包括SNP、INS和DEL三种变异类型,以SNP为主,榴莲基因组中平均每13个碱基有1个变异位点,变异位点主要位于基因间,位于基因外显子和内含子的较少。99份榴莲资源可以分成3个亚群,LD系数降低到最大值的一半的衰减距离只有0.1-0.2 kb,表现出丰富的遗传多样性。99份榴莲种质资源的基因组测序数据和变异位点信息,为榴莲遗传学以及育种方法和育种理论研究提供了基础数据支撑,有助于海南乃至世界榴莲品种选育。

数据摘要:

项目 描述
数据库(集)名称 榴莲99份种质资源变异位点数据集
所属学科 农学,生物学
研究主题 榴莲种质资源遗传变异
数据时间范围 2022年-2023年
时间分辨率 1年
数据地理空间覆盖 海南省三亚市
数据类型与技术格式 .XLSX和VCF
数据库(集)组成 本数据集由1个表格和1个VCF文件组成,主要包括WGS测序数据质控结果、比对情况和变异位点信息。
数据量 143.36 GB
数据可用性 CSTR:17058.11.sciencedb.agriculture.00077;https://cstr.cn/17058.11.sciencedb.agriculture.00077
DOI:10.57760/sciencedb.agriculture.00077; https://doi.org/10.57760/sciencedb.agriculture.00077
经费支持 中国农业科学院南繁专项(SWAQ09);中国农业科学院创新工程项目(CAAS-ASTIP-2021-RIP-02)

关键词: 榴莲, 变异位点, SNP, 群体进化

Abstract:

Durian has high economic and nutritional value. In China, the durian industry is highly dependent on imports. The durian industry in Hainan Province is in its infancy, characterized by limited acreage, low yield, complete reliance on introduced varieties, lack of self-sufficiency, and insufficient supporting cultivation techniques. These issues lead to a stark contrast between high market demand and a weak industry. There is an urgent need for the collection, identification, and evaluation of durian germplasm resources. In this study, DNA was extracted from 99 durian germplasm resources. Libraries were constructed, and second-generation whole-genome sequencing was performed. Bioinformatic analyses, including quality control of sequencing data, variant site discovery and annotation, and population evolution studies, were conducted on the sequencing data. The total amount of sequencing data was 1.62 Tb, yielding 54,974,697 variant sites, including SNPs, insertions (INS), and deletions (DEL), with SNPs being the most prevalent. On average, there is one variant site per 13 bases in the durian genome. These variant sites are mainly located in intergenic regions, with fewer in gene exons and introns. The 99 durian resources can be divided into three subgroups. The distance at which the LD coefficient decays to half its maximum value is only 0.1-0.2 kb, indicating rich genetic diversity. This study provides genome sequencing data and variant site information for 99 durian germplasm resources, offering fundamental data support for durian genetics, breeding methods, and breeding theory research. This will aid in the selection and breeding of durian varieties in Hainan and worldwide.

Data summary:

Items Description
Name of dataset Variant Site Dataset of 99 Durio zibethinus Germplasm Resources
Specific subject area Agronomy, biology
Research topic Genetic variation of durian germplasm resources
Time range 2022 - 2023
Temporal resolution one year
Geographical scope Sanya City, Hainan Province, China
Data types and technical formats .XLSX, VCF
Dataset structure This dataset consists of one table and one VCF file, primarily including the quality control results of WGS sequencing data, alignment information, and variant site information.
Volume of dataset 143.36 GB
Data accessibility CSTR:17058.11.sciencedb.agriculture.00077;https://cstr.cn/17058.11.sciencedb.agriculture.00077
DOI:10.57760/sciencedb.agriculture.00077; https://doi.org/10.57760/sciencedb.agriculture.00077
Financial support Nanfan Special Project of the Chinese Academy of Agricultural Sciences(SWAQ09); The Agricultural Science and Technology Innovation Program of the Chinese Academy of Agricultural Sciences (CAAS-ASTIP-2021-RIP-02).

Key words: durian, variant sites, SNP, population evolution