数据资源

榴莲99份种质资源变异位点数据集

  • 冀晓昊 ,
  • 郑道君 ,
  • 谢圣华 ,
  • 时梦 ,
  • 钟义旺 ,
  • 王莹莹 ,
  • 王孝娣 ,
  • 刘凤之 ,
  • 冯学杰 ,
  • 王海波
展开
  • 1.中国农业科学院国家南繁研究院,海南三亚 572024
    2.中国农业科学院果树研究所/农业农村部园艺作物种质资源利用重点实验室,辽宁兴城 125100
    3.海南省农业科学院三亚研究所/海南省农业科学院热带果树研究所/农业农村部热带果蔬遗传资源评价与利用重点实验室(省部共建)/海南省热带果树生物学重点实验室/农业农村部海口热带果树科学观测实验站/海南省热带果树野外科学观测研究站,海口 571100
冀晓昊,E-mail:jixiaohao2006@163.com
冯学杰,E-mail:13807680898@163.com
王海波,E-mail:haibo8316@163.com

收稿日期: 2024-06-27

  录用日期: 2024-08-13

  网络出版日期: 2025-06-23

基金资助

中国农业科学院南繁专项(SWAQ09);中国农业科学院创新工程项目(CAAS-ASTIP-2021-RIP-02)

Variant Site Dataset of 99 Durio zibethinus Germplasm Resources

  • JI XiaoHao ,
  • ZHENG DaoJun ,
  • XIE ShengHua ,
  • SHI Meng ,
  • ZHONG YiWang ,
  • WANG YingYing ,
  • WANG XiaoDi ,
  • LIU FengZhi ,
  • FENG XueJie ,
  • WANG HaiBo
Expand
  • 1. National Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya 572024, Hainan, China
    2. Research Institute of Pomology, Chinese Academy of Agricultural Sciences/Key Laboratory of Horticultural Crops Germplasm Resources Utilization, Ministry of Agriculture and Rural Affairs, Xingcheng 125100, Liaoning, China
    3. Sanya Institute, Hainan Academy of Agricultural Sciences / Institute of Tropical Fruit Trees, Hainan Academy of Agricultural Sciences / Key Laboratory of Genetic Resources Evaluation and Utilization of Tropical Fruits and Vegetables (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs / Key Laboratory of Tropical Fruit Tree Biology of Hainan Province / Haikou Scientific Observation and Experimental Station for Tropical Fruit Trees, Ministry of Agriculture and Rural Affairs/ Hainan Field Scientific Observation and Research Station for Tropical Fruit Trees, Haikou 571100, Hainan, China

Received date: 2024-06-27

  Accepted date: 2024-08-13

  Online published: 2025-06-23

摘要

榴莲具有较高的经济价值和营养价值。我国榴莲高度依赖进口,海南省榴莲产业处于刚刚起步阶段,存在面积少、产量低、品种完全依赖引种而缺乏自主性、配套栽培技术欠缺等诸多问题,导致市场需求大而产业薄弱的矛盾突出,迫切需要进行榴莲种质资源收集、鉴定与评价。该研究对99份榴莲种质资源提取DNA,构建文库并开展了二代全基因组测序,对测序数据开展了质控、变异位点挖掘注释和群体进化等生信分析。测序数据量共计1.62 Tb,共挖掘到54 974 697个变异位点,包括SNP、INS和DEL三种变异类型,以SNP为主,榴莲基因组中平均每13个碱基有1个变异位点,变异位点主要位于基因间,位于基因外显子和内含子的较少。99份榴莲资源可以分成3个亚群,LD系数降低到最大值的一半的衰减距离只有0.1-0.2 kb,表现出丰富的遗传多样性。99份榴莲种质资源的基因组测序数据和变异位点信息,为榴莲遗传学以及育种方法和育种理论研究提供了基础数据支撑,有助于海南乃至世界榴莲品种选育。

数据摘要:

项目 描述
数据库(集)名称 榴莲99份种质资源变异位点数据集
所属学科 农学,生物学
研究主题 榴莲种质资源遗传变异
数据时间范围 2022年-2023年
时间分辨率 1年
数据地理空间覆盖 海南省三亚市
数据类型与技术格式 .XLSX和VCF
数据库(集)组成 本数据集由1个表格和1个VCF文件组成,主要包括WGS测序数据质控结果、比对情况和变异位点信息。
数据量 143.36 GB
数据可用性 CSTR:17058.11.sciencedb.agriculture.00077;https://cstr.cn/17058.11.sciencedb.agriculture.00077
DOI:10.57760/sciencedb.agriculture.00077; https://doi.org/10.57760/sciencedb.agriculture.00077
经费支持 中国农业科学院南繁专项(SWAQ09);中国农业科学院创新工程项目(CAAS-ASTIP-2021-RIP-02)

本文引用格式

冀晓昊 , 郑道君 , 谢圣华 , 时梦 , 钟义旺 , 王莹莹 , 王孝娣 , 刘凤之 , 冯学杰 , 王海波 . 榴莲99份种质资源变异位点数据集[J]. 农业大数据学报, 2025 , 7(2) : 227 -237 . DOI: 10.19788/j.issn.2096-6369.100040

Abstract

Durian has high economic and nutritional value. In China, the durian industry is highly dependent on imports. The durian industry in Hainan Province is in its infancy, characterized by limited acreage, low yield, complete reliance on introduced varieties, lack of self-sufficiency, and insufficient supporting cultivation techniques. These issues lead to a stark contrast between high market demand and a weak industry. There is an urgent need for the collection, identification, and evaluation of durian germplasm resources. In this study, DNA was extracted from 99 durian germplasm resources. Libraries were constructed, and second-generation whole-genome sequencing was performed. Bioinformatic analyses, including quality control of sequencing data, variant site discovery and annotation, and population evolution studies, were conducted on the sequencing data. The total amount of sequencing data was 1.62 Tb, yielding 54,974,697 variant sites, including SNPs, insertions (INS), and deletions (DEL), with SNPs being the most prevalent. On average, there is one variant site per 13 bases in the durian genome. These variant sites are mainly located in intergenic regions, with fewer in gene exons and introns. The 99 durian resources can be divided into three subgroups. The distance at which the LD coefficient decays to half its maximum value is only 0.1-0.2 kb, indicating rich genetic diversity. This study provides genome sequencing data and variant site information for 99 durian germplasm resources, offering fundamental data support for durian genetics, breeding methods, and breeding theory research. This will aid in the selection and breeding of durian varieties in Hainan and worldwide.

Data summary:

Items Description
Name of dataset Variant Site Dataset of 99 Durio zibethinus Germplasm Resources
Specific subject area Agronomy, biology
Research topic Genetic variation of durian germplasm resources
Time range 2022 - 2023
Temporal resolution one year
Geographical scope Sanya City, Hainan Province, China
Data types and technical formats .XLSX, VCF
Dataset structure This dataset consists of one table and one VCF file, primarily including the quality control results of WGS sequencing data, alignment information, and variant site information.
Volume of dataset 143.36 GB
Data accessibility CSTR:17058.11.sciencedb.agriculture.00077;https://cstr.cn/17058.11.sciencedb.agriculture.00077
DOI:10.57760/sciencedb.agriculture.00077; https://doi.org/10.57760/sciencedb.agriculture.00077
Financial support Nanfan Special Project of the Chinese Academy of Agricultural Sciences(SWAQ09); The Agricultural Science and Technology Innovation Program of the Chinese Academy of Agricultural Sciences (CAAS-ASTIP-2021-RIP-02).

参考文献

[1] 青莲. 榴莲品种介绍. 世界热带农业信息, 2005(10):27-30.
[2] KHAKSAR G, KASEMCHOLATHAN S, SIRIKANTARAMAS S. Durian (Durio zibethinus L.): Nutritional composition, pharmacological implications, value-added products, and omics-based investigations. Horticulturae, 2024, 10(4): 342. DOI:10.3390/HORTICULTURAE10040342.
[3] 朱振忠, 周兆禧, 陈妹姑, 等. 榴莲果实品质与矿质元素的灰色关联度和通径分析. 中国南方果树, 2024, 53(6):76-82.
[4] 余顺生, 辛勍, 刘文玫. 中国水果进口贸易现状分析. 天津农林科技, 2023(6):39-42.
[5] 张放. 2023年我国进口鲜榴莲情况简析. 中国果业信息, 2024, 41(5):36-43.
[6] 冯学杰, 华敏, 郭利军, 等. 海南榴莲产业的培育对策与发展建议. 中国热带农业, 2019(6):12-14+65.
[7] 王秋萍. 海南:有序扩大榴莲种植规模. 中国果业信息, 2024, 41(4):59.
[8] CHEN S, ZHOU Y, CHEN Y, et al. Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics, 2018, 34(17): i884-i890. https://doi.org/10.1093/bioinformatics/bty560.
[9] LI H, HANDSAKER B, WYSOKER A, et al. The Sequence Alignment / Map format and SAMtools. Bioinformatics, 2009, 25(16): 2078-2079. https://doi.org/10.1093/bioinformatics/btp352.
[10] MCKENNA A, HANNA M, BANKS E, et al. The genome analysis toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research, 2010, 20(9): 1297-1303. https://doi.org/10.1101/gr.107524.110.
[11] CINGOLANI P, PLATTS A, WANG L L, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly, 2012, 6(2): 80-92. https://doi.org/10.4161/fly.19695.
[12] CHANG C C, CHOW C C, TELLIER L C, et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience, 2015, 4(1):7. https://doi.org/10.1186/s13742-015-0047-8.
[13] PRICE M N, DEHAL P S, ARKIN A P. FastTree 2-approximately maximum-likelihood trees for large alignments. PloS One, 2010, 5(3): e9490. https://doi.org/10.1371/journal.pone.0009490.
[14] ALEXANDER D H, NOVEMBRE J, LANGE K. Fast model-based estimation of ancestry in unrelated individuals. Genome Research, 2009, 19(9): 1655-1664. https://doi.org/10.1101/gr.094052.109.
[15] DANECEK P, AUTON A, ABECASIS G, et al. The variant call format and VCFtools. Bioinformatics, 2011, 27(15): 2156-2158. https://doi.org/10.1093/bioinformatics/btr330 .
[16] ZHANG C, DONG S S, XU J Y, et al. PopLDdecay: a fast and effective tool for linkage disequilibrium decay analysis based on variant call format files. Bioinformatics, 2019, 35(10): 1786-1788. https://doi.org/10.1093/bioinformatics/bty875.
文章导航

/