大豆多组学数据资源全景导航

doi:10.19788/j.issn.2096-6369.000151

农业大数据学报 ›› 2026, Vol. 8 ›› Issue (1): 98-112.doi: 10.19788/j.issn.2096-6369.000151

大豆多组学数据资源全景导航

曹永荣¹^,²^,³^,⁵^,^#(), 任思伟¹^,²^,³^,⁵^,^#, 谢海霞¹^,²^,³^,⁵^,^#, 邵洲秦¹^,²^,³^,⁴^,⁵^,^#, 田东梅¹^,²^,³^,^*(), 宋述慧¹^,²^,³^,⁴^,⁵^,^*()

¹ 国家生物信息中心，国家基因组科学数据中心，北京 100101
² 国家生物信息中心，生物大数据智能治理与应用北京市重点实验室，北京 100101
³ 中国科学院北京基因组研究所，北京 100101
⁴ 中国科学院大学中丹学院，北京 100049
⁵ 中国科学院大学，北京 100049

收稿日期:2026-01-07 接受日期:2026-03-18 出版日期:2026-03-26 发布日期:2026-04-01
通讯作者: *宋述慧，E-mail: songshh@big.ac.cn。
田东梅，E-mail: tiandm@big.ac.cn。
作者简介:曹永荣，E-mail: caoyongrong@big.ac.cn。

曹永荣、任思伟、谢海霞、邵洲秦对该文有同等贡献。
基金资助:
国家重点研发计划[2025YFF1207901];中国科学院战略部署先导A类专项[XDA0460405];中国科学院信息化专项(CAS-WX2024GC-0602)

A Panoramic Guide to Multi-Omics Data Resources for Soybean

CAO YongRong¹^,²^,³^,⁵^,^#(), REN SiWei¹^,²^,³^,⁵^,^#, XIE HaiXia¹^,²^,³^,⁵^,^#, SHAO ZhouQin¹^,²^,³^,⁴^,⁵^,^#, TIAN DongMei¹^,²^,³^,^*(), SONG ShuHui¹^,²^,³^,⁴^,⁵^,^*()

¹ National Genomics Data Center, China National Center for Bioinformation, Beijing 100101, China
² Beijing Key Laboratory of Intelligent Governance and Application of Biological Big Data, China National Center for Bioinformation, Beijing 100101, China
³ Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
⁴ Sino-Danish College, University of Chinese Academy of Sciences, Beijing 100049, China
⁵ University of Chinese Academy of Sciences, Beijing 100049, China

Received:2026-01-07 Accepted:2026-03-18 Published:2026-03-26 Online:2026-04-01
Contact: *Corresponding authors.
About author:
#These authors contributed equally to this work

摘要/Abstract

摘要：

随着测序技术和高通量表型采集手段的迅速发展，大豆研究进入多组学数据快速积累的阶段，涵盖基因组、转录组、表观遗传及表型组等多维度信息，并催生了SoyBase、SoyOD、SoyMD、SoyOmics等一系列专业数据库，为功能基因解析与分子育种应用提供了丰富的数据资源。本综述系统梳理现有大豆多组学数据资源与数据库平台，归纳其数据类型、组织逻辑与功能定位，分析各平台间的互补关系，并总结多组学整合应用的最新进展。本综述旨在为研究人员提供系统、清晰且具有可操作性的大豆数据资源指引，促进各类组学数据的高效整合与利用，助力精准育种及性状遗传机制的深入研究。

关键词: 大豆, 数据库, 多组学, 基因组, 变异组, 转录组, 蛋白组, 代谢组, 表观组, 表型组

Abstract:

With the rapid development of sequencing technologies and high-throughput phenotyping approaches, soybean research has entered an era of rapid accumulation of multi-omics data. These data encompass multiple dimensions, including genomics, transcriptomics, epigenomics, and phenomics, and have driven the establishment of a series of specialized databases such as SoyBase, SoyOD, SoyMD, and SoyOmics. Together, these resources provide a solid data foundation for functional gene discovery and molecular breeding applications. In this review, we systematically summarize currently available soybean multi-omics data resources and database platforms, highlighting their data types, organizational frameworks, and functional characteristics. We further analyze the complementarity among these platforms and review recent advances in the integrative application of multi-omics data. This review aims to provide researchers with a systematic, clear, and practical guide to soybean data resources, facilitating the efficient integration and utilization of diverse omics datasets and supporting precision breeding and in-depth studies of the genetic mechanisms underlying complex traits.

Key words: soybean, database, multi-omics, genome, variome, transcriptome, proteome, metabolome, epigenome, phenome

曹永荣, 任思伟, 谢海霞, 邵洲秦, 田东梅, 宋述慧. 大豆多组学数据资源全景导航[J]. 农业大数据学报, 2026, 8(1): 98-112.

CAO YongRong, REN SiWei, XIE HaiXia, SHAO ZhouQin, TIAN DongMei, SONG ShuHui. A Panoramic Guide to Multi-Omics Data Resources for Soybean[J]. Journal of Agricultural Big Data, 2026, 8(1): 98-112.

图/表 7

表1

表2

表3

表4

表5

表6

图1

参考文献 99

[1]	XIE M, CHUNG C Y, LI M W, et al. A reference-grade wild soybean genome. Nature Communications, 2019, 10(1): 1216. doi: 10.1038/s41467-019-09142-9 pmid: 30872580
[2]	VALLIYODAN B, CANNON S B, BAYER P E, et al. Construction and comparison of three reference-quality genome assemblies for soybean. Plant Journal, 2019, 100(5): 1066-82. doi: 10.1111/tpj.v100.5
[3]	SONG Q, JENKINS J, JIA G, et al. Construction of high resolution genetic linkage maps to improve the soybean genome sequence assembly Glyma1.01. BMC Genomics, 2016, 17(1): 33. doi: 10.1186/s12864-015-2344-0
[4]	SHEN Y, DU H, LIU Y, et al. Update soybean Zhonghuang 13 genome to a golden reference. Science China Life Sciences, 2019, 62(9): 1257-60. doi: 10.1007/s11427-019-9822-2
[5]	LIU Y, DU H, LI P, et al. Pan-genome of wild and cultivated soybeans. Cell, 2020, 182(1): 162-76.e13. doi: S0092-8674(20)30618-8 pmid: 32553274
[6]	BAYER P E, VALLIYODAN B, HU H, et al. Sequencing the USDA core soybean collection reveals gene loss during domestication and breeding. Plant Genome, 2022, 15(1): e20109. doi: 10.1002/tpg2.v15.1
[7]	ZHANG J, SONG Q, CREGAN P B, et al. Genome-wide association study, genomic prediction and marker-assisted selection for seed weight in soybean (Glycine max). Theoretical and Applied Genetics, 2016, 129(1): 117-30. doi: 10.1007/s00122-015-2614-x pmid: 26518570
[8]	MORRELL P L, BUCKLER E S, ROSS-IBARRA J. Crop genomics: advances and applications. Nature Reviews Genetics, 2011, 13(2): 85-96. doi: 10.1038/nrg3097 pmid: 22207165
[9]	STEWART-BROWN B B, SONG Q, VAUGHN J N, et al. Genomic Selection for Yield and Seed Composition Traits Within an Applied Soybean Breeding Program. G3 (Bethesda), 2019, 9(7): 2253-65. doi: 10.1534/g3.118.200917
[10]	ZHAO X, ZHU H, LIU F, et al. Integrating Genome-Wide Association Study, Transcriptome and Metabolome Reveal Novel QTL and Candidate Genes That Control Protein Content in Soybean. Plants (Basel), 2024, 13(8).
[11]	PU Y, YAN R, JIA D, et al. Identification of soybean mosaic virus strain SC7 resistance loci and candidate genes in soybean [Glycine max (L.) Merr.]. Molecular Genetics and Genomics, 2024, 299(1): 54. doi: 10.1007/s00438-024-02151-4
[12]	LI Y, GU J, ZHAO B, et al. Identification and confirmation of novel genetic loci and domestication gene GmGA20ox1 regulating primary root length in soybean seedling stage. Industrial Crops and Products, 2024, 217: 118814. doi: 10.1016/j.indcrop.2024.118814
[13]	FAN J, SHEN Y, CHEN C, et al. A large-scale integrated transcriptomic atlas for soybean organ development. Molecular Plant, 2025, 18(4): 669-89. doi: 10.1016/j.molp.2025.02.003
[14]	ALMEIDA-SILVA F, PEDROSA-SILVA F, VENANCIO T M. The Soybean Expression Atlas v2: A comprehensive database of over 5000 RNA-seq samples. Plant Journal, 2023, 116(4): 1041-51. doi: 10.1111/tpj.v116.4
[15]	LIU Z, KONG X, LONG Y, et al. Integrated single-nucleus and spatial transcriptomics captures transitional states in soybean nodule maturation. Nature Plants, 2023, 9(4): 515-24. doi: 10.1038/s41477-023-01387-z pmid: 37055554
[16]	DU J, WANG S, HE C, et al. Identification of regulatory networks and hub genes controlling soybean seed set and size using RNA sequencing analysis. Journal of Experimental Botany, 2017, 68(8): 1955-72. doi: 10.1093/jxb/erw460 pmid: 28087653
[17]	NIU J, ZHAO J, GUO Q, et al. WGCNA Reveals Hub Genes and Key Gene Regulatory Pathways of the Response of Soybean to Infection by Soybean mosaic virus. Genes (Basel), 2024, 15(5).
[18]	CAO P, ZHAO Y, WU F, et al. Multi-Omics Techniques for Soybean Molecular Breeding. International Journal of Molecular Sciences, 2022, 23(9).
[19]	GROSSKINSK D K, SYAIFULLAH S J, ROITSCH T. Integration of multi-omics techniques and physiological phenotyping within a holistic phenomics approach to study senescence in model and crop plants. Journal of Experimental Botany, 2018, 69(4): 825-44. doi: 10.1093/jxb/erx333 pmid: 29444308
[20]	XU X P, LIU H, TIAN L, et al. Integrated and comparative proteomics of high-oil and high-protein soybean seeds. Food Chemistry, 2015, 172: 105-16. doi: 10.1016/j.foodchem.2014.09.035 pmid: 25442530
[21]	AFROZ A, HASHIGUCHI A, KHAN M R, et al. Analyses of the proteomes of the leaf, hypocotyl, and root of young soybean seedlings. Protein and Peptide Letters, 2010, 17(3): 319-31. doi: 10.2174/092986610790780341 pmid: 19508212
[22]	WANG X, KOMATSU S. Proteomic approaches to uncover the flooding and drought stress response mechanisms in soybean. Journal of Proteomics, 2018, 172: 201-15. doi: S1874-3919(17)30377-9 pmid: 29133124
[23]	WANG X, KHODADADI E, FAKHERI B, et al. Organ-specific proteomics of soybean seedlings under flooding and drought stresses. Journal of Proteomics, 2017, 162: 62-72. doi: S1874-3919(17)30132-X pmid: 28435105
[24]	WEI J, LIU X, LI L, et al. Quantitative proteomic, physiological and biochemical analysis of cotyledon, embryo, leaf and pod reveals the effects of high temperature and humidity stress on seed vigor formation in soybean. BMC Plant Biology, 2020, 20(1): 127. doi: 10.1186/s12870-020-02335-1 pmid: 32216758
[25]	LEE J, HWANG Y S, CHANG W S, et al. Seed maturity differentially mediates metabolic responses in black soybean. Food Chemistry, 2013, 141(3): 2052-9. doi: 10.1016/j.foodchem.2013.05.059 pmid: 23870927
[26]	DUBAL Í T P, CORADI P C, DOS SANTOS BILHALVA N, et al. Monitoring of carbon dioxide and equilibrium moisture content for early detection of physicochemical and morphological changes in soybeans stored in vertical silos. Food Chemistry, 2024, 436: 137721. doi: 10.1016/j.foodchem.2023.137721
[27]	FENG Z, DING C, LI W, et al. Applications of metabolomics in the research of soybean plant under abiotic stress. Food Chemistry, 2020, 310: 125914. doi: 10.1016/j.foodchem.2019.125914
[28]	AN Y C, GOETTEL W, HAN Q, et al. Dynamic Changes of Genome- Wide DNA Methylation during Soybean Seed Development. Scientific Reports, 2017, 7(1): 12263. doi: 10.1038/s41598-017-12510-4
[29]	ZHAI H, WAN Z, JIAO S, et al. GmMDE genes bridge the maturity gene E1 and florigens in photoperiodic regulation of flowering in soybean. Plant Physiollogy, 2022, 189(2): 1021-36.
[30]	RAMBANI A, HU Y, PIYA S, et al. Identification of Differentially Methylated miRNA Genes During Compatible and Incompatible Interactions Between Soybean and Soybean Cyst Nematode. Molecular Plant-Microbe Interactions, 2020, 33(11): 1340-52. doi: 10.1094/MPMI-07-20-0196-R
[31]	ZHANG Y, HAN X, SU D, et al. An analysis of differentially expressed and differentially m6A-modified transcripts in soybean roots treated with lead. Journal of Hazardous Materials, 2023, 453: 131370. doi: 10.1016/j.jhazmat.2023.131370
[32]	HAN X, SHI Q, HE Z, et al. Transcriptome-wide N(6)-methyladenosine (m(6)A) methylation in soybean under Meloidogyne incognita infection. aBIOTECH, 2022, 3(3): 197-211. doi: 10.1007/s42994-022-00077-2
[33]	HAN X, WANG J, ZHANG Y, et al. Changes in the m6A RNA methylome accompany the promotion of soybean root growth by rhizobia under cadmium stress. Journal of Hazardous Materials, 2023, 441: 129843. doi: 10.1016/j.jhazmat.2022.129843
[34]	LU L, WEI W, TAO J J, et al. Nuclear factor Y subunit GmNFYA competes with GmHDA13 for interaction with GmFVE to positively regulate salt tolerance in soybean. Plant Biotechnology Journal, 2021, 19(11): 2362-79. doi: 10.1111/pbi.13668 pmid: 34265872
[35]	LIU M, JIANG J, HAN Y, et al. Functional Characterization of the Lysine-Specific Histone Demethylases Family in Soybean. Plants (Basel), 2022, 11(11).
[36]	SHEN Y, ZHOU G, LIANG C, et al. Omics-based interdisciplinarity is accelerating plant breeding. Current Opinion in Plant Biology, 2022, 66: 102167. doi: 10.1016/j.pbi.2021.102167
[37]	YU N, LI L, SCHMITZ N, et al. Development of methods to improve soybean yield estimation and predict plant maturity with an unmanned aerial vehicle based platform. Remote Sensing of Environment, 2016, 187: 91-101. doi: 10.1016/j.rse.2016.10.005
[38]	TETILA E C, MACHADO B B, MENEZES G K, et al. Automatic Recognition of Soybean Leaf Diseases Using UAV Images and Deep Convolutional Neural Networks. IEEE Geoscience and Remote Sensing Letters, 2020, 17(5): 903-7. doi: 10.1109/LGRS.8859
[39]	MORRISON M, GAHAGAN A, LEFEBVRE M. Measuring canopy height in soybean and wheat using a low‐cost depth camera. The Plant Phenome Journal, 2021, 4.
[40]	LI X, WANG C Y. From bulk, single-cell to spatial RNA sequencing. International Journal of Oral Science, 2021, 13(1): 36. doi: 10.1038/s41368-021-00146-0
[41]	FAROOQ M A, GAO S, HASSAN M A, et al. Artificial intelligence in plant breeding. Trends in Genetics, 2024, 40(10): 891-908. doi: 10.1016/j.tig.2024.07.001 pmid: 39117482
[42]	LIU J, LI J, WANG H, et al. Application of deep learning in genomics. Science China Life Sciences, 2020, 63(12): 1860-78. doi: 10.1007/s11427-020-1804-5
[43]	Deep learning for genomics. Nature Genetics, 2019, 51(1): 1. doi: 10.1038/s41588-018-0328-0 pmid: 30578416
[44]	ZHANG X, LUO Z, MARAND A P, et al. A spatially resolved multi-omic single-cell atlas of soybean development. Cell, 2025, 188(2): 550-67 e19. doi: 10.1016/j.cell.2024.10.050 pmid: 39742806
[45]	LIU Y, ZHANG Y, LIU X, et al.SoyOmics: A deeply integrated database on soybean multi-omics. Molecular Plant, 2023, 16(5): 794-7.
[46]	ZHANG Y, ZOU D, ZHU T, et al. Gene Expression Nebulas (GEN): a comprehensive data portal integrating transcriptomic profiles across multiple species at both bulk and single-cell levels. Nucleic Acids Research, 2022, 50(D1): D1016-D24.
[47]	YANG Z, LUO C, PEI X, et al. SoyMD: a platform combining multi-omics data with various tools for soybean research and breeding. Nucleic Acids Research, 2024, 52(D1): D1639-D50.
[48]	BROWN A V, CONNERS S I, HUANG W, et al. A new decade and new data at SoyBase, the USDA-ARS soybean genetics and genomics database. Nucleic Acids Research, 2021, 49(D1): D1496-D501.
[49]	LI J, NI Q, HE G, et al. SoyOD: An Integrated Soybean Multi-omics Database for Mining Genes and Biological Research. Genomics, Proteomics & Bioinformatics, 2024, 22(6): qzae080.
[50]	SU L, XU C, ZENG S, et al. Large-Scale Integrative Analysis of Soybean Transcriptome Using an Unsupervised Autoencoder Model. Frontiers in Plant Science, 2022, 13: 831204. doi: 10.3389/fpls.2022.831204
[51]	PEREIRA J L, OLIMPIO G V, COELHO F S, et al. H+-ATPases Regulated by Auxin and ABA Mediate Acid Growth of Soybean Embryonic Axis During Germination. Seeds, 2025, 4(3).
[52]	ZHANG Z-Q, LI M-M, TIAN R-M, et al. Genome-Wide Identification, Expression Profile and Evolution Analysis of Importin α Gene Family in Glycine max. Agronomy, 2025, 15(11).
[53]	ARAUJO P M, GRUBER A, OLIVEIRA L S, et al. RdDM-Associated Chromatin Remodelers in Soybean: Evolution and Stress-Induced Expression of CLASSY Genes. Plants (Basel), 2025, 14(16).
[54]	ARAUS J L, CAIRNS J E. Field high-throughput phenotyping: the new crop breeding frontier. Trends Plant Sci, 2014, 19(1): 52-61. doi: 10.1016/j.tplants.2013.09.008 pmid: 24139902
[55]	COOPER M, TECHNOW F, MESSINA C, et al. Use of crop growth models with whole‐genome prediction: application to a maize multienvironment trial. Crop Science, 2016, 56(5): 2141-56. doi: 10.2135/cropsci2015.08.0512
[56]	ZHAO C, ZHANG Y, DU J, et al. Crop phenomics: current status and perspectives. Frontiers in Plant Science, 2019, 10: 714. doi: 10.3389/fpls.2019.00714 pmid: 31214228
[57]	CAO Y, TIAN D, TANG Z, et al. OPIA: an open archive of plant images and related phenotypic traits. Nucleic Acids Research, 2024, 52(D1): D1530-D7.
[58]	ZHENG T, LI Y, LI Y, et al. SoyFGB v2. 0: a unique access to variations of Chinese Soybean Gene Bank (CNSGB) germplasm. bioRxiv, 2021: 2021.12. 28.474253.
[59]	TIAN D, WANG P, TANG B, et al. GWAS Atlas: a curated resource of genome-wide variant-trait associations in plants and animals. Nucleic Acids Research, 2020, 48(D1): D927-D32.
[60]	LIU X, TIAN D, LI C, et al. GWAS Atlas: an updated knowledgebase integrating more curated associations in plants and animals. Nucleic Acids Research, 2023, 51(D1): D969-D76.
[61]	GRANT D, NELSON R T, CANNON S B, et al. SoyBase, the USDA-ARS soybean genetics and genomics database. Nucleic Acids Research, 2009, 38(suppl_1): D843-D6.
[62]	GAO P, ZHAO H, LUO Z, et al. SoyDNGP: a web-accessible deep learning framework for genomic prediction in soybean breeding. Brief Bioinform, 2023, 24(6).
[63]	LIU Y, WANG D, HE F, et al. Phenotype Prediction and Genome- Wide Association Study Using Deep Convolutional Neural Network of Soybean. Frontiers in Genetics, 2019, 10: 1091. doi: 10.3389/fgene.2019.01091
[64]	RANDELOVIC P, DORDEVIC V, MILADINOVIC J, et al. High-throughput phenotyping for non-destructive estimation of soybean fresh biomass using a machine learning model and temporal UAV data. Plant Methods, 2023, 19(1): 89. doi: 10.1186/s13007-023-01054-6 pmid: 37633921
[65]	CHANG F, LV W, LV P, et al. Exploring genetic architecture for pod-related traits in soybean using image-based phenotyping. Molecular Breeding, 2021, 41(4): 28. doi: 10.1007/s11032-021-01223-2
[66]	YU T, ZHANG H, CHEN S, et al. EXGEP: a framework for predicting genotype-by-environment interactions using ensembles of explainable machine-learning models. Briefings in Bioinformatics, 2025, 26(4).
[67]	YAO Z, YAO M, WANG C, et al. GEFormer: A genotype- environment interaction-based genomic prediction method that integrates the gating multilayer perceptron and linear attention mechanisms. Molcular Plant, 2025, 18(3): 527-49.
[68]	YANG Q, HOU Z Y, LI L, et al. Landscape and m(6)A post- transcriptional regulation of soybean proteome. Cell Genomomics, 2025: 100926.
[69]	ZHANG M, LIU S, WANG Z, et al. Progress in soybean functional genomics over the past decade. Plant Biotechnology Journal, 2022, 20(2): 256-82. doi: 10.1111/pbi.v20.2
[70]	KHAN M N, KOMATSU S. Proteomic analysis of soybean root including hypocotyl during recovery from drought stress. Journal of Proteomics, 2016, 144: 39-50. doi: 10.1016/j.jprot.2016.06.006 pmid: 27292084
[71]	KHATOON A, REHMAN S, HIRAGA S, et al. Organ-specific proteomics analysis for identification of response mechanism in soybean seedlings under flooding stress. Journal of Proteomics, 2012, 75(18): 5706-23. doi: 10.1016/j.jprot.2012.07.031 pmid: 22850269
[72]	KOMATSU S, KOBAYASHI Y, NISHIZAWA K, et al. Comparative proteomics analysis of differentially expressed proteins in soybean cell wall during flooding stress. Amino Acids, 2010, 39(5): 1435-49. doi: 10.1007/s00726-010-0608-1 pmid: 20458513
[73]	AGRAWAL G K, HAJDUCH M, GRAHAM K, et al. In-depth investigation of the soybean seed-filling proteome and comparison with a parallel study of rapeseed. Plant Physiology, 2008, 148(1): 504-18. doi: 10.1104/pp.108.119222 pmid: 18599654
[74]	ARAI Y, HAYASHI M, NISHIMURA M. Proteomic analysis of highly purified peroxisomes from etiolated soybean cotyledons. Plant Cell Physiology, 2008, 49(4): 526-39. doi: 10.1093/pcp/pcn027
[75]	HAJDUCH M, GANAPATHY A, STEIN J W, et al. A systematic proteomic study of seed filling in soybean. Establishment of high- resolution two-dimensional reference maps, expression profiles, and an interactive proteome database. Plant Physiology, 2005, 137(4): 1397-419.
[76]	TAVAKOLAN M, ALKHAROUF N W, KHAN F H, et al. SoyProDB: A database for the identification of soybean seed proteins. Bioinformation, 2013, 9(3): 165-7.
[77]	TAVAKOLAN M, ALKHAROUF N W, MATTHEWS B F, et al.SoyProLow: A protein database enriched in low abundant soybean proteins. Bioinformation, 2014, 10(9): 599-601.
[78]	KOMATSU S, WANG X, YIN X, et al. Integration of gel-based and gel-free proteomic data for functional analysis of proteins through Soybean Proteome Database. Journal of Proteomics, 2017, 163: 52-66. doi: S1874-3919(17)30167-7 pmid: 28499913
[79]	DEUTSCH E W, BANDEIRA N, PEREZ-RIVEROL Y, et al. The ProteomeXchange consortium at 10 years: 2023 update. Nucleic Acids Research, 2023, 51(D1): D1539-d48.
[80]	MIN C W, GUPTA R, AGRAWAL G K, et al. Concepts and strategies of soybean seed proteomics using the shotgun proteomics approach. Expert Review of Proteomics, 2019, 16(9): 795-804. doi: 10.1080/14789450.2019.1654860 pmid: 31398080
[81]	KOMATSU S, AHSAN N. Soybean proteomics and its application to functional analysis. Journal of proteomics, 2009, 72(3): 325-36. doi: 10.1016/j.jprot.2008.10.001 pmid: 19022415
[82]	HOSSAIN Z, KOMATSU S. Soybean proteomics. Plant Proteomics: Methods and Protocols, 2013: 315-31.
[83]	GLAUSER G, BOCCARD J, WOLFENDER J L, et al. Metabolomics: application in plant sciences. Metabolomics in practice: Successful strategies to generate and analyze metabolic data, 2013: 313-43.
[84]	HU Y, LUAN T, WANG X, et al. Integrated Metabolomics and Transcriptomics Analyses Reveal Resistance to Salt Stress in Wild Soybean (Glycine soja) During the Post‐Germination Growth Period. Journal of Agronomy and Crop Science, 2024, 210(5): e12748. doi: 10.1111/jac.v210.5
[85]	SHEN Y, ZHANG J, LIU Y, et al. DNA methylation footprints during soybean domestication and improvement. Genome biology, 2018, 19(1): 128. doi: 10.1186/s13059-018-1516-z pmid: 30201012
[86]	SCHMUTZ J, CANNON S B, SCHLUETER J, et al. Genome sequence of the palaeopolyploid soybean. Nature, 2010, 463(7278): 178-83. doi: 10.1038/nature08670
[87]	JOSHI T, YAO Q, LEVI D F, et al. SoyMetDB: The soybean metabolome database; proceedings of the 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), F 18-21 Dec. 2010, 2010[C].
[88]	ZHOU Z, JIANG Y, WANG Z, et al. Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nature biotechnology, 2015, 33(4): 408-14. doi: 10.1038/nbt.3096 pmid: 25643055
[89]	KIM E, HWANG S, LEE I. SoyNet: a database of co-functional networks for soybean Glycine max. Nucleic Acids Research, 2017, 45(D1): D1082-D9.
[90]	CAO Y, LUO H, YE R, et al. Img2Variety:Image-Based Intraspecific Varieties Identification Across the Whole Growth Period. Plant Phenomics, 2025: 100151.
[91]	NELSON R. Lessons from a soybean collection. Crop Science, 2023, 63(3): 1050-8. doi: 10.1002/csc2.v63.3
[92]	YU X, FU X, YANG Q, et al. Genetic and Phenotypic Characterization of Soybean Landraces Collected from the Zhejiang Province in China. Plants, 2024, 13(3): 353. doi: 10.3390/plants13030353
[93]	王耀君, 徐国威, 朱建军, 等. 农业领域大语言模型研究进展. 农业机械学报, 56(9): 240-56.
	WANG Y, XU G, ZHU J, et al. Survey of Research on Large Language Models in Agriculture. Transactions of the Chinese Society for Agricultural Machinery, 2025, 56(9):240-256.
[94]	LV B, WU H, CHEN W, et al. VEG-MMKG: Multimodal knowledge graph construction for vegetables based on pre-trained model extraction. Computers and Electronics in Agriculture, 2024, 226.
[95]	YANG F, KONG H, YING J, et al. SeedLLM.Rice: A large language model integrated with rice biological knowledge graph. Molecular Plant, 2025, 18(7): 1118-29. doi: 10.1016/j.molp.2025.05.013
[96]	CHAO D, WANG H, WAN F, et al. MtCro: multi-task deep learning framework improves multi-trait genomic prediction of crops. Plant Methods, 2025, 21(1): 12. doi: 10.1186/s13007-024-01321-0 pmid: 39910577
[97]	袁欢, 范蓓蕾, 杨晨雪, 等. 图神经网络应用于知识图谱构建:研究进展、农业发展潜力及关键方向. 智慧农业(中英文), 2025, 7(2): 41-56.
	YUAN H, FAN B L, YANG C X, et al. Graph neural networks for knowledge graph construction: research progress, agricultural development potential, and future directions. Smart Agriculture, 2025, 7(2): 41-56.
[98]	盐碱地上耐盐大豆新品系“中黄357”迎来丰收. 农村新技术, 2025 (11): 49-50.
	A new salt-tolerant soybean variety, “Zhonghuang 357,” has achieved a bumper harvest on saline-alkali land. Rural New Technology, 2025, (11): 49-50.
[99]	ZHU Z, WANG Y, LIU S, et al. Genomic atlas of 8,105 accessions reveals stepwise domestication, global dissemination, and improvement trajectories in soybean. Cell, 2025, 188(23): 6519-35. e15. doi: 10.1016/j.cell.2025.09.007

组学类型	主要数据归档库	数据量统计维度	数据量截至2025.12.22
基因组	SRA/GSA	原始测序记录数/项目数	SRA: 16035/60 GSA: 27014/-
转录组	SRA/GSA/GEO/GEN	原始测序记录数/项目数	SRA: 17189/646 GSA: 22963/- GEO: 406/164 GEN: 499/16
表观组	SRA/GSA/GEO/MethBank	原始测序记录数/项目数	SRA: 404/58 GSA: 567/27 GEO: 2102/38 MethBank: 121/16
蛋白组	ProteomXchange	质谱数据集数	114
代谢组	Metabolomics/ MetaboLights	数据集	Metabolomics: 9 MetaboLights: 14

品种名	地区	版本定位	版本号	组装水平	大小（Mb）	蛋白编码基因	BUSCO （%）	Scaffold/ Contig N50 (Mb)	核心平台/发布日期	核心文献PMID
Wm82	美国	NCBI官方参考（Glycine_max_v4.0）	GCF_000004515.6	Chromosome	978.4	47 068	99.2	20.4/0.42	NCBI 2021/3/10	31433882
Wm82	美国	SoyBase最新参考（Wm82.gnm6）	GCA_043381025.1	Chromosome	1 011.1	48 387	99.5	51.1/44.4	Phytozome 2024/2/16	39276372
Wm82	美国	首个T2T （Wm82-NJAU）	GWHCAYC00000000	Complete	1 011.8	55 498	99.5	51.2/51.2	GWH 2023/8/19	37634078
ZH13	中国	NGDC首发T2T参考（ZH13-T2T）	GWHBWDJ00000000.1	Complete	1 007.2	52 157	99.6	48.8/48.8	GWH 2023/7/29	37803825
ZH13	中国	SoyBase参考（Zh13.gnm2）	GWHAAEV00000000.1	Chromosome	1 011.8	55 443	99.5	52/18	GWH 2019/9/16	31444683
Lee	美国	SoyBase参考（Lee.gnm3）	-	Chromosome	1 016.4	56 725	99.5	51.6/32.2	Figshare 2023/7/22	37749941
Fiskeby III	瑞典	SoyBase参考（FiskebyIII.gnm1.F177）	GCA_044510105.1	Chromosome	992.2	52 783	99.5	50.8/15.7	Phytozome 2020/9/15	39276372
Hwangkeum	韩国	SoyBase参考（Hwangkeum.gnm1）	GCA_020497155.1	Chromosome	933.1	58 550	99.5	46.6/7.8	NCBI 2021/10/14	34568925
Jidou17	中国	SoyBase参考（JD17.gnm1）	GCA_021733175.1	Chromosome	995.2	52 840	99.5	50.6/18	NCBI 2022/2/23	35188189
W05	中国	NCBI 官方野生参考	GCA_004193775.2	Chromosome	1 013.2	89 477	99.4	50.7/3.3	NCBI 2019/2/21	30872580

群体构成	样本总数	平均测序深度	测序技术平台	编号	年份	PMID
17^@14^#	31	~5×	Illumina GAII	SRA020131	2010	21076406
62^@130^*110^{^}	302	>10×	Illumina HiSeq 2000	SRP045129	2015	25643055
388^#421^*	809	~8.3×	Illumina HiSeq 2000/2500	PRJNA394629	2017	28838319
103^@1048^*1747^{^}	2 898	>13×	Illumina Platform	PRJCA002030	2020	32553274
218^@1131^*862^{^}	2 214	~6.3×	Illumina HiSeq X	PRJNA681974	2023	35997916
199^#51^*	250	~11×	Illumina NovaSeq 6000	PRJCA002554	2021	34314874
61^*486^{^}	547	~18.05×	Illumina Platform	PRJNA1114896	2024	39251789

数据库	样本数	群体构成	变异类型	核心技术	参考基因组版本	变异位点个数
EVA	6 611	栽培	SNP, InDel, SV	GBS, WGS, 芯片	Glycine_max_v1.0/v1.1/v2.0/v2.1	~2 870万
GVM	8 917	栽培	SNP, InDel	WGS	Wm82.a2.v1 ZH13	~4 035万SNP ~1 237万InDel
Soybase	>20 087	栽培野生	SNP, InDel, SV	WGS, 芯片（SoySNP50K）	Wm82.a1/a2/a4	42 509个SNP （SoySNP50K）
SoyKB	>1 000	栽培	SNP, InDel, CNV	WGS, GBS	Wm82.a2.v1	未明确提供总数
SoyOmics	2 898	野生栽培地方改良	SNP, InDel, SV, QTN	WGS	ZH13 v2.0	~3 800万SNP/InDel, ~55万SV
SoyMD	24 501	野生栽培地方	SNP, InDel, SV	WGS, 芯片（SoySNP50K）	多参考	~945万SNP, ~100万InDel
SoyFGB	2 214	野生栽培地方改良	SNP, InDel	WGS	Wm82.a2	~6 537万SNP, ~1 095万InDel
SoyOD	3 904	野生	SNP, InDel, SV	WGS	ZH13 v2	~719万SNP, ~75万InDel

数据库名称	测序技术及规模	数据来源	数据获取方式^$	表达矩阵计数方式^&	链接
Soybean Expression Atlas	5 481个样本^*	ENA	1, 2, 3	1, 3	https://soyatlas.venanciogroup.uenf.br/
SoyOmics	覆盖576个品种、组织、发育阶段组合^* 7个数据集^#5个数据集^{^}	自测数据	1	3 (仅检索结果可下载)	https://ngdc.cncb.ac.cn/soyomics
Gene Expression Nebulas	499个样本^*	SRA	1, 2, 3	1, 3, 4	https://ngdc.cncb.ac.cn/gen/species/Glycine%20max
SoyMD	435个样本^*	GSA/SRA	1, 2	1, 2, 3 (仅检索结果可下载)	https://yanglab.hzau.edu.cn/SoyMD/
SoyBase	440个样本^*	GEO/SRA 自测数据	1, 2, 3	3	https://www.soybase.org/tools/expression/
SoyOD	1 097个样本^*	GEO/SRA 自测数据	1, 2	1, 3, 4 (仅检索结果可下载)	https://bis.zju.edu.cn/soyod

大豆多组学数据资源全景导航

A Panoramic Guide to Multi-Omics Data Resources for Soybean

RichHTML

PDF (PC)

赞

可视化

摘要/Abstract

引用本文

使用本文

图/表 7

参考文献 99

相关文章 15

Metrics

本文评价

推荐阅读 0

数据库名称	材料数量	表型数	数据来源	链接
SoyOD	4 097	225个性状；约2 500张图像	中国	https://bis.zju.edu.cn/soyod
SoyOmics	2 898	115个性状	中国	https://ngdc.cncb.ac.cn/soyomics/phenome
SoyFGB	2 214	42	亚洲、美洲、欧洲和非洲	https://sfgb.rmbreeding.cn/search/phenotype
GWAS Atlas	-	145个性状及相应的8 950条关联知识	国际研究积累	https://ngdc.cncb.ac.cn/gwas
SoyBase	-	>90个性状的QTL关联知识	国际研究积累	https://soybase.org

[1]	陈晓静, 李威, 樊景超, 闫燊, 张建华, 周国民. 基于图数据库的农业多本体解析导入方法[J]. 农业大数据学报, 2025, 7(4): 431-445.
[2]	陈维娟, 全雯珺, 李慧华, 刘宇辉, 侯佳蓝, 吴明月. 叶面喷施尿素+调节剂对大豆玉米带状复合种植苗期涝害的缓解效应研究[J]. 农业大数据学报, 2025, 0(12): 38-44.
[3]	张美珍, 汤茹, 冯涛, 孙向春, 许文霞. 氮肥减施对间套作模式下小麦大豆产量及氮素积累的影响[J]. 农业大数据学报, 2025, 0(12): 53-56.
[4]	刘洪, 窦婧文, 王越, 廖勇, 刘小磊, 李新云, 赵书红, 付玉华. 一种面向功能基因挖掘的动物多组学数据集[J]. 农业大数据学报, 2025, 7(1): 96-106.
[5]	辛蕊, 陆忠军, 付斌, 王婷, 黄楠, 刘克宝, 刘艳霞. 大豆振兴计划背景下黑龙江省县域尺度大豆种植结构研究[J]. 农业大数据学报, 2023, 5(2): 44-53.
[6]	陈乃钰, 赵贺, 蒋慧欣, 凌磊, 殷亚杰, 任国领. 五种豆科植物WRKY基因家族全基因组鉴定及表达分析[J]. 农业大数据学报, 2023, 5(2): 16-26.
[7]	段博文, 王卷乐, 石蕾, 高孟绪. 前沿领域国内外典型数据库调研与启示[J]. 农业大数据学报, 2023, 5(1): 46-54.
[8]	刘兴龙, 王宇, 王晓曦, 王克勤. 2018—2021年黑龙江省哈尔滨农田瓢虫数据集[J]. 农业大数据学报, 2022, 4(4): 39-44.
[9]	刘海燕, 杨榕, 侯彤瑜, 赵维, 姚兆群, 王海江, 张泽, 高攀, 吕新. 新疆棉田土壤微生物资源大数据平台建设与可视化分析[J]. 农业大数据学报, 2021, 3(1): 45-55.
[10]	张明旭, 陈元, 席琳图雅, 张茹, 毕雅琼, 张春红, 吴涛涛, 李旻辉. 中药资源大数据的应用与展望[J]. 农业大数据学报, 2021, 3(1): 14-24.
[11]	顾金刚, 马锐, 李世贵, 马晓彤, 梁瑞珍. 农用微生物数据与资源关联应用研究[J]. 农业大数据学报, 2020, 2(4): 38-46.
[12]	徐丽君, 姚艳敏, 张保辉, 杨桂霞, 闫瑞瑞, 阿斯娅, 柳小妮, 辛晓平. 中国草地和草业科学数据库研究与应用[J]. 农业大数据学报, 2019, 1(4): 46-57.
[13]	梁栋,唐文凤,杜维成,孙光荣,贾昕为. 农业农村数字资源体系架构研究与设计[J]. 农业大数据学报, 2019, 1(3): 28-37.
[14]	李晓曼,张扬,徐倩,谢能付. 基于文献计量的植物表型组学研究进展分析[J]. 农业大数据学报, 2019, 1(2): 64-75.
[15]	戴国新,陈国兴,杨万能,冯慧. 基于高光谱的水稻精米品质参数测量技术研究[J]. 农业大数据学报, 2019, 1(2): 51-63.