Journal of Agricultural Big Data ›› 2024, Vol. 6 ›› Issue (3): 412-423.doi: 10.19788/j.issn.2096-6369.000052
Previous Articles Next Articles
ZHANG YuQin1,2(), ZHU JingQuan3, DONG Wei2, LI FuZhong1,*(), GUO LeiFeng2,*()
Received:
2024-05-23
Accepted:
2024-06-23
Online:
2024-09-26
Published:
2024-10-01
Contact:
LI FuZhong, GUO LeiFeng
ZHANG YuQin, ZHU JingQuan, DONG Wei, LI FuZhong, GUO LeiFeng. Construction Process and Technological Prospects of Large Language Models in the Agricultural Vertical Domain[J].Journal of Agricultural Big Data, 2024, 6(3): 412-423.
Table 3
Open source foundation models"
LLMs基模型 | 发布组织 | 参数大小(B指Billion) | 发布时间 | 处理语言 |
---|---|---|---|---|
Llama[ | Meta | 7B、13B、33B、65B | 2023年02月 | 英文 |
Llama2[ | Meta | 7B、13B、34B、70B | 2023年07月 | 英文 |
Bloom[ | BigScience | 560M、1.1B、1.7B、3B 、7.1B、176B | 2022年11月 | 英文 |
GLaM[ | 64B | 2022年08月 | 英文 | |
PaLM[ | 8B,62B,540B | 2022年10月 | 英文 | |
Qwen[ | 阿里云 | 1.5B、1.8B、7B、14B、72B | 2024年02月 | 中文 |
ChatGLM[ | 智谱 | 6B | 2023年01月 | 中文 |
Baichuan2[ | 百川智能 | 7B、13B | 2023年06月 | 中文 |
Table 4
Fine-tuning methods"
微调方法 | 原理 |
---|---|
Freeze | 微调Transformer模型的深层特征全连接层参数,可以在保证微调效率的前提下,最大限度地发挥模型的微调作用 |
Lora[ | 通过冻结模型的参数,并向模型中添加可训练的低秩分解层,仅训练新增层的参数,从而实现模型性能的提升 |
Prefix-Tuning[ | 在模型输入之前添加一系列任务特定的连续向量,称为前缀,来引导模型生成更符合特定任务要求的文本输出 |
P-tuning v1[ | 固定模型前馈层参数,仅仅更新部分embedding参数即可实现低成本微调大模型 |
P-tuning v2[ | 将可训练的连续提示独立添加到每个transformer层的输入中,删除带有LM头的verbalizers,以增强通用性 |
[1] | BROWN T, MANN B, RYDER N, et al. Language models are few-shot learners[J]. Advances in neural information processing systems, 2020, 33: 1877-1901. |
[2] | 张振乾, 汪澍, 宋琦, 等. 人工智能大模型在智慧农业领域的应用[J]. 智慧农业导刊, 2023, 3(10):9-12+17. |
[3] | 郭旺, 杨雨森, 吴华瑞, 等. 农业垂直领域大语言:关键技术、应用分析与发展方向[J]. 智慧农业(中英文), 2024, 6(2):1-13. |
[4] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in neural information processing systems, 2017, 30. |
[5] | WANG Y, ZHANG Z, WANG R. Element-aware summarization with large language models: Expert-aligned evaluation and chain-of- thought method[J]. arXiv preprint arXiv:2305.13412, 2023. |
[6] | BRIAKOU E, CHERRY C, FOSTER G. Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM's Translation Capability[J]. arXiv preprint arXiv:2305.10266, 2023. |
[7] | LIU Z, YANG K, ZHANG T, et al. Emollms: A series of emotional large language models and annotation tools for comprehensive affective analysis[J]. arXiv preprint arXiv:2401.08508, 2024. |
[8] | 王婷, 王娜, 崔运鹏, 等. 基于人工智能大模型技术的果蔬农技知识智能问答系统[J]. 智慧农业(中英文), 2023, 5(4):105-116. |
[9] | ZHANG X, TIAN C, YANG X, et al. Alpacare: Instruction-tuned large language models for medical application[J]. arXiv preprint arXiv: 2310. 14558, 2023. |
[10] | ZHANG H, CHEN J, JIANG F, et al. HuatuoGPT, towards taming language model to be a doctor[OL]. arXiv preprint, 2023. arXiv: 2305.15075. |
[11] | HUANG Q, TAO M, AN Z, et al. Lawyer llama technical report[J/OL]. arXiv preprint, 2023. arXiv:2305.15062. |
[12] | LUO Y, ZHANG J, FAN S, et al. Biomedgpt: Open multimodal generative pre-trained transformer for biomedicine[J/OL]. arXiv preprint, 2023. arXiv:2308.09442. |
[13] | LUO Y, YANG K, HONG M, et al. Molfm: A multimodal molecular foundation model[J/OL]. arXiv preprint, 2023. arXiv:2307.09484. |
[14] | ZHAO S, ZHANG J, NIE Z. Large-scale cell representation learning via divide-and-conquer contrastive learning[J/OL]. arXiv preprint, 2023. arXiv:2306.04371. |
[15] | YANG H, LIU X Y, WANG C D. Fingpt: Open-source financial large language models[J/OL]. arXiv preprint, 2023. arXiv:2306.06031. |
[16] | LI Y, MA S, WANG X, et al. EcomGPT: Instruction-tuning large language models with chain-of-task tasks for e-commerce[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2024, 38(17): 18582-18590. |
[17] | YANG X, GAO J, XUE W, et al. Pllama: An open-source large language model for plant science[J]. arXiv preprint arXiv:2401.01600, 2024. |
[18] | ZHAO B, JIN W, SER J D, et al. ChatAgri: Exploring potentials of ChatGPT on cross-linguistic agricultural text classification[J]. Neurocomputing, 2023, 557: 126708. |
[19] | BALAGUER A, BENARA V, DE FREITAS CUNHA R L, et al. RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture[J/OL]. arXiv e-prints, 2024. arXiv: 2401.08406. |
[20] | SILVA B, NUNES L, ESTEVÃO R, et al. GPT-4 as an Agronomist Assistant? Answering Agriculture Exams Using Large Language Models[J]. arXiv preprint, 2023. arXiv:2310.06225. |
[21] | WANG Y, KORDI Y, MISHRA S, et al. Self-instruct: Aligning language models with self-generated instructions[J/OL]. arXiv preprint, 2022. arXiv:2212.10560. |
[22] | ZHANG X, YANG Q. Self-qa: Unsupervised knowledge guided language model alignment[J/OL]. arXiv preprint, 2023. arXiv:2305. 11952. |
[23] | LIU X, HONG H, WANG X, et al. Selfkg: Self-supervised entity alignment in knowledge graphs[C]// Proceedings of the ACM Web Conference 2022. 2022: 860-870. |
[24] |
YAN J, WANG C, CHENG W, et al. A retrospective of knowledge graphs[J]. Frontiers of Computer Science, 2018, 12: 55-74.
doi: 10.1007/s11704-016-5228-9 |
[25] | 侯琛, 牛培宇. 农业知识图谱技术的研究现状与展望[J]. 农业机械学报, 2024, 55(6):1-17. |
[26] | 田鹏菲. 苹果病虫害知识图谱和施药辅助App的研究与实现[D]. 新疆塔里木: 塔里木大学, 2023. |
[27] | 张宇, 郭文忠, 林森, 等. 基于Neo4j的草莓种植管理知识图谱构建及验证[J]. 现代农业科技, 2022,(1):223-230+234. |
[28] | 张文豪. 基于大豆育种语料的知识图谱构建[D]. 济南: 山东大学, 2021. |
[29] | TOUVRON H, LAVRIL T, IZACARD G, et al. Llama: Open and efficient foundation language models[J/OL]. arXiv preprint, 2023. arXiv:2302.13971. |
[30] | TOUVRON H, MARTIN L, STONE K, et al. Llama 2: Open foundation and fine-tuned chat models[J/OL]. arXiv preprint, 2023. arXiv:2307.09288. |
[31] | WORKSHOP B S, SCAO T L, FAN A, et al. Bloom: A 176b- parameter open-access multilingual language model[J/OL]. arXiv preprint, 2022. arXiv:2211.05100. |
[32] | DU N, HUANG Y, DAI A M, et al. Glam: Efficient scaling of language models with mixture-of-experts[C]// International Conference on Machine Learning. PMLR, 2022: 5547-5569. |
[33] | CHOWDHERY A, NARANG S, DEVLIN J, et al. Palm: Scaling language modeling with pathways[J]. Journal of Machine Learning Research, 2023, 24(240): 1-113. |
[34] | BAI J, BAI S, CHU Y, et al. Qwen technical report[J/OL]. arXiv preprint, 2023. arXiv:2309.16609. |
[35] | DU Z, QIAN Y, LIU X, et al. Glm: General language model pretraining with autoregressive blank infilling[J/OL]. arXiv preprint, 2021. arXiv:2103.10360. |
[36] | YANG A, XIAO B, WANG B, et al. Baichuan 2: Open large-scale language models[J/OL]. arXiv preprint, 2023. arXiv:2309.10305. |
[37] | OUYANG L, WU J, JIANG X, et al. Training language models to follow instructions with human feedback[J]. Advances in Neural Information Processing Systems, 2022, 35: 27730-27744. |
[38] | LIU H, TAM D, MUQEETH M, et al. Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning[J]. Advances in Neural Information Processing Systems, 2022, 35: 1950-1965. |
[39] | LESTER B, AL-RFOU R, CONSTANT N. The power of scale for parameter-efficient prompt tuning[J/OL]. arXiv preprint, 2021. arXiv:2104.08691. |
[40] | HU E J, SHEN Y, WALLIS P, et al. Lora: Low-rank adaptation of large language models[J/OL]. arXiv preprint, 2021. arXiv:2106.09685. |
[41] | LI X L, LIANG P. Prefix-tuning: Optimizing continuous prompts for generation[J/OL]. arXiv preprint, 2021. arXiv:2101.00190. |
[42] | LIU X, JI K, FU Y, et al. P-tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks[J/OL]. arXiv preprint, 2021. arXiv:2110.07602. |
[43] | LEWIS P, PEREZ E, PIKTUS A, et al. Retrieval-augmented generation for knowledge-intensive nlp tasks[J]. Advances in Neural Information Processing Systems, 2020, 33: 9459-9474. |
[44] | GAO Y, XIONG Y, GAO X, et al. Retrieval-augmented generation for large language models: A survey[J/OL]. arXiv preprint, 2023. arXiv:2312.10997. |
[45] | CHANG Y, WANG X, WANG J, et al. A survey on evaluation of large language models[J]. ACM Transactions on Intelligent Systems and Technology, 2024, 15(3): 1-45. |
[46] | HENDRYCKS D, BURNS C, BASART S, et al. Measuring massive multitask language understanding[J/OL]. arXiv preprint, 2020. arXiv:2009.03300. |
[47] | LEWKOWYCZ A, SLONE A, ANDREASSEN A, et al. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models[R]. Technical Report, 2022. |
[48] | BOMMASANI R, LIANG P, LEE T. Holistic evaluation of language models[J]. Annals of the New York Academy of Sciences, 2023, 1525(1): 140-146. |
[49] | CHIANG W L, ZHENG L, SHENG Y, et al. Chatbot arena: An open platform for evaluating llms by human preference[J/OL]. arXiv preprint, 2024. arXiv:2403.04132. |
[50] | ZHANG N, CHEN M, BI Z, et al. Cblue: A chinese biomedical language understanding evaluation benchmark[J/OL]. arXiv preprint, 2021. arXiv:2106.08087. |
[51] | FEI Z, SHEN X, ZHU D, et al. Lawbench: Benchmarking legal knowledge of large language models[J/OL]. arXiv preprint, 2023. arXiv:2309.16289. |
[52] | GU Z, ZHU X, YE H, et al. Xiezhi: An ever-updating benchmark for holistic domain knowledge evaluation[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2024, 38(16): 18099-18107. |
[53] | CHIANG C H, LEE H. Can large language models be an alternative to human evaluations?[J/OL]. arXiv preprint, 2023. arXiv:2305.01937. |
[54] | WEI J, WANG X, SCHUURMANS D, et al. Chain-of-thought prompting elicits reasoning in large language models[J]. Advances in Neural Information Processing Systems, 2022, 35: 24824-24837. |
[55] | LIU Y, ITER D, XU Y, et al. G-eval: Nlg evaluation using gpt-4 with better human alignment[J]. arXiv preprint, 2023. arXiv: 2303.16634. |
[56] | PIÑEIRO-MARTÍN A, GARCÍA-MATEO C, DOCÍO-FERNÁNDEZ L, et al. Ethical challenges in the development of virtual assistants powered by large language models[J]. Electronics, 2023, 12(14): 3170. |
[57] | SHUTSKE J M. Harnessing the power of large language models in agricultural safety & health[J]. Journal of Agricultural Safety and Health, 2023: 0. |
[58] | CHEN X, LI L, CHANG L, et al. Challenges and Contributing Factors in the Utilization of Large Language Models (LLMs)[J/OL]. arXiv preprint, 2023. arXiv:2310.13343. |
[59] | McCLOSKEY M, COHEN N J. Catastrophic interference in connectionist networks: The sequential learning problem[M]// Psychology of Learning and Motivation. Academic Press, 1989, 24: 109-165. |
[60] | JI Z, LEE N, FRIESKE R, et al. Survey of hallucination in natural language generation[J]. ACM Computing Surveys, 2023, 55(12):1-38. |
[61] | HUANG L, YU W, MA W, et al. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions[J/OL]. arXiv preprint, 2023. arXiv:2311.05232. |
[62] | MIN S, KRISHNA K, LYU X, et al. Factscore: Fine-grained atomic evaluation of factual precision in long form text generation[J/OL]. arXiv preprint, 2023. arXiv:2305.14251. |
[63] | ELARABY M, LU M, DUNN J, et al. Halo: Estimation and reduction of hallucinations in open-source weak large language models[J/OL]. arXiv preprint, 2023. arXiv:2308.11764. |
[64] | TAN C, CAO Q, LI Y, et al. On the promises and challenges of multimodal foundation models for geographical, environmental, agricultural, and urban planning applications[J/OL]. arXiv preprint, 2023. arXiv:2312.17016. |
[65] | YANG Z, LI L, LIN K, et al. The dawn of LMMs: Preliminary explorations with GPT-4V (Ision)[J/OL]. arXiv preprint, 2023. arXiv: 2309.17421. |
[66] | IBRAHIM A, SENTHILKUMAR K, SAITO K. Evaluating responses by ChatGPT to farmers’ questions on irrigated rice cultivation in Nigeria[J]. Research Square Platform LLC, 2023. |
[67] |
杜保佳, 张晶, 王宗明, 等. 应用 Sentinel-2A NDVI 时间序列和面向对象决策树方法的农作物分类[J]. 地球信息科学学报, 2019, 21(5): 740-751.
doi: 10.12082/dqxxkx.2019.180412 |
[68] | 陈诗扬, 刘佳. 基于 GF-6 时序数据的农作物深度学习识别算法评估[J]. 农业工程学报, 2021, 37(15):161-168. |
[69] | ABRAMSON J, ADLER J, DUNGER J, et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3[J]. Nature, 2024: 1-3. |
[70] | CHEN L, ZAHARIA M, ZOU J. Frugalgpt: How to use large language models while reducing cost and improving performance[J/OL]. arXiv preprint, 2023. arXiv:2305.05176. |
[1] | WANG LuAn, ZHANG LiXiang, ZHANG Jing. Cost Benefit Survey Statistical Dataset of China's Grain Processing Enterprises from 2013 to 2016 [J]. Journal of Agricultural Big Data, 2023, 5(3): 26-31. |
[2] | Yican Zhang, Fengzhi Liu, Haibo Wang. Comparison of Different Grapevine Rootstocks on Storage and Mineral Nutrition [J]. Journal of Agricultural Big Data, 2022, 4(3): 77-83. |
[3] | Yue Hou, Changhui Peng, Mingxia Yang, Zhihao Liu, Xiaolu Zhou. Storing and Sharing Ecological Observation Data Using Blockchain Technology [J]. Journal of Agricultural Big Data, 2020, 2(2): 55-66. |
[4] | Ping Xiao Yundan Hou Ruixia Ji. The Exploration and Practices on Data Management in Forestry Science [J]. Journal of Agricultural Big Data, 2019, 1(3): 46-56. |
|