农业垂直领域大语言模型构建流程和技术展望

doi:10.19788/j.issn.2096-6369.000052

摘要/Abstract

摘要：

随着互联网的普及，农业知识和信息的获取变得更加便捷，但信息大多固定且通用，无法针对具体情况提供定制化的解决方案。在此背景下，大语言模型（Large Language Models，LLMs）作为一种高效的人工智能工具，逐渐在农业领域中获得关注和应用。目前，LLMs技术在农业领域大模型的相关综述中只是简单描述，并没有系统地介绍LLMs构建流程。本文重点介绍了农业垂直领域大语言模型构建流程，包括数据采集和预处理、选择适当的LLMs基模型、微调训练、检索增强生成 (Retrieval Augmented Generation，RAG)技术、评估过程。以及介绍了LangChain框架在农业问答系统中的构建。最后，总结出当前构建农业垂直领域大语言模型的一些挑战，包括数据安全挑战、模型遗忘挑战和模型幻觉挑战，以及提出了未来农业垂直领域大语言的发展方向，包括多模态数据融合、强时效数据更新、多语言知识表达和微调成本优化，以进一步提高农业生产的智能化和现代化水平。

关键词: 大语言模型, 检索增强生成, LangChain, 农业问答系统

Abstract:

With the proliferation of the internet, accessing agricultural knowledge and information has become more convenient. However, this information is often static and generic, failing to provide tailored solutions for specific situations. To address this issue, vertical domain models in agriculture combine agricultural data with large language models (LLMs), utilizing natural language processing and semantic understanding technologies to provide real-time answers to agricultural questions and play a crucial role in agricultural decision-making and extension. This paper details the construction process of LLMs in the agricultural vertical domain, including data collection and preprocessing, selecting appropriate pre-trained LLM base models, fine-tuning training, Retrieval Augmented Generation (RAG), evaluation. The paper also discusses the application of the LangChain framework in agricultural Q&A systems. Finally, the paper summarizes some challenges in building LLMs for the agricultural vertical domain, including data security challenges, model forgetting challenges, and model hallucination challenges, and proposes future development directions for agricultural models, including the utilization of multimodal data, real-time data updates, the integration of multilingual knowledge, and optimization of fine-tuning costs to further promote the intelligence and modernization of agricultural production.

Key words: LLMs, RAG, LangChain, agricultural Q&A systems

张宇芹, 朱景全, 董薇, 李富忠, 郭雷风. 农业垂直领域大语言模型构建流程和技术展望[J]. 农业大数据学报, 2024, 6(3): 412-423.

ZHANG YuQin, ZHU JingQuan, DONG Wei, LI FuZhong, GUO LeiFeng. Construction Process and Technological Prospects of Large Language Models in the Agricultural Vertical Domain[J]. Journal of Agricultural Big Data, 2024, 6(3): 412-423.

图/表 10

表1

图1

表2

图2

图3

表3

表4

图4

图5

图6

参考文献 70

[1]	BROWN T, MANN B, RYDER N, et al. Language models are few-shot learners[J]. Advances in neural information processing systems, 2020, 33: 1877-1901.
[2]	张振乾, 汪澍, 宋琦, 等. 人工智能大模型在智慧农业领域的应用[J]. 智慧农业导刊, 2023, 3(10):9-12+17.
[3]	郭旺, 杨雨森, 吴华瑞, 等. 农业垂直领域大语言:关键技术、应用分析与发展方向[J]. 智慧农业(中英文), 2024, 6(2):1-13.
[4]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in neural information processing systems, 2017, 30.
[5]	WANG Y, ZHANG Z, WANG R. Element-aware summarization with large language models: Expert-aligned evaluation and chain-of- thought method[J]. arXiv preprint arXiv:2305.13412, 2023.
[6]	BRIAKOU E, CHERRY C, FOSTER G. Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM's Translation Capability[J]. arXiv preprint arXiv:2305.10266, 2023.
[7]	LIU Z, YANG K, ZHANG T, et al. Emollms: A series of emotional large language models and annotation tools for comprehensive affective analysis[J]. arXiv preprint arXiv:2401.08508, 2024.
[8]	王婷, 王娜, 崔运鹏, 等. 基于人工智能大模型技术的果蔬农技知识智能问答系统[J]. 智慧农业(中英文), 2023, 5(4):105-116.
[9]	ZHANG X, TIAN C, YANG X, et al. Alpacare: Instruction-tuned large language models for medical application[J]. arXiv preprint arXiv: 2310. 14558, 2023.
[10]	ZHANG H, CHEN J, JIANG F, et al. HuatuoGPT, towards taming language model to be a doctor[OL]. arXiv preprint, 2023. arXiv: 2305.15075.
[11]	HUANG Q, TAO M, AN Z, et al. Lawyer llama technical report[J/OL]. arXiv preprint, 2023. arXiv:2305.15062.
[12]	LUO Y, ZHANG J, FAN S, et al. Biomedgpt: Open multimodal generative pre-trained transformer for biomedicine[J/OL]. arXiv preprint, 2023. arXiv:2308.09442.
[13]	LUO Y, YANG K, HONG M, et al. Molfm: A multimodal molecular foundation model[J/OL]. arXiv preprint, 2023. arXiv:2307.09484.
[14]	ZHAO S, ZHANG J, NIE Z. Large-scale cell representation learning via divide-and-conquer contrastive learning[J/OL]. arXiv preprint, 2023. arXiv:2306.04371.
[15]	YANG H, LIU X Y, WANG C D. Fingpt: Open-source financial large language models[J/OL]. arXiv preprint, 2023. arXiv:2306.06031.
[16]	LI Y, MA S, WANG X, et al. EcomGPT: Instruction-tuning large language models with chain-of-task tasks for e-commerce[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2024, 38(17): 18582-18590.
[17]	YANG X, GAO J, XUE W, et al. Pllama: An open-source large language model for plant science[J]. arXiv preprint arXiv:2401.01600, 2024.
[18]	ZHAO B, JIN W, SER J D, et al. ChatAgri: Exploring potentials of ChatGPT on cross-linguistic agricultural text classification[J]. Neurocomputing, 2023, 557: 126708.
[19]	BALAGUER A, BENARA V, DE FREITAS CUNHA R L, et al. RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture[J/OL]. arXiv e-prints, 2024. arXiv: 2401.08406.
[20]	SILVA B, NUNES L, ESTEVÃO R, et al. GPT-4 as an Agronomist Assistant? Answering Agriculture Exams Using Large Language Models[J]. arXiv preprint, 2023. arXiv:2310.06225.
[21]	WANG Y, KORDI Y, MISHRA S, et al. Self-instruct: Aligning language models with self-generated instructions[J/OL]. arXiv preprint, 2022. arXiv:2212.10560.
[22]	ZHANG X, YANG Q. Self-qa: Unsupervised knowledge guided language model alignment[J/OL]. arXiv preprint, 2023. arXiv:2305. 11952.
[23]	LIU X, HONG H, WANG X, et al. Selfkg: Self-supervised entity alignment in knowledge graphs[C]// Proceedings of the ACM Web Conference 2022. 2022: 860-870.
[24]	YAN J, WANG C, CHENG W, et al. A retrospective of knowledge graphs[J]. Frontiers of Computer Science, 2018, 12: 55-74. doi: 10.1007/s11704-016-5228-9
[25]	侯琛, 牛培宇. 农业知识图谱技术的研究现状与展望[J]. 农业机械学报, 2024, 55(6):1-17.
[26]	田鹏菲. 苹果病虫害知识图谱和施药辅助App的研究与实现[D]. 新疆塔里木: 塔里木大学, 2023.
[27]	张宇, 郭文忠, 林森, 等. 基于Neo4j的草莓种植管理知识图谱构建及验证[J]. 现代农业科技, 2022,(1):223-230+234.
[28]	张文豪. 基于大豆育种语料的知识图谱构建[D]. 济南: 山东大学, 2021.
[29]	TOUVRON H, LAVRIL T, IZACARD G, et al. Llama: Open and efficient foundation language models[J/OL]. arXiv preprint, 2023. arXiv:2302.13971.
[30]	TOUVRON H, MARTIN L, STONE K, et al. Llama 2: Open foundation and fine-tuned chat models[J/OL]. arXiv preprint, 2023. arXiv:2307.09288.
[31]	WORKSHOP B S, SCAO T L, FAN A, et al. Bloom: A 176b- parameter open-access multilingual language model[J/OL]. arXiv preprint, 2022. arXiv:2211.05100.
[32]	DU N, HUANG Y, DAI A M, et al. Glam: Efficient scaling of language models with mixture-of-experts[C]// International Conference on Machine Learning. PMLR, 2022: 5547-5569.
[33]	CHOWDHERY A, NARANG S, DEVLIN J, et al. Palm: Scaling language modeling with pathways[J]. Journal of Machine Learning Research, 2023, 24(240): 1-113.
[34]	BAI J, BAI S, CHU Y, et al. Qwen technical report[J/OL]. arXiv preprint, 2023. arXiv:2309.16609.
[35]	DU Z, QIAN Y, LIU X, et al. Glm: General language model pretraining with autoregressive blank infilling[J/OL]. arXiv preprint, 2021. arXiv:2103.10360.
[36]	YANG A, XIAO B, WANG B, et al. Baichuan 2: Open large-scale language models[J/OL]. arXiv preprint, 2023. arXiv:2309.10305.
[37]	OUYANG L, WU J, JIANG X, et al. Training language models to follow instructions with human feedback[J]. Advances in Neural Information Processing Systems, 2022, 35: 27730-27744.
[38]	LIU H, TAM D, MUQEETH M, et al. Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning[J]. Advances in Neural Information Processing Systems, 2022, 35: 1950-1965.
[39]	LESTER B, AL-RFOU R, CONSTANT N. The power of scale for parameter-efficient prompt tuning[J/OL]. arXiv preprint, 2021. arXiv:2104.08691.
[40]	HU E J, SHEN Y, WALLIS P, et al. Lora: Low-rank adaptation of large language models[J/OL]. arXiv preprint, 2021. arXiv:2106.09685.
[41]	LI X L, LIANG P. Prefix-tuning: Optimizing continuous prompts for generation[J/OL]. arXiv preprint, 2021. arXiv:2101.00190.
[42]	LIU X, JI K, FU Y, et al. P-tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks[J/OL]. arXiv preprint, 2021. arXiv:2110.07602.
[43]	LEWIS P, PEREZ E, PIKTUS A, et al. Retrieval-augmented generation for knowledge-intensive nlp tasks[J]. Advances in Neural Information Processing Systems, 2020, 33: 9459-9474.
[44]	GAO Y, XIONG Y, GAO X, et al. Retrieval-augmented generation for large language models: A survey[J/OL]. arXiv preprint, 2023. arXiv:2312.10997.
[45]	CHANG Y, WANG X, WANG J, et al. A survey on evaluation of large language models[J]. ACM Transactions on Intelligent Systems and Technology, 2024, 15(3): 1-45.
[46]	HENDRYCKS D, BURNS C, BASART S, et al. Measuring massive multitask language understanding[J/OL]. arXiv preprint, 2020. arXiv:2009.03300.
[47]	LEWKOWYCZ A, SLONE A, ANDREASSEN A, et al. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models[R]. Technical Report, 2022.
[48]	BOMMASANI R, LIANG P, LEE T. Holistic evaluation of language models[J]. Annals of the New York Academy of Sciences, 2023, 1525(1): 140-146.
[49]	CHIANG W L, ZHENG L, SHENG Y, et al. Chatbot arena: An open platform for evaluating llms by human preference[J/OL]. arXiv preprint, 2024. arXiv:2403.04132.
[50]	ZHANG N, CHEN M, BI Z, et al. Cblue: A chinese biomedical language understanding evaluation benchmark[J/OL]. arXiv preprint, 2021. arXiv:2106.08087.
[51]	FEI Z, SHEN X, ZHU D, et al. Lawbench: Benchmarking legal knowledge of large language models[J/OL]. arXiv preprint, 2023. arXiv:2309.16289.
[52]	GU Z, ZHU X, YE H, et al. Xiezhi: An ever-updating benchmark for holistic domain knowledge evaluation[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2024, 38(16): 18099-18107.
[53]	CHIANG C H, LEE H. Can large language models be an alternative to human evaluations?[J/OL]. arXiv preprint, 2023. arXiv:2305.01937.
[54]	WEI J, WANG X, SCHUURMANS D, et al. Chain-of-thought prompting elicits reasoning in large language models[J]. Advances in Neural Information Processing Systems, 2022, 35: 24824-24837.
[55]	LIU Y, ITER D, XU Y, et al. G-eval: Nlg evaluation using gpt-4 with better human alignment[J]. arXiv preprint, 2023. arXiv: 2303.16634.
[56]	PIÑEIRO-MARTÍN A, GARCÍA-MATEO C, DOCÍO-FERNÁNDEZ L, et al. Ethical challenges in the development of virtual assistants powered by large language models[J]. Electronics, 2023, 12(14): 3170.
[57]	SHUTSKE J M. Harnessing the power of large language models in agricultural safety & health[J]. Journal of Agricultural Safety and Health, 2023: 0.
[58]	CHEN X, LI L, CHANG L, et al. Challenges and Contributing Factors in the Utilization of Large Language Models (LLMs)[J/OL]. arXiv preprint, 2023. arXiv:2310.13343.
[59]	McCLOSKEY M, COHEN N J. Catastrophic interference in connectionist networks: The sequential learning problem[M]// Psychology of Learning and Motivation. Academic Press, 1989, 24: 109-165.
[60]	JI Z, LEE N, FRIESKE R, et al. Survey of hallucination in natural language generation[J]. ACM Computing Surveys, 2023, 55(12):1-38.
[61]	HUANG L, YU W, MA W, et al. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions[J/OL]. arXiv preprint, 2023. arXiv:2311.05232.
[62]	MIN S, KRISHNA K, LYU X, et al. Factscore: Fine-grained atomic evaluation of factual precision in long form text generation[J/OL]. arXiv preprint, 2023. arXiv:2305.14251.
[63]	ELARABY M, LU M, DUNN J, et al. Halo: Estimation and reduction of hallucinations in open-source weak large language models[J/OL]. arXiv preprint, 2023. arXiv:2308.11764.
[64]	TAN C, CAO Q, LI Y, et al. On the promises and challenges of multimodal foundation models for geographical, environmental, agricultural, and urban planning applications[J/OL]. arXiv preprint, 2023. arXiv:2312.17016.
[65]	YANG Z, LI L, LIN K, et al. The dawn of LMMs: Preliminary explorations with GPT-4V (Ision)[J/OL]. arXiv preprint, 2023. arXiv: 2309.17421.
[66]	IBRAHIM A, SENTHILKUMAR K, SAITO K. Evaluating responses by ChatGPT to farmers’ questions on irrigated rice cultivation in Nigeria[J]. Research Square Platform LLC, 2023.
[67]	杜保佳, 张晶, 王宗明, 等. 应用 Sentinel-2A NDVI 时间序列和面向对象决策树方法的农作物分类[J]. 地球信息科学学报, 2019, 21(5): 740-751. doi: 10.12082/dqxxkx.2019.180412
[68]	陈诗扬, 刘佳. 基于 GF-6 时序数据的农作物深度学习识别算法评估[J]. 农业工程学报, 2021, 37(15):161-168.
[69]	ABRAMSON J, ADLER J, DUNGER J, et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3[J]. Nature, 2024: 1-3.
[70]	CHEN L, ZAHARIA M, ZOU J. Frugalgpt: How to use large language models while reducing cost and improving performance[J/OL]. arXiv preprint, 2023. arXiv:2305.05176.

模型名称	发布组织	LLMs基模型	领域
AlpaCare^[9]	加州大学、中国医学科学院、北京协和医学院、马里兰大学	LLaMA	医疗
HuatuoGPT^[10]	深圳市大数据研究院、香港中文大学（深圳）	Baichuan	医疗
Lawyer LLaMA^[11]	北京大学	LLaMA	法律
OpenBioMed^[12][13][14]	清华大学人工智能产业研究院、PharMolix	LLaMA2	生物
FinGPT^[15]	哥伦比亚大学、纽约大学（上海）	ChatGLM2	金融
EcomGPT^[16]	阿里巴巴	BLOOMZ	电商

框架名称	处理数据	生成数据
Self-instruct^[21]	单轮或者多轮的种子数据	单轮或者多轮的指令微调数据
Self-QA^[22]	文档、新闻、论文等非结构化数据	基于数据构建的微调数据
Self-KG^[23]	知识图谱	基于数据构建的微调数据

LLMs基模型	发布组织	参数大小（B指Billion）	发布时间	处理语言
Llama^[29]	Meta	7B、13B、33B、65B	2023年02月	英文
Llama2^[30]	Meta	7B、13B、34B、70B	2023年07月	英文
Bloom^[31]	BigScience	560M、1.1B、1.7B、3B 、7.1B、176B	2022年11月	英文
GLaM^[32]	Google	64B	2022年08月	英文
PaLM^[33]	Google	8B，62B，540B	2022年10月	英文
Qwen^[34]	阿里云	1.5B、1.8B、7B、14B、72B	2024年02月	中文
ChatGLM^[35]	智谱	6B	2023年01月	中文
Baichuan2^[36]	百川智能	7B、13B	2023年06月	中文

微调方法	原理
Freeze	微调Transformer模型的深层特征全连接层参数，可以在保证微调效率的前提下，最大限度地发挥模型的微调作用
Lora^[40]	通过冻结模型的参数，并向模型中添加可训练的低秩分解层，仅训练新增层的参数，从而实现模型性能的提升
Prefix-Tuning^[41]	在模型输入之前添加一系列任务特定的连续向量，称为前缀，来引导模型生成更符合特定任务要求的文本输出
P-tuning v1^[39]	固定模型前馈层参数，仅仅更新部分embedding参数即可实现低成本微调大模型
P-tuning v2^[42]	将可训练的连续提示独立添加到每个transformer层的输入中，删除带有LM头的verbalizers，以增强通用性