大数据深度学习系统研究进展与典型农业应用
收稿日期: 2019-04-10
网络出版日期: 2019-08-21
基金资助
国家重点研发计划“云计算和大数据"重点专项"数据驱动的云数据中心智能管理技术与平台(共性关键技术类)”(018YFB1003700)
A Survey of Big Data Deep Learning Systems and a Typical Agricultural Application
Received date: 2019-04-10
Online published: 2019-08-21
随着信息时代的快速发展,大数据成为推动人们生产和生活发生重大变革的关键技术,对于包括农业在内的各大领域的发展都起着非常重要的作用。而要想对大数据进行有效的分析和利用,并使其发挥最大价值,深度学习技术的研究和发展起着决定性的影响。在此背景下,本文对大数据深度学习系统领域的主要技术特征及其发展情况进行了详细介绍,包括深度学习模型(如CNN模型和RNN模型)、优化算法、大数据学习框架、硬件配置等方面。本文还对包括PyTorch在内的五种主流的深度学习框架的技术特征和发展历程分别进行了讲解,并对比了不同框架的长处和缺点。此外,本文还提到了大数据深度学习系统在农业领域的典型应用"基于大数据的葡萄叶片霜霉病预报系统",并以其关键步骤"葡萄叶片种类的分类识别过程"为例详细介绍了工作的原理,包括数据收集、样本特征提取、聚类算法、分类算法以及结果分析等过程。该系统运用大数据和深度学习技术,在检测和预防葡萄叶片霜霉病方面有着显著的效果。最后,本文还针对大数据深度学习系统目前的主要发展趋势,以及在农业领域的研究应用中所需注意的问题进行了介绍。到今天,大数据深度学习系统在包括农作物病虫害预测在内的农业数据分析领域发挥着日益重要的作用,并获得了广泛的应用。
张凌栩,韩锐,李文明,史银雪,刘驰 . 大数据深度学习系统研究进展与典型农业应用[J]. 农业大数据学报, 2019 , 1(2) : 88 -104 . DOI: 10.19788/j.issn.2096-6369.190208
With the rapid development of information age, big data has become the key technology to promote people's production and daily life to undergo major changes, and plays a very important part in the development of various fields, including agriculture. In order to effectively analyze and utilize the big data and make it play its maximum value, the research and development of deep learning technology plays a decisive role. In this context, this paper gives a detailed introduction to the main technical characteristics and development of big data deep learning system, including deep learning model (such as CNN model and RNN model), optimization algorithm, big data learning framework, hardware configuration and so on. This paper also explains the technical characteristics and development process of five mainstream deep learning frameworks, including PyTorch, and compares the strengths and weaknesses of these frameworks. In addition, this paper also mentions the typical application of big data deep learning system in agriculture, "Grape Leaf Downy Mildew Forecasting System Based on Big Data", and takes its key step "Grape Leaf Classification and Recognition Process" as an example to introduce its working principle in detail, including data collection, sample feature extraction, clustering algorithms, classification algorithms and result analysis. This system uses big data and deep learning technology to help detect and prevent downy mildew of grape leaves. Finally, this paper introduces the main development trend of big data deep learning system, as well as the problems requiring attention in agricultural research and application. Today, big data deep learning system is playing an increasingly important role and has been widely used in the field of agricultural data analysis, including crop pest prediction.
| [1] | 孟小峰, 慈祥 . 大数据管理: 概念、技术与挑战[J]. 计算机研究与发展, 2013,50(1):146-169. |
| [1] | Meng X F, Ci X . Big Data Management: Concepts, Technologies and Challenges[J]. Computer Research and Development, 2013,50(1):146-169. |
| [2] | 马世龙, 乌尼日其其格, 李小平 .大数据与深度学习综述[J]. 智能系统学报, 2016,( 6):728-742. |
| [2] | Ma S L, Wuniri Qiqige, Li X P . A review of large data and deep learning[J]. CAAI Transactions on Intelligent Systems, 2016, ( 6):728-742. |
| [3] | 周子扬 . 机器学习与深度学习的发展及应用[J]. 电子世界, 2017,( 23):72-73. |
| [3] | Chou Z Y . Development and application of machine learning and deep learning[J]. Electronic World, 2017, ( 23):72-73. |
| [4] | 张建明, 詹智财, 成科扬 , 等. 深度学习的研究与发展[J]. 江苏大学学报(自然科学版), 2015,( 2):191-200. |
| [4] | Zhang J M, Zhan Z C, Cheng K Y , et al.[J]. Journal of Jiangsu University (Natural Science Edition), 2015, ( 2):191-200. |
| [5] | 韩伟红, 贾焰, 周斌 . 大数据分析关键技术与挑战[J]. 信息技术与网络安全, 2018,( 4):7-10. |
| [5] | Han W H, Jia Y, Zhou B . Key Technologies and Challenges in Large Data Analysis[J]. Information Technology and Network Security, 2018, ( 4):7-10. |
| [6] | 袁冰清, 陆悦斌, 张杰 . 神经网络与深度学习基础[J]. 数字通信世界, 2018,( 5):59-62. |
| [6] | Yuan B Q, Lu Y B, Zhang J . Neural Network and Deep Learning Foundation[J]. Digital Communication World, 2018, ( 5):59-62. |
| [7] | 刘俊一 . 基于人工神经网络的深度学习算法综述[J]. 中国新通信, 2018,( 6):193-194. |
| [7] | Liu J Y . An overview of deep learning algorithms based on artificial neural networks[J]. China New Communication, 2018, ( 6):193-194. |
| [8] | 翟俊海, 张素芳, 郝璞 . 卷积神经网络及其研究进展[J]. 河北大学学报(自然科学版), 2017,( 6):640-651. |
| [8] | Zhai J H, Zhang S F, Hao P . Convolutional Neural Networks and Their Research Progress[J]. Journal of Hebei University (Natural Science Edition), 2017, ( 6):640-651. |
| [9] | 陈旭, 张军, 陈文伟 , 等. 卷积网络深度学习算法与实例[J]. 广东工业大学学报, 2017,( 6):20-26. |
| [9] | Chen X, Zhang J, Chen W W , et al. Convolutional network depth learning algorithm and examples[J]. Journal of Guangdong University of Technology, 2017, ( 6):20-26. |
| [10] | A. Krizhevsky, I. Sutskever, G.E. Hinton . ImageNet classification with deep convolutional neural networks. NIPS’12. 2012: 1097-1105. |
| [11] | O. Russakovsky, J. Deng, H. Su , et al. ImageNet Large Scale Visual Recognition Challenge[J]. International Journal of Computer Vision, 2014,115(3):211-252. |
| [12] | K. Simonyan and A. Zisserman . Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv: 1409. 1556, 2014. |
| [13] | M. Lin, Q. Chen, S. Yan . Network in network. arXiv preprint arXiv: 1312. 4400, 2013. |
| [14] | C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich . Going deeper with convolutions. In IEEE conference on computer vision and pat-tern recognition, pages 1-9, 2015. |
| [15] | K. He, X. Zhang, S. Ren, J. Sun . Deep residual learning for image recognition. In IEEE conference on computer vision and pattern recognition, pages 770-778, 2016. |
| [16] | 王金甲, 陈浩, 刘青玉 . 大数据下的深度学习研究[J]. 高技术通讯, 2017,( 1):27-37. |
| [16] | Wang J J, Chen H, Liu Q Y . Deep learning research under big data[J]. High Technology Newsletter, 2017, ( 1):27-37. |
| [17] | Csurka, Gabriela. Domain adaptation for visual applications: A comprehensive survey. arXiv preprint arXiv: 1702. 05374. 2017. |
| [18] | Ganin Y, Lempitsky V . Unsupervised domain adaptation by backpropagation. Proceedings of the 32th International Conference on Machine Learning, pages 1180-1189, 2015. |
| [19] | Hoffman J, Tzeng E, Park T , et al. Cycada: Cycle-consistent adversarial domain adaptation. Proceedings of the 35th International Conference on Machine Learning, pages 1989-1998, 2018. |
| [20] | Saito, Kuniaki , et al. Maximum Classifier Discrepancy for Unsupervised Domain Adaptation. In IEEE Conference on Computer Vision and Pattern Recognition, 2018. |
| [21] | 李丹, 沈夏炯, 张海香 , 等. 基于Lenet-5的卷积神经网络改进算法[J]. 计算机时代, 2016,( 8):4-6,12. |
| [21] | Li Dan, Shen Xiajiong, Zhang Haixiang , et al. Improved algorithm of convolution neural network based on Lenet-5[J]. Computer Age, 2016, ( 8):4-6,12. |
| [22] | S. Zagoruyko and N. Komodakis . Wide residual networks. In British Machine Vision Conference(BMVC), pages 87.1-87. 12, 2016. |
| [23] | G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger . Densely connected convolutional networks. In IEEE conference on computer vision and pattern recognition, pages 4700-4708,2017. |
| [24] | F. Wang, M. Jiang, C. Qian, S. Yang, C. Li, H. Zhang, X. Wang, X. Tang . Residual attention network for image classification. In IEEE conference on computer vision and pattern recognition, pages 6450-6458, 2017. |
| [25] | Y. Zhou, Q. Ye, Q. Qiu, J. Jiao . Oriented response networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4961-4970, 2017. |
| [26] | 王鹏, 张利 . 大数据处理系统的研究进展与展望[J]. 高技术通讯, 2015,( 8):793-801. |
| [26] | Wang Peng, Zhang Li . Research progress and Prospect of big data processing system[J]. High Technology Newsletter, 2015, ( 8):793-801. |
| [27] | 赵菲, 林穗, 高西刚 . 面向大数据的Storm框架研究与应用[J]. 微型机与应用, 2016,( 6):12-14. |
| [27] | Zhao Fei, Lin Sui, Gao Xigang . Research and Application of Storm Framework for Large Data[J]. Microcomputer and Application, 2016, ( 6):12-14. |
| [28] | Noghabi S A, Paramasivam K, Pan Y , et al. Samza: stateful scalable stream processing at LinkedIn[J]. Proceedings of the VLDB Endowment, 2017,10(12):1634-1645. |
| [29] | Zaharia M, Das T, Li H , et al. Discretized streams: Fault-tolerant streaming computation at scale [C]. Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, 2013: 423-438. |
| [30] | Padhy R P . Big data processing with Hadoop-MapReduce in cloud systems[J]. International Journal of Cloud Computing and Services Science, 2013,2(1):16. |
| [31] | Bittorf M, Bobrovytsky T, Erickson C , et al. Impala: A modern, open-source SQL engine for Hadoop [C]. Proceedings of the 7th Biennial Conference on Innovative Data Systems Research. 2015. |
| [32] | Hausenblas M, Nadeau J . Apache drill: interactive ad-hoc analysis at scale[J]. Big Data, 2013,1(2):100-104. |
| [33] | Lupher A . Shark: SQL and Analytics with Cost-Based Query Optimization on Coarse-Grained Distributed Memory[R]. CALIFORNIA UNIV BERKELEY DEPT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2014. |
| [34] | Isard M, Budiu M, Yu Y , et al. Dryad: distributed data-parallel programs from sequential building blocks [C]. ACM SIGOPS operating systems review. ACM, 2007,41(3):59-72. |
| [35] | Saha B, Shah H, Seth S , et al. Apache tez: A unifying framework for modeling and building data processing applications [C]//Proceedings of the 2015 ACM SIGMOD international conference on Management of Data. ACM, 2015: 1357-1369. |
| [36] | Agarwal S, Mozafari B, Panda A , et al. BlinkDB: queries with bounded errors and bounded response times on very large data [C]//Proceedings of the 8th ACM European Conference on Computer Systems. ACM, 2013: 29-42. |
| [37] | Olston C, Reed B, Srivastava U , et al. Pig latin: a not-so-foreign language for data processing [C]. Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, 2008: 1099-1110. |
| [38] | Thusoo A, Sarma J S, Jain N , et al. Hive-a petabyte scale data warehouse using hadoop [C]. Data Engineering (ICDE), 2010 IEEE 26th International Conference on. IEEE, 2010: 996-1005. |
| [39] | Gupta K, Sachdev A, Sureka A . Pragamana: performance comparison and programming alpha-miner algorithm in relational database query language and NoSQL column-oriented using apache phoenix [C]. Proceedings of the Eighth International C* Conference on Computer Science & Software Engineering. ACM, 2015: 113-118. |
| [40] | Meng X . MLlib: Scalable machine learning on Spark [C]. Spark Workshop April. 2014. |
| [41] | Venkataraman S, Yang Z, Liu D , et al. Sparkr: Scaling r programs with spark [C]. Proceedings of the 2016 International Conference on Management of Data. ACM, 2016: 1099-1104. |
| [42] | 沈阳, 王倩, 王亚男 , 等. 深度学习硬件方案综述[J]. 广播电视信息, 2017,( 10):64-68. |
| [42] | Shenyang, Wang Qian, Wang Yanan , et al. A review of deep learning hardware scheme[J]. Radio and TV Information, 2017, ( 10):64-68. |
| [43] | 人工智能AI世代乐学. 谷歌推出定制化机器学习芯片速度是传统GPU的15到30倍[J]. 信息与电脑, 2017,( 8):7-8. |
| [43] | Artificial Intelligence AI Generation Happy Learning. Google releases custom machine learning chips 15 to 30 times faster than traditional GPUs[J]. Information and Computer, 2017, ( 8):7-8. |
| [44] | A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Wey and, M. Andreetto, H. Adam . Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv: 1704. 04861, 2017. |
| [45] | M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L.-C. Chen . Mobilenetv2: Inverted residuals and linear bottlenecks. In2018 IEEE Conference on Computer Vision and Pattern Recognition, pages 4510-4520, 2018. |
| [46] | T. Zhang, G.-J. Qi, B. Xiao, J. Wang . Interleaved group convolutions. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, pages 4373-4382., 2017 |
| [47] | G. Xie, J. Wang, T. Zhang, J. Lai, R. Hong, G. J. Qi . Interleaved structured sparse convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition, pages 8847-8856, 2018. |
| [48] | K. Sun, M. Li, D. Liu, J. Wang . Igcv3: Interleaved low-rank group convolutions for efficient deep neural networks. arXiv preprintarXiv: 1806. 00178, 2018. |
| [49] | 付文博, 孙涛, 梁藉 , 等. 深度学习原理及应用综述[J]. 计算机科学, 2018,( z1):11-15,40. |
| [49] | Fu Wenbo, Sun Tao, Liang Ji , et al. A review of deep learning principles and applications[J]. Computer Science, 2018, ( z1):11-15, 40. |
| [50] | Girija S S . Tensorflow: Large-scale machine learning on heterogeneous distributed systems[J]. 2016. |
| [51] | Abadi M, Barham P, Chen J , et al. Tensorflow: a system for large-scale machine learning [C]. OSDI. 2016,16:265-283. |
| [52] | Abadi M . TensorFlow: learning functions at scale [C]. Acm Sigplan Notices. ACM, 2016,51(9):1-1. |
| [53] | Bastien F, Lamblin P, Pascanu R , et al. Theano: new features and speed improvements[J]. arXiv preprint arXiv: 1211. 5590, 2012. |
| [54] | 陈亮 . AI入门快速读懂深度学习框架[J]. 机器人产业, 2017,( 3):70-75. |
| [54] | Chen Liang . AI introduction to fast reading deep learning framework[J]. Robotics Industry, 2017, ( 3):70-75. |
| [55] | Bergstra J, Bastien F, Breuleux O , et al. Theano: Deep learning on gpus with python[C]//NIPS 2011, BigLearning Workshop, Granada, Spain. Microtome Publishing., 2011,3:1-48. |
| [56] | Ketkar N . Introduction to keras[M]. Deep Learning with Python. Apress, Berkeley, CA, 2017: 97-111. |
| [57] | Arnold T B . kerasR: R interface to the keras deep learning library[J]. The Journal of Open Source Software, 2017,2. |
| [58] | Paszke A, Gross S, Chintala S , et al. Automatic differentiation in pytorch[J]. 2017. |
| [59] | Ketkar N . Introduction to pytorch[M]. Deep Learning with Python. Apress, Berkeley, CA, 2017: 195-208. |
| [60] | Jia Y, Shelhamer E, Donahue J , et al. Caffe: Convolutional architecture for fast feature embedding [C]. Proceedings of the 22nd ACM international conference on Multimedia. ACM, 2014: 675-678. |
| [61] | 高榕, 张良, 梅魁志 . 基于Caffe的嵌入式多核处理器深度学习框架并行实现[J]. 西安交通大学学报, 2018,( 6):36-41,113. |
| [61] | Gao Rong, Zhang Liang, Mei Kuizhi . Parallel Implementation of the Depth Learning Framework for Embedded Multicore Processors Based on Cafe[J]. Journal of Xi'an Jiaotong University, 2018, ( 6):36-41, 113. |
| [62] | Han S, Pool J, Tran J , et al. Learning both weights and connections for efficient neural network [C]. Advances in neural information processing systems. 2015: 1135-1143. |
| [63] | 陈海昆, 张瑞芳, 张海燕 , 等. 数据库技术发展及其在农业领域中的应用[J]. 安徽农业科学, 2008(18):7818-7820. |
| [63] | Chen Haikun, Zhang Ruifang , Zhang Haiyan, etc. Development of database technology and its application in agriculture[J]. Anhui Agricultural Science, 2008 ( 18):7818-7820. |
/
| 〈 |
|
〉 |