Journal of Agricultural Big Data >
Spatial Feature Fusion-Based ViT Method for Fine-Grained Classification of Wolfberry Pests
Received date: 2024-09-09
Accepted date: 2024-10-14
Online published: 2024-12-02
To address the fine-grained pest classification challenge faced in wolfberry cultivation, we propose an agricultural pest fine-grained classification model—Spatial Feature Fusion-based Data Augmented Visual Transformer (ESF-ViT). The model first utilizes the self-attention mechanism to crop images of the foreground targets to enhance image input and supplement more detailed representations. Secondly, it combines the self-attention mechanism with a Graph Convolutional Network (GCN) to extract spatial information from the pest regions, learning the spatial posture features of the pests. To validate the effectiveness of the proposed model, we conducted experimental research on the CUB-200-2011, IP102, and Ningxia wolfberry pest dataset WPIT9K. The experimental results show that the proposed method outperforms the baseline ViT model by 1.83%, 2.09%, and 2.01% respectively, and surpasses the existing state-of-the-art pest classification models. The proposed model effectively solves the fine-grained pest image classification problem in the field of agricultural pest recognition, providing a visual model for efficient pest monitoring and early warning.
SUN LuLu, LIU JianPing, ZHOU GuoMin, WANG Jian, LIU LiBo . Spatial Feature Fusion-Based ViT Method for Fine-Grained Classification of Wolfberry Pests[J]. Journal of Agricultural Big Data, 2024 , 6(4) : 522 -531 . DOI: 10.19788/j.issn.2096-6369.000066
| [1] | LEHMANN P, AMMUNET T, BARTON M, et al. Complex responses of global insect pests to climate warming[J]. Frontiers in Ecology and the Environment, 2020, 18(3):141-150. |
| [2] | HADDI K, TURCHEN L, JUMBO L, et al. Rethinking biorational insecticides for the pest management: Unintended effects and consequences[J]. Pest Management Science. 2020, 76(7):2286-2293. |
| [3] | FILHO F, HELDENS W, KONG Z, et al. Drones: Innovative technology for use in precision pest management[J]. Journal of Economic Entomology, 2020, 113(1):1-25. |
| [4] | THENMOZHI K, REDDY U. Crop pest classification based on deep convolutional neural network and transfer learning[J]. Computers and Electronics in Agriculture, 2019, 164:104906. |
| [5] | 周国民. 迎接农业农村领域数字经济的提速发展[J]. 农业大数据学报, 2023, 5(1): 1-1. |
| [6] | DAWEI W, LIMIAO D, JIANGONG N, et al. Recognition pest by image‐based transfer learning[J]. Journal of the Science of Food and Agriculture, 2019, 99(10): 4524-4531. |
| [7] | JIN X, TAO Z, KONG J. Multi-stream aggregation network for fine-grained crop pests and diseases image recognition[J]. International Journal of Cybernetics and Cyber-Physical Systems, 2020, 1(1):52-67. |
| [8] | 周国民. 我国农业大数据应用进展综述[J]. 农业大数据学报, 2019, 1(1): 16-23. |
| [9] | 张凌栩, 韩锐, 李文明, 等. 大数据深度学习系统研究进展与典型农业应用[J]. 农业大数据学报, 2019, 1(2): 88-104. |
| [10] | YANG G, CHEN G, LI C, et al. Convolutional rebalancing network for the classification of large imbalanced rice pest and disease datasets in the field[J]. Frontiers in Plant Science, 2021, 12:671134. |
| [11] | LIU J, WANG X, MIAO W, et al. Tomato pest recognition algorithm based on improved yolov4[J]. Frontiers in Plant Science, 2022, 13: 814681. |
| [12] | LIU B, DING Z, TIAN L, et al. Grape leaf disease identification using improved deep convolutional neural networks[J]. Frontiers in Plant Science, 2020, 11: 1082. |
| [13] | GU Y, YIN H, JIN D, et al. Image-based hot pepper disease and pest diagnosis using transfer learning and finetuning[J]. Frontiers in Plant Science. 2021, 12: 724487. |
| [14] | DAI G, FAN J, DEWI C. ITF-WPI: Image and text based cross-modal feature fusion model for wolfberry pest recognition[J]. Computers and Electronics in Agriculture. 2023, 212:108129. |
| [15] | YANG G, HE Y, YANG Y, et al. Fine-grained image classification for crop disease based on attention mechanism[J]. Frontiers in Plant Science, 2020, 11: 600854. |
| [16] | YANG J, ZHANG F, QIAN T. Attention-based hierarchical convolution neural network for fine-grained crop image classification[C]// 2020 International Conferences on Internet of Things, 2020: 106-112. |
| [17] | ZHANG X, GAO H, WAN L. Classification of fine-grained crop disease by dilated convolution and improved channel attention module[J]. Agriculture, 2020, 12(10):1727. |
| [18] | ZENG Q, NIU L, WANG S, et al. SEViT: a large-scale and fine-grained plant disease classification model based on transformer and attention convolution[J]. Multimedia Systems, 2022, 29(3): 1001-1010. |
| [19] | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[OL]. arXiv:2010.11929. DOI:10.48550/arXiv.2010.11929. |
| [20] | WAH C, BRANSON S, WELINDER P, et al. The caltech-ucsd birds-200-2011 dataset[J]. 2011. |
| [21] | WU X, ZHAN C, LAI Y, et al. Ip102: A large-scale benchmark dataset for insect pest recognition[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 8779-8788. |
| [22] | 陈磊, 刘立波, 王晓丽. 2020年宁夏枸杞虫害图文跨模态检索数据集[J]. 中国科学数据, 2022, 7(3):149-156. |
| [23] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017. |
| [24] | 孙露露, 刘建平, 王健, 等. 细粒度图像分类上Vision Transformer的发展综述[J]. 计算机工程与应用, 2024, 60(10):30-46. |
| [25] | BERA A, WHARTON Z, LIU Y, et al. SR-GNN: Spatial relation-aware graph neural network for fine-grained image categorization[C]// IEEE Transactions on Image Processing, 2022. 31: 6017-6031. DOI: 10.1109/TIP.2022.3205215. |
| [26] | LIU H, ZHANG C, XIE B, et al. Affinity relation-aware fine-grained bird image recognition for robot vision tracking via transformers[C]// 2022 IEEE International Conference on Robotics and Biomimetics (ROBIO), Jinghong, China, 2022: 662-667. DOI: 10.1109/ROBIO55434.2022.10011861. |
| [27] | ZHANG Z C, CHEN Z D, WANG Y, et al. ViT-FOD: A vision transformer based fine-grained object discriminator[OL].arXiv: 2203. 12816. |
| [28] | XU Q, WANG J, JIANG B, et al. Fine-grained visual classification via internal ensemble learning transformer[J]. IEEE Transactions on Multimedia, 2023, 25:9015-9028. DOI: 10.1109/TMM.2023.3244340. |
| [29] | LIU H, ZHANG C, DENG Y, et al. TransIFC: Invariant cues-aware feature concentration learning for efficient fine-grained bird image classification[OL]. IEEE Transactions on Multimedia. DOI: 10.1109/TMM.2023.3238548. |
| [30] | 李佳盈, 蒋文婷, 杨林, 等. 基于ViT的细粒度图像分类[J]. 计算机工程与设计, 2023, 44(3):916-921. |
| [31] | WANG Q, WANG J, DENG H, et al. AA-trans: Core attention aggregating transformer with information entropy selector for fine-grained visual classification[J]. Pattern Recognition, 2023, 140: 109547. https://doi.org/10.1016/j.patcog.2023.109547. |
| [32] | SUN H, HE X, PENG Y. SIM-Trans: Structure information modeling transformer for fine-grained visual categorization[C]// Proceedings of the 30th ACM International Conference on Multimedia. 2022: 5853-5861. |
| [33] | HE J, CHEN J, LIU S, et al. TransFG: A transformer architecture for fine-grained recognition[C]// Proceedings of the AAAI conference on artificial intelligence. 2022, 36(1): 852-860. |
| [34] | WANG J, YU X, GAO Y. Feature fusion vision transformer for fine-grained visual categorization[J]. arXiv preprint arXiv:2107.02341, 2021. |
| [35] | HU X, ZHU S, PENG T. Hierarchical attention vision transformer for fine-grained visual classification[J]. Journal of Visual Communication and Image Representation, 2023. 91: 103755. https://doi.org/10.1016/j.jvcir.2023.103755. |
| [36] | DIAO Q, JIANG Y, WEN B, et al. Metaformer: A unified meta framework for fine-grained recognition[OL]. arXiv:2203.02751. |
| [37] | TOUVRON H, CORD M, DOUZE M, et al. Training data-efficient image transformers distillation through attention[OL]. arXiv:2012.12877. |
/
| 〈 |
|
〉 |