A Multi-scale Feature Fusion Network Focusing on Small Objects in UAV-View

Colomina I, Molina P. Unmanned aerial systems for photogrammetry and remote sensing: a review. ISPRS J Photogramm Remote Sens. 2014;92:79–97.

Article  Google Scholar 

Zhang Z. Drone-YOLO: an efficient neural network method for target detection in drone images. Drones. 2023;7(8):526.

Article  Google Scholar 

Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587. 2014.

Girshick R. Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision. 2015.

Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC. SSD: Single shot multibox detector. In: Computer vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37. Springer; 2016.

Terven J, Cordova-Esparza D. A comprehensive review of YOLO: from YOLOv1 to YOLOv8 and beyond. 2023. arXiv:2304.00501.

Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L. Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. IEEE; 2009.

Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL. Microsoft COCO: common objects in context. In: Computer vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pp. 740–755. Springer; 2014.

Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A. The pascal Visual Object Classes (VOC) challenge. Int J Comput Vis. 2010;88:303–38.

Article  Google Scholar 

Du D, Zhu P, Wen L, Bian X, Lin H, Hu Q, Peng T, Zheng J, Wang X, Zhang Y et al. VisDrone-DET2019: the vision meets drone object detection in image challenge results. In: Proceedings of the IEEE/CVF international conference on computer vision workshops, pp. 0–0. 2019.

Du D, Qi Y, Yu H, Yang Y, Duan K, Li G, Zhang W, Huang Q, Tian Q. The unmanned aerial vehicle benchmark: object detection and tracking. In: Proceedings of the European conference on computer vision, pp. 370–386. 2018.

Deng S, Li S, Xie K, Song W, Liao X, Hao A, Qin H. A global-local self-adaptive network for drone-view object detection. IEEE Trans Image Process. 2020;30:1556–69.

Article  MathSciNet  Google Scholar 

Chattopadhay A, Sarkar A, Howlader P, Balasubramanian VN. Grad-cam++: generalized gradient-based visual explanations for deep convolutional networks. In: 2018 IEEE winter conference on applications of computer vision, pp. 839–847. 2018. https://doi.org/10.1109/WACV.2018.00097

Xu C, Wang J, Yang W, Yu H, Yu L, Xia G-S. Detecting tiny objects in aerial images: a normalized Wasserstein distance and a new benchmark. ISPRS J Photogramm Remote Sens. 2022;190:79–93. https://doi.org/10.1016/j.isprsjprs.2022.06.002.

Article  Google Scholar 

He K, Zhang X, Ren S, Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell. 2015;37(9):1904–16.

Article  Google Scholar 

Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S. Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125. 2017.

Liu S, Qi L, Qin H, Shi J, Jia J. Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8759–8768. 2018.

Dai Y, Gieseke F, Oehmcke S, Wu Y, Barnard K. Attentional feature fusion. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 3560–3569. 2021.

Chen G, Wang H, Chen K, Li Z, Song Z, Liu Y, Chen W, Knoll A. A survey of the four pillars for small object detection: multiscale representation, contextual information, super-resolution, and region proposal. IEEE Trans Syst Man Cybern Syst. 2020;52(2):936–53.

Article  Google Scholar 

Liu R, Yu Z, Mo D, Cai Y. An improved faster-RCNN algorithm for object detection in remote sensing images. In: 2020 39th Chinese control conference, pp. 7188–7192. IEEE; 2020.

Wang Q, Zhang H, Hong X, Zhou Q. Small object detection based on modified FSSD and model compression. In: 2021 IEEE 6th International Conference on Signal and Image Processing (ICSIP), pp. 88–92. IEEE; 2021.

Redmon J, Farhadi A. YOlOv3: an incremental improvement. 2018. arXiv:1804.02767

Bochkovskiy A, Wang C-Y, Liao H-YM. YOLOv4: optimal speed and accuracy of object detection. 2020. arXiv:2004.10934

Jocher G. Ultralytics YOLOv5. https://doi.org/10.5281/zenodo.3908559

Jocher G, Chaurasia A, Qiu J. Ultralytics YOLOv8.

Wang C-Y, Yeh I-H, Liao H-YM. YOLOv9: learning what you want to learn using programmable gradient information. 2024. arXiv:2402.13616

Cui L, Ma R, Lv P, Jiang X, Gao Z, Zhou B, Xu M. MDSSD: multi-scale deconvolutional single shot detector for small objects. 2018. arXiv:1805.07009

Duan K, Du D, Qi H, Huang Q. Detecting small objects using a channel-aware deconvolutional network. IEEE Trans Circ Syst Video Technol. 2019;30(6):1639–52.

Article  Google Scholar 

Hu X, Xu X, Xiao Y, Chen H, He S, Qin J, Heng P-A. SINet: a scale-insensitive convolutional neural network for fast vehicle detection. IEEE Trans Intell Transp Syst. 2018;20(3):1010–9.

Article  Google Scholar 

Hu J, Shen L, Sun G. Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141. 2018.

Gao F, He Y, Wang J, Hussain A, Zhou H. Anchor-free convolutional network with dense attention feature aggregation for ship detection in SAR images. Remote Sens. 2020;12(16).

Zeng S, Yang W, Jiao Y, Geng L, Chen X. SCA-YOLO: a new small object detection model for UAV images. Vis Comput. 2024;40(3):1787–803.

Article  Google Scholar 

Zhao L, Zhu M. MS-YOLOv7: YOLOv7 based on multi-scale for object detection on UAV aerial photography. Drones. 2023;7(3):188.

Article  Google Scholar 

Peng J, Liu Y, Tang S, Hao Y, Chu L, Chen G, Wu Z, Chen Z, Yu Z, Du Y et al. PP-LiteSeg: a superior real-time semantic segmentation model. 2022. arXiv:2204.02681

Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S. Generalized intersection over union: a metric and a loss for bounding box regression. 2019.

Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D. Distance-IoU loss: faster and better learning for bounding box regression. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, pp. 12993–13000. 2020.

Zheng Z, Wang P, Ren D, Liu W, Ye R, Hu Q, Zuo W. Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans Cybern. 2022;52(8):8574–86. https://doi.org/10.1109/TCYB.2021.3095305.

Article  Google Scholar 

Zhang Y-F, Ren W, Zhang Z, Jia Z, Wang L, Tan T. Focal and efficient IoU loss for accurate bounding box regression. 2022.

Zhang S, Li C, Jia Z, Liu L, Zhang Z, Wang L. Diag-IoU loss for object detection. IEEE Trans Circ Syst Video Technol. 2023.

Gao F, Huo Y, Wang J, Hussain A, Zhou H. Anchor-free SAR ship instance segmentation with centroid-distance based loss. IEEE J Sel Top Appl Earth Observ Remote Sens. 2021;14:11352–71. https://doi.org/10.1109/JSTARS.2021.3123784.

Article  Google Scholar 

Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell. 2016;39(6):1137–49.

Article  Google Scholar 

Wang J, Xu C, Yang W, Yu L. A normalized Gaussian Wasserstein distance for tiny object detection. 2021. arXiv:2110.13389

Cai Z, Vasconcelos N. Cascade R-CNN: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6154–6162. 2018.

Li X, Wang W, Wu L, Chen S, Hu X, Li J, Tang J, Yang J. Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection. Adv Neural Inf Process Syst. 2020;33:21002–12.

Google Scholar 

Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C. Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520. 2018.

Ma N, Zhang X, Zheng H-T, Sun J. Shufflenet v2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European conference on computer vision, pp. 116–131. 2018.

Du B, Huang Y, Chen J, Huang D. Adaptive sparse convolutional networks with global context enhancement for faster object detection on drone images. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 13435–13444. 2023.

Lin T-Y, Goyal P, Girshick R, He K, Dollár P. Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp. 2980–2988. 2017.

Yang C, Huang Z, Wang N. Querydet: cascaded sparse query for accelerating high-resolution small object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 13668–13677. 2022.

Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Ke Z, Li Q, Cheng M, Nie W et al. YOLOv6: a single-stage object detection framework for industrial applications. 2022. arXiv:2209.02976

Wang C-Y, Bochkovskiy A, Liao H-YM. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 7464–7475. 2023.

Wang A, Chen H, Liu L, Chen K, Lin Z, Han J, Ding G. YOLOv10: real-time end-to-end object detection. 2024. arXiv:2405.14458

Li Y, Fan Q, Huang H, Han Z, Gu Q. A modified YOLOv8 detection network for UAV aerial image recognition. Drones. 2023;7(5):304.

Article  Google Scholar 

Liu S, Zha J, Sun J, Li Z, Wang G. EdgeYOLO: an edge-real-time object detector. In: 2023 42nd Chinese control conference, pp. 7507–7512. IEEE; 2023.

Jiang L, Yuan B, Du J, Chen B, Xie H, Tian J, Yuan Z. MFFSODNet: multi-scale feature fusion small object detection network for UAV aerial images. IEEE Trans Instrum Meas. 2024.

Yang F, Fan H, Chu P, Blasch E, Ling H. Clustered object detection in aerial images. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 8311–8320. 2019.

Li C, Yang T, Zhu S, Chen C, Guan S. Density map guided object detection in aerial images. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp. 190–191. 2020.

Duan C, Wei Z, Zhang C, Qu S, Wang H. Coarse-grained density map guided object detection in aerial images. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 2789–2798. 2021.

Liao J, Piao Y, Su J, Cai G, Huang X, Chen L, Huang Z, Wu Y. Unsupervised cluster guided object detection in aerial images. IEEE J Sel Top Appl Earth Observ Remote Sens. 2021;14:11204–16.

Xu J, Li Y-L, Wang S. Adazoom: towards scale-aware large scene object detection. IEEE Trans Multimed. 2022;25:4598–609.

Article  Google Scholar 

Meethal A, Granger E, Pedersoli M. Cascaded zoom-in detector for high resolution aerial images. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2045–2054. 2023.

Huang Y, Chen J, Huang D. UFPMP-Det: toward accurate and efficient object detection on drone imagery. In: Proceedings of the AAAI conference on artificial intelligence, vol. 36, pp. 1026–1033. 2022.

Yin N, Liu C, Tian R, Qian X. SDPDet: learning scale-separated dynamic proposals for end-to-end drone-view detection. IEEE Trans Multimed. 2024.

Comments (0)

No login
gif