引用本文: |
-
元昌安,王文姬,黄豪杰,覃正优,张金勇,廖惠仙,覃晓,李小森,李永玉,符云琴,谭思婧,钱泉梅,吴琨生.基于多尺度特征提取的密集型小目标检测网络[J].广西科学,2024,31(5):939-953. [点击复制]
- YUAN Chang'an,WANG Wenji,HUANG Haojie,QIN Zhengyou,ZHANG Jinyong,LIAO Huixian,QIN Xiao,LI Xiaosen,LI Yongyu,FU Yunqin,TAN Sijing,QIAN Quanmei,WU Kunsheng.An Intensive Small Object Detection Network Based on Multi-scale Feature Extraction[J].Guangxi Sciences,2024,31(5):939-953. [点击复制]
|
|
|
|
本文已被:浏览 32次 下载 34次 |
 码上扫一扫! |
基于多尺度特征提取的密集型小目标检测网络 |
元昌安1,2, 王文姬1, 黄豪杰3, 覃正优1, 张金勇1, 廖惠仙4, 覃晓1,5, 李小森6, 李永玉1, 符云琴1, 谭思婧1, 钱泉梅1, 吴琨生7
|
|
(1.南宁师范大学, 广西人机交互与智能决策重点实验室, 广西南宁 530100;2.广西科学院, 广西南宁 530007;3.中国通信服务股份有限公司广西技术服务分公司, 广西南宁 530000;4.广东财贸职业学院数字技术学院, 广东清远 511510;5.广西区域多源数据集成与智能处理协同创新中心, 广西桂林 541004;6.广西民族大学人工智能学院, 广西南宁 530006;7.广西壮族自治区南宁树木园, 广西南宁 530225) |
|
摘要: |
针对现有的无锚框目标检测算法难以在密集场景下有效提取多尺度目标特征的问题,本研究提出基于多尺度特征提取的密集型小目标检测网络(Intensive small target detection network based on Multi-Scale feature Extraction,IMSE)。本研究首先提出多尺度特征增强(Multi-scale Feature Enhancement,MFE)模块,其包括窗口注意力(Window Attention,WA)模块和多尺度信息融合(Multi-scale Information Fusion,MIF)模块,通过建立全局级别的上下文联系从而增强IMSE在密集场景下的特征表达,进而能够更有效地提取检测目标的多尺度特征;其次提出可变形卷积特征金字塔网络(Deformable Convolutional Feature Pyramid Networks,DCFPN)结构,引入空洞卷积进行特征增强,从而能够有效提高IMSE检测形状不规则、分布无规律物体的能力;最后将融合后的多尺度特征分别输入检测头进行分类与边界框的回归任务。IMSE在公共数据集MS COCO、CARPK与基于实际生产场景构建的WOOD数据集上进行验证,实验结果表明,IMSE在3个数据集上的平均精度(Average Precision,AP)分别达到了49.4%、75.8%和55.0%,分别比原始FCOS方法高出1.8%、1.4%和2.1%,验证了所提出模型的有效性。 |
关键词: 目标检测 自注意力机制 特征金字塔 空洞卷积 可变形卷积 |
DOI:10.13656/j.cnki.gxkx.20241127.011 |
投稿时间:2024-07-22修订日期:2024-09-24 |
基金项目:广西科技重大专项(桂科AA22068057和桂科AB21076021)资助。 |
|
An Intensive Small Object Detection Network Based on Multi-scale Feature Extraction |
YUAN Chang'an1,2, WANG Wenji1, HUANG Haojie3, QIN Zhengyou1, ZHANG Jinyong1, LIAO Huixian4, QIN Xiao1,5, LI Xiaosen6, LI Yongyu1, FU Yunqin1, TAN Sijing1, QIAN Quanmei1, WU Kunsheng7
|
(1.Guangxi Key Laboratory of Human-Computer Interaction and Intelligent Decision Making, Nanning Normal University, Nanning, Guangxi, 530100, China;2.Guangxi Academy of Sciences, Nanning, Guangxi, 530007, China;3.Guangxi Technical Service Company, China Communications Services Corporation Limited, Nanning, Guangxi, 530000, China;4.College of Digital Technology, Guangdong Vocational College of Finance and Trade, Qingyuan, Guangdong, 511510, China;5.Guangxi Regional Collaborative Innovation Center for Multi-Source Data Integration and Intelligent Processing, Guilin, Guangxi, 541004, China;6.School of Artificial Intelligence, Guangxi Minzu University, Nanning, Guangxi, 530006, China;7.Nanning Arboretum, Guangxi Zhuang Autonomous Region, Nanning, Guangxi, 530225, China) |
Abstract: |
Aiming at the problem that existing object detection algorithms without anchor frame are difficult to extract multi-scale target features effectively in dense scenes,an Intensive small object detection network based on Multi-Scale feature Extraction (IMSE) is proposed.Firstly,a multi-scale feature enhancement module including a Window Attention (WA) module and a Multi-scale Information Fusion (MIF) module is proposed.The global context connection is established to enhance the feature expression of IMSE in dense scenes,which enables more effective extraction of multi-scale features of detection objects.Secondly,a Deformable Convolutional Feature Pyramid Network (DCFPN) structure is proposed,which introduces dilated convolution for feature enhancement,thereby effectively improving the ability of IMSE to detect irregularly shaped and irregularly distributed objects.Finally,the fused multi-scale features are input into the detection head for classification and bounding box regression tasks.IMSE was then validated on the public datasets MS COCO and CARPK and the WOOD dataset constructed based on actual production scenarios.The experimental results showed that the Average Precision (AP) of IMSE on the three datasets reached 49.4%,75.8%,and 55.0%,respectively,which were 1.8%,1.4%,and 2.1% higher than that of the original FCOS method,verifying the effectiveness of the proposed model. |
Key words: object detection self-attention mechanism feature pyramid network dilated convolution deformable convolution |
|
|
|
|
|