引用本文
【打印本页】 【在线阅读全文】【下载PDF全文】 查看/发表评论下载PDF阅读器关闭

过刊浏览    高级检索

本文已被:浏览 96次   下载 0  
基于多尺度特征提取的密集型小目标检测网络
元昌安1, 王文姬2, 黄豪杰3, 覃正优2, 张金勇2, 廖惠仙4, 覃 晓2, 李小森5, 李永玉2, 符云琴2, 谭思婧2, 钱泉梅2, 吴琨生6
0
(1.广西科学院;2.南宁师范大学广西人机交互与智能决策重点实验室;3.广西壮族自治区通信产业服务有限公司技术服务分公司;4.广东财贸职业学院数字技术学院;5.广西民族大学人工智能学院;6.广西壮族自治区南宁树木园)
摘要:
针对现有的无锚框目标检测算法难以在密集场景下有效提取多尺度目标特征的问题,本研究提出基于多尺度特征提取的密集型小目标检测网络(Intensive Small Target Detection Network Based On Multi-scale Feature Extraction,IMSE)。本研究首先提出多尺度特征增强模块,其包括窗口注意力模块和多尺度信息融合模块,通过建立全局级别的上下文联系从而增强IMSE在密集场景下的特征表达,进而能够更有效地提取检测目标的多尺度特征;其次,提出可变形卷积特征金字塔结构,引入空洞卷积进行特征增强,从而能够有效提高IMSE检测形状不规则、分布无规律物体的能力;最后将融合后的多尺度特征分别输入检测头进行分类与边界框的回归任务。IMSE在公共数据集MS COCO、CARPK与基于实际生产场景构建的WOOD数据集上进行验证,实验结果表明,IMSE在3个数据集上的平均精度(AP)分别达到了49.4%、75.8%和55.0%,分别比原始FCOS方法高出1.8%、1.4%和2.1%,验证了所提出模型的有效性。
关键词:  目标检测  自注意力机制  特征金字塔  空洞卷积  可变形卷积
DOI:
投稿时间:2024-07-22修订日期:2025-02-01
基金项目:广西科技重大专项(桂科AA22068057,桂科AB21076021)
Intensive Small Target Detection Network Based On Multi-scale Feature Extraction
YUAN Changan1, WANG Wenji2, HUANG Haojie3, QIN Zhengyou2, ZHANG Jinyong2, LIAO Huixian4, QIN Xiao2, LI Xiaosen5, LI Yongyu2, FU Yunqin2, TAN Sijing2, QIAN Quanmei2, WU Kunsheng6
(1.Guangxi Academy of Sciences,Nanning,Guangxi;2.Guangxi Key Laboratory of Human-Computer Interaction and Intelligent Decision Making,Nanning Normal University,Nanning,Guangxi;3.Guangxi Technical Service Company,China Communications Services Corporation Limited;4.College of Digital Technology,Guangdong Vocational College of Finance and Trade,Qingyuan,Guangdong;5.School of Artiffcial Intelligence,Guangxi Minzu University,Nanning;6.Nanning Arboretum,Guangxi Zhuang Autonomous Region,Nanning,Guangxi)
Abstract:
Aiming at the problem that existing target detection algorithms without anchor frame are difficult to extract multi-scale target features effectively in dense scenes, An Intensive Small Target Detection Network Based on Multi-scale Feature Extraction (IMSE) is proposed in this paper. Firstly, this paper proposes a multi-scale feature enhancement module, which includes window attention module and multi-scale information fusion module, to enhance the feature expression of the network in dense scenes by establishing global context connection, so as to extract multi-scale features of detection targets more effectively. Firstly, this article proposes a multi-scale feature enhancement module, which includes a window attention module and a multi-scale information fusion module. By establishing global level contextual connections, it enhances the feature expression of IMSE in dense scenes, and can more effectively extract multi-scale features of detection targets; Secondly, a deformable convolutional feature pyramid structure is proposed, which introduces dilated convolution for feature enhancement, thereby effectively improving the ability of IMSE to detect irregularly shaped and irregularly distributed objects; Finally, the fused multi-scale features are input into the detection head for classification and bounding box regression tasks. IMSE was validated on the public datasets MS COCO, CARPK, and the WOOD dataset constructed based on actual production scenarios. The experimental results showed that the average precision (AP) of IMSE on the three datasets reached 49.4%, 75.8%, and 55.0%, respectively, which were 1.8%, 1.4%, and 2.1% higher than the original FCOS method, respectively, verifying the effectiveness of the proposed model.
Key words:  Object Detection  self-Attention  FPN  Atrous Convolution  Deformable Convolution

用微信扫一扫

用微信扫一扫