引用本文
  • 覃晓,李永玉,吴琨生,元昌安,谭思靖,刘善锐.一种基于人体姿态的新型中国交警手势识别网络[J].广西科学,2024,31(5):1011-1024.    [点击复制]
  • QIN Xiao,LI Yongyu,WU Kunsheng,YUAN Chang'an,TAN Sijing,LIU Shanrui.A Novel Chinese Traffic Police Gesture Recognition Network Based on Human Pose[J].Guangxi Sciences,2024,31(5):1011-1024.   [点击复制]
【打印本页】 【在线阅读全文】【下载PDF全文】 查看/发表评论下载PDF阅读器关闭

←前一篇|后一篇→

过刊浏览    高级检索

本文已被:浏览 34次   下载 26 本文二维码信息
码上扫一扫!
一种基于人体姿态的新型中国交警手势识别网络
覃晓1,2, 李永玉1, 吴琨生3, 元昌安4, 谭思靖1, 刘善锐1
0
(1.南宁师范大学, 广西人机交互与智能决策重点实验室, 广西南宁 530100;2.广西区域多源数据集成与智能处理协同创新中心, 广西桂林 541004;3.广西壮族自治区南宁树木园, 广西南宁 530225;4.广西科学院, 广西南宁 530012)
摘要:
交警手势识别对于自动驾驶技术至关重要,现有的基于人体姿态的交警手势识别方法在骨架特征提取中存在特征不完整、鲁棒性不足等问题;时序特征提取存在动态信息丢失、时序依赖性弱、实时性差等问题,其效果也极易受到环境背景的影响。本研究提出一种基于人体姿态的新型交警手势识别网络(Pose Long Short-Term Memory,PoseLSTM)。PoseLSTM中的关节组合编码器(Compositional Tokens Multi-layer perceptron Mixer,CTMM)能够捕捉身体各关节间的关联特征,并通过依赖建模来转换这些关节信息,形成多部位特征表示,解决了基于长短期记忆(Long Short-Term Memory,LSTM)的算法无法有效提取骨架特征的问题;此外,PoseLSTM中的混合架构注意力LSTM (Attention LSTM),能更好地融合输入与隐藏状态的信息,其效果优于原始LSTM。实验结果表明,PoseLSTM在开源的中国交警手势数据集上的准确率为100.00%,实现了最优。为了证明PoseLSTM的泛化能力,在开放手语数据集LSA64、WLASL-100和CSL-500上进行实验,其准确率分别达到100.00%、59.69%和96.40%。
关键词:  交警手势识别  注意力机制  LSTM  关节组合
DOI:10.13656/j.cnki.gxkx.20241127.017
投稿时间:2024-07-22修订日期:2024-09-25
基金项目:科技部科技创新2030-“脑科学与类脑研究”重大项目(2021ZD0201904)和广西科技重大专项(桂科AA22068057)资助。
A Novel Chinese Traffic Police Gesture Recognition Network Based on Human Pose
QIN Xiao1,2, LI Yongyu1, WU Kunsheng3, YUAN Chang'an4, TAN Sijing1, LIU Shanrui1
(1.Guangxi Key Laboratory of Human-Computer Interaction and Intelligent Decision Making, Nanning Normal University, Nanning, Guangxi, 530100, China;2.Guangxi Regional Collaborative Innovation Center for Multi-Source Data Integration and Intelligent Processing, Guilin, Guangxi, 541004, China;3.Nanning Arboretum, Guangxi Zhuang Autonomous Region, Nanning, Guangxi, 530225, China;4.Guangxi Academy of Sciences, Nanning, Guangxi, 530012, China)
Abstract:
Traffic police gesture recognition is crucial for autonomous driving technology.Existing traffic police gesture recognition methods based on human pose have shortcomings such as incomplete feature extraction and insufficient robustness in skeleton feature extraction.The extraction of time series features suffers from dynamic information loss,weak time series dependency,and poor real-time performance.Additionally,the extraction effectiveness is easily affected by environmental backgrounds.In view of the shortcomings above,a novel traffic police gesture recognition network Pose Long Short-Term Memory (PoseLSTM) based on human pose is proposed.Specifically,the Compositional Tokens Multi-layer Perceptron Mixer (CTMM) in PoseLSTM can capture the relational features between body joints and transform the joint information through dependency modeling to form multi-part feature representations,addressing the problem of Long Shore-Term Memory (LSTM)-based algorithms failing to effectively extract skeleton features.Moreover,the hybrid architecture Attention LSTM in PoseLSTM can better integrate input and hidden state information,outperforming the original LSTM.Experimental results showed that PoseLSTM achieved the accuracy of 100.00% on the open-source Chinese traffic police gesture dataset,achieving optimal performance.Furthermore,to demonstrate the generalization ability of PoseLSTM,experiments were conducted on open sign language datasets LSA64,WLASL-100,and CSL-500,with the accuracy of 100.00%,59.69%,and 96.40%,respectively.
Key words:  traffic police gesture recognition  attention mechanism  LSTM  joint combination

用微信扫一扫

用微信扫一扫