引用本文
  • 彭柏程,张安勤,张挺.融合GAN与深度神经网络的混合信用评估方法[J].广西科学,2023,30(1):121-131.    [点击复制]
  • PENG Baicheng,ZHANG Anqin,ZHANG Ting.A Hybrid Credit Evaluation Method Integrating GAN and Deep Neural Network[J].Guangxi Sciences,2023,30(1):121-131.   [点击复制]
【打印本页】 【在线阅读全文】【下载PDF全文】 查看/发表评论下载PDF阅读器关闭

←前一篇|后一篇→

过刊浏览    高级检索

本文已被:浏览 167次   下载 284 本文二维码信息
码上扫一扫!
融合GAN与深度神经网络的混合信用评估方法
彭柏程, 张安勤, 张挺
0
(上海电力大学计算机科学与技术学院, 上海 201306)
摘要:
随着信用卡和个人贷款业务在金融业的快速增长,如何在信息有限的情况下检测潜在违约或坏账业务已经变得极其重要。信用评分领域面临的主要困难是样本不平衡以及分类器性能不佳,为此本研究首先提出了一种基于表格数据的生成对抗网络Tab-GAN,从原始数据中生成足够的违约样本;随后设计了一种基于CNN-LSTM的混合深度学习模型用于特征提取,该模型包含卷积神经网络(Convolutional Neural Networks,CNN)和长短期记忆神经网络(Long Short-Term Memory,LSTM)两个子模型,分别从用户数据中提取静态局部特征和动态时间特征,并加入时空注意力模块对模型的输出进行重要度计算,从而抽取更关键的信息;最后在分类器层面引入焦点损失函数改进轻量级梯度提升机(Light Gradient Boosting Machine,LightGBM)分类器,实现违约风险的概率输出。在两个真实世界数据集中验证风险预测模型,实验结果表明生成对抗网络可以有效解决样本不平衡问题,CNN-LSTM+LightGBM模型在各项分类评价指标上均优于信用评分领域的其他先进算法,证明了该模型在信用评分领域的有效性和可移植性。
关键词:  不平衡数据|信用评估|生成对抗网络|卷积神经网络|长短期记忆神经网络|焦点损失函数
DOI:10.13656/j.cnki.gxkx.20230308.014
基金项目:国家自然科学基金项目(42672114)资助。
A Hybrid Credit Evaluation Method Integrating GAN and Deep Neural Network
PENG Baicheng, ZHANG Anqin, ZHANG Ting
(College of Computer Science and Technology, Shanghai University of Electric Power, Shanghai, 201306, China)
Abstract:
With the rapid growth of credit card and personal loan business in the financial industry,how to detect potential default or bad debt business with limited information has become extremely important.The main difficulties in the field of credit scoring are sample imbalance and poor classifier performance.For this reason,this study first proposes a generative adversarial network TabGAN based on tabular data to generate sufficient default samples from the original data.Then,a hybrid deep learning model based on CNN-LSTM is designed for feature extraction.The model includes two sub-models:Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM).Static local features and dynamic time features are extracted from user data respectively,and the spatio-temporal attention module is added to calculate the importance of the output of the model,thereby extracting more critical information.Finally,the focus loss function is introduced at the classifier level to improve Light Gradient Boosting Machine (LightGBM) classifier and realize the probability output of default risk.The risk prediction model is validated on two real-world datasets,and the experimental results show that the generative adversarial network can effectively solve the problem of sample imbalance.And the CNN-LSTM+LightGBM model is superior to other advanced algorithms in the field of credit scoring in all kinds of classification evaluation index,which proves the effectiveness and portability of the model in the field of credit scoring.
Key words:  imbalanced data|credit scoring|GAN|CNN|LSTM|focal loss

用微信扫一扫

用微信扫一扫