引用本文
  • 张玲,马士伦,黎利辉,文益民.一种基于局部分类精度的概念漂移数据流分类算法[J].广西科学,2024,31(1):100-109.    [点击复制]
  • ZHANG Ling,MA Shilun,LI Lihui,WEN Yimin.A Concept Drift Data Stream Classification Algorithm Based on Local Classification Accuracy[J].Guangxi Sciences,2024,31(1):100-109.   [点击复制]
【打印本页】 【在线阅读全文】【下载PDF全文】 查看/发表评论下载PDF阅读器关闭

←前一篇|后一篇→

过刊浏览    高级检索

本文已被:浏览 219次   下载 262 本文二维码信息
码上扫一扫!
一种基于局部分类精度的概念漂移数据流分类算法
张玲, 马士伦, 黎利辉, 文益民
0
(桂林电子科技大学, 广西图像与图形智能处理重点实验室, 广西桂林 541004)
摘要:
概念漂移数据流分类是一个极具挑战性的问题。当新概念出现时,该概念下的学习样本过少,无法对分类器进行及时调整,进而导致分类精度不高。为了解决该问题,本文提出一种基于局部分类精度的概念漂移数据流分类算法——LA-MS-CDC。第一,LA-MS-CDC将k-means聚类和局部分类精度算法结合,从分类器池中挑选出最优源领域分类器;第二,将最优源领域分类器与目标领域分类器加权集成,进而对样本分类;第三,根据分类样本的真实标签分别计算各分类器的损失,并对目标领域和源领域的分类器权重进行更新;第四,再利用该分类样本对目标领域分类器、最优源领域分类器进行更新;最后,完成分类器池的更新。在公开数据集上的实验结果表明,LA-MS-CDC能够有效地将源领域知识迁移到目标领域,与现有方法相比,其分类效果具有显著性提升。算法代码可在https://gitee.com/ymw12345/LAMSCDC上获取。
关键词:  概念漂移  多源在线迁移学习  局部分类精度  集成学习  多样性
DOI:10.13656/j.cnki.gxkx.20240417.010
投稿时间:2023-03-08修订日期:2023-03-31
基金项目:国家自然科学基金项目(62366011),广西重点研发计划项目(桂科AB21220023),广西图像图形与智能处理重点实验室项目(GIIP2306)资助。
A Concept Drift Data Stream Classification Algorithm Based on Local Classification Accuracy
ZHANG Ling, MA Shilun, LI Lihui, WEN Yimin
(Guangxi Key Laboratory of Image and Graphic Intelligent Processing, Guilin University of Electronic Technology, Guilin, Guangxi, 541004, China)
Abstract:
The classification of concept drift data streams is a challenging problem.When a new concept appears,there are too few learning samples of the concept,and the classifier cannot be adjusted in time,which leads to low classification accuracy.In order to solve this problem,this article proposes a concept drift data stream classification algorithm,called LA-MS-CDC,based on local classification accuracy.Firstly,LA-MS-CDC combines k-means clustering and local classification accuracy algorithm to select the optimal source domain classifier from the classifier pool.Secondly,the optimal source domain classifier and the target domain classifier are weighted and integrated to classify the samples.Then,according to the real labels of the classification samples,the loss of each classifier is calculated respectively and the weights of the classifiers in the target domain and the source domain are updated. Then,the classification samples are used to update the target domain classifier and the optimal source domain classifier. Finally,the update of the classifier pool is completed. The experimental results on the public datasets show that LA-MS-CDC can effectively transfer the source domain knowledge to the target domain,and the classification effect of LA-MS-CDC is significantly improved compared with the existing methods. The algorithm code can be obtained on https://gitee.com/ymw12345/LAMSCDC.
Key words:  concept drift  multi-source online transfer learning  local classification accuracy  ensemble learning  diversity

用微信扫一扫

用微信扫一扫