引用本文: |
-
张艺,滕飞,胡节.基于多任务学习的国际疾病分类自动编码模型[J].广西科学,2023,30(1):114-120. [点击复制]
- ZHANG Yi,TENG Fei,HU Jie.Auto-encoding Model Based on Multi-Task Learning for International Classification of Diseases (ICD)[J].Guangxi Sciences,2023,30(1):114-120. [点击复制]
|
|
摘要: |
国际疾病分类(International Classification of Diseases,ICD)编码任务是将疾病编码分配给电子病历,每份电子病历分配一个或多个ICD编码。现有的方法大多考虑临床文本中症状与诊断之间的关系,而对诊断与诊断间关系以及症状与症状间关系缺乏考量。针对这一现状,对于诊断与诊断间关系,构造编码共现任务,采用多任务的形式使得预测结果不依赖于标签之间的顺序关系,且不会进行错误预测的传播;对于症状与症状间关系,使用对比学习获取有意义的表征,学习同一临床文本中的症状一致性。通过以上任务的组合,构建基于多任务学习的ICD自动编码模型框架。在MIMIC-Ⅲ数据集上的实验表明,所提出的方法相较于优异模型在Micro-f1指标上提高了1.0%,在Micro-auc指标上提高了0.3%,在P@5指标上提高了0.7%。 |
关键词: ICD编码|多任务学习|编码共现|对比学习|自然语言处理 |
DOI:10.13656/j.cnki.gxkx.20230308.013 |
|
基金项目:四川省国际科技创新合作项目(2022YFH0020)和四川省重点研发项目(2021YFG0136)资助。 |
|
Auto-encoding Model Based on Multi-Task Learning for International Classification of Diseases (ICD) |
ZHANG Yi, TENG Fei, HU Jie
|
(School of Computer and Artificial Intelligence, Southwest Jiaotong University, Chengdu, Sichuan, 611756, China) |
Abstract: |
The task of International Classification of Diseases(ICD) automatic coding is to assign disease codes to electronic medical records,and each electronic medical record is assigned one or more ICD codes.Most of the existing methods consider the relationship between symptoms and diagnosis in clinical texts,while the relationship between diagnosis and diagnosis and the relationship between symptoms and symptoms are not considered.In view of this situation,for the relationship between diagnosis and diagnosis,the coding co-occurrence task is constructed,and the multi-task form is used to make the prediction result independent of the sequential relationship between labels,and the propagation of error prediction will not be carried out.For the relationship between symptoms and symptoms,the comparative learning is used to obtain meaningful representations and learn the consistency of symptoms in the same clinical text.Through the combination of the above tasks,the framework of ICD automatic coding model based on multi-task learning is constructed.The experiment on the MIMIC-Ⅲ dataset shows that the proposed method has improved the Micro-f1 index by 1.0%,the Micro-auc index by 0.3%,and the P@5 index by 0.7% compared with the excellent model. |
Key words: ICD encoding|multi-task learning|code co-occurrence|contrastive learning|natural language processing |