【摘要】 赛题:全球人工智能技术创新大赛赛道一: 医学影像报告异常检测 赛题背景 影像科医生在工作时会观察医学影像(如CT、核磁共振影像),并对其作出描述,这些描述中包含了大量医学信息,对医疗AI具有重要意义。本任务需要参赛队伍根据医生对CT的影像描述文本数据,判断身体若干目标区域是否有异常以及异常的类型。初赛阶段仅需判断各区域是否有异常,复...

赛题:全球人工智能技术创新大赛赛道一: 医学影像报告异常检测







列名 类型 示例
report_ID int 1
description string,影像描述 右下肺野见小结节样影与软组织肿块影
label 由两部分组成。第一部分为若干异常区域ID,用空格分割。第二部分为若干异常类型ID,用空格分割。两部分用逗号“,”分割。若定义中所有区域均无异常,则两部分均为空,此项为“,”。 4,1 2




  • 初赛Training数据格式(不同列使用分隔符“|,|”分割):
列名 类型 示例
report_ID int 1
description 脱敏后的影像描述,以字为单位使用空格分割 101 47 12 66 74 90 0 411 234 79 175
label 由多个异常区域ID组成,以空格分隔。若此描述中无异常区域,则为空 3 4
  • 复赛Training数据格式(不同列使用分隔符“|,|”分割):
列名 类型 示例
report_ID int 1
description 脱敏后的影像描述,以字为单位使用空格分割 101 47 12 66 74 90 0 411 234 79 175
label string,由两部分组成。第一部分为若干异常区域ID,用空格分割。第二部分为若干异常类型ID,用空格分割。两部分用逗号“,”分割。若定义中所有区域均无异常,则两部分均为空,此项为“,”。 3 4,0 2




列名 类型 示例
report_ID int 1
description 脱敏后的影像描述,以字为单位使用空格分割 101 47 12 66 74 90 0 411 234 79 175



  • 初赛提交数据格式(不同列使用分隔符“|,|”分割):
列名 类型 示例
report_ID int 1
Prediction 17维向量 0.68 0.82 0.92 0.59 0.71 0.23 0.45 0.36 0.46 0.64 0.92 0.66 0.3 0.5 0.94 0.7 0.38
  • 复赛提交数据格式(不同列使用分隔符“|,|”分割):
列名 类型 示例
report_ID int 1
Prediction 29维向量(中间不需要使用逗号分隔) 0.68 0.82 0.92 0.59 0.71 0.23 0.45 0.36 0.46 0.64 0.92 0.66 0.3 0.5 0.94 0.7 0.38 0.05 0.97 0.71 0.5 0.64 0.0 0.54 0.5 0.49 0.41 0.06 0.07



其中y_{n,m}yn,m​ 和\hat{y}_{n,m}y^​n,m​分别是第n个样本第m个标签的真实值和预测值。

初赛分数 S=1-mlogloss。为了让分数区间更合理,复赛阶段调整为S=1-2*mlogloss。

在复赛阶段,分数由两部分组成。第一部分与初赛相同,对预测值的前17维结合真实值计算S_1S1​得到 。第二部分为对所有实际存在异常区域的测试样本,对其预测值后12维结合真实异常类型进行计算,方法与第一部分相同,若N个测试样本中有K个实际有异常区域,则将对12K个值进行计算(实际无异常的样本不参与第二部分计算),得到S_2S2​。复赛最终分数S=0.6S_1+0.4S_2S=0.6S1​+0.4S2​。




  • 构建词汇表,总共858个词语,编号为0-857。
  • 统一样本的长度,这里选择50个词语作为样本长度,多的截断,少的补齐(用858补齐)
  • textCNN的第一层是对原始序列进行enmbeding,对每一个词都enmbed到固定维度,然后使用CNN来进行特征提取。
  • 最后的输出采取BECWithlogitLoss()
  • 线下验证指标采取auc和logloss两种方案


  • 统一样本的长度,这里选择64个词语作为样本长度,多的截断,少的补齐(用0补齐,用0补齐后大约有0.05的提高)
  • textCNN的第一层是对原始序列进行enmbeding,对每一个词都enmbed到固定维度,然后使用CNN来进行特征提取,enmbeding改为64.






2、使用CNN的主力机制Coordinate Attention ,参考这篇: 注意力机制在CNN中使用总结_AI浩-CSDN博客




  1. import torch
  2. import torch.nn as nn
  3. import torch.nn.functional as F
  4. from collections import OrderedDict
  5. channelNum = 64
  6. class CA_Block(nn.Module):
  7. def __init__(self, channel, h, w, reduction=16):
  8. super(CA_Block, self).__init__()
  9. self.h = h
  10. self.w = w
  11. self.avg_pool_x = nn.AdaptiveAvgPool2d((h, 1))
  12. self.avg_pool_y = nn.AdaptiveAvgPool2d((1, w))
  13. self.conv_1x1 = nn.Conv2d(in_channels=channel, out_channels=channel // reduction, kernel_size=1, stride=1,
  14. bias=False)
  15. self.relu = nn.ReLU()
  16. self.bn = nn.BatchNorm2d(channel // reduction)
  17. self.F_h = nn.Conv2d(in_channels=channel // reduction, out_channels=channel, kernel_size=1, stride=1,
  18. bias=False)
  19. self.F_w = nn.Conv2d(in_channels=channel // reduction, out_channels=channel, kernel_size=1, stride=1,
  20. bias=False)
  21. self.sigmoid_h = nn.Sigmoid()
  22. self.sigmoid_w = nn.Sigmoid()
  23. def forward(self, x):
  24. x_h = self.avg_pool_x(x).permute(0, 1, 3, 2)
  25. x_w = self.avg_pool_y(x)
  26. x_cat_conv_relu = self.relu(self.conv_1x1(torch.cat((x_h, x_w), 3)))
  27. x_cat_conv_split_h, x_cat_conv_split_w = x_cat_conv_relu.split([self.h, self.w], 3)
  28. s_h = self.sigmoid_h(self.F_h(x_cat_conv_split_h.permute(0, 1, 3, 2)))
  29. s_w = self.sigmoid_w(self.F_w(x_cat_conv_split_w))
  30. out = x * s_h.expand_as(x) * s_w.expand_as(x)
  31. return out
  32. class Mish(torch.nn.Module):
  33. def __init__(self):
  34. super().__init__()
  35. def forward(self, x):
  36. x = x * (torch.tanh(torch.nn.functional.softplus(x)))
  37. return x
  38. class ConvBN(nn.Sequential):
  39. def __init__(self, in_planes, out_planes, kernel_size, stride=1, groups=1):
  40. if not isinstance(kernel_size, int):
  41. padding = [(i - 1) // 2 for i in kernel_size]
  42. else:
  43. padding = (kernel_size - 1) // 2
  44. super(ConvBN, self).__init__(OrderedDict([
  45. ('conv', nn.Conv2d(in_planes, out_planes, kernel_size, stride,
  46. padding=padding, groups=groups, bias=False)),
  47. ('bn', nn.BatchNorm2d(out_planes)),
  48. # ('Mish', Mish())
  49. ('Mish', nn.LeakyReLU(negative_slope=0.3, inplace=False))
  50. ]))
  51. class ResBlock(nn.Module):
  52. """
  53. Sequential residual blocks each of which consists of \
  54. two convolution layers.
  55. Args:
  56. ch (int): number of input and output channels.
  57. nblocks (int): number of residual blocks.
  58. shortcut (bool): if True, residual tensor addition is enabled.
  59. """
  60. def __init__(self, ch, nblocks=1, shortcut=True):
  61. super().__init__()
  62. self.shortcut = shortcut
  63. self.module_list = nn.ModuleList()
  64. for i in range(nblocks):
  65. resblock_one = nn.ModuleList()
  66. resblock_one.append(ConvBN(ch, ch, 1))
  67. resblock_one.append(Mish())
  68. resblock_one.append(ConvBN(ch, ch, 3))
  69. resblock_one.append(Mish())
  70. self.module_list.append(resblock_one)
  71. def forward(self, x):
  72. for module in self.module_list:
  73. h = x
  74. for res in module:
  75. h = res(h)
  76. x = x + h if self.shortcut else h
  77. return x
  78. class Encoder_conv(nn.Module):
  79. def __init__(self, in_planes=128, blocks=2, h=32, w=64):
  80. super().__init__()
  81. self.conv2 = ConvBN(in_planes, in_planes * 2, [1, 9])
  82. self.conv3 = ConvBN(in_planes * 2, in_planes * 4, [9, 1])
  83. self.conv4 = ConvBN(in_planes * 4, in_planes, 1)
  84. self.resBlock = ResBlock(ch=in_planes, nblocks=blocks)
  85. self.conv5 = ConvBN(in_planes, in_planes * 2, [1, 7])
  86. self.conv6 = ConvBN(in_planes * 2, in_planes * 4, [7, 1])
  87. self.conv7 = ConvBN(in_planes * 4, in_planes, 1)
  88. self.eca = CA_Block(in_planes, h=h, w=w)
  89. self.relu = Mish()
  90. def forward(self, input):
  91. x2 = self.conv2(input)
  92. x3 = self.conv3(x2)
  93. x4 = self.conv4(x3)
  94. r1 = self.resBlock(x4)
  95. x5 = self.conv5(r1)
  96. x6 = self.conv6(x5)
  97. x7 = self.conv7(x6)
  98. x8 = self.relu(x7 + x4)
  99. e = self.eca(x8)
  100. return e
  101. class TransformerEncoder(torch.nn.Module):
  102. def __init__(self, embed_dim, num_heads, dropout, feedforward_dim):
  103. super().__init__()
  104. self.attn = torch.nn.MultiheadAttention(embed_dim, num_heads, dropout=dropout)
  105. self.linear_1 = torch.nn.Linear(embed_dim, feedforward_dim)
  106. self.linear_2 = torch.nn.Linear(feedforward_dim, embed_dim)
  107. self.layernorm_1 = torch.nn.LayerNorm(embed_dim)
  108. self.layernorm_2 = torch.nn.LayerNorm(embed_dim)
  109. def forward(self, x_in):
  110. attn_out, _ = self.attn(x_in, x_in, x_in)
  111. x = self.layernorm_1(x_in + attn_out)
  112. ff_out = self.linear_2(torch.nn.functional.relu(self.linear_1(x)))
  113. x = self.layernorm_2(x + ff_out)
  114. return x
  115. class CNN_Text(nn.Module):
  116. def __init__(self, embed_num, static=False):
  117. super(CNN_Text, self).__init__()
  118. embed_dim = 128
  119. class_num = 17
  120. Ci = 1
  121. self.embed = nn.Embedding(embed_num, embed_dim) # 词嵌入
  122. self.tram = TransformerEncoder(embed_dim, 8, 0.5, 512)
  123. self.encoder_2 = TransformerEncoder(embed_dim, 8, 0.5, 512)
  124. self.encoder_3 = TransformerEncoder(embed_dim, 8, 0.5, 512)
  125. self.encoder1 = nn.Sequential(OrderedDict([
  126. ("conv3_bn3", ConvBN(Ci, channelNum, 1)),
  127. ("encoder_conv1", Encoder_conv(channelNum, blocks=2, h=64, w=128)),
  128. ]))
  129. self.encoder2 = nn.Sequential(OrderedDict([
  130. ("conv3_bn_3,3", ConvBN(Ci, channelNum * 2, 3)),
  131. ("conv3_bn1,1", ConvBN(channelNum * 2, channelNum // 2, 1)),
  132. ('se', CA_Block(channelNum // 2, h=64, w=128)),
  133. ]))
  134. self.encoder3 = nn.Sequential(OrderedDict([
  135. ("conv3_bn3", ConvBN(Ci, channelNum * 2, [1, 3])),
  136. ("conv3_bn31", ConvBN(channelNum * 2, channelNum // 2, [3, 1])),
  137. ('se', CA_Block(channelNum // 2, h=64, w=128)),
  138. ]))
  139. self.encoder_conv = Encoder_conv(channelNum * 2)
  140. self.encoder_conv1 = nn.Sequential(OrderedDict([
  141. ("conv1x1_bn", ConvBN(channelNum * 2, 1, 1)),
  142. ]))
  143. self.con1 = ConvBN(Ci, channelNum * 2, 1, stride=2)
  144. self.relu = Mish()
  145. self.pool = nn.AvgPool2d(2);
  146. self.fc1 = nn.Linear(2048, class_num)
  147. self.dp = nn.Dropout(0.5)
  148. self.sg = nn.Sigmoid()
  149. if static:
  150. self.embed.weight.requires_grad = False
  151. def forward(self, x):
  152. x = self.embed(x) # (N, W, D)-batch,单词数量,维度
  153. x1=self.tram(x)
  154. x2=self.encoder_2(x1)
  155. x = self.encoder_3(x2)
  156. x = x.unsqueeze(1) # (N, Ci, W, D)
  157. x = self.sg(x)
  158. x0 = self.con1(x)
  159. encode1 = self.encoder1(x)
  160. encode2 = self.encoder2(x)
  161. encode3 = self.encoder3(x)
  162. x = torch.cat((encode1, encode2, encode3), dim=1)
  163. x = self.relu(x)
  164. x = self.pool(x)
  165. x = self.encoder_conv(x)
  166. x = self.relu(x + x0)
  167. x = self.encoder_conv1(x)
  168. x = x.contiguous().view(-1, 2048)
  169. x = self.dp(x)
  170. logit = self.fc1(x) # (N, C)
  171. return logit
  172. if __name__ == "__main__":
  173. net = CNN_Text(embed_num=1000)
  174. x = torch.LongTensor([[1, 2, 4, 5, 2, 35, 43, 113, 111, 451, 455, 22, 45, 55],
  175. [14, 3, 12, 9, 13, 4, 51, 45, 53, 17, 57, 954, 156, 23]])
  176. logit = net(x)
  177. print(net)






  1. import pandas as pd
  2. import numpy as np
  3. from collections import Counter
  4. #
  5. train_df=pd.read_csv('data/track1_round1_train_20210222.csv',header=None)
  6. test_df=pd.read_csv('data/track1_round1_testA_20210222.csv',header=None)
  7. #
  8. train_df.columns=['report_ID','description','label']
  9. test_df.columns=['report_ID','description']
  10. train_df.drop(['report_ID'],axis=1,inplace=True)
  11. test_df.drop(['report_ID'],axis=1,inplace=True)
  12. print("train_df:{},test_df:{}".format(train_df.shape,test_df.shape))
  13. #
  14. new_des=[i.strip('|').strip() for i in train_df['description'].values]
  15. new_label=[i.strip('|').strip() for i in train_df['label'].values]
  16. train_df['description']=new_des
  17. train_df['label']=new_label
  18. new_des_test=[i.strip('|').strip() for i in test_df['description'].values]
  19. test_df['description']=new_des_test
  20. #
  21. word_all=[]
  22. len_list=[]
  23. for i in range(len(new_des)):
  24. tmp=[int(i) for i in new_des[i].split(' ')]
  25. word_all+=tmp
  26. len_list.append(len(tmp))
  27. for i in range(len(new_des_test)):
  28. tmp=[int(i) for i in new_des_test[i].split(' ')]
  29. word_all+=tmp
  30. len_list.append(len(tmp))
  31. #
  32. print(train_df['label'].unique())
  33. a=Counter(word_all)
  34. print(len(a))
  35. a=dict(a)
  36. a=sorted(a)#0-857
  37. #print(a)
  38. print(np.max(len_list),np.min(len_list),np.mean(len_list))




  1. def logloss(y_true, y_pred):
  2. # Clip y_pred between eps and 1-eps
  3. p = torch.clamp(y_pred, 1e-5, 1-1e-5)
  4. loss = torch.sum(y_true * torch.log(p) + (1 - y_true) * torch.log(1 - p))
  5. return loss / len(y_true)
  6. class Muti_logloss(torch.nn.Module):
  7. def __init__(self):
  8. super(Muti_logloss, self).__init__()
  9. def forward(self, y, y_p):
  10. allloss = []
  11. for i in range(y.shape[1]):
  12. loss = logloss(y[:, i], y_p[:, i])
  13. allloss.append(loss)
  14. allloss = torch.tensor(allloss, dtype=torch.float)
  15. alllosssum = torch.sum(allloss)
  16. lossre = alllosssum / (y.shape[1])
  17. lossre = -Variable(lossre, requires_grad=True)
  18. return lossre

写完loss后,接着训练,发现loss一直不收敛,找了几个大佬帮我核对loss没有问题。不知道哪里出问题了,先记录。最后还是用的 BCEWithLogitsLoss()。



