- 微信
- 微博
  
  分享文章到微博
- 复制链接
  
  复制链接到剪贴板

浅谈yolov4中的一部分数据增强

yd_234306724 发表于 2020/12/28 00:30:00 2020/12/28

【摘要】浅谈yolov4中的数据增强前言数据增强数据增强步骤1.对图片进行水平翻转2.对图片进行缩放3.对图片HSV色域变换4. Mosaic数据增强5. 总代码前言在接下来的几天，我将解读yolov4，yolo系列一直是很火的目标检测算法。我特别喜欢yolov4。而今天我们来谈下数据增强。数据增强计算机视觉中的图像增强，是人为的为视觉不变性（...

浅谈yolov4中的数据增强

前言
数据增强
数据增强步骤

前言

在接下来的几天，我将解读yolov4，yolo系列一直是很火的目标检测算法。我特别喜欢yolov4。而今天我们来谈下数据增强。

数据增强

计算机视觉中的图像增强，是人为的为视觉不变性（语义不变）引入了先验知识。数据增强也基本上成了提高模型性能的最简单、直接的方法了。首先增强的样本和原来的样本是由强相关性的（裁剪、翻转、旋转、缩放、扭曲等几何变换，还有像素扰动、添加噪声、光照调节、对比度调节、样本加和或插值、分割补丁等），通过某些简单的操作，提高了最终性能。

数据增强步骤

1.对图片进行水平翻转

水平翻转目标框坐标

# 图片的大小 iw, ih = image.size
  image = image.transpose(Image.FLIP_LEFT_RIGHT) # print(box[:, [0, 2]] ,box[:, [2, 0]]) box[:, [0, 2]] = iw - box[:, [2, 0]] image.show()

  
 
  1
  2
  3
  4
  5
  6

2.对图片进行缩放

代码：

 # 对输入进来的图片进行缩放 new_ar = w / h scale = rand(scale_low, scale_high) if new_ar < 1: nh = int(scale * h) nw = int(nh * new_ar) # image.show() else: nw = int(scale * w) nh = int(nw / new_ar) image = image.resize((nw, nh), Image.BICUBIC) image.show()

  
 
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  11
  12
  13

3.对图片HSV色域变换

HSV模型，是针对用户观感的一种颜色模型，侧重于色彩表示，什么颜色、深浅如何、明暗如何。

H是色彩，S是深浅， S = 0时，只有灰度，V是明暗，表示色彩的明亮程度
代码：

 # 进行色域变换 hue = rand(-hue, hue) sat = rand(1, sat) if rand() < .5 else 1 / rand(1, sat) val = rand(1, val) if rand() < .5 else 1 / rand(1, val) x = rgb_to_hsv(np.array(image) / 255.) x[..., 0] += hue x[..., 0][x[..., 0] > 1] -= 1 x[..., 0][x[..., 0] < 0] += 1 x[..., 1] *= sat x[..., 2] *= val x[x > 1] = 1 x[x < 0] = 0 image = hsv_to_rgb(x) image = Image.fromarray((image * 255).astype(np.uint8)) image.show()

  
 
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  11
  12
  13
  14
  15
  16

4. Mosaic数据增强

Yolov4的mosaic数据增强参考了CutMix数据增强方式，理论上具有一定的相似性！CutMix数据增强方式利用两张图片进行拼接。如下第4张图。

但是mosaic利用了四张图片，根据论文所说其拥有一个巨大的优点是丰富检测物体的背景！且在BN计算的时候一下子会计算四张图片的数据！

annotations需要对框的坐标在合成图中进行调整，超出边界的需要裁剪，效果图如下

 # 将图片进行放置，分别对应四张分割图片的位置 dx = place_x[index] # print(dx) dy = place_y[index] # print(dy) new_image = Image.new('RGB', (w, h), (128, 128, 128)) new_image.paste(image, (dx, dy)) image_data = np.array(new_image) / 255 # new_image.show() # Image.fromarray((image_data*255).astype(np.uint8)).save(str(index)+"distort.jpg") index = index + 1 box_data = [] # 对box进行重新处理 if len(box) > 0: np.random.shuffle(box) box[:, [0, 2]] = box[:, [0, 2]] * nw / iw + dx box[:, [1, 3]] = box[:, [1, 3]] * nh / ih + dy box[:, 0:2][box[:, 0:2] < 0] = 0 box[:, 2][box[:, 2] > w] = w box[:, 3][box[:, 3] > h] = h box_w = box[:, 2] - box[:, 0] box_h = box[:, 3] - box[:, 1] #>>> np.logical_and([True, False], [False, False]) #array([False, False], dtype=bool) box = box[np.logical_and(box_w > 1, box_h > 1)] box_data = np.zeros((len(box), 5)) box_data[:len(box)] = box image_datas.append(image_data) box_datas.append(box_data) img = Image.fromarray((image_data * 255).astype(np.uint8)) for j in range(len(box_data)): thickness = 3 left, top, right, bottom = box_data[j][0:4] draw = ImageDraw.Draw(img) for i in range(thickness): draw.rectangle([left + i, top + i, right - i, bottom - i], outline=(255, 255, 255)) # img.show() # # 将图片分割，放在一起 # print(int(w * min_offset_x)) # print( int(w * (1 - min_offset_x))) cutx = np.random.randint(int(w * min_offset_x), int(w * (1 - min_offset_x))) cuty = np.random.randint(int(h * min_offset_y), int(h * (1 - min_offset_y))) new_image = np.zeros([h, w, 3]) new_image[:cuty, :cutx, :] = image_datas[0][:cuty, :cutx, :] new_image[cuty:, :cutx, :] = image_datas[1][cuty:, :cutx, :] new_image[cuty:, cutx:, :] = image_datas[2][cuty:, cutx:, :] new_image[:cuty, cutx:, :] = image_datas[3][:cuty, cutx:, :] img = Image.fromarray((new_image * 255).astype(np.uint8)) img.show() # 对框进行进一步的处理 new_boxes = merge_bboxes(box_datas, cutx, cuty)
def merge_bboxes(bboxes, cutx, cuty): merge_bbox = [] for i in range(len(bboxes)): for box in bboxes[i]: tmp_box = [] x1, y1, x2, y2 = box[0], box[1], box[2], box[3] if i == 0: if y1 > cuty or x1 > cutx: continue if y2 >= cuty and y1 <= cuty: y2 = cuty if y2 - y1 < 5: continue if x2 >= cutx and x1 <= cutx: x2 = cutx if x2 - x1 < 5: continue if i == 1: if y2 < cuty or x1 > cutx: continue if y2 >= cuty and y1 <= cuty: y1 = cuty if y2 - y1 < 5: continue if x2 >= cutx and x1 <= cutx: x2 = cutx if x2 - x1 < 5: continue if i == 2: if y2 < cuty or x2 < cutx: continue if y2 >= cuty and y1 <= cuty: y1 = cuty if y2 - y1 < 5: continue if x2 >= cutx and x1 <= cutx: x1 = cutx if x2 - x1 < 5: continue if i == 3: if y1 > cuty or x2 < cutx: continue if y2 >= cuty and y1 <= cuty: y2 = cuty if y2 - y1 < 5: continue if x2 >= cutx and x1 <= cutx: x1 = cutx if x2 - x1 < 5: continue tmp_box.append(x1) tmp_box.append(y1) tmp_box.append(x2) tmp_box.append(y2) tmp_box.append(box[-1]) merge_bbox.append(tmp_box) return merge_bbox

  
 
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
  63
  64
  65
  66
  67
  68
  69
  70
  71
  72
  73
  74
  75
  76
  77
  78
  79
  80
  81
  82
  83
  84
  85
  86
  87
  88
  89
  90
  91
  92
  93
  94
  95
  96
  97
  98
  99
  100
  101
  102
  103
  104
  105
  106
  107
  108
  109
  110
  111
  112
  113
  114
  115
  116
  117
  118
  119
  120
  121
  122
  123
  124

5. 总代码

from PIL import Image, ImageDraw
import numpy as np
from matplotlib.colors import rgb_to_hsv, hsv_to_rgb
import math
def rand(a=0, b=1): return np.random.rand() * (b - a) + a
def merge_bboxes(bboxes, cutx, cuty): merge_bbox = [] for i in range(len(bboxes)): for box in bboxes[i]: tmp_box = [] x1, y1, x2, y2 = box[0], box[1], box[2], box[3] if i == 0: if y1 > cuty or x1 > cutx: continue if y2 >= cuty and y1 <= cuty: y2 = cuty if y2 - y1 < 5: continue if x2 >= cutx and x1 <= cutx: x2 = cutx if x2 - x1 < 5: continue if i == 1: if y2 < cuty or x1 > cutx: continue if y2 >= cuty and y1 <= cuty: y1 = cuty if y2 - y1 < 5: continue if x2 >= cutx and x1 <= cutx: x2 = cutx if x2 - x1 < 5: continue if i == 2: if y2 < cuty or x2 < cutx: continue if y2 >= cuty and y1 <= cuty: y1 = cuty if y2 - y1 < 5: continue if x2 >= cutx and x1 <= cutx: x1 = cutx if x2 - x1 < 5: continue if i == 3: if y1 > cuty or x2 < cutx: continue if y2 >= cuty and y1 <= cuty: y2 = cuty if y2 - y1 < 5: continue if x2 >= cutx and x1 <= cutx: x1 = cutx if x2 - x1 < 5: continue tmp_box.append(x1) tmp_box.append(y1) tmp_box.append(x2) tmp_box.append(y2) tmp_box.append(box[-1]) merge_bbox.append(tmp_box) return merge_bbox
def get_random_data(annotation_line, input_shape, random=True, hue=.1, sat=1.5, val=1.5, proc_img=True): '''random preprocessing for real-time data augmentation''' h, w = input_shape min_offset_x = 0.4 min_offset_y = 0.4 scale_low = 1 - min(min_offset_x, min_offset_y) scale_high = scale_low + 0.2 image_datas = [] box_datas = [] index = 0 place_x = [0, 0, int(w * min_offset_x), int(w * min_offset_x)] place_y = [0, int(h * min_offset_y), int(w * min_offset_y), 0] for line in annotation_line: # 每一行进行分割 line_content = line.split() # 打开图片 image = Image.open(line_content[0]) image = image.convert("RGB") image.show() # 图片的大小 iw, ih = image.size # 保存框的位置 box = np.array([np.array(list(map(int, box.split(',')))) for box in line_content[1:]]) # image.save(str(index)+".jpg") # 是否翻转图片 flip = rand() < .5 # image.show() if flip and len(box) > 0: # image.show() image = image.transpose(Image.FLIP_LEFT_RIGHT) # print(box[:, [0, 2]] ,box[:, [2, 0]]) box[:, [0, 2]] = iw - box[:, [2, 0]] # image.show() # 对输入进来的图片进行缩放 new_ar = w / h scale = rand(scale_low, scale_high) if new_ar < 1: nh = int(scale * h) nw = int(nh * new_ar) # image.show() else: nw = int(scale * w) nh = int(nw / new_ar) image = image.resize((nw, nh), Image.BICUBIC) # image.show() # 进行色域变换 hue = rand(-hue, hue) sat = rand(1, sat) if rand() < .5 else 1 / rand(1, sat) val = rand(1, val) if rand() < .5 else 1 / rand(1, val) x = rgb_to_hsv(np.array(image) / 255.) x[..., 0] += hue x[..., 0][x[..., 0] > 1] -= 1 x[..., 0][x[..., 0] < 0] += 1 x[..., 1] *= sat x[..., 2] *= val x[x > 1] = 1 x[x < 0] = 0 image = hsv_to_rgb(x) image = Image.fromarray((image * 255).astype(np.uint8)) image.show() # 将图片进行放置，分别对应四张分割图片的位置 dx = place_x[index] # print(dx) dy = place_y[index] # print(dy) new_image = Image.new('RGB', (w, h), (128, 128, 128)) new_image.paste(image, (dx, dy)) image_data = np.array(new_image) / 255 # new_image.show() # Image.fromarray((image_data*255).astype(np.uint8)).save(str(index)+"distort.jpg") index = index + 1 box_data = [] # 对box进行重新处理 if len(box) > 0: np.random.shuffle(box) box[:, [0, 2]] = box[:, [0, 2]] * nw / iw + dx box[:, [1, 3]] = box[:, [1, 3]] * nh / ih + dy box[:, 0:2][box[:, 0:2] < 0] = 0 box[:, 2][box[:, 2] > w] = w box[:, 3][box[:, 3] > h] = h box_w = box[:, 2] - box[:, 0] box_h = box[:, 3] - box[:, 1] #>>> np.logical_and([True, False], [False, False]) #array([False, False], dtype=bool) box = box[np.logical_and(box_w > 1, box_h > 1)] box_data = np.zeros((len(box), 5)) box_data[:len(box)] = box image_datas.append(image_data) box_datas.append(box_data) img = Image.fromarray((image_data * 255).astype(np.uint8)) for j in range(len(box_data)): thickness = 3 left, top, right, bottom = box_data[j][0:4] draw = ImageDraw.Draw(img) for i in range(thickness): draw.rectangle([left + i, top + i, right - i, bottom - i], outline=(255, 255, 255)) # img.show() # # 将图片分割，放在一起 # print(int(w * min_offset_x)) # print( int(w * (1 - min_offset_x))) cutx = np.random.randint(int(w * min_offset_x), int(w * (1 - min_offset_x))) cuty = np.random.randint(int(h * min_offset_y), int(h * (1 - min_offset_y))) new_image = np.zeros([h, w, 3]) new_image[:cuty, :cutx, :] = image_datas[0][:cuty, :cutx, :] new_image[cuty:, :cutx, :] = image_datas[1][cuty:, :cutx, :] new_image[cuty:, cutx:, :] = image_datas[2][cuty:, cutx:, :] new_image[:cuty, cutx:, :] = image_datas[3][:cuty, cutx:, :] img = Image.fromarray((new_image * 255).astype(np.uint8)) img.show() # 对框进行进一步的处理 new_boxes = merge_bboxes(box_datas, cutx, cuty) return new_image, new_boxes
def normal_(annotation_line, input_shape): '''random preprocessing for real-time data augmentation''' line = annotation_line.split() image = Image.open(line[0]) box = np.array([np.array(list(map(int, box.split(',')))) for box in line[1:]]) iw, ih = image.size image = image.transpose(Image.FLIP_LEFT_RIGHT) box[:, [0, 2]] = iw - box[:, [2, 0]] return image, box
if __name__ == "__main__": with open("2007_train.txt") as f: lines = f.readlines() a = np.random.randint(0, len(lines)) line = lines[a:a + 4] image_data, box_data = get_random_data(line, [416, 416]) img = Image.fromarray((image_data * 255).astype(np.uint8)) for j in range(len(box_data)): thickness = 3 left, top, right, bottom = box_data[j][0:4] draw = ImageDraw.Draw(img) for i in range(thickness): draw.rectangle([left + i, top + i, right - i, bottom - i], outline=(255, 255, 255)) img.show() # img.save("box_all.jpg")

  
 
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
  63
  64
  65
  66
  67
  68
  69
  70
  71
  72
  73
  74
  75
  76
  77
  78
  79
  80
  81
  82
  83
  84
  85
  86
  87
  88
  89
  90
  91
  92
  93
  94
  95
  96
  97
  98
  99
  100
  101
  102
  103
  104
  105
  106
  107
  108
  109
  110
  111
  112
  113
  114
  115
  116
  117
  118
  119
  120
  121
  122
  123
  124
  125
  126
  127
  128
  129
  130
  131
  132
  133
  134
  135
  136
  137
  138
  139
  140
  141
  142
  143
  144
  145
  146
  147
  148
  149
  150
  151
  152
  153
  154
  155
  156
  157
  158
  159
  160
  161
  162
  163
  164
  165
  166
  167
  168
  169
  170
  171
  172
  173
  174
  175
  176
  177
  178
  179
  180
  181
  182
  183
  184
  185
  186
  187
  188
  189
  190
  191
  192
  193
  194
  195
  196
  197
  198
  199
  200
  201
  202
  203
  204
  205
  206
  207
  208
  209
  210
  211
  212
  213
  214
  215
  216
  217
  218
  219
  220
  221
  222
  223
  224
  225
  226
  227
  228
  229
  230
  231
  232
  233
  234

文章来源: blog.csdn.net，作者：快了的程序猿小可哥，版权归原作者所有，如需转载，请联系作者。

原文链接：blog.csdn.net/qq_35914625/article/details/108475839

点赞
收藏
关注作者

0/1000

抱歉，系统识别当前为高风险访问，暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称，即可参与社区互动！

*长度不超过10个汉字或20个英文字符，设置后3个月内不可修改。

确认取消

加入云驻计划，成为创作者

华为云周边好礼
免费体验产品
特殊身份标识
线下官方门票
内部专家零距离
与10000+优质创作者共同成长

立即加入

浅谈yolov4中的一部分数据增强

浅谈yolov4中的数据增强

前言

数据增强

数据增强步骤

1.对图片进行水平翻转

2.对图片进行缩放

3.对图片HSV色域变换

4. Mosaic数据增强

5. 总代码

全部回复

设置昵称

关于作者

目录

加入云驻计划，成为创作者

浅谈yolov4中的一部分数据增强

浅谈yolov4中的数据增强

前言

数据增强

数据增强步骤

1.对图片进行水平翻转

2.对图片进行缩放

3.对图片HSV色域变换

4. Mosaic数据增强

5. 总代码

全部回复

设置昵称

关于作者

目录

加入云驻计划，成为创作者

推荐阅读

相关产品