物体检测YOLOv3实践
物体检测YOLOv3实践
物体检测是计算机视觉中的一个重要的研究领域,在人流检测,行人跟踪,自动驾驶,医学影像等领域有着广泛的应用。不同于简单的图像分类,物体检测旨在对图像中的目标进行精确识别,包括物体的位置和分类,因此能够应用于更多高层视觉处理的场景。例如在自动驾驶领域,需要辨识摄像头拍摄的图像中的车辆、行人、交通指示牌及其位置,以便进一步根据这些数据决定驾驶策略。本期学习案例,我们将聚焦于YOLO算法,YOLO(You Only Look Once)是一种one-stage物体检测算法。
注意事项:
-
本案例使用框架: TensorFlow-1.13.1
-
本案例使用硬件规格: GPU V100
-
进入运行环境方法:点此链接进入AI Gallery,点击Run in ModelArts按钮进入ModelArts运行环境,如需使用GPU,您可以在ModelArts JupyterLab运行界面右边的工作区进行切换
-
运行代码方法: 点击本页面顶部菜单栏的三角形运行按钮或按Ctrl+Enter键 运行每个方块中的代码
-
JupyterLab的详细用法: 请参考《ModelAtrs JupyterLab使用指导》
-
碰到问题的解决办法: 请参考《ModelAtrs JupyterLab常见问题解决办法》
1.数据和代码下载
运行下面代码,进行数据和代码的下载和解压
本案例使用coco数据,共80个类别。
import os
from modelarts.session import Session
sess = Session()
if sess.region_name == 'cn-north-1':
bucket_path="modelarts-labs/notebook/DL_object_detection_yolo/yolov3.tar.gz"
elif sess.region_name == 'cn-north-4':
bucket_path="modelarts-labs-bj4/notebook/DL_object_detection_yolo/yolov3.tar.gz"
else:
print("请更换地区到北京一或北京四")
if not os.path.exists('./yolo3'):
sess.download_data(bucket_path=bucket_path, path="./yolov3.tar.gz")
if os.path.exists('./yolov3.tar.gz'):
# 解压文件
os.system("tar -xf ./yolov3.tar.gz")
# 清理压缩包
os.system("rm -r ./yolov3.tar.gz")
2.准备数据
2.1文件路径定义
from train import get_classes, get_anchors
# 数据文件路径
data_path = "./coco/coco_data"
# coco类型定义文件存储位置
classes_path = './model_data/coco_classes.txt'
# coco数据anchor值文件存储位置
anchors_path = './model_data/yolo_anchors.txt'
# coco数据标注信息文件存储位置
annotation_path = './coco/coco_train.txt'
# 预训练权重文件存储位置
weights_path = "./model_data/yolo.h5"
# 模型文件存储位置
save_path = "./result/models/"
classes = get_classes(classes_path)
anchors = get_anchors(anchors_path)
# 获取类型数量和anchor数量变量
num_classes = len(classes)
num_anchors = len(anchors)
Using TensorFlow backend.
/home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
2.2读取标注数据
import numpy as np
# 训练集与验证集划分比例
val_split = 0.1
with open(annotation_path) as f:
lines = f.readlines()
np.random.seed(10101)
np.random.shuffle(lines)
np.random.seed(None)
num_val = int(len(lines)*val_split)
num_train = len(lines) - num_val
2.3数据读取函数,构建数据生成器。每次读取一个批次的数据至内存训练,并做数据增强。
def data_generator(annotation_lines, batch_size, input_shape, data_path,anchors, num_classes):
n = len(annotation_lines)
i = 0
while True:
image_data = []
box_data = []
for b in range(batch_size):
if i==0:
np.random.shuffle(annotation_lines)
image, box = get_random_data(annotation_lines[i], input_shape, data_path,random=True) # 随机挑选一个批次的数据
image_data.append(image)
box_data.append(box)
i = (i+1) % n
image_data = np.array(image_data)
box_data = np.array(box_data)
y_true = preprocess_true_boxes(box_data, input_shape, anchors, num_classes) # 对标注框预处理,过滤异常标注框
yield [image_data, *y_true], np.zeros(batch_size)
def data_generator_wrapper(annotation_lines, batch_size, input_shape, data_path,anchors, num_classes):
n = len(annotation_lines)
if n==0 or batch_size<=0: return None
return data_generator(annotation_lines, batch_size, input_shape, data_path,anchors, num_classes)
3.模型训练
本案例使用Keras深度学习框架搭建YOLOv3神经网络。
可以进入相应的文件夹路径查看源码实现。
3.1构建神经网络
可以在./yolo3/model.py
文件中查看细节
import keras.backend as K
from yolo3.model import preprocess_true_boxes, yolo_body, yolo_loss
from keras.layers import Input, Lambda
from keras.models import Model
# 初始化session
K.clear_session()
# 图像输入尺寸
input_shape = (416, 416)
image_input = Input(shape=(None, None, 3))
h, w = input_shape
# 设置多尺度检测的下采样尺寸
y_true = [Input(shape=(h//{0:32, 1:16, 2:8}[l], w//{0:32, 1:16, 2:8}[l], num_anchors//3, num_classes+5))
for l in range(3)]
# 构建YOLO模型结构
model_body = yolo_body(image_input, num_anchors//3, num_classes)
# 将YOLO权重文件加载进来,如果希望不加载预训练权重,从头开始训练的话,可以删除这句代码
model_body.load_weights(weights_path, by_name=True, skip_mismatch=True)
# 定义YOLO损失函数
model_loss = Lambda(yolo_loss, output_shape=(1,), name='yolo_loss',
arguments={'anchors': anchors, 'num_classes': num_classes, 'ignore_thresh': 0.5})([*model_body.output, *y_true])
# 构建Model,为训练做准备
model = Model([model_body.input, *y_true], model_loss)
WARNING:tensorflow:From /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
# 打印模型各层结构
model.summary()
(此处代码执行的输出很长,省略)
训练回调函数定义
from keras.callbacks import ReduceLROnPlateau, EarlyStopping
# 定义回调方法
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=3, verbose=1) # 学习率衰减策略
early_stopping = EarlyStopping(monitor='val_loss', min_delta=0, patience=10, verbose=1) # 早停策略
3.2开始训练
from keras.optimizers import Adam
from yolo3.utils import get_random_data
# 设置所有的层可训练
for i in range(len(model.layers)):
model.layers[i].trainable = True
# 选择Adam优化器,设置学习率
learning_rate = 1e-4
model.compile(optimizer=Adam(lr=learning_rate), loss={'yolo_loss': lambda y_true, y_pred: y_pred})
# 设置批大小和训练轮数
batch_size = 16
max_epochs = 2
print('Train on {} samples, val on {} samples, with batch size {}.'.format(num_train, num_val, batch_size))
# 开始训练
model.fit_generator(data_generator_wrapper(lines[:num_train], batch_size, input_shape, data_path,anchors, num_classes),
steps_per_epoch=max(1, num_train//batch_size),
validation_data=data_generator_wrapper(lines[num_train:], batch_size, input_shape, data_path,anchors, num_classes),
validation_steps=max(1, num_val//batch_size),
epochs=max_epochs,
initial_epoch=0,
callbacks=[reduce_lr, early_stopping])
Train on 179 samples, val on 19 samples, with batch size 16.
Epoch 1/2
11/11 [==============================] - 25s 2s/step - loss: 46.6694 - val_loss: 39.1381
Epoch 2/2
11/11 [==============================] - 5s 452ms/step - loss: 45.5145 - val_loss: 43.6707
<keras.callbacks.History at 0x7fbff60659e8>
3.3保存模型
import os
if not os.path.exists(save_path):
os.makedirs(save_path)
# 保存模型
model.save_weights(os.path.join(save_path, 'trained_weights_final.h5'))
4.模型测试
4.1打开一张测试图片
from PIL import Image
import numpy as np
# 测试文件路径
test_file_path = './test.jpg'
# 打开测试文件
image = Image.open(test_file_path)
image_ori = np.array(image)
image_ori.shape
(640, 481, 3)
4.2图片预处理
from yolo3.utils import letterbox_image
new_image_size = (image.width - (image.width % 32), image.height - (image.height % 32))
boxed_image = letterbox_image(image, new_image_size)
image_data = np.array(boxed_image, dtype='float32')
image_data /= 255.
image_data = np.expand_dims(image_data, 0)
image_data.shape
(1, 640, 480, 3)
import keras.backend as K
sess = K.get_session()
4.3构建模型
from yolo3.model import yolo_body
from keras.layers import Input
# coco数据anchor值文件存储位置
anchor_path = "./model_data/yolo_anchors.txt"
with open(anchor_path) as f:
anchors = f.readline()
anchors = [float(x) for x in anchors.split(',')]
anchors = np.array(anchors).reshape(-1, 2)
yolo_model = yolo_body(Input(shape=(None,None,3)), len(anchors)//3, num_classes)
4.4加载模型权重,或将模型路径替换成上一步训练得出的模型路径
# 模型权重存储路径
weights_path = "./model_data/yolo.h5"
yolo_model.load_weights(weights_path)
4.5定义IOU以及score:
- IOU: 将交并比大于IOU的边界框作为冗余框去除
- score:将预测分数大于score的边界框筛选出来
iou = 0.45
score = 0.8
4.6构建输出[boxes, scores, classes]
from yolo3.model import yolo_eval
input_image_shape = K.placeholder(shape=(2, ))
boxes, scores, classes = yolo_eval(
yolo_model.output,
anchors,
num_classes,
input_image_shape,
score_threshold=score,
iou_threshold=iou)
4.7进行预测
out_boxes, out_scores, out_classes = sess.run(
[boxes, scores, classes],
feed_dict={
yolo_model.input: image_data,
input_image_shape: [image.size[1], image.size[0]],
K.learning_phase(): 0
})
class_coco = get_classes(classes_path)
out_coco = []
for i in out_classes:
out_coco.append(class_coco[i])
print(out_boxes)
print(out_scores)
print(out_coco)
[[152.69937 166.2726 649.0503 459.9374 ]
[ 68.62158 21.843088 465.66208 452.6878 ]]
[0.9838943 0.999688 ]
['person', 'umbrella']
4.8将预测结果绘制在图片上
from PIL import Image, ImageFont, ImageDraw
font = ImageFont.truetype(font='font/FiraMono-Medium.otf',
size=np.floor(3e-2 * image.size[1] + 0.5).astype('int32'))
thickness = (image.size[0] + image.size[1]) // 300
for i, c in reversed(list(enumerate(out_coco))):
predicted_class = c
box = out_boxes[i]
score = out_scores[i]
label = '{} {:.2f}'.format(predicted_class, score)
draw = ImageDraw.Draw(image)
label_size = draw.textsize(label, font)
top, left, bottom, right = box
top = max(0, np.floor(top + 0.5).astype('int32'))
left = max(0, np.floor(left + 0.5).astype('int32'))
bottom = min(image.size[1], np.floor(bottom + 0.5).astype('int32'))
right = min(image.size[0], np.floor(right + 0.5).astype('int32'))
print(label, (left, top), (right, bottom))
if top - label_size[1] >= 0:
text_origin = np.array([left, top - label_size[1]])
else:
text_origin = np.array([left, top + 1])
for i in range(thickness):
draw.rectangle(
[left + i, top + i, right - i, bottom - i],
outline=225)
draw.rectangle(
[tuple(text_origin), tuple(text_origin + label_size)],
fill=225)
draw.text(text_origin, label, fill=(0, 0, 0), font=font)
del draw
umbrella 1.00 (22, 69) (453, 466)
person 0.98 (166, 153) (460, 640)
image
- 点赞
- 收藏
- 关注作者
评论(0)