YOLOv5姿态估计实战与介绍
利用YOLOv5进行姿态估计,HRnet与SimDR检测图片、视频以及摄像头中的人体关键点**,欢迎大家一起前来探讨学习~
@TOC
一、项目准备
首先需要我们利用Pycharm直接克隆github中的姿态估计原工程文件,如果不知道怎样在本地克隆Pycharm,可以接着往下看,如果知道的话可以直接跳过1.1,直接看1.2
1Pycharm中克隆github上的项目
要想在Pycharm中克隆github上的源码,必须要借助Git工具来实现,如何检测自己的Pycharm中有没有安装Git呢?
首先第一步,找到File中的Settings,打开:
然后第二步,找到Version Control中的Git,然后点击test按钮:
如果有像我一样的Git版本显示,则证明我们已经成功安装了Git:
如果出现此提示,则证明没有安装Git:
如果有提示的话我们点击提示自动去安装Git;但是如果没有提示的话,我们需要自己去Git官网上下载,但是官网上速度慢的一批,给大家提供一个提供一个国内网址,在这里下载速度会很快:https://registry.npmmirror.com/binary.html?path=git-for-windows/
下载完之后,新打开一个窗口,切换到Version Control–>Git选项卡中,选择刚安装好的Git路径
注意:一定要选择git.exe,默认安装的都在cmd下。
2.具体步骤
下载完Git之后,我们在最上面找到Git,然后点击下方的克隆:
将下方要克隆的工程文件URL写入进去,同时一定要注意自己克隆到的文件目录,这个很重要!!!千万不要开始调模型了才发现自己文件目录搞错了,像我一样还得重新来一遍(苦涩)。
姿态估计原工程文件:
git clone https://github.com/leeyegy/SimDR.git
克隆完之后,需要再将yolov5算法添加到其中:
- 在终端处:
cd SimDR
- 接着在终端输入:
git clone -b v5.https://github.com/ultralytics/yolov5.git
解释:git clone -b + 分支名 + 仓库地址,-b=-branch(分支)
3.环境配置
本次实验需要的直接按照工程中SimDR与yolov5的requirement.txt安装即可,一般会有提示一键安装,哪里报错解决哪里,记得更新pip,这个地方经常出错,总之缺啥补啥。
二、目标检测
1.添加权重文件
首先我们需要为xai/yolovz添加权重文件:yolov5x.pt到 SimDR/yolov5/weights/
文件夹下:
这里是权重文件的下载链接,免费下载:
链接: 帅哥一枚到此一游 提取码: 0213
2.获取图片边界框
在yolov5文件夹下新建一个py文件,将代码写入其中,如果有很多报错,先不要着急,我们慢慢解决:
import argparse
import time
from pathlib import Path
import numpy as np
import cv2
import torch
import torch.backends.cudnn as cudnn
from numpy import random
import sys
import os
from models.experimental import attempt_load
from utils.datasets import LoadStreams, LoadImages
from utils.general import check_img_size, check_requirements, check_imshow, non_max_suppression, \
apply_classifier, \
scale_coords, xyxy2xywh, strip_optimizer, set_logging, increment_path
from utils.plots import plot_one_box
from utils.torch_utils import select_device, load_classifier, time_synchronized
from utils.datasets import letterbox
from yolov5.models.experimental import attempt_load
from yolov5.utils.datasets import LoadStreams, LoadImages
from yolov5.utils.general import check_img_size, check_requirements, check_imshow, non_max_suppression, apply_classifier, \
scale_coords, xyxy2xywh, strip_optimizer, set_logging, increment_path
from yolov5.utils.plots import plot_one_box
from yolov5.utils.torch_utils import select_device, load_classifier, time_synchronized
from yolov5.utils.datasets import letterbox
class Yolov5():
def __init__(self, weights=None, opt=None, device=None):
"""
@param weights:
@param save_txt:
@param opt:
@param device:
"""
self.weights = weights
self.device = device
# save_dir = Path(increment_path(Path(opt.project) / opt.name, exist_ok=opt.exist_ok)) # increment run
# save_dir.mkdir(parents=True, exist_ok=True) # make dir
self.img_size = 640
self.model = attempt_load(weights, map_location=self.device)
self.stride = int(self.model.stride.max())
self.names = self.model.module.names if hasattr(self.model, 'module') else self.model.names
self.colors = [[random.randint(0, 255) for _ in range(3)] for _ in self.names]
self.opt = opt
def detect(self,img0):
"""
@param img0: 输入图片 shape=[h,w,3]
@return:
"""
person_boxes = np.ones((6))
img = letterbox(img0, self.img_size, stride=self.stride)[0]
# Convert
img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416
img = np.ascontiguousarray(img)
img = torch.from_numpy(img).to(self.device)
img = img.float() # uint8 to fp16/32
img /= 255.0 # 0 - 255 to 0.0 - 1.0
if img.ndimension() == 3:
img = img.unsqueeze(0)
pred = self.model(img, augment=self.opt.augment)[0]
# Apply NMS
pred = non_max_suppression(pred, self.opt.conf_thres, self.opt.iou_thres, classes=self.opt.classes, agnostic=self.opt.agnostic_nms)
for i, det in enumerate(pred):
if len(det):
# Rescale boxes from img_size to im0 size
det[:, :4] = scale_coords(img.shape[2:], det[:, :4], img0.shape).round()
boxes = reversed(det)
boxes = boxes.cpu().numpy() #2022.04.06修改,在GPU上跑boxes无法直接转numpy数据
#for i , box in enumerate(np.array(boxes)):
for i , box in enumerate(boxes):
if int(box[-1]) == 0 and box[-2]>=0.7:
person_boxes=np.vstack((person_boxes , box))
# label = f'{self.names[int(box[-1])]} {box[-2]:.2f}'
# print(label)
# plot_one_box(box, img0, label=label, color=self.colors[int(box[-1])], line_thickness=3)
# cv2.imwrite('result1.jpg',img0)
# print(s)
# print(person_boxes,np.ndim(person_boxes))
if np.ndim(person_boxes)>=2 :
person_boxes_result = person_boxes[1:]
boxes_result = person_boxes[1:,:4]
else:
person_boxes_result = []
boxes_result = []
return person_boxes_result,boxes_result
def yolov5test(opt,path = ''):
detector = Yolov5(weights='weights/yolov5x.pt',opt=opt,device=torch.device('cpu'))
img0 = cv2.imread(path)
personboxes ,boxes= detector.detect(img0)
for i,(x1,y1,x2,y2) in enumerate(boxes):
print(x1,y1,x2,y2)
print(personboxes,'\n',boxes)
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--conf-thres', type=float, default=0.25, help='object confidence threshold')
parser.add_argument('--iou-thres', type=float, default=0.45, help='IOU threshold for NMS')
parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --class 0, or --class 0 2 3')
parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')
parser.add_argument('--augment', action='store_true', help='augmented inference')
parser.add_argument('--update', action='store_true', help='update all model')
parser.add_argument('--project', default='runs/detect', help='save results to project/name')
parser.add_argument('--name', default='exp', help='save results to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
opt = parser.parse_args()
print(opt)
# check_requirements(exclude=('pycocotools', 'thop'))
with torch.no_grad():
yolov5test(opt,'data/images/zidane.jpg')
如果没有报错,会得到以下的坐标,但是有报错也不要着急,在下面也给大家列举了常见的报错:
3.添加SPPF模块
yolov5 v5.0工程中没有SPPF模块,此时我们需要在SimDR/yolov5/models/common.py
文件末尾加入以下代码:
import warnings
class SPPF(nn.Module):
# Spatial Pyramid Pooling - Fast (SPPF) layer for YOLOv5 by Glenn Jocher
def __init__(self, c1, c2, k=5): # equivalent to SPP(k=(5, 9, 13))
super().__init__()
c_ = c1 // 2 # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = Conv(c_ * 4, c2, 1, 1)
self.m = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)
def forward(self, x):
x = self.cv1(x)
with warnings.catch_warnings():
warnings.simplefilter('ignore') # suppress torch 1.9.0 max_pool2d() warning
y1 = self.m(x)
y2 = self.m(y1)
return self.cv2(torch.cat([x, y1, y2, self.m(y2)], 1))
三、姿态估计
1.添加权重
在SimDR文件夹下新建weight/hrnet文件夹,添加pose_hrnet_w48_384x288.pth等三个类似文件,这几个文件也在我们上面提到的权重链接里面,这是所在路径:weight\hrnet
2.修改yaml文件
SimDR/experiments/文件夹下是coco与mpii数据集的配置文件,本文以coco数据集去做,同时只演示了其中的一个heatmap,其余两个也是一样修改便可以:
修改./SimDR/experiments/coco/hrnet/heatmap/w48_384x288_adam_lr1e-3.yaml
文件中的TEST部分的MODEL_FILE路径,如图所示:
3.获取关键点
在SimDR文件夹下新建Point_detect.py ,将下面的代码写入其中
4.绘制骨骼关键点
根据以上步骤,我们已经得到了关键点的坐标值,接下来需要在图片中描绘出来,以便展示检测结果。在SimDR/lib/utils/
文件夹下新建visualization.py文件,将代码写到文件中。骨架绘制代码结合了simple-hrnet与Openpose工程:
import cv2
import matplotlib.pyplot as plt
import numpy as np
import torch
import torchvision
import ffmpeg
import random
import math
import copy
def plot_one_box(x, img, color=None, label=None, line_thickness=3):
# Plots one bounding box on image img
tl = line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1 # line/font thickness
color = color or [random.randint(0, 255) for _ in range(3)]
c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3]))
cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
if label:
tf = max(tl - 1, 1) # font thickness
t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3
cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA) # filled
cv2.putText(img, label, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA)
return img
def joints_dict():
joints = {
"coco": {
"keypoints": {
0: "nose",
1: "left_eye",
2: "right_eye",
3: "left_ear",
4: "right_ear",
5: "left_shoulder",
6: "right_shoulder",
7: "left_elbow",
8: "right_elbow",
9: "left_wrist",
10: "right_wrist",
11: "left_hip",
12: "right_hip",
13: "left_knee",
14: "right_knee",
15: "left_ankle",
16: "right_ankle"
},
"skeleton": [
# # [16, 14], [14, 12], [17, 15], [15, 13], [12, 13], [6, 12], [7, 13], [6, 7], [6, 8],
# # [7, 9], [8, 10], [9, 11], [2, 3], [1, 2], [1, 3], [2, 4], [3, 5], [4, 6], [5, 7]
# [15, 13], [13, 11], [16, 14], [14, 12], [11, 12], [5, 11], [6, 12], [5, 6], [5, 7],
# [6, 8], [7, 9], [8, 10], [1, 2], [0, 1], [0, 2], [1, 3], [2, 4], [3, 5], [4, 6]
[15, 13], [13, 11], [16, 14], [14, 12], [11, 12], [5, 11], [6, 12], [5, 6], [5, 7],
[6, 8], [7, 9], [8, 10], [1, 2], [0, 1], [0, 2], [1, 3], [2, 4], # [3, 5], [4, 6]
[0, 5], [0, 6]
# [15, 13], [13, 11], [16, 14], [14, 12], [11, 12], [5, 11], [6, 12], [5, 6], [5, 7],
# [6, 8], [7, 9], [8, 10], [0, 3], [0, 4], [1, 3], [2, 4], # [3, 5], [4, 6]
# [0, 5], [0, 6]
]
},
"mpii": {
"keypoints": {
0: "right_ankle",
1: "right_knee",
2: "right_hip",
3: "left_hip",
4: "left_knee",
5: "left_ankle",
6: "pelvis",
7: "thorax",
8: "upper_neck",
9: "head top",
10: "right_wrist",
11: "right_elbow",
12: "right_shoulder",
13: "left_shoulder",
14: "left_elbow",
15: "left_wrist"
},
"skeleton": [
# [5, 4], [4, 3], [0, 1], [1, 2], [3, 2], [13, 3], [12, 2], [13, 12], [13, 14],
# [12, 11], [14, 15], [11, 10], # [2, 3], [1, 2], [1, 3], [2, 4], [3, 5], [4, 6], [5, 7]
[5, 4], [4, 3], [0, 1], [1, 2], [3, 2], [3, 6], [2, 6], [6, 7], [7, 8], [8, 9],
[13, 7], [12, 7], [13, 14], [12, 11], [14, 15], [11, 10],
]
},
}
return joints
def draw_points(image, points, color_palette='tab20', palette_samples=16, confidence_threshold=0.1,color=None):
"""
Draws `points` on `image`.
Args:
image: image in opencv format
points: list of points to be drawn.
Shape: (nof_points, 3)
Format: each point should contain (y, x, confidence)
color_palette: name of a matplotlib color palette
Default: 'tab20'
palette_samples: number of different colors sampled from the `color_palette`
Default: 16
confidence_threshold: only points with a confidence higher than this threshold will be drawn. Range: [0, 1]
Default: 0.1
Returns:
A new image with overlaid points
"""
circle_size = max(2, int(np.sqrt(np.max(np.max(points, axis=0) - np.min(points, axis=0)) // 16)))
for i, pt in enumerate(points):
if pt[2] >= confidence_threshold:
image = cv2.circle(image, (int(pt[0]), int(pt[1])), circle_size, color[i] ,-1, lineType= cv2.LINE_AA)
return image
def draw_skeleton(image, points, skeleton, color_palette='Set2', palette_samples=8, person_index=0,
confidence_threshold=0.1,sk_color=None):
"""
Draws a `skeleton` on `image`.
Args:
image: image in opencv format
points: list of points to be drawn.
Shape: (nof_points, 3)
Format: each point should contain (y, x, confidence)
skeleton: list of joints to be drawn
Shape: (nof_joints, 2)
Format: each joint should contain (point_a, point_b) where `point_a` and `point_b` are an index in `points`
color_palette: name of a matplotlib color palette
Default: 'Set2'
palette_samples: number of different colors sampled from the `color_palette`
Default: 8
person_index: index of the person in `image`
Default: 0
confidence_threshold: only points with a confidence higher than this threshold will be drawn. Range: [0, 1]
Default: 0.1
Returns:
A new image with overlaid joints
"""
canvas = copy.deepcopy(image)
cur_canvas = canvas.copy()
for i, joint in enumerate(skeleton):
pt1, pt2 = points[joint]
if pt1[2] >= confidence_threshold and pt2[2]>= confidence_threshold :
length = ((pt1[0] - pt2[0]) ** 2 + (pt1[1] - pt2[1]) ** 2) ** 0.5
angle = math.degrees(math.atan2(pt1[1] - pt2[1],pt1[0] - pt2[0]))
polygon = cv2.ellipse2Poly((int(np.mean((pt1[0],pt2[0]))), int(np.mean((pt1[1],pt2[1])))), (int(length / 2), 2), int(angle), 0, 360, 1)
cv2.fillConvexPoly(cur_canvas, polygon, sk_color[i],lineType=cv2.LINE_AA)
# cv2.fillConvexPoly(cur_canvas, polygon, sk_color,lineType=cv2.LINE_AA)
canvas = cv2.addWeighted(canvas, 0.4, cur_canvas, 0.6, 0)
return canvas
四、结果演示
1.照片演示
在SimDR文件夹下新建main.py ,将下面的代码写入文件中,我们需要自己修改parser参数source的默认值:
import argparse
import time
import os
import cv2 as cv
import numpy as np
from pathlib import Path
from Point_detect import Points
from lib.utils.visualization import draw_points_and_skeleton, joints_dict
def image_detect(opt):
skeleton = joints_dict()['coco']['skeleton']
hrnet_model = Points(model_name='hrnet', opt=opt, resolution=(384, 288)) # resolution = (384,288) or (256,192)
# simdr_model = Points(model_name='simdr', opt=opt,resolution=(256,192)) #resolution = (256,192)
# sa_simdr_model = Points(model_name='sa-simdr', opt=opt,resolution=(384,288)) #resolution = (384,288) or (256,192)
img0 = cv.imread(opt.source)
frame = img0.copy()
# predict
pred = hrnet_model.predict(img0)
# pred = simdr_model.predict(frame)
# pred = sa_simdr_model.predict(frame)
# vis
for i, pt in enumerate(pred):
frame = draw_points_and_skeleton(frame, pt, skeleton)
# save
cv.imwrite('test_result.jpg', frame)
def video_detect(opt):
hrnet_model = Points(model_name='hrnet', opt=opt, resolution=(384, 288)) # resolution = (384,288) or (256,192)
# simdr_model = Points(model_name='simdr', opt=opt,resolution=(256,192)) #resolution = (256,192)
# sa_simdr_model = Points(model_name='sa-simdr', opt=opt,resolution=(384,288)) #resolution = (384,288) or (256,192)
skeleton = joints_dict()['coco']['skeleton']
cap = cv.VideoCapture(opt.source)
# cap = cv.VideoCapture(0)
if opt.save_video:
fourcc = cv.VideoWriter_fourcc(*'MJPG')
out = cv.VideoWriter('data/runs/{}_out.avi'.format(os.path.basename(opt.source).split('.')[0]), fourcc, 24,
(int(cap.get(3)), int(cap.get(4))))
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
pred = hrnet_model.predict(frame)
# pred = simdr_model.predict(frame)
# pred = sa_simdr_model.predict(frame)
for pt in pred:
frame = draw_points_and_skeleton(frame, pt, skeleton)
if opt.show:
cv.imshow('result', frame)
if opt.save_video:
out.write(frame)
if cv.waitKey(1) == 27:
break
out.release()
cap.release()
cv.destroyAllWindows()
# video_detect(opt)
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--source', type=str, default='D:\PycharmProjects\SimDR\yolov5\data\images\\1234.jpg',
help='source') # file/folder, 0 for webcam
parser.add_argument('--detect_weight', type=str, default="./yolov5/weights/yolov5x.pt",
help='e.g "./yolov5/weights/yolov5x.pt"')
parser.add_argument('--save_video', action='store_true', default=False, help='save results to *.avi')
parser.add_argument('--show', action='store_true', default=True, help='save results to *.avi')
parser.add_argument('--device', default='cpu', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--conf-thres', type=float, default=0.25, help='object confidence threshold')
parser.add_argument('--iou-thres', type=float, default=0.45, help='IOU threshold for NMS')
parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --class 0, or --class 0 2 3')
parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')
parser.add_argument('--augment', action='store_true', help='augmented inference')
opt = parser.parse_args()
image_detect(opt)
# video_detect(opt)
结果展示:
PS:随便找了一张图片,但是忽然一看这男的有点小帅,谁啊这是,原来是CSDN的是Dream呀博主,听说他每周都有免费的送书活动,为大家送上免费的书籍,话都说到这份上了,点个关注和和赞不过分吧,我已经点了(^^)
2.视频演示
结果展示:
这波属实是是Dream呀博主为艺术献身,喔不是,是为学术献身了,这波点个关注和赞应该真不过份了吧~
五、报错分析
这里面记录了我自己所有的报错,以及大家可能经常会遇到的一些报错:
1.路径问题
本文代码是在pycharm中运行,yolov5工程的加入导致有些文件夹名称相同,pycharm会搞混,可能会出现某些包找不到,出现类似下面的错误:
NameError: name ‘Yolov5’ is not defined
解决方法:
这里我们可以先运行一下YOLOv5.py脚本,根据报错改一下import的内容。举个例子,YOLOv5.py中刚开始导包,有的人这样就不会报错:
而大部分人都需要在前面在加一层索引:
更有甚者需要把所有路径都加上,就比如我,我真的是烦死了,这不是纯纯有冰吗!
破电脑,等哥赚够了money,就把你休了。真气死我了,这个地方我前前后后弄了快俩点了,就是导包导不进去。只要大家的报错中含有XXX不存在,那大概率就是导包没导进去,根据每一个错误的地方去改路径就可以啦!
这个地方出错太正常了,因为文件太多了很容易出错,如果你想万无一失,就和我一样,把路径写到最完全,这样寻找路径就找不错了,如果还错了,那怎么办呢?再生一个吧…
2.YOLOv5、YOLOvx训练过程警告:functional.py
运行程序,遇到出现以下警告:
…pytorch\lib\site-packages\torch\functional.py:478: UserWarning:
torch.meshgrid: in an upcoming release, it will be required to pass
the indexing argument. (Triggered internally at
C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\TensorShape.cpp:2895.)
巴拉巴拉一大长串,就是functional.py
的问题,虽然警告不影响程序,但是看着难受!!!
解决方法:
点击报错处,跳转到代码functional.py,找到错误处,加上indexing='ij'
即可,如下图所示:
3.matplotlib.use( Agg ) # for writing to files only
出现这个报错:
解决方法:
将:matplotlib.use('Agg')
改为:plt.switch_backend('agg')
4.AttributeError: ‘Upsample‘ object has no attribute ‘recompute_scale_factor‘
在使用yolov5训练的时候,有可能会出现报错:
AttributeError: ‘Upsample’ object has no attribute ‘recompute_scale_factor’
意思是:
“Upsample”对象没有属性“recompute_scale_factor”
我们根据报错的提示,进入到py文件,找到 forward()
函数。
def forward(self, input: Tensor) -> Tensor:
return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners,
recompute_scale_factor=self.recompute_scale_factor)
注释掉,改成下面这个代码。为什么注释,因为以便别的项目使用,以后要改回来:
def forward(self, input: Tensor) -> Tensor:
return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners)
5. ‘gbk‘ codec can‘t decode byte
运行程序,报错以下错误:
UnicodeDecodeError: ‘gbk‘ codec can‘t decode byte
然后参考资料,说是编码格式utf-8的问题,可是已经设置过是utf-8了,还是报错。
解决方法:
读取文件的问题,一般都是。
按照给的报错,找到错误那段,有时候比较难找,实在找不到,就找报错程序里,看哪部分有以下代码:
with open(file) as f:
就是它,没跑了,把它改成:
with open(file, ‘r’, encoding=‘utf-8’) as f:
即可解决,over。
- 点赞
- 收藏
- 关注作者
评论(0)