- 微信
- 微博
  
  分享文章到微博
- 复制链接
  
  复制链接到剪贴板

ModelBox猫狗识别【玩转华为云】

HouYanSong 发表于 2023/06/11 19:28:19 2023/06/11

4.7k+ 1 0

【摘要】本文主要进行算法的验证和部署，使用ModelBox开发AI应用，实现了猫狗图像的分类以及头部的定位，同时输出mask掩模图。目前该算法仅支持单个目标的检测和分割，后续会扩展到多个目标。

TensorFlow猫狗识别Oxford-IIIT图像定位与分割算法验证及部署

本文主要进行算法验证和部署测试，该算法实现了猫狗图像的分类以及头部的定位，同时输出mask掩模图。目前仅支持单个目标的检测和分割，后续会扩展到多个目标。之后使用ModelBox进行AI应用开发，模型最终运行效果如下：

本案例所需资源（代码、模型、测试数据等）均可从网盘链接下载。

由于笔者晚上睡觉时把电脑压坏了，于是趁着618活动之际入手了一台matebook16，笔者会在上面部署模型，实际检验电脑的性能。

模型训练

代码实现可以参考我发布的Notebook，可以点击Run in ModelArts一键运行：

导出的onnx模型结构如下所示：

ModelBox推理真的高效吗

我们分别使用onnxruntime与ModelBox Windows SDK对相同的模型实现相同的推理进行端到端的性能对比，我们准备了(1080p, 25fps)的视频作为测试输入。

原生onnxruntime推理：

视频检测的帧率只有4：

原生API推理代码位于资源包的onnxruntime_infer目录下：

"""

OpenCV 读取摄像头视频视频流，使用原生的onnxruntime推理

"""

# 导入OpenCV

import cv2

import time

import drawUtils

import numpy as np

import onnxruntime

from PIL import Image

# cap = cv2.VideoCapture(0)

cap = cv2.VideoCapture('video/1686284611402.mp4')

# cap = cv2.VideoCapture('video/1686307053582.mp4')

if not cap.isOpened():

print('文件不存在或编码错误')

else:

i = 0

fps = 25

start_time = time.time()

font = cv2.FONT_HERSHEY_PLAIN

index_to_clss = {0: 'cat', 1: 'dog'}

width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))

height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

onnx_model = onnxruntime.InferenceSession('model/oxford.onnx')

writer = cv2.VideoWriter('VideoOut.mp4',cv2.VideoWriter_fourcc(*'X264'),fps,(width,height))

while cap.isOpened():

ret,frame = cap.read()

if ret:

img = cv2.cvtColor(frame,cv2.COLOR_BGR2RGB)

img = cv2.resize(img, (224, 224))

img = img[...,::-1]/255.

img = img.astype(np.float32)

data = np.expand_dims(img, axis=0)

onnx_input ={onnx_model.get_inputs()[0].name: data}

pred_mask, clss, out1, out2, out3, out4 = onnx_model.run(None, onnx_input)

pred_mask = np.argmax(pred_mask, axis=-1)

pred_mask = pred_mask.astype(np.float32)

pred_mask = cv2.resize(pred_mask[0], (width, height))

pred_mask = pred_mask.astype(np.uint8)

mask = np.zeros(pred_mask.shape)

mask_0 = np.where(pred_mask==0, 255, mask)

mask_2 = np.where(pred_mask==2, 255, mask)

mask_r = Image.fromarray(mask_0).convert('L')

mask_g = Image.fromarray(mask_2).convert('L')

mask_b = Image.fromarray(mask_0).convert('L')

img_rgb = Image.merge('RGB', (mask_r, mask_g, mask_b))

img_rgb = np.asarray(img_rgb)

img_add = cv2.addWeighted(frame, 1, img_rgb, 0.5, 0)

if clss[0]>0.99 or clss[0]<0.1:

xmin, ymin, xmax, ymax = int(out1[0]*width), int(out2[0]*height), int(out3[0]*width), int(out4[0]*height)

img_add = cv2.rectangle(img=img_add,pt1=(xmin,ymin),pt2=(xmax,ymax),color=(255,0,0),thickness=10)

img_add = cv2.putText(img=img_add,text=index_to_clss.get((clss[0] > 0.5).astype('int')[0]),org=(xmin,ymin),fontFace=font,fontScale=5,color=(0,0,255),thickness=5,lineType=cv2.LINE_AA)

# 计算FPS

i += 1

now = time.time()

fps_text = int(1 / ( now - start_time))

start_time = now

print('oxford post ' + str(i))

# 添加中文（首先导入模块）

img_add = drawUtils.cv2AddChineseText(img_add, '帧率：'+str(fps_text), (20,50), textColor=(0, 255, 0), textSize=30)

# 显示画面

# cv2.imshow('oxford',img_add)

writer.write(img_add)

# 退出条件

if cv2.waitKey(1) & 0xFF == ord('q'):

break

else:

break

cap.release()

cv2.destroyAllWindows()

ModelBox应用开发

使用VS Code连接到ModelBox SDK所在目录或者远程开发板进行应用开发，从零开发工程。

创建工程

在SDK目录下使用create.py脚本创建工程，命名为oxford工程。

yanso@hou MINGW64 /d/modelbox-win10-x64-1.5.3

$ python ./create.py -t server -n oxford

sdk version is modelbox-win10-x64-1.5.3

dos2unix: converting file D:\modelbox-win10-x64-1.5.3\workspace\oxford/graph\modelbox.conf to Unix format...

dos2unix: converting file D:\modelbox-win10-x64-1.5.3\workspace\oxford/graph\oxford.toml to Unix format...

dos2unix: converting file D:\modelbox-win10-x64-1.5.3\workspace\oxford/bin\mock_task.toml to Unix format...

success: create oxford in D:\modelbox-win10-x64-1.5.3\workspace

创建推理功能单元

yanso@hou MINGW64 /d/modelbox-win10-x64-1.5.3

$ python ./create.py -t infer -n oxford_infer -p oxford

sdk version is modelbox-win10-x64-1.5.3

success: create infer oxford_infer in D:\modelbox-win10-x64-1.5.3\workspace\oxford/model/oxford_infer

可以看到创建好的推理功能单元在项目工程的model目录下面：

将我们转换好的oxford.onnx模型拖到oxford_infer目录下，接着编辑.toml文件，主要修改模型路径与输入输出，由于我们的模型有一个来自cpu的float类型输入与六个float类型的输出，所以对配置文件编辑如下：

[base]

name = "oxford_infer"

device = "cpu"

version = "1.0.0"

description = "your description"

entry = "./oxford.onnx" # model file path, use relative path

type = "inference"

virtual_type = "onnx" # inference engine type: win10 now only support onnx

group_type = "Inference" # flowunit group attribution, do not change

# Input ports description

[input]

[input.input1] # input port number, Format is input.input[N]

name = "Input" # input port name

type = "float" # input port data type ,e.g. float or uint8

device = "cpu" # input buffer type: cpu, win10 now copy input from cpu

# Output ports description

[output]

[output.output1] # output port number, Format is output.output[N]

name = "Output1" # output port name

type = "float" # output port data type ,e.g. float or uint8

[output.output2] # output port number, Format is output.output[N]

name = "Output2" # output port name

type = "float" # output port data type ,e.g. float or uint8

[output.output3] # output port number, Format is output.output[N]

name = "Output3" # output port name

type = "float" # output port data type ,e.g. float or uint8

[output.output4] # output port number, Format is output.output[N]

name = "Output4" # output port name

type = "float" # output port data type ,e.g. float or uint8

[output.output5] # output port number, Format is output.output[N]

name = "Output5" # output port name

type = "float" # output port data type ,e.g. float or uint8

[output.output6] # output port number, Format is output.output[N]

name = "Output6" # output port name

type = "float" # output port data type ,e.g. float or uint8

创建后处理功能单元

yanso@hou MINGW64 /d/modelbox-win10-x64-1.5.3

$ python ./create.py -t python -n oxford_post -p oxford

sdk version is modelbox-win10-x64-1.5.3

success: create python oxford_post in D:\modelbox-win10-x64-1.5.3\workspace\oxford/etc/flowunit/oxford_post

后处理功能单元主要对模型推理结果进行解码，可以看到在项目工程的etc/flowunit目录下面已经生成了 oxford_post功能单元，包含.toml配置文件与.py功能代码文件：

对于后处理功能单元的配置文件，我们在config中配置参数，接收六个float类型的推理结果与一个uint8类型的原图，输出和原图相同大小的掩模图以及图像类别和检测框。

# Basic config

[base]

name = "oxford_post" # The FlowUnit name

device = "cpu" # The flowunit runs on cpu

version = "1.0.0" # The version of the flowunit

type = "python" # Fixed value, do not change

description = "description" # The description of the flowunit

entry = "oxford_post@oxford_postFlowUnit" # Python flowunit entry function

group_type = "Generic" # flowunit group attribution, change as Input/Output/Image/Generic ...

# Flowunit Type

stream = false # Whether the flowunit is a stream flowunit

condition = false # Whether the flowunit is a condition flowunit

collapse = false # Whether the flowunit is a collapse flowunit

collapse_all = false # Whether the flowunit will collapse all the data

expand = false # Whether the flowunit is a expand flowunit

# The default Flowunit config

[config]

mask_h = 224

mask_w = 224

# Input ports description

[input]

[input.input1] # Input port number, the format is input.input[N]

name = "in_feat1" # Input port name