如何在Python中调用C++版本的ByteTrack跟踪算法

举报
HouYanSong 发表于 2025/11/08 17:48:23 2025/11/08
【摘要】 本文实现了ByteTrack-TensorRT跟踪算法的Python插件,并在原有算法基础上提供了跟踪目标的类别信息,Jetson Orin Nano (8GB)上的YOLOv5实时目标检测和跟踪速度高达80FPS,满足对快速运动目标的跟踪需求。

如何在Python中调用C++版本的ByteTrack跟踪算法

这个项目提供了基于ByteTrack-TensorRTPython插件,并在原有算法基础上提供了跟踪目标的类别信息,Jetson Orin Nano在之前YOLOv5插件的基础上实现高达83 FPS的实时检测跟踪性能。

  • 极致性能: 基于TensorRT优化,充分利用硬件加速
  • 📦 开箱即用:构建过程简单,快速部署您的跟踪应用
  • 🐍 Python 友好: 使用Pybind11提供简洁Python接口
  • 📱 边缘设备优化: 特别针对Jetson边缘设备进行适配

Build plugin

首先安装必要的库克隆仓库构建项目,注意JetPack 5.x版本才能正常运行:

sudo apt update
sudo apt install ffmpeg
sudo apt install pybind11-dev
sudo apt install libeigen3-dev
git cone https://github.com/HouYanSong/bytetrack_pybind11.git
cd bytetrack_pybind11
pip install pybind11
rm -fr build
cmake -S . -B build
cmake --build build
[ 12%] Building CXX object CMakeFiles/bytetrack.dir/bytetrack/src/BYTETracker.cpp.o
[ 25%] Building CXX object CMakeFiles/bytetrack.dir/bytetrack/src/STrack.cpp.o
[ 37%] Building CXX object CMakeFiles/bytetrack.dir/bytetrack/src/kalmanFilter.cpp.o
[ 50%] Building CXX object CMakeFiles/bytetrack.dir/bytetrack/src/lapjv.cpp.o
[ 62%] Building CXX object CMakeFiles/bytetrack.dir/bytetrack/src/utils.cpp.o
[ 75%] Linking CXX shared library libbytetrack.so
[ 75%] Built target bytetrack
[ 87%] Building CXX object CMakeFiles/bytetrack_trt.dir/bytetrack_trt.cpp.o
[100%] Linking CXX shared module bytetrack_trt.cpython-38-aarch64-linux-gnu.so
[100%] Built target bytetrack_trt

Run demo

我们提供了一个简单的Python示例,只需要导入C++构建的Python动态链接库就可以非常方便的调用ByteTrack跟踪算法,返回目标位置、跟踪ID和类别信息。

import cv2
import time
import ctypes

# 加载依赖库
ctypes.CDLL("./yolov5_trt_plugin/libyolo_plugin.so", mode=ctypes.RTLD_GLOBAL)
ctypes.CDLL("./yolov5_trt_plugin/libyolo_utils.so", mode=ctypes.RTLD_GLOBAL)
ctypes.CDLL("./build/libbytetrack.so", mode=ctypes.RTLD_GLOBAL)

# 导入YOLOv5检测器和ByteTrack跟踪器
from yolov5_trt_plugin import yolov5_trt
from build import bytetrack_trt

def draw_image(image, detections, tracks, fps):
    for track in tracks:
        x, y, w, h = track.tlwh
        track_id = track.track_id
        class_id = track.label
        x1, y1, x2, y2 = int(x), int(y), int(x+w), int(y+h)
        cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
        cv2.putText(image, f"C:{class_id} T:{track_id}", (x1, y1 - 10), 
                   cv2.FONT_HERSHEY_PLAIN, 1.2, (0, 0, 255), 2)
    
    cv2.putText(image, f"FPS: {fps:.2f}", (10, 30), 
               cv2.FONT_HERSHEY_PLAIN, 1.5, (0, 0, 255), 2)
    
    return image

def main(input_path, output_path):
    cap = cv2.VideoCapture(input_path)
    fps_value = int(cap.get(cv2.CAP_PROP_FPS))
    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
    writer = cv2.VideoWriter(output_path, cv2.VideoWriter_fourcc(*'MJPG'), fps_value, (width, height))    

    detector = yolov5_trt.YOLOv5Detector("./yolov5_trt_plugin/yolov5s.engine", width, height)
    tracker = bytetrack_trt.BYTETracker(frame_rate = fps_value, track_buffer = 30)
    
    fps_list = []
    frame_count = 0
    total_time = 0.0

    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break
            
        start_time = time.time()
        
        # 目标检测
        detections = detector.detect(input_image=frame, 
                                    input_w=640, input_h=640, 
                                    conf_thresh=0.45, nms_thresh=0.55)
        objects = []
        for det in detections:
            x1, y1, x2, y2 = det['bbox']
            rect = bytetrack_trt.RectFloat(x1, y1, x2-x1, y2-y1)  # x, y, width, height
            obj = bytetrack_trt.Object()
            obj.rect = rect
            obj.label = det['class_id']
            obj.prob = det['confidence']
            objects.append(obj)
            
        # 目标跟踪
        tracks = tracker.update(objects)
        
        process_time = time.time() - start_time
        current_fps = 1.0 / process_time if process_time > 0 else 0
        
        frame_count += 1
        total_time += process_time
        fps_list.append(current_fps)

        # 图像绘制
        image = draw_image(frame, detections, tracks, current_fps)
        writer.write(image)

    cap.release()
    writer.release()
    
    if frame_count > 0:
        avg_fps = frame_count / total_time if total_time > 0 else 0
        print(f"Processed {frame_count} frames")
        print(f"Average FPS: {avg_fps:.2f}")
        print(f"Min FPS: {min(fps_list):.2f}")
        print(f"Max FPS: {max(fps_list):.2f}")


if __name__ == "__main__":
    input_video = "./media/sample_720p.mp4"  
    output_video = "./result.avi"  
    main(input_video, output_video)

仅需在终端中运行yolov5_bytetrack.py脚本:

python yolov5_bytetrack.py
[11/07/2025-17:13:10] [I] [TRT] Loaded engine size: 8 MiB
Deserialize yoloLayer plugin: YoloLayer
[11/07/2025-17:13:12] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +536, GPU +702, now: CPU 841, GPU 3927 (MiB)
[11/07/2025-17:13:12] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +83, GPU +94, now: CPU 924, GPU 4021 (MiB)
[11/07/2025-17:13:12] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +7, now: CPU 0, GPU 7 (MiB)
[11/07/2025-17:13:12] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 924, GPU 4021 (MiB)
[11/07/2025-17:13:12] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +1, now: CPU 924, GPU 4022 (MiB)
[11/07/2025-17:13:12] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +11, now: CPU 0, GPU 18 (MiB)
Init ByteTrack!
Processed 1442 frames
Average FPS: 83.78
Min FPS: 68.31
Max FPS: 113.35

Conclusion Remarks

本文实现了ByteTrack-TensorRT跟踪算法的Python插件,并在原有算法基础上提供了跟踪目标的类别信息,Jetson Orin Nano (8GB)上的YOLOv5实时目标检测和跟踪速度高达80FPS,满足对快速运动目标的跟踪需求。

【声明】本内容来自华为云开发者社区博主,不代表华为云及华为云开发者社区的观点和立场。转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息,否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。