Error Code 1: Cuda Runtime (invalid resource handle)
【摘要】 问题描述同时加载了多个TensorRT模型,就会出现如下问题:[12/06/2022-14:28:23] [TRT] [I] Loaded engine size: 5 MiB[12/06/2022-14:28:23] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU ...
问题描述
同时加载了多个TensorRT模型,就会出现如下问题:
[12/06/2022-14:28:23] [TRT] [I] Loaded engine size: 5 MiB
[12/06/2022-14:28:23] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +3, now: CPU 0, GPU 3 (MiB)
[12/06/2022-14:28:23] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +5, now: CPU 0, GPU 8 (MiB)
[12/06/2022-14:28:23] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[12/06/2022-14:28:25] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[12/06/2022-14:28:25] [TRT] [W] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
[12/06/2022-14:28:25] [TRT] [W] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
[12/06/2022-14:28:27] [TRT] [E] 1: [reformat.cpp::genericReformat::executeCutensor::388] Error Code 1: CuTensor (Internal cuTensor permutate execute failed)
[12/06/2022-14:28:27] [TRT] [E] 1: [checkMacros.cpp::nvinfer1::catchCudaError::202] Error Code 1: Cuda Runtime (invalid resource handle)
原因分析
这种问题一般多发于在多线程中使用tensorrt,或者在主线程中定义tensorrt的引擎,然后在回调线程利用该引擎进行推理的时候,就会发生这样的错误。
解决方法
导入cuda包,然后初始化。
import pycuda.driver as cuda0
cuda0.init()
在类初始化里面添加:
self.cfx = cuda0.Device(0).make_context()
在推理代码里面,再推理前加上 self.cfx.push(),在推理完成后,加上 self.cfx.pop()
self.cfx.push()
#推理代码
self.context.execute_v2(list(self.binding_addrs.values()))
self.cfx.pop()
【声明】本内容来自华为云开发者社区博主,不代表华为云及华为云开发者社区的观点和立场。转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息,否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱:
cloudbbs@huaweicloud.com
- 点赞
- 收藏
- 关注作者
评论(0)