mmdetection在自己的数据集上训练检测模型
Bug0:
The size of tensor a (209) must match the size of tensor b (21824) at non-singleton dimension 0
解决:Firstly, I run training on coco dataset fluently.
Then, I refresh the config file according to config file in coco training set,the problem is solved.
Bug1:
aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [3,0,0], thread:
解决:看log,先出现Bug2或者Bug3,再出现的Bug1。
Bug2:
loss_cls: nan, loss_bbox: nan, loss: nan
解决:https://github.com/open-mmlab/mmdetection/issues/3013
alexchungio 的回答
check 生成的自己数据集的 json文件:
box坐标在图片内,
lr不要太大,
在配置文件中添加grad_clip,optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2)),
mmdet版本问题。
Bug3:
loss_cls: 0.0000, loss_bbox: 0.0000, loss: 0.0000
解决:https://github.com/open-mmlab/mmdetection/issues/3357
(1)sunnyisabaster 的回答
image_id box_id 不要从相同的数字开始,改成下面形式
93 image_id = -1
94 box_id = 7000
(2) aimhabo 的回答
生成coco json文件的代码中 class_name的标签名称
与代码中的:训练代码配置文件、coco.py文件、测试的标签类别文件(eval.py)中的classes的标签名称要一致。
Bug4:
mmdet - ERROR - The testing results of the whole dataset is empty.
解决:
https://github.com/open-mmlab/mmdetection/issues/4092
It's because I didn't modify the right number of classes in conf file.
Ref:
【1】正确跑通自己数据集的博客:
https://bbs.huaweicloud.com/blogs/198417
- 点赞
- 收藏
- 关注作者
评论(0)