ClickHouse kafka表引擎使用故障问题 (一)
场景与问题:MRS ClickHouse客户,在执行滚动重启操作后,发现manager界面“集群队列大小”有大量业务拥塞,检查后台信息“Too many parts (303). Parts cleaning are processing significantly slower than inserts…”
客户数据表情况:
(1)客户报错信息设计数据表test.dwd_c_vehicle_upload_real_detail采用vin
String 设置分区键,“PARTITION BY xxHash32(vin) % 100”
(2)kafka引擎数据表,参数仅配置了必须配置的参数
SETTINGS kafka_broker_list = ‘xx.xx.xx.xx:9092,xx.xx.xx.xx:9092,xx.xx.xx.xx:9092’,
kafka_topic_list = ‘pro_dwd_c_vehicle_upload_real_detail’,
kafka_group_name = ‘clickhouse_pro_new’,
kafka_format = ‘JSONEachRow’,
kafka_num_consumers = 1
(3)客户数据插入的频次不详,每次插入数据大致在几百条。
根据报错信息定位源码信息与相关参数信息:
(1)\ClickHouse_Kernel-master\src\Storages\MergeTree\MergeTreeData.cpp
size_t parts_count_in_partition = getMaxPartsCountForPartition();
…….
if (parts_count_in_partition >= settings->parts_to_throw_insert)
{
ProfileEvents::increment(ProfileEvents::RejectedInserts);
throw Exception(
ErrorCodes::TOO_MANY_PARTS,
"Too many parts ({}). Parts cleaning are processing significantly slower than inserts",
parts_count_in_partition);
}
查阅官方文档parts_to_throw_insert默认值为300;
(2)根据kafka表引擎,其他参数分析,影响kafka数据表性能的重要参数:'kafka_max_block_size’默认值为65536即64K。
结合以上信息得出结论:由于客户数据表采用hash值作为分区键,数据表分区相对较多,再由于客户kafka表引擎参数“kafka_max_block_size”采用默认值65536,导致数据块较小,进而也就导致了数据插入时数据块较多,相应的分区part数量很容易超过“parts_to_throw_insert”默认值300,进而触发异常报错。
给客户建议:建议客户根据数据表情况、数据插入频次和每次插入数据的条数,对kafka表引擎数据表进行合理化配置,也可对clickhouse相应配置进行更改。例如:可以修改parts_to_throw_insert的默认值,可以增加“kafka_max_block_size”默认值,社区建议将“kafka_max_block_size”设置应增加为521K-1M,实现单表的最佳性能。
参考链接:
https://github.com/ClickHouse/ClickHouse/issues/3174
https://github.com/ClickHouse/ClickHouse/issues/9053
https://altinity.com/blog/clickhouse-kafka-engine-faq
- 点赞
- 收藏
- 关注作者
评论(0)