手把手教你在MRS集群部署ClickHouse服务
【摘要】 继上篇在华为云编译和使用单点的ClickHouse,本篇博文会向大家分享MRS集群上配置部署ClickHouse服务,是ClickHouse集群:)
1 前言
继上篇在华为云编译和使用单点的ClickHouse,本篇博文会向大家分享MRS集群上配置部署ClickHouse服务,是ClickHouse集群:)
2 准备工作
因为是在MRS集群上部署,所以需要先购买一个MRS集群,这里以“MRS 2.1.0”版本的流式集群为例,注意下图中的箭头:
MRS集群购买以后,记得配置“管理安全组规则”,把前面博文中编译机IP地址添加到“入方向规则”中,这样可以从编译机scp拷贝ClickHouse二进制文件传送到MRS集群上。我这里为了省点费用,选择了非高可用的集群,所以是1个master和3个core节点:
我计划把ClickHouse集群部署在3个core节点上。
3 部署集群
3.1 上传文件
上传clickhouse二进制文件到上面的三个core节点上,只需要clickhouse这个文件;
上传config.xml、metrika.xml、user.xml文件到上面三个core节点上;
[root@node-str-coreCWtk0002 ~]# ls -lrt
total 3170348
-rw-r--r--. 1 root root 79 Nov 8 2019 env_file
-rwx------. 1 root root 3246411984 Oct 21 17:54 clickhouse
-rw-r--r--. 1 root root 1483 Oct 21 18:36 users.xml
-rw-r--r--. 1 root root 1603 Oct 21 18:39 metrika.xml
-rw-r--r--. 1 root root 3396 Oct 21 18:45 config.xml
[root@node-str-coreCWtk0002 ~]#
3.2 获取zookeeper信息
可以有2个方式获取到zk的信息:
读取配置文件
[root@node-master1azib ~]# cat /opt/client/ZooKeeper/zookeeper/conf/zoo.cfg
在MRS集群的每台机器上执行
[root@node-str-coreCWtk0002 ~]# netstat -an | grep 2181
tcp 0 0 192.168.0.220:2181 0.0.0.0:* LISTEN
侦听2181端口的话表示该节点部署有zookeeper节点,然后把他们配置到metrika.xml中的zookeeper-servers中。
3.3 修改配置文件
3个配置文件的配置信息如下,大家只需要关注并修改其中的IP地址即可。
config.xml:
<?xml version="1.0"?>
<yandex>
<logger>
<level>information</level>
<log>/var/log/Bigdata/clickhouse/clickhouse-server.log</log>
<errorlog>/var/log/Bigdata/clickhouse/clickhouse-server.err.log</errorlog>
<size>1000M</size>
<count>10</count>
</logger>
<http_port>8123</http_port>
<tcp_port>9000</tcp_port>
<interserver_http_port>9009</interserver_http_port>
<interserver_http_host>192.168.0.17</interserver_http_host>
<listen_host>192.168.0.17</listen_host>
<max_connections>4096</max_connections>
<keep_alive_timeout>600</keep_alive_timeout>
<max_concurrent_queries>150</max_concurrent_queries>
<uncompressed_cache_size>8589934592</uncompressed_cache_size>
<mark_cache_size>5368709120</mark_cache_size>
<path>/srv/BigData/streaming/data1/clickhouse/</path>
<tmp_path>/srv/BigData/streaming/data1/clickhouse/tmp/</tmp_path>
<users_config>/root/users.xml</users_config>
<default_profile>default</default_profile>
<default_database>default</default_database>
<mlock_executable>false</mlock_executable>
<include_from>/root/metrika.xml</include_from>
<remote_servers incl="clickhouse_remote_servers" />
<zookeeper incl="zookeeper-servers" optional="true" />
<macros incl="macros" optional="true" />
<builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval>
<max_session_timeout>3600</max_session_timeout>
<default_session_timeout>60</default_session_timeout>
<query_log>
<database>system</database>
<table>query_log</table>
<partition_by>toYYYYMM(event_date)</partition_by>
<flush_interval_milliseconds>7500</flush_interval_milliseconds>
</query_log>
<trace_log>
<database>system</database>
<table>trace_log</table>
<partition_by>toYYYYMM(event_date)</partition_by>
<flush_interval_milliseconds>7500</flush_interval_milliseconds>
</trace_log>
<query_thread_log>
<database>system</database>
<table>query_thread_log</table>
<partition_by>toYYYYMM(event_date)</partition_by>
<flush_interval_milliseconds>7500</flush_interval_milliseconds>
</query_thread_log>
<dictionaries_config>*_dictionary.xml</dictionaries_config>
<compression incl="clickhouse_compression"></compression>
<distributed_ddl>
<path>/clickhouse/task_queue/ddl</path>
</distributed_ddl>
<graphite_rollup_example>
<pattern>
<regexp>click_cost</regexp>
<function>any</function>
<retention>
<age>0</age>
<precision>3600</precision>
</retention>
<retention>
<age>86400</age>
<precision>60</precision>
</retention>
</pattern>
<default>
<function>max</function>
<retention>
<age>0</age>
<precision>60</precision>
</retention>
<retention>
<age>3600</age>
<precision>300</precision>
</retention>
<retention>
<age>86400</age>
<precision>3600</precision>
</retention>
</default>
</graphite_rollup_example>
<format_schema_path>/srv/BigData/streaming/data1/clickhouse/format_schemas/</format_schema_path>
</yandex>
只需要修改:interserver_http_host、listen_host为本机的IP地址;
metrika.xml:
<yandex>
<clickhouse_remote_servers>
<cluster1>
<shard>
<replica>
<host>192.168.0.17</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>192.168.0.57</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>192.168.0.220</host>
<port>9000</port>
</replica>
</shard>
</cluster1>
<cluster2>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>192.168.0.17</host>
<port>9000</port>
</replica>
<replica>
<host>192.168.0.57</host>
<port>9000</port>
</replica>
<replica>
<host>192.168.0.220</host>
<port>9000</port>
</replica>
</shard>
</cluster2>
</clickhouse_remote_servers>
<macros>
<cluster>cluster2</cluster>
<shard>1</shard>
<replica>192.168.0.17</replica>
</macros>
<zookeeper-servers>
<node index="1">
<host>192.168.0.57</host>
<port>2181</port>
</node>
<node index="2">
<host>192.168.0.134</host>
<port>2181</port>
</node>
<node index="3">
<host>192.168.0.220</host>
<port>2181</port>
</node>
</zookeeper-servers>
<allow_ip_list>
<ip>::/0</ip>
</allow_ip_list>
<clickhouse_compression>
<case>
<min_part_size>10000000000</min_part_size>
<min_part_size_ratio>0.01</min_part_size_ratio>
<method>lz4</method>
</case>
</clickhouse_compression>
</yandex>
这里我定义了2个cluster:cluster1表示3shard1replica,cluster2表示1shard3replica,大家可以自行修改;macros宏定义部分修改为本机的IP地址;zookeeper-servers配置成前面找的zookeeper地址;
users.xml:
<?xml version="1.0"?>
<yandex>
<profiles>
<default>
<log_queries>1</log_queries>
<max_memory_usage>66000000000</max_memory_usage>
<max_threads>14</max_threads>
<background_pool_size>32</background_pool_size>
<max_pipeline_depth>2000</max_pipeline_depth>
<load_balancing>random</load_balancing>
</default>
<readonly>
<readonly>1</readonly>
</readonly>
<web>
<max_execution_time>300</max_execution_time>
<max_columns_to_read>30</max_columns_to_read>
<max_rows_to_read>10000000000</max_rows_to_read>
<max_rows_to_group_by>10000000000</max_rows_to_group_by>
<max_rows_to_sort>10000000000</max_rows_to_sort>
<max_rows_in_distinct>10000000000</max_rows_in_distinct>
<max_rows_to_transfer>10000000000</max_rows_to_transfer>
<readonly>1</readonly>
</web>
</profiles>
<users>
<default>
<password></password>
<networks>
<ip>::/0</ip>
</networks>
<profile>default</profile>
<quota>default</quota>
</default>
</users>
<quotas>
<default>
<interval>
<duration>0</duration>
<queries>0</queries>
<errors>0</errors>
<result_rows>0</result_rows>
<read_rows>0</read_rows>
<execution_time>0</execution_time>
</interval>
</default>
</quotas>
</yandex>
这个不用改。
3.4 启动服务端
在3个core节点上分别执行:
[root@node-str-coreCWtk0002 ~]# nohup ./clickhouse server --config-file=config.xml &
3.5 启动客户端
[root@node-str-coreCWtk0002 ~]# ./clickhouse client --host=192.168.0.220
ClickHouse client version 20.9.4.1.
Connecting to 192.168.0.220:9000 as user default.
Connected to ClickHouse server version 20.9.4 revision 54439.
node-str-coreCWtk0002 :) show databases;
SHOW DATABASES
┌─name───────────────────────────┐
│ _temporary_and_external_tables │
│ default │
│ system │
└────────────────────────────────┘
3 rows in set. Elapsed: 0.001 sec.
node-str-coreCWtk0002 :) select * from system.clusters
SELECT *
FROM system.clusters
┌─cluster──┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name─────┬─host_address──┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─estimated_recovery_time─┐
│ cluster1 │ 1 │ 1 │ 1 │ 192.168.0.17 │ 192.168.0.17 │ 9000 │ 0 │ default │ │ 0 │ 0 │
│ cluster1 │ 2 │ 1 │ 1 │ 192.168.0.57 │ 192.168.0.57 │ 9000 │ 0 │ default │ │ 0 │ 0 │
│ cluster1 │ 3 │ 1 │ 1 │ 192.168.0.220 │ 192.168.0.220 │ 9000 │ 1 │ default │ │ 0 │ 0 │
│ cluster2 │ 1 │ 1 │ 1 │ 192.168.0.17 │ 192.168.0.17 │ 9000 │ 0 │ default │ │ 0 │ 0 │
│ cluster2 │ 1 │ 1 │ 2 │ 192.168.0.57 │ 192.168.0.57 │ 9000 │ 0 │ default │ │ 0 │ 0 │
│ cluster2 │ 1 │ 1 │ 3 │ 192.168.0.220 │ 192.168.0.220 │ 9000 │ 1 │ default │ │ 0 │ 0 │
└──────────┴───────────┴──────────────┴─────────────┴───────────────┴───────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────────────┘
6 rows in set. Elapsed: 0.002 sec.
node-str-coreCWtk0002 :)
3.6 测试验证
[root@node-str-coreCWtk0002 ~]# ./clickhouse client --host=192.168.0.220 --multiline
ClickHouse client version 20.9.4.1.
Connecting to 192.168.0.220:9000 as user default.
Connected to ClickHouse server version 20.9.4 revision 54439.
node-str-coreCWtk0002 :) create table table2 on cluster cluster2 (
:-] `EventDate` DateTime,
:-] `id` UInt64
:-] ) engine = ReplicatedMergeTree('/clickhouse/{cluster}/{shard}/table2', '{replica}')
:-] partition by toYYYYMM(EventDate)
:-] order by id;
CREATE TABLE table2 ON CLUSTER cluster2
(
`EventDate` DateTime,
`id` UInt64
)
ENGINE = ReplicatedMergeTree('/clickhouse/{cluster}/{shard}/table2', '{replica}')
PARTITION BY toYYYYMM(EventDate)
ORDER BY id
┌─host──────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ 192.168.0.57 │ 9000 │ 0 │ │ 2 │ 0 │
│ 192.168.0.17 │ 9000 │ 0 │ │ 1 │ 0 │
│ 192.168.0.220 │ 9000 │ 0 │ │ 0 │ 0 │
└───────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
3 rows in set. Elapsed: 0.121 sec.
node-str-coreCWtk0002 :)
node-str-coreCWtk0002 :) create table table2_all on cluster cluster2 (
:-] `EventDate` DateTime,
:-] `id` UInt64
:-] ) engine = Distributed(cluster2, default, table2, rand());
CREATE TABLE table2_all ON CLUSTER cluster2
(
`EventDate` DateTime,
`id` UInt64
)
ENGINE = Distributed(cluster2, default, table2, rand())
┌─host──────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ 192.168.0.57 │ 9000 │ 0 │ │ 2 │ 0 │
│ 192.168.0.17 │ 9000 │ 0 │ │ 1 │ 0 │
│ 192.168.0.220 │ 9000 │ 0 │ │ 0 │ 0 │
└───────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
3 rows in set. Elapsed: 0.117 sec.
node-str-coreCWtk0002 :) show tables;
SHOW TABLES
┌─name───────┐
│ table2 │
│ table2_all │
└────────────┘
2 rows in set. Elapsed: 0.002 sec.
node-str-coreCWtk0002 :) INSERT INTO table2 VALUES ('2020-01-01 01:01:01', 1);
INSERT INTO table2 VALUES
Ok.
1 rows in set. Elapsed: 0.012 sec.
node-str-coreCWtk0002 :) INSERT INTO table2_all VALUES ('2020-02-02 02:02:02', 2);
INSERT INTO table2_all VALUES
Ok.
1 rows in set. Elapsed: 0.009 sec.
node-str-coreCWtk0002 :) select * from table2;
SELECT *
FROM table2
┌───────────EventDate─┬─id─┐
│ 2020-02-02 02:02:02 │ 2 │
└─────────────────────┴────┘
┌───────────EventDate─┬─id─┐
│ 2020-01-01 01:01:01 │ 1 │
└─────────────────────┴────┘
2 rows in set. Elapsed: 0.002 sec.
node-str-coreCWtk0002 :) select * from table2_all;
SELECT *
FROM table2_all
┌───────────EventDate─┬─id─┐
│ 2020-01-01 01:01:01 │ 1 │
└─────────────────────┴────┘
┌───────────EventDate─┬─id─┐
│ 2020-02-02 02:02:02 │ 2 │
└─────────────────────┴────┘
2 rows in set. Elapsed: 0.002 sec.
node-str-coreCWtk0002 :)
至此,MRS集群上的ClickHouse服务部署完成,且能正常工作^_^
【声明】本内容来自华为云开发者社区博主,不代表华为云及华为云开发者社区的观点和立场。转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息,否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱:
cloudbbs@huaweicloud.com
- 点赞
- 收藏
- 关注作者
作者其他文章
蘑菇街_海风2020/12/01 08:01:161楼编辑删除举报
妞子2021/05/17 10:43:172楼编辑删除举报