Flume搭建
Flume搭建
注意:前期步骤,最小化安装,配置yum,安装bash-completion,安装vim,安装net-tools,关闭防火墙,关闭SELINUX,hosts表,免密登录。
1.安装java
2.上传并解压flume
[root@win1 soft]# tar -zxvf apache-flume-1.9.0-bin.tar.gz
3.配置flume环境。默认没有flume-env.sh,需要复制
[root@win1 conf]# cd /hadoop/soft/apache-flume-1.9.0-bin/conf/ 进入conf目录
[root@win1 conf]# scp -p flume-env.sh.template flume-env.sh 复制一份flume-env.sh
4、编辑flume-env.sh文件(只编辑export JAVA_HOME段落)。
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# If this file is placed at FLUME_CONF_DIR/flume-env.sh, it will be sourced
# during Flume startup.
# Enviroment variables can be set here.
export JAVA_HOME=/hadoop/soft/jdk1.8.0_161
后面省略...
5、编辑环境变量
[root@win1 conf]# vim /root/.bashrc
JAVA_HOME=/hadoop/soft/jdk1.8.0_161
FLUME_HOME=/hadoop/soft/apache-flume-1.9.0-bin
PATH=$JAVA_HOME/bin:$PATH:$FLUME_HOME/bin
CLASS_PATH=$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar
6.加载环境变量
[root@win1 ~]# source /root/.bashrc
7.确定环境配置是否成功
[root@win1 ~]# flume-ng version
Flume 1.9.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: d4fcab4f501d41597bc616921329a4339f73585e
Compiled by fszabo on Mon Dec 17 20:45:25 CET 2018
From source with checksum 35db629a3bda49d23e9b3690c80737f9
8.配置flume与kafka连接
[root@win1 ~]# cd /hadoop/soft/apache-flume-1.9.0-bin/conf/
[root@win1 conf]# cp flume-conf.properties.template kafka.properties
默认没有kafka.properties需复制一份kafka.properties
[root@win1 conf]# vi kafka.properties 进入kafka.properties
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# The configuration file needs to define the sources,
# the channels and the sinks.
# Sources, channels and sinks are defined per agent,
# in this case called 'agent' 把原文件该行之后内容删除后,输入以下内容↓
#配置flume agent的source、channel、sink
a1.sources = r1
a1.channels = c1
a1.sinks = k1
#配置source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /tmp/logs/kafka.log
#配置channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
#配置sink
a1.sinks.k1.channel = c1
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
#配置Kafka的Topic
a1.sinks.k1.kafka.topic = mytest
#配置kafka的broker地址和端口号
a1.sinks.k1.kafka.bootstrap.servers = win1:9092,win2:9092,win3:9092
#配置批量提交的数量
a1.sinks.k1.kafka.flumeBatchSize = 20
a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.kafka.producer.linger.ms = 1
a1.sinks.k1.kafka.producer.compression.type = snappy
#绑定source和sink到channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
9.创建logs目录
[root@win1 ~]# mkdir -p /tmp/logs 创建logs目录
[root@win1 ~]# touch /tmp/logs/kafka.log 创建kafka.log
10.创建脚本
[root@win1 ~]# cd /root
[root@win1 ~]# vim kafkaoutput.sh 创建并编辑kafkaoutput.sh
#!/bin/bash
for((i=0;i<=1000;i++))
do
echo "kafka_test-"+$i >> /tmp/logs/kafka.log
done
11.脚本赋权
[root@win1 ~]# chmod 777 kafkaoutput.sh
12.在kafka节点创建topic
[root@win1 ~]# kafka-topics.sh --create --zookeeper win1:2181 --replication-factor 3 --partitions 1 --topic mytest
13.打开console 第一台win1
[root@win1 ~]# kafka-console-consumer.sh --bootstrap-server win1:9092,win2:9092,win3:9092 --from-beginning --topic mytest
注意:/hadoop/soft/apache-flume-1.9.0-bin/conf/kafka.properties里指定的话题必须和打开控制台的话题一致
#配置Kafka的Topic
a1.sinks.k1.kafka.topic = mytest
14.启动flume(-Dflume.root.logger=INFO,console -D后面的为参数,这里使用打印到控制台的方式,并动态修改log4j的为info级别)第二台win1
[root@win1 ~]# flume-ng agent --conf /hadoop/soft/apache-flume-1.9.0-bin/conf/ --conf-file /hadoop/soft/apache-flume-1.9.0-bin/conf/kafka.properties -name a1 -Dflume.root.logger=INFO,console
运行之后会出现如下信息反馈并卡在那里,在另一个窗口启动步骤15的执行脚本 ↓
15.执行脚本 第三台win1
[root@win1 ~]# sh kafkaoutput.sh
16.到第一台win1查看kafka
kafka_test-+992
kafka_test-+993
kafka_test-+994
kafka_test-+995
kafka_test-+996
kafka_test-+997
kafka_test-+998
kafka_test-+999
kafka_test-+1000
- 点赞
- 收藏
- 关注作者
评论(0)