大数据之 kafka 入门
【摘要】 一、zookeeper 下载 安装
cdh 版本的 zookeeper 下载地址
http://archive.cloudera.com/cdh5/cdh/5/zookeeper-3.4.5-cdh5.7.0.tar.gz
解压 tar -zxvf zookeeper-3.4.5-cdh5.7.0.tar.gz
配置环境变量
export ZK_HOME=/home/h...
一、zookeeper 下载 安装
- cdh 版本的 zookeeper 下载地址
http://archive.cloudera.com/cdh5/cdh/5/zookeeper-3.4.5-cdh5.7.0.tar.gz
解压 tar -zxvf zookeeper-3.4.5-cdh5.7.0.tar.gz
- 配置环境变量
export ZK_HOME=/home/hadoop/app/zookeeper-3.4.5-cdh5.7.0
export PATH=${ZK_HOME}/bin:$PATH
- 修改配置文件
cd /zookeeper-3.4.5-cdh5.7.0/conf
cp zoo_sample.cfg zoo.cfg
vim zoo.cfg
#修改数据存放目录,默认 目录是/tmp/zookeeper 临时文件夹,重启系统之后数据会被清除
dataDir=/home/hadoop/app/tmp/zookeeper
启动 zookeeper start
zkServer.sh
JMX enabled by default
Using config: /home/hadoop/app/zookeeper-3.4.5-cdh5.7.0/bin/../conf/zoo.cfg
Usage: /home/hadoop/app/zookeeper-3.4.5-cdh5.7.0/bin/zkServer.sh {start|start-foreground|stop|restart|status|upgrade|print-cmd}
[root@hadoop000 /home/hadoop/app/zookeeper-3.4.5-cdh5.7.0/conf]#
执行 jps 查看存在 QuorumPeerMain 说明启动成功
二、kafka 下载 安装 配置 单节点 单 b'roker 启动
1、下载地址 :http://kafka.apache.org/downloads
https://archive.apache.org/dist/kafka/0.9.0.0/kafka_2.11-0.9.0.0.tgz
解压 kafka tar -zxvf kafka_2.11-0.9.0.0.gz
配置kafka 环境变量
vim /etc/profile
export KAFKA_HOME=/home/hadoop/app/kafka_2.11-0.9.0.0
export PATH=${KAFKA_HOME}/bin:$PATH
2、修改 配置文件
vim /kafka_2.11-0.9.0.0/config/server.properties
############################# Server Basics #############################
# broker 唯一id
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=0
############################# Socket Server Settings #############################
listeners=PLAINTEXT://:9092
# The port the socket server listens on
#port=9092
# Hostname the broker will bind to. If not set, the server will bind to all interfaces
host.name=hadoop000
############################# Log Basics #############################
#修改 log 存放目录
# A comma seperated list of directories under which to store log files
log.dirs=/home/hadoop/app/tmp/kafka-logs
############################# Zookeeper #############################
# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=hadoop000:2181
# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000
3、启动 kafka
#kafka-server-start.sh
USAGE: /home/hadoop/app/kafka_2.11-0.9.0.0/bin/kafka-server-start.sh [-daemon] server.properties [--override property=value]*
启动命令
kafka-server-start.sh $KAFKA_HOME/config/server.properties
jps -m
三 、kafka 的基本 操作
1.创建 topic
kafka-topics.sh --create --zookeeper hadoop000:2181 --replication-factor 1 --partitions 1 --topic hello_topic
2.生产者发送 消息 到 topic
kafka-console-producer.sh --broker-list hadoop000:9092 --topic hello_topic
hello
hadoop
spark
kafka
flume
3.消费者接收消息
kafka-console-consumer.sh --zookeeper hadoop000:2181 --topic hello_topic --from-beginning
--from-beginning 的 使用 :加上 该参数,会接收之前所有的消息,不加该参数只接收最后面发送的消息。
4. 查看所有topic 详情
kafka-topics.sh --describe --zookeeper hadoop000:2181
Topic:hello_topic PartitionCount:1 ReplicationFactor:1 Configs:
Topic: hello_topic Partition: 0 Leader: 0 Replicas: 0 Isr: 0
5.查看指定 topic 详情
kafka-topics.sh --describe --zookeeper hadoop000:2181 --topic hello_topic
Topic:hello_topic PartitionCount:1 ReplicationFactor:1 Configs:
Topic: hello_topic Partition: 0 Leader: 0 Replicas: 0 Isr: 0
四、单节点多broker 部署 及 使用
cp server.properties server-1.properties
cp server.properties server-2.properties
cp server.properties server-3.properties
分别修改 broker.id 监听端口号,log 目录配置
server-1.properties
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=1
############################# Socket Server Settings #############################
listeners=PLAINTEXT://:9093
############################# Log Basics #############################
# A comma seperated list of directories under which to store log files
log.dirs=/home/hadoop/app/tmp/kafka-logs-1
server-2.properties
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=2
############################# Socket Server Settings #############################
listeners=PLAINTEXT://:9094
############################# Log Basics #############################
# A comma seperated list of directories under which to store log files
log.dirs=/home/hadoop/app/tmp/kafka-logs-2
server-3.properties
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=3
############################# Socket Server Settings #############################
listeners=PLAINTEXT://:9095
############################# Log Basics #############################
# A comma seperated list of directories under which to store log files
log.dirs=/home/hadoop/app/tmp/kafka-logs-3
启动三个 kafka 节点;
kafka-server-start.sh -daemon $KAFKA_HOME/config/server-1.properties
kafka-server-start.sh -daemon $KAFKA_HOME/config/server-2.properties
kafka-server-start.sh -daemon $KAFKA_HOME/config/server-3.properties
创建topic
kafka-topics.sh --create --zookeeper hadoop000:2181 --replication-factor 3 --partitions 1 --topic my_replication_topic
发送消息
kafka-console-producer.sh --broker-list hadoop000:9093,hadoop000:9094,hadoop000:9095 --topic my_replication_topic
接收消息
kafka-console-consumer.sh --zookeeper hadoop000:2181 --topic my_replication_topic
kafka 的 容错性还是能够保障的,其中一个broker 挂掉了之后,仍然可以接收到 消息。
五、java api 操作 kafka 完成生产者 生产消息,消费者消费消息
public class KafkaProperties {
public static String ZOOKEEPER = "192.168.42.85:2181";
public static String BROKER_LIST = "192.168.42.85:9092";
public static String TOPIC = "hello_topic";
public static String GROUP_ID = "test_group";
}
public class KafkaProductor implements Runnable {
private String topic;
private Producer<Integer, String> producer;
public KafkaProductor(String topic) {
this.topic = topic;
Properties properties = new Properties();
properties.put("metadata.broker.list", KafkaProperties.BROKER_LIST);
properties.put("serializer.class", "kafka.serializer.StringEncoder");
properties.put("request.required.acks", "1");
ProducerConfig producerConfig = new ProducerConfig(properties);
producer = new Producer<Integer, String>(producerConfig);
}
public String getTopic() {
return topic;
}
public void setTopic(String topic) {
this.topic = topic;
}
public static final int THREAD_COUNT = 100;
public static final int ALLOW_COUNT = 20;
public static final CountDownLatch countDownLatch = new CountDownLatch(THREAD_COUNT);
public static final Semaphore semaphore = new Semaphore(ALLOW_COUNT);
public static final ExecutorService EXECUTOR_SERVICE = Executors.newCachedThreadPool();
public static void main(String[] args) throws InterruptedException {
for (int i = 0; i < THREAD_COUNT; i++) {
EXECUTOR_SERVICE.submit(new KafkaProductor(KafkaProperties.TOPIC));
}
new Thread(new KafkaCustomer(KafkaProperties.TOPIC)).start();
}
@Override
public void run() {
try {
semaphore.acquire();
} catch (InterruptedException e) {
e.printStackTrace();
}
int no = 0;
while (no <= 10) {
String message = "msg" + no;
producer.send(new KeyedMessage<Integer, String>(topic, message));
no++;
try {
Thread.sleep(2000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
semaphore.release();
}
}
public class KafkaCustomer implements Runnable {
private String topic;
private ConsumerConnector consumerConnector;
public KafkaCustomer(String topic) {
this.topic = topic;
Properties properties = new Properties();
properties.put("group.id", KafkaProperties.GROUP_ID);
properties.put("zookeeper.connect", KafkaProperties.ZOOKEEPER);
ConsumerConfig consumerConfig = new ConsumerConfig(properties);
consumerConnector = Consumer.createJavaConsumerConnector(consumerConfig);
}
@Override
public void run() {
Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
topicCountMap.put(KafkaProperties.TOPIC, 1);
Map<String, List<KafkaStream<byte[], byte[]>>> streamMap = consumerConnector.createMessageStreams(topicCountMap);
KafkaStream<byte[], byte[]> messageAndMetadata = streamMap.get(KafkaProperties.TOPIC).get(0);
ConsumerIterator<byte[], byte[]> iterator = messageAndMetadata.iterator();
while (iterator.hasNext()) {
String msg = new String(iterator.next().message());
System.out.println("receive msg ~~~~:" + msg);
}
}
}
六、 0.9版本的flume 采集日志 输出到 kafka
新版本的flume 配置 不同
exec-memory-avro.conf 编写
exec-memory-avro.sources = exec-source
exec-memory-avro.sinks = avro-sink
exec-memory-avro.channels = memory-channel
#描述/配置源
exec-memory-avro.sources.exec-source.type = exec
exec-memory-avro.sources.exec-source.command = tail -F /home/hadoop000/hello.txt
exec-memory-avro.sources.exec-source.shell = /bin/bash -c
#描述接收器
exec-memory-avro.sinks.avro-sink.type = avro
exec-memory-avro.sinks.avro-sink.hostname = hadoop000
exec-memory-avro.sinks.avro-sink.port = 44444
#使用缓冲内存中事件的通道
exec-memory-avro.channels.memory-channel.type = memory
exec-memory-avro.channels.memory-channel.capacity = 1000
exec-memory-avro.channels.memory-channel.transactionCapacity = 100
#将源和接收器绑定到通道
exec-memory-avro.sources.exec-source.channels = memory-channel
exec-memory-avro.sinks.avro-sink.channel = memory-channel
avro-memory-kafka.conf 编写
avro-memory-kafka.sources = avro-source
avro-memory-kafka.sinks = kafka-sink
avro-memory-kafka.channels = memory-channel
#描述/配置源
avro-memory-kafka.sources.avro-source.type = avro
avro-memory-kafka.sources.avro-source.bind= hadoop000
avro-memory-kafka.sources.avro-source.port = 44444
#描述接收器
avro-memory-kafka.sinks.kafka-sink.type = org.apache.flume.sink.kafka.KafkaSink
avro-memory-kafka.sinks.kafka-sink.brokerList = hadoop000:9093
avro-memory-kafka.sinks.kafka-sink.topic = hello_topic
avro-memory-kafka.sinks.kafka-sink.batchSize = 5
avro-memory-kafka.sinks.kafka-sink.requireAcks = 1
#使用缓冲内存中事件的通道
avro-memory-kafka.channels.memory-channel.type = memory
avro-memory-kafka.channels.memory-channel.capacity = 1000
avro-memory-kafka.channels.memory-channel.transactionCapacity = 100
#将源和接收器绑定到通道
avro-memory-kafka.sources.avro-source.channels = memory-channel
avro-memory-kafka.sinks.kafka-sink.channel = memory-channel
启动 kafka
kafka-server-start.sh $KAFKA_HOME/config/server-1.properties
启动 两个 flume
flume-ng agent --name avro-memory-kafka --conf $FLUME_HOME/conf --conf-file $FLUME_HOME/conf/avro-memory-kafka.conf -Dflume.root.logger=INFO,console
flume-ng agent --name exec-memory-avro --conf $FLUME_HOME/conf --conf-file $FLUME_HOME/conf/exec-memory-avro.conf -Dflume.root.logger=INFO,console
echo hello world >> /home/hadoop000/hello.txt
文章来源: blog.csdn.net,作者:血煞风雨城2018,版权归原作者所有,如需转载,请联系作者。
原文链接:blog.csdn.net/qq_31905135/article/details/85260702
【版权声明】本文为华为云社区用户转载文章,如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱:
cloudbbs@huaweicloud.com
- 点赞
- 收藏
- 关注作者
作者其他文章
评论(0)