OenStack超级架构-2Ceph部分
【摘要】 ceph集群搭建后端存储操作系统现在centos7版本主机名IPcq-kz-h-cephadm-0-16.xier.local192.168.0.16cq-kz-h-cephmon01-0-17.xier.local192.168.0.17cq-kz-h-cephmon02-0-18.xier.local192.168.0.18cq-kz-h-cephstor03-0-19.xier.loc...
ceph集群搭建后端存储
操作系统现在centos7版本
主机名 | IP |
---|---|
cq-kz-h-cephadm-0-16.xier.local | 192.168.0.16 |
cq-kz-h-cephmon01-0-17.xier.local | 192.168.0.17 |
cq-kz-h-cephmon02-0-18.xier.local | 192.168.0.18 |
cq-kz-h-cephstor03-0-19.xier.local | 192.168.0.19 |
配置hosts解析(所有节点)
cat > /etc/hosts << EOF 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.0.0.16 cq-kz-h-cephadm-0-16.xier.local gz-zy-h-cephadm-0-16 10.0.0.17 cq-kz-h-cephmon01-0-17.xier.local gz-zy-h-cephmon01-0-17 mon01 stor01 10.0.0.18 cq-kz-h-cephmon02-0-18.xier.local gz-zy-h-cephmon02-0-18 mon02 stor02 10.0.0.19 cq-kz-h-cephstor03-0-19.xier.local gz-zy-h-cephstor03-0-19 stor03 EOF
配置时间同步
# admin节点 yum install -y chrony vim /etc/chrony.conf server ntp6.aliyun.com iburst allow all local stratum 10 systemctl restart chronyd clock -w # 其它节点 vim /etc/chrony.conf server cq-kz--h-cephadm-0-16.xier.local iburst systemctl restart chronyd clock -w
准备ceph仓库配置文件(所有节点)
rpm -ivh https://mirrors.aliyun.com/ceph/rpm-mimic/el7/noarch/ceph-release-1-1.el7.noarch.rpm yum install -y epel-release
各Ceph各节点创建新用户(所有节点)
useradd cephadm; echo "123" | passwd --stdin cephadm
新用户无密码运行sudo命令的权限(所有节点)
echo "cephadm ALL = (root) NOPASSWD: ALL" > /etc/sudoers.d/cephadm chmod 0440 /etc/sudoers.d/cephadm # 测试 su - cephadm sudo -l
配置免密(cephadm节点)
su - cephadm ssh-keygen -t rsa sudo ssh-copy-id -i .ssh/id_rsa.pub cephadm@localhost for i in `seq 1 3`;do scp -rp .ssh/ cephadm@stor0$i:/home/cephadm;done
在管理节点安装ceph-deploy(ceph-admin节点)
Ceph存储集群的部署过程可通过管理节点使用ceph-deploy全程进行,这里首先在管理节点安装ceph-deploy及其依赖到的程序包
yum install -y ceph-deploy python-setuptools python2-subprocess32
部署RADOS存储集群
初始化RADOS集群
首先在管理节点上以cephadm用户创建集群相关的配置文件目录(ceph-admin节点)
su - cephadm mkdir ceph-cluster cd ceph-cluster
初始化第一个MON节点,准备创建集群(ceph-admin节点)
名称必须于节点当前实际使用的主机名保持一致。运行如下命令即可生成初始配置
ceph-deploy new --cluster-network 192.168.0.0/16 --public-network 10.0.0.0/16 gz-zy-h-mon01-0-17.ang.local
编辑生成ceph.conf配置文件,在[global]配置段中设置Ceph集群面向客户端通信时使用的IP地址的网络,即公网网络地址(cephadm节点)
vim /home/cephadm/ceph-cluster/ceph.conf [global] '''' mon_initial_members = gz-zy-h-cephmon01-0-17 # 如果有多个,mon01,mon02....即可 cluster_network = 192.168.0.0/16 # osd需要集群网络 public_network = 10.0.0.0/16 # 客户端连接需要公共网络 mon_host = 10.0.0.17 # 只需要公共网络的地址即可,不参与集群内部的事务,所以不需要集群网络 '''
安装Ceph集群(ceph-admin节点)
方式1: 批量将stor01、stor02、stor03、stor04配置为Ceph集群节点,则执行如下命令即可
su - cephadm cd ceph-cluster ceph-deploy install stor01 stor02 stor03
方式2: 手动部署
# 所有mon与stor节点 wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo yum clean all yum makecache fast yum install -y ceph ceph-radosgw # cephadm节点 su - cephadm cd ceph-cluster ceph-deploy install --no-adjust-repos mon01 mon02 stor03
配置初始MON节点,并收集所有密钥(ceph-admin节点)
su - cephadm cd ceph-cluster # 不用配置主机,会根据配置文件完成 ceph-deploy mon create-initial
把配置文件和admin密钥拷贝Ceph集群各节点,以免得每次执行"ceph"命令时不得不明确指定MON节点地址和ceph.client.admin.keyring
# cephadm节点 ceph-deploy admin mon01 mon02 stor03 yum install -y ceph # 将ceph的配置文件拷贝到cephadm节点 chown -R cephadm.cephadm /etc/ceph/
到ceph任意集群节点验证配置文件是否已拷贝过来(所有节点)
ls /etc/ceph/
所有节点以root用户的身份设定用户cephadm能够读取/etc/ceph/ceph.client.admin.keyring文件(所有节点)
setfacl -m u:cephadm:rw /etc/ceph/ceph.client.admin.keyring
配置Manager节点,启动ceph-mgr进程(ceph-admin节点)
ceph-deploy mgr create mon01
在Ceph集群内的节点上以cephadm用户的身份运行如下命令,测试集群健康状态(ceph-admin节点)
yum install ceph-common -y # 将配置文件与密码给管理节点 ceph-deploy admin gz-zy-h-cephadm-0-16 setfacl -m u:cephadm:rw /etc/ceph/ceph.client.admin.keyring ceph health ceph -s
向RADOS集群添加OSD
列出并擦净磁盘(ceph-admin节点)
"ceph-deploy disk"命令可用检查并列出OSD节点上所有可用的磁盘相关信息
ceph-deploy disk list mon01 mon02 stor03
而后,在管理节点上使用ceph-deploy命令擦除计划专用于OSD磁盘上的所有分区表和数据以便用于OSD,命令格式为"ceph-deploy disk zap {osd-server-name}{disk-name}",需要注意的是此步会清除目标设备上的所有数据。下面分别擦净mon01 mon02 mon03 stor04用于OSD的一个磁盘设备sdb
ceph-deploy disk zap mon01 /dev/sdb ceph-deploy disk zap mon01 /dev/sdc ceph-deploy disk zap mon02 /dev/sdb ceph-deploy disk zap mon02 /dev/sdc ceph-deploy disk zap stor03 /dev/sdb ceph-deploy disk zap stor03 /dev/sdc
添加OSD
如下命令即可分别把磁盘添加为OSD
ceph-deploy osd create mon01 --data /dev/sdb ceph-deploy osd create mon01 --data /dev/sdc ceph-deploy osd create mon02 --data /dev/sdb ceph-deploy osd create mon02 --data /dev/sdc ceph-deploy osd create stor03 --data /dev/sdb ceph-deploy osd create stor03 --data /dev/sdc
列出指定节点上的OSD
ceph-deploy osd list mon01 mon02 stor03
管理员也可以使用ceph命令查看OSD相关信息
ceph osd stat
了解相关信息
ceph osd dump ceph osd ls
扩展Ceph集群
扩展监视器节点
至少三个节点
ceph-deploy mon add cq-kz--h-mon01-0-17 ceph-deploy mon add cq-kz--h-mon02-0-18 ceph-deploy mon add cq-kz--h-stor03-0-19
设置完成后,可以在ceph客户端查看监视器及法定人数的相关状态
ceph quorum_status --format json-pretty
扩展Manager节点
至少两个节点
ceph-deploy mgr create cq-kz--h-mon02-0-18 ceph -s # 检查
RBD使用
ceph osd pool ls # 创建存储池 ceph osd pool create openstack-vm 64 64 # 将openstack-vm存储池设置为rbd接口 ceph osd pool application enable openstack-vm rbd # 初始化openstack-vm的rbd存储池 rbd pool init openstack-vm # 创建映像(磁盘) rbd create --pool openstack-vm --image vm01 --size 2G # 查看存储池信息 rbd ls --pool openstack-vm # 创建映像(磁盘) rbd create --size 2G openstack-vm/vm02 rbd ls --pool openstack-vm # 查看存储池详细信息 rbd ls --pool openstack-vm -l # 查看存储池详细信息以json格式显示 rbd ls --pool openstack-vm -l --format json --pretty-format # 查看存储池中映像详细信息 rbd info openstack-vm/vm01 rbd info --pool openstack-vm --image vm01 rbd info --pool openstack-vm vm02 # 禁用特性 rbd feature disable openstack-vm/vm02 object-map fast-diff deep-flatten
创建一台云主机客户端,来使用ceph集群的osd
# stor03节点拷贝yum源文件 scp /etc/yum.repos.d/ceph.repo 172.18.199.75:/etc/yum.repos.d/ # 回到stor05节点 yum -y install epel-release yum -y install ceph-common # rbd依赖的是libvirtrbd模块 modinfo ceph # 到管理节点ceph-admin创建账号 ceph auth get-or-create client.openstack mon 'allow r' osd 'allow * pool=openstack-vm' ceph auth get client.openstack -o ceph.client.openstack.keyring scp ceph.client.openstack.keyring root@10.0.0.10:/etc/ceph/ scp ceph.conf root@10.0.0.10:/etc/ceph/ # 回到stor05节点 ceph -s # 发现默认使用的密钥是ceph.client.admin.keyring,而这里只有ceph.client.kube.keyring,所以需要指定用户 ceph --user openstack -s
加载内核识别rbd创建好的ceph镜像文件,映射到云主机
rbd help map rbd --user openstack map openstack-vm/vm1 # 注意其它节点正在用时别格式化,数据会丢失 mkfs.xfs /dev/rbd0 # 操作系统必须得有ceph模块 modinfo ceph
将磁盘镜像移除本机(云主机)
# 查看当前已加载的镜像文件 rbd showmapped # ceph集群中可查看rbd被其它主机映射时加了锁(ceph集群任意节点) rbd -p openstack-vm ls -l # 移除设备 rbd unmap /dev/rbd0
更改镜像大小(cephadm节点)
rbd resize -s 5G openstack-vm/vm1 # 查看是否已更改 rbd -p kube ls -l
删除镜像(cephadm节点)
# 帮助 rbd help rm # 删除 rbd rm openstack-vm/vm01 rbd ls openstack-vm -l
CephFS使用
创建存储池(cephadm节点)
ceph osd pool create cephfs-metadata 32 32 ceph osd pool create cephfs-data 64 64
激活cephfs存储池(cephadm节点)
ceph fs new cephfs cephfs-metadata cephfs-data # 查看状态 ceph fs status cephfs ceph mds stat
创建授权账号,给客户端使用(cephadm节点)
ceph auth get-or-create client.fsclient mon 'allow r' mds 'allow rw' osd 'allow rwx pool=cephfs-data' -o ceph.client.fsclient.keyring -o 指定保存的客户端key文件,这是一种方式,可以创建授权文件,再利用授权文件导出key保存给客户端使用 # 单独获取key值 ceph auth print-key client.fsclient # 客户端key的第二种方式 ceph auth print-key client.fsclient > fsclient.key scp fsclient.key root@10.0.0.10:/etc/ceph/ scp fsclient.key root@10.0.0.11:/etc/ceph/ # 拷贝本机配置文件统一 ceph-deploy --overwrite-conf config push mon01 mon02 stor03 # 由于受rank数量限制,会处于备用状态 ceph-deploy mds create mon01 mon03 # 可看到Standby MDS两个节点 ceph fs status ceph fs get cephfs ceph fs set cephfs max_mds 2 # 当使用两个cephfs节点时,下面备用的MDS立马就补上去了,所以当中有cephfs挂时,不满足两个,会立马补上去实现高可用 ceph fs status
客户端使用cephfs(控制节点)
yum -y install ceph-common # cephfs依赖ceph模块 modinfo ceph # 检查是否有上面授权给予的密钥key,如果密钥则无法使用相应的key ls /etc/ceph/ ================================================配置glance后端存储============================================= mkdir /image-bak/ mv /var/lib/glance/images/* /image-bak/ # name不要加client前缀 mount -t ceph mon01:6789,mon02:6789:/ /var/lib/glance/images/ -o name=fsclient,secretfile=/etc/ceph/fsclient.key chown -R glance.glance /var/lib/glance/images/ mv /image-bak/* /var/lib/glance/images/ # 查看 mount # 查看文件系统类型 stat -f /var/lib/glance/images/ # 配置永久挂载 vim /etc/fstab mon01:6789,mon03:6789:/ /var/lib/glance/images/ ceph name=fsclient,secretfile=/etc/ceph/fsclient.key,_netdev,noatime 0 0 # _netdev作为网络传输挂载,不能访问时就不挂载 # noatime为了提示文件性能,不用实时访问时间戳 # 查看是否成功 df -TH
当内核不支持ceph模块时解决方法(stor05节点)
yum install ceph-fuse ceph-common -y # ceph-admin节点将生成的授权密钥复制过来 scp ceph.client.fsclient.keyring root@172.18.199.75:/etc/ceph ceph-fuse -n client.fsclient -m mon01:6789,mon02:6789 /data # 查看 mount # 配置永久挂载 vim /etc/fstab none /data fuse.ceph ceph.id=fsclient,ceph.conf=/etc/ceph/ceph.conf,_netdev,defaults 0 0 umount /data mount -a # 查看是否成功 df -TH
解决单个MDS问题
# 拷贝本机配置文件统一 ceph-deploy --overwrite-conf config push mon01 mon02 mon03 stor04 # 由于受rank数量限制,会处于备用状态,创建mds ceph-deploy mds create mon01 mon02 stor03 # 可看到Standby MDS两个节点 ceph fs status # 创建两个rank ceph fs set cephfs max_mds 2 ceph fs get cephfs # 可选配置 vim /etc/ceph.conf ''''' [mds.mon01] mds_standby_for_fscid = cephfs # 最优配置
Prometheus监控ceph
下载地址: https://prometheus.io/download/
创建运⾏Prometheus Server进程的系统用户,并为其创建家⽬录/var/lib/prometheus作为数据存储⽬录
useradd -r -m -d /var/lib/prometheus prometheus
下载并安装Prometheus Server
wget https://github.com/prometheus/prometheus/releases/download/v2.7.2/prometheus-2.7.2.linux-amd64.tar.gz tar xf prometheus-2.7.2.linux-amd64.tar.gz -C /usr/local/ cd /usr/local/ ln -sv prometheus-2.7.2.linux-amd64 prometheus
创建Unit File 创建Prometheus专⽤的Unit File,也就是启动文件,⽂件路径为/usr/lib/systemd/system/prometheus.service
vim /usr/lib/systemd/system/prometheus.service [Unit] Description=The Prometheus 2 monitoring system and time series database. Documentation=https://prometheus.io After=network.target [Service] EnvironmentFile=-/etc/sysconfig/prometheus User=prometheus ExecStart=/usr/local/prometheus/prometheus --storage.tsdb.path=/var/lib/prometheus --config.file=/usr/local/prometheus/prometheus.yml --web.listen-address=0.0.0.0:9090 --web.external-url= $PROM_EXTRA_ARGS Restart=on-failure StartLimitInterval=1 RestartSec=3 [Install] WantedBy=multi-user.target systemctl daemon-reload systemctl restart prometheus # 访问 http://$IP:9090
部署监控工具node_exporter于ceph监控
stor03当监控节点
wget https://github.com/prometheus/node_exporter/releases/download/v0.17.0/node_exporter-0.17.0.linux-amd64.tar.gz # 创建运⾏Prometheus Server进程的系统⽤⼾,并为其创建家⽬录/var/lib/prometheus作为数据存储⽬录。 useradd -r -m -d /var/lib/prometheus prometheus tar xf node_exporter-0.17.0.linux-amd64.tar.gz -C /usr/local/ cd /usr/local/ ln -sv node_exporter-0.17.0.linux-amd64 node_exporter
创建启动文件
vim /usr/lib/systemd/system/node_exporter.service [Unit] Description=Prometheus exporter for machine metrics, written in Go with pluggable metric collectors. Documentation=https://github.com/prometheus/node_exporter After=network.target [Service] EnvironmentFile=-/etc/sysconfig/node_exporter User=prometheus ExecStart=/usr/local/node_exporter/node_exporter $NODE_EXPORTER_OPTS Restart=on-failure StartLimitInterval=1 RestartSec=3 [Install] WantedBy=multi-user.target # # 启动,查看有没有9100端口 systemctl enable --now node_exporter.service
配置Prometheus监控
cephadm节点配置node发现(cephadm节点)
========================================================cephadm节点配置node发现======================================================== # 配置设置好的node_exporter,在stor05节点中设置 vim /usr/local/prometheus/prometheus.yml '''' scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: "prometheus" # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ["localhost:9090"] - job_name: 'nodes' static_configs: - targets: ['cq-kz--h-cephstor03-0-19.xier.local:9100'] systemctl restart prometheus # 在web界面中点击status中的targets查看是否被发现 # dahboard的targets查看状态是否全启动成功,如果没成功,那么换启动方式 ./prometheus --config.file=/usr/local/prometheus/prometheus.yml &
设定Ceph Mgr(cephadm节点)
Ceph Manager内置了众多模块,包括prometheus模块,⽤于直接输出Prometheus⻛格的指标数据
ceph mgr module enable prometheus Prometheus模块默认监听于TCP协议的9283端⼝ ./prometheus --config.file=/usr/local/prometheus/prometheus.yml &
修改Prometheus的配置⽂件,添加与Ceph相关的Job
vim /usr/local/prometheus/prometheus.yml # my global config global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: "prometheus" # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ["10.0.0.16:9090"] - job_name: 'nodes' static_configs: - targets: ['cq-kz--h-cephstor03-0-19.xier.local:9100'] - job_name: 'ceph' static_configs: - targets: ["cq-kz--h-cephmon01-0-17.xier.local:9283"] systemctl restart prometheus # 在web主页面中查看是否有ceph开头的数据选项
发现没有ceph数据,由于启动问题,换一种启动方式即可
systemctl stop prometheus cd /usr/local/prometheus ./prometheus --config.file=/usr/local/prometheus/prometheus.yml &
部署grafana监控页面(stor03节点)
https://grafana.com/grafana/download
下载并启动,监听3000端口
wget https://dl.grafana.com/enterprise/release/grafana-enterprise-8.3.3-1.x86_64.rpm sudo yum install grafana-enterprise-8.3.3-1.x86_64.rpm systemctl start grafana-server # 访问http://$IP:3000
下载pmetheus模板
【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱:
cloudbbs@huaweicloud.com
- 点赞
- 收藏
- 关注作者
评论(0)