Kubernetes 集群分布式存储插件 Rook Ceph部署
前言
我们经常会说:容器和 Pod 是短暂的。其含义是它们的生命周期可能很短,会被频繁地销毁和创建。容器销毁时,保存在容器内部文件系统中的数据都会被清除。 为了持久化保存容器的数据,可以使用存储插件在容器里挂载一个基于网络或者其他机制的远程数据卷,使得在容器里创建的文件,实际上是保存在远程存储服务器上,或者以分布式的方式保存在多个节点上,而与当前宿主机没有绑定关系。这样,无论在哪个节点上启动新的容器,都可以请求挂载指定的持久化存储卷。
由于 Kubernetes 本身的松耦合设计,绝大多数存储项目,比如 Ceph、GlusterFS、NFS 等,都可以为 Kubernetes 提供持久化存储能力。在这次的部署实践中,选择一个很重要生产级的存储插件项目:Rook。
Rook 介绍
简介
Rook 项目是一个基于 Ceph 的 Kubernetes 存储插件(后期也在加入对更多存储的支持)。不过,不同于对 Ceph 的简单封装,Rook 在自己的实现中加入了水平扩展、迁移、灾难备份、监控等大量的企业级功能,使得这个项目变成了一个高度可扩展的分布式存储解决方案,提供对象、文件和块存储。
Rook 目前支持 Ceph、NFS、Minio Object Store、Edegefs、Cassandra、CockroachDB 存储的搭建。
Rook 机制:
- Rook 提供了卷插件,来扩展了 K8S 的存储系统,使用 Kubelet 代理程序 Pod 可以挂载 Rook 管理的块设备和文件系统。
- Rook Operator 负责启动并监控整个底层存储系统,例如 Ceph Pod、Ceph OSD 等,同时它还管理 CRD、对象存储、文件系统。
- Rook Agent 代理部署在 K8S 每个节点上以 Pod 容器运行,每个代理 Pod 都配置一个 Flexvolume 驱动,该驱动主要用来跟 K8S 的卷控制框架集成起来,每个节点上的相关的操作,例如添加存储设备、挂载、格式化、删除存储等操作,都有该代理来完成。
更多参考如下官网:
2、Rook 架构
Rook 部署
前期规划
准备工作
为了配置 Ceph 存储集群,至少需要以下本地存储选项之一:
- 原始设备(无分区或格式化的文件系统)
- 原始分区(无格式文件系统)
- 可通过 block 模式从存储类别获得 PV
可以使用以下命令确认分区或设备是格式化的文件系统:
$ lsblk -f
NAME FSTYPE LABEL UUID MOUNTPOINT
vda
├─vda1 xfs e16ad84e-8cef-4cb1-b19b-9105c57f97b1 /boot
├─vda2 LVM2_member Vg3nyB-iW9Q-4xp0-LEIO-gzHc-2eax-D1razB
│ └─centos-root xfs 0bb4bfa4-b315-43ca-a789-2b43e726c10c /
├─vda3 LVM2_member VZMibm-DJ8e-apig-YhR3-a1dF-wHYQ-8pjKan
│ └─centos-root xfs 0bb4bfa4-b315-43ca-a789-2b43e726c10c /
└─vda4
如果该 FSTYPE 字段不为空,则在相应设备的顶部有一个文件系统。在这种情况下,可以将 vda4 用于 Ceph。
获取 YAML
git clone --single-branch --branch master https://github.com/rook/rook.git
部署 Rook Operator
本实验使用k8s-node1
、k8s-node2
、k8s-node3 三
个节点,因此需要如下修改:
kubectl label nodes {k8s-node1,k8s-node2,k8s-node3} ceph-osd=enabled
kubectl label nodes {k8s-node1,k8s-node2,k8s-node3} ceph-mon=enabled
kubectl label nodes k8s-node1 ceph-mgr=enabled
注意:当前版 本 rook 中 mgr 只能支持一个节点运行。
执行脚本:
cd rook/cluster/examples/kubernetes/ceph
kubectl create -f common.yaml
kubectl create -f operator.yaml
注意:如上创建了相应的基础服务(如 serviceaccounts),同时 rook-ceph-operator 会在每个节点创建 rook-ceph-agent 和 rook-discover。
部署 cluster
配置cluster.yaml
vi cluster.yaml
修改完如下:
#################################################################################################################
# Define the settings for the rook-ceph cluster with common settings for a production cluster.
# All nodes with available raw devices will be used for the Ceph cluster. At least three nodes are required
# in this example. See the documentation for more details on storage settings available.
# For example, to create the cluster:
# kubectl create -f common.yaml
# kubectl create -f operator.yaml
# kubectl create -f cluster.yaml
#################################################################################################################
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
cephVersion:
# The container image used to launch the Ceph daemon pods (mon, mgr, osd, mds, rgw).
# v13 is mimic, v14 is nautilus, and v15 is octopus.
# RECOMMENDATION: In production, use a specific version tag instead of the general v14 flag, which pulls the latest release and could result in different
# versions running within the cluster. See tags available at https://hub.docker.com/r/ceph/ceph/tags/.
# If you want to be more precise, you can always use a timestamp tag such ceph/ceph:v14.2.5-20190917
# This tag might not contain a new Ceph version, just security fixes from the underlying operating system, which will reduce vulnerabilities
image: ceph/ceph:v15.2.3
# Whether to allow unsupported versions of Ceph. Currently mimic and nautilus are supported, with the recommendation to upgrade to nautilus.
# Octopus is the version allowed when this is set to true.
# Do not set to true in production.
allowUnsupported: false
# The path on the host where configuration files will be persisted. Must be specified.
# Important: if you reinstall the cluster, make sure you delete this directory from each host or else the mons will fail to start on the new cluster.
# In Minikube, the '/data' directory is configured to persist across reboots. Use "/data/rook" in Minikube environment.
dataDirHostPath: /var/lib/rook
# Whether or not upgrade should continue even if a check fails
# This means Ceph's status could be degraded and we don't recommend upgrading but you might decide otherwise
# Use at your OWN risk
# To understand Rook's upgrade process of Ceph, read https://rook.io/docs/rook/master/ceph-upgrade.html#ceph-version-upgrades
skipUpgradeChecks: false
# Whether or not continue if PGs are not clean during an upgrade
continueUpgradeAfterChecksEvenIfNotHealthy: false
# set the amount of mons to be started
mon:
count: 3
allowMultiplePerNode: false
mgr:
modules:
# Several modules should not need to be included in this list. The "dashboard" and "monitoring" modules
# are already enabled by other settings in the cluster CR and the "rook" module is always enabled.
- name: pg_autoscaler
enabled: true
# enable the ceph dashboard for viewing cluster status
dashboard:
enabled: true
# serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
# urlPrefix: /ceph-dashboard
# serve the dashboard at the given port.
# port: 8443
# serve the dashboard using SSL
ssl: true
# enable prometheus alerting for cluster
monitoring:
# requires Prometheus to be pre-installed
enabled: false
# namespace to deploy prometheusRule in. If empty, namespace of the cluster will be used.
# Recommended:
# If you have a single rook-ceph cluster, set the rulesNamespace to the same namespace as the cluster or keep it empty.
# If you have multiple rook-ceph clusters in the same k8s cluster, choose the same namespace (ideally, namespace with prometheus
# deployed) to set rulesNamespace for all the clusters. Otherwise, you will get duplicate alerts with multiple alert definitions.
rulesNamespace: rook-ceph
network:
# enable host networking
#provider: host
# EXPERIMENTAL: enable the Multus network provider
#provider: multus
#selectors:
# The selector keys are required to be `public` and `cluster`.
# Based on the configuration, the operator will do the following:
# 1. if only the `public` selector key is specified both public_network and cluster_network Ceph settings will listen on that interface
# 2. if both `public` and `cluster` selector keys are specified the first one will point to 'public_network' flag and the second one to 'cluster_network'
#
# In order to work, each selector value must match a NetworkAttachmentDefinition object in Multus
#
#public: public-conf --> NetworkAttachmentDefinition object name in Multus
#cluster: cluster-conf --> NetworkAttachmentDefinition object name in Multus
# enable the crash collector for ceph daemon crash collection
crashCollector:
disable: false
cleanupPolicy:
# cleanup should only be added to the cluster when the cluster is about to be deleted.
# After any field of the cleanup policy is set, Rook will stop configuring the cluster as if the cluster is about
# to be destroyed in order to prevent these settings from being deployed unintentionally.
# To signify that automatic deletion is desired, use the value "yes-really-destroy-data". Only this and an empty
# string are valid values for this field.
confirmation: ""
# To control where various services will be scheduled by kubernetes, use the placement configuration sections below.
# The example under 'all' would have all services scheduled on kubernetes nodes labeled with 'role=storage-node' and
# tolerate taints with a key of 'storage-node'.
placement:
mon:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: ceph-mon
operator: In
values:
- enabled
podAffinity:
podAntiAffinity:
topologySpreadConstraints:
tolerations:
- key: ceph-mon
operator: Exists
osd:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: ceph-osd
operator: In
values:
- enabled
podAffinity:
podAntiAffinity:
topologySpreadConstraints:
tolerations:
- key: ceph-osd
operator: Exists
mgr:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: ceph-mgr
operator: In
values:
- enabled
podAffinity:
podAntiAffinity:
topologySpreadConstraints:
tolerations:
- key: ceph-mgr
operator: Exists
# cleanup:
annotations:
# all:
# mon:
# osd:
# cleanup:
# If no mgr annotations are set, prometheus scrape annotations will be set by default.
# mgr:
resources:
# The requests and limits set here, allow the mgr pod to use half of one CPU core and 1 gigabyte of memory
# mgr:
# limits:
# cpu: "500m"
# memory: "1024Mi"
# requests:
# cpu: "500m"
# memory: "1024Mi"
# The above example requests/limits can also be added to the mon and osd components
# mon:
# osd:
# prepareosd:
# crashcollector:
# cleanup:
# The option to automatically remove OSDs that are out and are safe to destroy.
removeOSDsIfOutAndSafeToRemove: false
# priorityClassNames:
# all: rook-ceph-default-priority-class
# mon: rook-ceph-mon-priority-class
# osd: rook-ceph-osd-priority-class
# mgr: rook-ceph-mgr-priority-class
storage: # cluster level storage configuration and selection
useAllNodes: false #关闭使用所有Node
useAllDevices: false #关闭使用所有设备
deviceFilter: vda4
config:
# metadataDevice: "md0" # specify a non-rotational storage so ceph-volume will use it as block db device of bluestore.
# databaseSizeMB: "1024" # uncomment if the disks are smaller than 100 GB
# journalSizeMB: "1024" # uncomment if the disks are 20 GB or smaller
# osdsPerDevice: "1" # this value can be overridden at the node or device level
# encryptedDevice: "true" # the default value for this option is "false"
# Individual nodes and their config can be specified as well, but 'useAllNodes' above must be set to false. Then, only the named
# nodes below will be used as storage resources. Each node's 'name' field should match their 'kubernetes.io/hostname' label.
nodes:
- name: "k8s-node1" #指定存储节点主机
devices:
- name: "vda4" #指定磁盘为sdb
config:
storeType: bluestore
- name: "k8s-node2"
devices:
- name: "vda4"
config:
storeType: bluestore
- name: "k8s-node3"
devices:
- name: "vda4"
config:
storeType: bluestore
# The section for configuring management of daemon disruptions during upgrade or fencing.
disruptionManagement:
# If true, the operator will create and manage PodDisruptionBudgets for OSD, Mon, RGW, and MDS daemons. OSD PDBs are managed dynamically
# via the strategy outlined in the [design](https://github.com/rook/rook/blob/master/design/ceph/ceph-managed-disruptionbudgets.md). The operator will
# block eviction of OSDs by default and unblock them safely when drains are detected.
managePodBudgets: false
# A duration in minutes that determines how long an entire failureDomain like `region/zone/host` will be held in `noout` (in addition to the
# default DOWN/OUT interval) when it is draining. This is only relevant when `managePodBudgets` is `true`. The default value is `30` minutes.
osdMaintenanceTimeout: 30
# If true, the operator will create and manage MachineDisruptionBudgets to ensure OSDs are only fenced when the cluster is healthy.
# Only available on OpenShift.
manageMachineDisruptionBudgets: false
# Namespace in which to watch for the MachineDisruptionBudgets.
machineDisruptionBudgetNamespace: openshift-machine-api
更多 cluster 的 CRD 配置参考:
- https://github.com/rook/rook/blob/master/Documentation/ceph-cluster-crd.md
- https://blog.gmem.cc/rook-based-k8s-storage-solution
执行cluster.yaml
kubectl create -f cluster.yaml
# 查看部署 log
$ kubectl logs -f -n rook-ceph rook-ceph-operator-567d7945d6-t9rd4
# 等待一定时间,部分中间态容器可能会波动
[7d@k8s-master ceph]$ kubectl get pods -n rook-ceph -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
csi-cephfsplugin-dr4dq 3/3 Running 0 24h 172.16.106.239 k8s-node2 <none> <none>
csi-cephfsplugin-provisioner-6bcb7cdd75-dtzmn 5/5
.......
注意:若部署失败
master 节点执行
kubectl delete -f ./
所有 node 节点执行如下清理操作:
rm -rf /var/lib/rook
/dev/mapper/ceph-*
dmsetup ls
dmsetup remove_all
dd if=/dev/zero of=/dev/vda4 bs=512k count=1
wipefs -af /dev/vda4
部署 Toolbox
Toolbox 是一个 Rook 的工具集容器,该容器中的命令可以用来调试、测试 Rook,对 Ceph 临时测试的操作一般在这个容器内执行。
# 启动 rook-ceph-tools pod:
$ kubectl create -f toolbox.yaml
deployment.apps/rook-ceph-tools created
# 等待 rook-ceph-tools 载其容器并进入 running 状态:
$ kubectl -n rook-ceph get pod -l "app=rook-ceph-tools"
NAME READY STATUS RESTARTS AGE
rook-ceph-tools-6d659f5579-knt6x 1/1 Running 0 7s
测试 Rook
$ kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead.
# 查看 Ceph 状态
[root@rook-ceph-tools-6d659f5579-knt6x /]# ceph status
cluster:
id: 550e2978-26a6-4f3b-b101-10369ab63cf4
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 2m)
mgr: a(active, since 85s)
osd: 3 osds: 3 up (since 114s), 3 in (since 114s)
data:
pools: 1 pools, 1 pgs
objects: 0 objects, 0 B
usage: 3.0 GiB used, 147 GiB / 150 GiB avail
pgs: 1 active+clean
[root@rook-ceph-tools-6d659f5579-knt6x /]# ceph osd status
ID HOST USED AVAIL WR OPS WR DATA RD OPS RD DATA STATE
0 k8s-node2 1026M 48.9G 0 0 0 0 exists,up
1 k8s-node1 1026M 48.9G 0 0 0 0 exists,up
2 k8s-node3 1026M 48.9G 0 0 0 0 exists,up
[root@rook-ceph-tools-6d659f5579-knt6x /]# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 150 GiB 147 GiB 6.2 MiB 3.0 GiB 2.00
TOTAL 150 GiB 147 GiB 6.2 MiB 3.0 GiB 2.00
--- POOLS ---
POOL ID STORED OBJECTS USED %USED MAX AVAIL
device_health_metrics 1 0 B 0 0 B 0 46 GiB
[root@rook-ceph-tools-6d659f5579-knt6x /]# rados df
POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR
device_health_metrics 0 B 0 0 0 0 0 0 0 0 B 0 0 B 0 B 0 B
total_objects 0
total_used 3.0 GiB
total_avail 147 GiB
total_space 150 GiB
# 查看 Ceph 所有 keyring
[root@rook-ceph-tools-6d659f5579-knt6x /]# ceph auth ls
installed auth entries:
osd.0
key: AQAjofFe1j9pGhAABnjTXAYZeZdwo2FGHIFv+g==
caps: [mgr] allow profile osd
caps: [mon] allow profile osd
caps: [osd] allow *
osd.1
key: AQAjofFeY0LaHhAAMVLxrH1lqXqyYsZE9yJ5dg==
caps: [mgr] allow profile osd
caps: [mon] allow profile osd
caps: [osd] allow *
osd.2
key: AQAkofFeBjtoDBAAVYW7FursqpbttekW54u2rA==
caps: [mgr] allow profile osd
caps: [mon] allow profile osd
caps: [osd] allow *
client.admin
key: AQDhoPFeqLE4ORAAvBeGwV7p1YY25owP8nS02Q==
caps: [mds] allow *
caps: [mgr] allow *
caps: [mon] allow *
caps: [osd] allow *
client.bootstrap-mds
key: AQABofFeQclcCxAAEtA9Y4+yF3I6H9RM0/f1DQ==
caps: [mon] allow profile bootstrap-mds
client.bootstrap-mgr
key: AQABofFeaeBcCxAA9SEnt+RV7neC4uy/xQb5qg==
caps: [mon] allow profile bootstrap-mgr
client.bootstrap-osd
key: AQABofFe0/NcCxAAqCKwJpzPlav8MuajRk8xmw==
caps: [mon] allow profile bootstrap-osd
client.bootstrap-rbd
key: AQABofFesAZdCxAAZyWJg+Pa3F0g5Toy4LamPw==
caps: [mon] allow profile bootstrap-rbd
client.bootstrap-rbd-mirror
key: AQABofFejRpdCxAA/9NbTQDJILdSoYJZdol7bQ==
caps: [mon] allow profile bootstrap-rbd-mirror
client.bootstrap-rgw
key: AQABofFeAi5dCxAAKu67ZyM8PRRPcluTXR3YRw==
caps: [mon] allow profile bootstrap-rgw
client.crash
key: AQAcofFeLDr3KBAAw9UowFd26JiQSGjCFyhx8w==
caps: [mgr] allow profile crash
caps: [mon] allow profile crash
client.csi-cephfs-node
key: AQAcofFeFRh4DBAA7Z8kgcHGM92vHj6cvGbXXg==
caps: [mds] allow rw
caps: [mgr] allow rw
caps: [mon] allow r
caps: [osd] allow rw tag cephfs *=*
client.csi-cephfs-provisioner
key: AQAbofFemLuJMRAA4WlGWBjONb1av48rox1q6g==
caps: [mgr] allow rw
caps: [mon] allow r
caps: [osd] allow rw tag cephfs metadata=*
client.csi-rbd-node
key: AQAbofFepu7rFhAA+vdit2ipDgVFc/yKUpHHug==
caps: [mon] profile rbd
caps: [osd] profile rbd
client.csi-rbd-provisioner
key: AQAaofFe3Yw9OxAAiJzZ6HQne/e9Zob5G311OA==
caps: [mgr] allow rw
caps: [mon] profile rbd
caps: [osd] profile rbd
mgr.a
key: AQAdofFeh4VZHhAA8VL9gH5jgOxzjTDtEaFWBQ==
caps: [mds] allow *
caps: [mon] allow profile mgr
caps: [osd] allow *
[root@rook-ceph-tools-6d659f5579-knt6x /]# ceph version
ceph version 15.2.3 (d289bbdec69ed7c1f516e0a093594580a76b78d0) octopus (stable)
[root@rook-ceph-tools-6d659f5579-knt6x /]# exit
exit
这样,一个基于 Rook 持久化存储集群就以容器的方式运行起来了,而接下来在 Kubernetes 项目上创建的所有 Pod 就能够通过 Persistent Volume(PV)和 Persistent Volume Claim(PVC)的方式,在容器里挂载由 Ceph 提供的数据卷了。而 Rook 项目,则会负责这些数据卷的生命周期管理、灾难备份等运维工作。
设置 dashboard
dashboard 是非常有用的工具,可让你大致了解 Ceph 集群的状态,包括总体运行状况,单仲裁状态,mgr,osd 和其他 Ceph 守护程序的状态,查看池和 PG 状态,显示日志用于守护程序等等。Rook 使启用仪表板变得简单。
部署 Node SVC
修改dashboard-external-https.yaml
$ vi dashboard-external-https.yaml
apiVersion: v1
kind: Service
metadata:
name: rook-ceph-mgr-dashboard-external-https
namespace: rook-ceph
labels:
app: rook-ceph-mgr
rook_cluster: rook-ceph
spec:
ports:
- name: dashboard
port: 8443
protocol: TCP
targetPort: 8443
selector:
app: rook-ceph-mgr
rook_cluster: rook-ceph
sessionAffinity: None
type: NodePort
创建 Node SVC
$ kubectl create -f dashboard-external-https.yaml
service/rook-ceph-mgr-dashboard-external-https created
$ kubectl get svc -n rook-ceph
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
csi-cephfsplugin-metrics ClusterIP 10.102.32.77 <none> 8080/TCP,8081/TCP 17m
csi-rbdplugin-metrics ClusterIP 10.101.121.5 <none> 8080/TCP,8081/TCP 17m
rook-ceph-mgr ClusterIP 10.99.155.138 <none> 9283/TCP 16m
rook-ceph-mgr-dashboard ClusterIP 10.97.61.135 <none> 8443/TCP 16m
rook-ceph-mgr-dashboard-external-https NodePort 10.108.210.25 <none> 8443:32364/TCP 6s
rook-ceph-mon-a ClusterIP 10.102.116.81 <none> 6789/TCP,3300/TCP 16m
rook-ceph-mon-b ClusterIP 10.101.141.241 <none> 6789/TCP,3300/TCP 16m
rook-ceph-mon-c ClusterIP 10.101.157.247 <none> 6789/TCP,3300/TCP 16m
Rook operator 将启用 ceph-mgr dashboard 模块。将创建一个服务对象以在 Kubernetes 集群中公开该端口。Rook 将启用端口 8443 进行 https 访问。
确认验证
登录 dashboard 需要安全访问。Rook 在运行 Rook Ceph 集群的名称空间中创建一个默认用户,admin 并生成一个称为的秘密rook-ceph-dashboard-admin-password
。
要检索生成的密码,可以运行以下命令:
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
Ceph 块存储应用
创建StorageClass
在提供(Provisioning)块存储之前,需要先创建StorageClass 和存储池。K8S 需要这两类资源,才能和Rook 交互,进而分配持久卷(PV)。
cd rook/cluster/examples/kubernetes/ceph/csi/rbd
kubectl create -f csi/rbd/storageclass.yaml
解读:如下配置文件中会创建一个名为 replicapool 的存储池,和rook-ceph-block
的 storageClass。
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: replicapool
namespace: rook-ceph
spec:
failureDomain: host
replicated:
size: 3
# Disallow setting pool with replica 1, this could lead to data loss without recovery.
# Make sure you're *ABSOLUTELY CERTAIN* that is what you want
requireSafeReplicaSize: true
# gives a hint (%) to Ceph in terms of expected consumption of the total cluster capacity of a given pool
# for more info: https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size
#targetSizeRatio: .5
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
# clusterID is the namespace where the rook cluster is running
# If you change this namespace, also change the namespace below where the secret namespaces are defined
clusterID: rook-ceph
# If you want to use erasure coded pool with RBD, you need to create
# two pools. one erasure coded and one replicated.
# You need to specify the replicated pool here in the `pool` parameter, it is
# used for the metadata of the images.
# The erasure coded pool must be set as the `dataPool` parameter below.
#dataPool: ec-data-pool
pool: replicapool
# RBD image format. Defaults to "2".
imageFormat: "2"
# RBD image features. Available for imageFormat: "2". CSI RBD currently supports only `layering` feature.
imageFeatures: layering
# The secrets contain Ceph admin credentials. These are generated automatically by the operator
# in the same namespace as the cluster.
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
# Specify the filesystem type of the volume. If not specified, csi-provisioner
# will set default as `ext4`.
csi.storage.k8s.io/fstype: ext4
# uncomment the following to use rbd-nbd as mounter on supported nodes
# **IMPORTANT**: If you are using rbd-nbd as the mounter, during upgrade you will be hit a ceph-csi
# issue that causes the mount to be disconnected. You will need to follow special upgrade steps
# to restart your application pods. Therefore, this option is not recommended.
#mounter: rbd-nbd
allowVolumeExpansion: true
reclaimPolicy: Delete
$ kubectl get storageclasses.storage.k8s.io
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
rook-ceph-block rook-ceph.rbd.csi.ceph.com Delete Immediate true 44m
创建PVC
$ kubectl create -f pvc.yaml
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
rbd-pvc Bound pvc-b2b7ce1d-7cad-4b7b-afac-dcf6cd597e88 1Gi RWO rook-ceph-block 45m
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-b2b7ce1d-7cad-4b7b-afac-dcf6cd597e88 1Gi RWO Delete Bound default/rbd-pvc rook-ceph-block 45m
解读:如上创建相应的PVC,storageClassName:为基于 rook Ceph 集群的 rook-ceph-block。
pvc.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rbd-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: rook-ceph-block
消费块设备
$ kubectl create -f rookpod01.yaml
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
rookpod01 0/1 Completed 0 4m1s
解读:创建如上 Pod,并挂载之前所创建的PVC,等待执行完毕
rookpod01.yaml
:
apiVersion: v1
kind: Pod
metadata:
name: rookpod01
spec:
restartPolicy: OnFailure
containers:
- name: test-container
image: busybox
volumeMounts:
- name: block-pvc
mountPath: /var/test
command: ['sh', '-c', 'echo "Hello World" > /var/test/data; exit 0']
volumes:
- name: block-pvc
persistentVolumeClaim:
claimName: rbd-pvc
readOnly: false
测试持久性
# 删除rookpod01
$ kubectl delete -f rookpod01.yaml
pod "rookpod01" deleted
# 创建rookpod02
$ kubectl create -f rookpod02.yaml
pod/rookpod02 created
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
rookpod02 0/1 Completed 0 59s
$ kubectl logs rookpod02
Hello World
解读:创建 rookpod02,并使用所创建的 PVC,测试持久性。
rookpod02.yaml
:
apiVersion: v1
kind: Pod
metadata:
name: rookpod02
spec:
restartPolicy: OnFailure
containers:
- name: test-container
image: busybox
volumeMounts:
- name: block-pvc
mountPath: /var/test
command: ['sh', '-c', 'cat /var/test/data; exit 0']
volumes:
- name: block-pvc
persistentVolumeClaim:
claimName: rbd-pvc
readOnly: false
遇到问题
dashboard 点击概述500 Internal Server Error
解决办法:
创建内置管理员角色的新副本,从该角色中删除 iscsi 权限,然后将此新角色分配给管理员。大概是在上游解决此问题后,可以删除新角色并将管理员角色重新分配给 admin 用户。
ceph dashboard ac-role-create admin-no-iscsi
for scope in dashboard-settings log rgw prometheus grafana nfs-ganesha manager hosts rbd-image config-opt rbd-mirroring cephfs user osd pool monitor; do
ceph dashboard ac-role-add-scope-perms admin-no-iscsi ${scope} create delete read update;
done
ceph dashboard ac-user-set-roles admin admin-no-iscsi
[root@rook-ceph-tools-6d659f5579-knt6x /]# ceph dashboard ac-role-create admin-no-iscsi
{"name": "admin-no-iscsi", "description": null, "scopes_permissions": {}}
[root@rook-ceph-tools-6d659f5579-knt6x /]# for scope in dashboard-settings log rgw prometheus grafana nfs-ganesha manager hosts rbd-image config-opt rbd-mirroring cephfs user osd pool monitor; do
> ceph dashboard ac-role-add-scope-perms admin-no-iscsi ${scope} create delete read update;
[root@rook-ceph-tools-6d659f5579-knt6x /]# ceph dashboard ac-user-set-roles admin admin-no-iscsi
源码地址:
- 点赞
- 收藏
- 关注作者
评论(0)