Grafana入门之kube-controller-manager监控大盘设计
一、背景
开源Prometheus-operator无法监控CCE集群master组件,公有云环境下master组件都是托管状态的,需要通过适配进行master组件指标采集,但是grafana 监控大盘也需要进行大量修改才能使用。
见于以上情况,笔者觉得可以从零开始制作一个kube-controller-manager监控大盘给各位看官使用。同时也想分享下Grafana的基本操作。
DashBoard设计效果如下:
注意: 本文以v9.5.3 版本的Grafana为例
二、 kube-controller-manager关键指标说明
在开始设计Grafana大盘时,请先确保数据源Prometheus已经采集到controller-manager指标。
kube-controller-manager 主要是处理集群中常规任务的后台线程。有如下关键指标
- workqueue_adds_total : 各种类型controller接受处理的任务数量。Counter类型指标。
- workqueue_depth: controller队列深度,表示一个controller中的任务数量。Gauge类型指标。
- workqueue_queue_duration_seconds_bucket: 任务在队列中的等待耗时。Histogram类型
- rest_client_requests_total: 从多维度(状态值status code,方法method,主机host)分析HTTP请求数。
- rest_client_request_duration_seconds_bucket:从方法Verb和URL维度分析HTTP请求时延。
- process_resident_memory_bytes: 内存使用量,进程驻留内存大小。以字节为单位。Gaugel类型指标
- process_cpu_seconds_total:进程用户和系统CPU 使用总时间,以秒为单位
- go_goroutines: 协程数量
将根据这些指标设计出一个dashboard。
三、 实践操作
由于一个完整的DashBoard由多个Panel组成。每个Panel给用户表达不同的指标信息。我们需要完成一个个Panel的设计。
3.1 设计Panel01用来统计kube-controller-manager的运行数量。
- 数据统计相关的面板可使用
Stat
进行可视化。
- 接下来配置数据源,数据源根据实际情况来,一般都是Prometheus数据源,如果有扩展使用Thanos可选择thanos-querier作为数据源。
- 根据Pannel需要展示的内容,编写PromQL:
sum(up{job="kube-controller-proxy"})
统计UP状态的kube-controller-manager组件。
- 针对该Pannel进行补充。包括文字说明,样式,颜色等布局调整。
如果样式满意,可保存退出。
3.2 设计全局参数变量
在开始Panel2设计之前。我们需要预先设计可能需要用到的环境变量。例如数据源、instance实例,因为高可k8s集群,存在多个kube-controller-manager,可以让用户自己去选择要查看哪一个实例的监控信息
- 添加参数变量
- 第一个参数:数据源的设置参考如下
- 第二个参数:实例IP的参数如下
- 参数设置完毕后,点击保存。
需要注意:如果在设置参数前,已经设计过相关Panel,则需要在Panel修改参数,将固定值替换为参数。否则设计出来的Panel因数据不一致导致无法在其他环境中使用。例如Panel1 需要将数据源的参数进行修改。
3.3 设计Panel02用来统计Workqueue入队速率
-
选择TimeSeries作为可视化面板。然后根据根据展示的内容编写PromQL
sum(rate(workqueue_adds_total{job="kube-controller-proxy", instance=~"$instance"}[$__rate_interval])) by ( instance, name)
其中$__rate_interval表示 grafana会按展示区间的尺度自动设定promql的平滑区间,保证在不同需求下的良好行为。 -
调整Panel02中PromQL查询的Label信息
PromQL查询出来的结果label信息过长,我们可以进行裁剪优化。
在Query下面,Options设置中配置Legend,只保留label对应的value值即可
效果如下:
同时我们也注意到Legend的信息中,instance信息总是带上一个端口号,并不是很美观。可以配置Transform规则进行转换:
-
调整Panel02的布局信息
添加纵坐标单位:
Panel02总体效果为:
3.4 设计Panel03用来统计Workqueue深度
- 选择TimeSeries作为可视化面板。然后根据根据展示的内容编写PromQL。
sum(rate(workqueue_depth{job="kube-controller-proxy", instance=~"$instance"}[$__rate_interval])) by ( name)
表达的意思为controller队列中的任务数量变化的情况
- 调整Panel03布局样式
将Legend 由list更换为table,同时增加Last和Max值对比。
3.5 设计Panel04用来统计Workqueue处理时延
-
先来了解一下Prometheus中的直方图类型的指标。直方图通常用于跟踪请求的延迟或响应大小等指标值。每个 bucket 桶的计数器是累加的,较大值的桶也包括所有低数值的桶的计数。相应的桶由特殊的 le 标签表示。le 代表的是小于或等于。kube-controller-manager提供了workqueue_queue_duration_seconds_bucket指标表示任务在队列中的耗时。直方图指标中最常见的一个函数为histogram_quantile,用于计算历史数据指标一段时间内的分位数。
-
选择Time series作为可视化面板。根据需要展示的内容编写PromQL
histogram_quantile(0.99, sum(rate(workqueue_queue_duration_seconds_bucket{job="kube-controller-proxy", instance=~"$instance"}[$__rate_interval])) by ( name, le))
该查询语句则表示过去一段时间内第99个百分数的工作队列处理时延。
-
整体框架出来了,现在对Panel04进行样式布局调整
3.6 设计Panel05用来统计Kube API请求QPS
- 选择TimeSeries作为可视化面板。编写PromQL用来统计: 对kube-apiserver发起的HTTP请求,从方法(Method)和返回值(Code) 维度分析。
sum(rate(rest_client_requests_total{job="kube-controller-proxy", instance=~"$instance",code=~"2.."}[$__rate_interval])) by (method,code) sum(rate(rest_client_requests_total{job="kube-controller-proxy", instance=~"$instance",code=~"3.."}[$__rate_interval])) by (method,code) sum(rate(rest_client_requests_total{job="kube-controller-proxy", instance=~"$instance",code=~"4.."}[$__rate_interval])) by (method,code) sum(rate(rest_client_requests_total{job="kube-controller-proxy", instance=~"$instance",code=~"5.."}[$__rate_interval])) by (method,code)
3.7 设计Panel06用来统计Kube API请求时延
-
选择TimeSeries作为可视化面板。
编写PromQL用来统计:
histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{job="kube-controller-proxy", instance=~"$instance"}[$__rate_interval])) by (verb, le))
该PromQL表示过去一段时间内第99%的KubeAPI请求时延是多少
-
然后进行面板样式调整
主要是添加面板说明,Legend name和value显示
3.8 设计Panel07用来统计Kube-controller-manager的内存使用情况
- 选择Time series面板进行可视化描述。
可通过kube-controller-manager暴露的指标:
process_resident_memory_bytes{job="kube-controller-proxy",instance=~"$instance"}
进行内存使用情况的统计。
由于instance信息还携带端口号,我们还可以通过Transfer进行转换:
- 对Panel07进行样式调整
调整重点包括Y轴坐标的单位设置,Legend信息的展示等
3.9 设计Panel08用来统计Kube-controller-manager的CPU信息
- 选择Time series面板进行可视化描述。
可通过指标:
rate(process_cpu_seconds_total{job="kube-controller-proxy",instance=~"$instance"}[$__rate_interval])
表示Kube-controller-manager的CPU使用情况
由于上述PromQL查询出来的结果有很多标签信息。需要过滤处理一下:
然后将instance这个标签进行截断处理:
对Panel08进行样式调整:
3.10 设置Panel09用来统计Kube-controller-manager的Go协程数量
- 选择可视化面板Timeseries 进行展示,填写相关PromQL
go_goroutines{job="kube-controller-proxy",instance=~"$instance"}
- 样式布局调整
四、 整体效果展示
上述各种Panel设计完毕后,都会保存在一张Dashboard下。
如果右上角未出现时间范围的选择,可前往dashboard设置中关闭Hide time picker选项
将该DashBoard导入高可用集群:
五、 延伸扩展
5.1 关于CCE集群如何获取master组件指标
如果需要通过自建的Prometheus采集Master节点组件指标,可通过以下指导进行配置:
https://support.huaweicloud.com/usermanual-cce/cce_10_0559.html
5.2 Dashboard持久化部署
如果是在Kubernetes环境中使用的话,我们可以将该DashBoard以volume形式挂载给Grafana容器,这样配置带来的好处是,如果grafana容器重启后,该dashboard不会消失。
- 创建configmap。将dashboard中的信息存储在里面
kubectl apply -f dashboard-kubecontroller.yaml
kind: ConfigMap apiVersion: v1 metadata: name: grafana-dashboard-controllermanager namespace: monitoring data: grafana-controllermanager.json: |- { "annotations": { "list": [ { "builtIn": 1, "datasource": { "type": "grafana", "uid": "-- Grafana --" }, "enable": true, "hide": true, "iconColor": "rgba(0, 211, 255, 1)", "name": "Annotations & Alerts", "type": "dashboard" } ] }, "description": "master组件kube-controller-manager指标大盘", "editable": true, "fiscalYearStartMonth": 0, "graphTooltip": 0, "id": 28, "links": [], "liveNow": false, "panels": [ { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "description": "统计kube-controller-manager正常运行的数量。可帮助判断该组件的运行状态", "fieldConfig": { "defaults": { "color": { "mode": "continuous-BlYlRd" }, "displayName": "kube-controller-manager运行:", "mappings": [ { "options": { "null": { "color": "red", "index": 0, "text": "N/A" } }, "type": "value" } ], "noValue": "指标出现异常,请排查", "thresholds": { "mode": "absolute", "steps": [ { "color": "green", "value": null }, { "color": "red", "value": 80 } ] }, "unit": "none" }, "overrides": [] }, "gridPos": { "h": 6, "w": 6, "x": 0, "y": 0 }, "id": 123124, "options": { "colorMode": "background", "graphMode": "none", "justifyMode": "auto", "orientation": "auto", "reduceOptions": { "calcs": [ "min" ], "fields": "", "values": false }, "text": {}, "textMode": "value_and_name" }, "pluginVersion": "9.5.3", "targets": [ { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "editorMode": "code", "expr": "sum (up{job=\"kube-controller-proxy\"})", "legendFormat": "__auto", "range": true, "refId": "A" } ], "title": "运行中", "type": "stat" }, { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "description": "各种类型controller每秒接受处理的任务数量。", "fieldConfig": { "defaults": { "color": { "mode": "palette-classic", "seriesBy": "last" }, "custom": { "axisCenteredZero": false, "axisColorMode": "text", "axisGridShow": true, "axisLabel": "", "axisPlacement": "auto", "barAlignment": 0, "drawStyle": "line", "fillOpacity": 20, "gradientMode": "opacity", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "lineInterpolation": "linear", "lineStyle": { "fill": "solid" }, "lineWidth": 3, "pointSize": 2, "scaleDistribution": { "type": "linear" }, "showPoints": "always", "spanNulls": false, "stacking": { "group": "A", "mode": "none" }, "thresholdsStyle": { "mode": "off" } }, "decimals": 2, "mappings": [ { "options": { "null": { "index": 0, "text": "N/A" } }, "type": "value" } ], "min": 0, "thresholds": { "mode": "absolute", "steps": [ { "color": "green", "value": null }, { "color": "dark-red", "value": 80 } ] }, "unit": "ops" }, "overrides": [] }, "gridPos": { "h": 6, "w": 17, "x": 6, "y": 0 }, "id": 123125, "options": { "legend": { "calcs": [ "lastNotNull", "max" ], "displayMode": "table", "placement": "right", "showLegend": true, "sortBy": "Last *", "sortDesc": true }, "timezone": [ "browser" ], "tooltip": { "mode": "single", "sort": "none" } }, "targets": [ { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "editorMode": "code", "expr": "sum(rate(workqueue_adds_total{job=\"kube-controller-proxy\", instance=~\"$instance\"}[$__rate_interval])) by ( instance, name)", "legendFormat": "__auto", "range": true, "refId": "A" } ], "title": "Workqueue入队速率", "transformations": [ { "id": "renameByRegex", "options": { "regex": "(.*):10257(.*)", "renamePattern": "$1$2" } } ], "type": "timeseries" }, { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "description": "一段时间内第99个百分数的工作队列处理时延。", "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisCenteredZero": false, "axisColorMode": "text", "axisLabel": "", "axisPlacement": "auto", "barAlignment": 0, "drawStyle": "line", "fillOpacity": 13, "gradientMode": "opacity", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "lineInterpolation": "linear", "lineStyle": { "fill": "solid" }, "lineWidth": 2, "pointSize": 1, "scaleDistribution": { "type": "linear" }, "showPoints": "auto", "spanNulls": false, "stacking": { "group": "A", "mode": "none" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [ { "options": { "NaN": { "index": 1 }, "null": { "index": 0, "text": "N/A" } }, "type": "value" } ], "thresholds": { "mode": "absolute", "steps": [ { "color": "green", "value": null }, { "color": "red", "value": 80 } ] }, "unit": "s" }, "overrides": [] }, "gridPos": { "h": 6, "w": 23, "x": 0, "y": 6 }, "id": 123127, "options": { "legend": { "calcs": [ "lastNotNull" ], "displayMode": "table", "placement": "right", "showLegend": true, "sortBy": "Last *", "sortDesc": false }, "timezone": [ "browser" ], "tooltip": { "mode": "single", "sort": "none" } }, "targets": [ { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "editorMode": "code", "expr": "histogram_quantile(0.99, sum(rate(workqueue_queue_duration_seconds_bucket{job=\"kube-controller-proxy\", instance=~\"$instance\"}[$__rate_interval])) by ( name, le))", "legendFormat": "{{name}}", "range": true, "refId": "A" } ], "title": "Workqueue处理时延", "type": "timeseries" }, { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "description": "从方法和状态码维度统计对kube-apiserver发起的rest请求。", "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisCenteredZero": false, "axisColorMode": "text", "axisLabel": "", "axisPlacement": "auto", "barAlignment": 0, "drawStyle": "line", "fillOpacity": 3, "gradientMode": "none", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "lineInterpolation": "linear", "lineStyle": { "fill": "solid" }, "lineWidth": 2, "pointSize": 1, "scaleDistribution": { "type": "linear" }, "showPoints": "auto", "spanNulls": false, "stacking": { "group": "A", "mode": "none" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "green", "value": null }, { "color": "red", "value": 80 } ] }, "unit": "ops" }, "overrides": [] }, "gridPos": { "h": 7, "w": 11, "x": 0, "y": 12 }, "id": 123128, "options": { "legend": { "calcs": [ "lastNotNull" ], "displayMode": "table", "placement": "right", "showLegend": true, "sortBy": "Last *", "sortDesc": true }, "tooltip": { "mode": "single", "sort": "none" } }, "targets": [ { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "editorMode": "code", "expr": "sum(rate(rest_client_requests_total{job=\"kube-controller-proxy\", instance=~\"$instance\",code=~\"2..\"}[$__rate_interval])) by (method,code)", "interval": "", "legendFormat": "{{method}} {{code}}", "range": true, "refId": "A" }, { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "editorMode": "code", "expr": "sum(rate(rest_client_requests_total{job=\"kube-controller-proxy\", instance=~\"$instance\",code=~\"3..\"}[$__rate_interval])) by (method,code)", "hide": false, "legendFormat": "{{method}} {{code}}", "range": true, "refId": "B" }, { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "editorMode": "code", "expr": "sum(rate(rest_client_requests_total{job=\"kube-controller-proxy\", instance=~\"$instance\",code=~\"4..\"}[$__rate_interval])) by (method,code)", "hide": false, "legendFormat": "{{method}} {{code}}", "range": true, "refId": "C" }, { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "editorMode": "code", "expr": "sum(rate(rest_client_requests_total{job=\"kube-controller-proxy\", instance=~\"$instance\",code=~\"5..\"}[$__rate_interval])) by (method,code)", "hide": false, "legendFormat": "{{method}} {{code}}", "range": true, "refId": "D" } ], "title": "Kube API 请求速率", "type": "timeseries" }, { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "description": "对kube-apiserver发起的HTTP请求时延,表示过去一段时间内第99个百分数HTTP请求时延。", "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisCenteredZero": false, "axisColorMode": "text", "axisLabel": "", "axisPlacement": "auto", "barAlignment": 0, "drawStyle": "line", "fillOpacity": 33, "gradientMode": "opacity", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "lineInterpolation": "linear", "lineStyle": { "fill": "solid" }, "lineWidth": 2, "pointSize": 1, "scaleDistribution": { "type": "linear" }, "showPoints": "auto", "spanNulls": true, "stacking": { "group": "A", "mode": "none" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "green", "value": null }, { "color": "red", "value": 80 } ] }, "unit": "s" }, "overrides": [] }, "gridPos": { "h": 7, "w": 12, "x": 11, "y": 12 }, "id": 123129, "options": { "legend": { "calcs": [ "lastNotNull" ], "displayMode": "table", "placement": "right", "showLegend": true, "sortBy": "Last *", "sortDesc": false }, "tooltip": { "mode": "single", "sort": "none" } }, "targets": [ { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "editorMode": "code", "expr": "histogram_quantile(0.99, sum(rate(rest_client_request_duration_seconds_bucket{job=\"kube-controller-proxy\", instance=~\"$instance\"}[$__rate_interval])) by (verb, le))", "legendFormat": "{{verb}}", "range": true, "refId": "A" } ], "title": "Kube API请求时延", "type": "timeseries" }, { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "description": "kube-controller-manager进程的内存使用情况", "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisCenteredZero": false, "axisColorMode": "text", "axisLabel": "", "axisPlacement": "auto", "barAlignment": 0, "drawStyle": "line", "fillOpacity": 9, "gradientMode": "opacity", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "lineInterpolation": "linear", "lineStyle": { "fill": "solid" }, "lineWidth": 3, "pointSize": 2, "scaleDistribution": { "type": "linear" }, "showPoints": "auto", "spanNulls": false, "stacking": { "group": "A", "mode": "none" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "green", "value": null }, { "color": "red", "value": 80 } ] }, "unit": "bytes" }, "overrides": [] }, "gridPos": { "h": 7, "w": 8, "x": 0, "y": 19 }, "id": 123130, "options": { "legend": { "calcs": [ "lastNotNull" ], "displayMode": "table", "placement": "bottom", "showLegend": true }, "tooltip": { "mode": "single", "sort": "none" } }, "targets": [ { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "editorMode": "code", "exemplar": false, "expr": "process_resident_memory_bytes{job=\"kube-controller-proxy\",instance=~\"$instance\"}", "instant": false, "legendFormat": "{{instance}}", "range": true, "refId": "A" } ], "title": "内存使用量", "transformations": [ { "id": "renameByRegex", "options": { "regex": "(.*):10257", "renamePattern": "$1" } } ], "type": "timeseries" }, { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "description": "kube-controller-manager的CPU使用情况", "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisCenteredZero": false, "axisColorMode": "text", "axisLabel": "", "axisPlacement": "auto", "barAlignment": 0, "drawStyle": "line", "fillOpacity": 9, "gradientMode": "opacity", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "lineInterpolation": "linear", "lineWidth": 2, "pointSize": 1, "scaleDistribution": { "type": "linear" }, "showPoints": "auto", "spanNulls": false, "stacking": { "group": "A", "mode": "none" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "green", "value": null }, { "color": "red", "value": 80 } ] }, "unit": "none" }, "overrides": [] }, "gridPos": { "h": 7, "w": 7, "x": 8, "y": 19 }, "id": 123131, "options": { "legend": { "calcs": [ "lastNotNull" ], "displayMode": "table", "placement": "bottom", "showLegend": true }, "tooltip": { "mode": "single", "sort": "none" } }, "targets": [ { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "editorMode": "code", "exemplar": false, "expr": "rate(process_cpu_seconds_total{job=\"kube-controller-proxy\",instance=~\"$instance\"}[$__rate_interval])", "format": "time_series", "instant": false, "legendFormat": "{{instance}}", "range": true, "refId": "A" } ], "title": "CPU使用量", "transformations": [ { "id": "renameByRegex", "options": { "regex": "(.*):10257", "renamePattern": "$1" } } ], "type": "timeseries" }, { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "description": "当前进程中存在的Go协程数量", "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisCenteredZero": false, "axisColorMode": "text", "axisLabel": "", "axisPlacement": "auto", "barAlignment": 0, "drawStyle": "line", "fillOpacity": 13, "gradientMode": "opacity", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "lineInterpolation": "linear", "lineWidth": 2, "pointSize": 1, "scaleDistribution": { "type": "linear" }, "showPoints": "auto", "spanNulls": false, "stacking": { "group": "A", "mode": "none" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "green", "value": null }, { "color": "red", "value": 80 } ] }, "unit": "none" }, "overrides": [] }, "gridPos": { "h": 7, "w": 8, "x": 15, "y": 19 }, "id": 123132, "options": { "legend": { "calcs": [ "lastNotNull" ], "displayMode": "table", "placement": "bottom", "showLegend": true }, "tooltip": { "mode": "single", "sort": "none" } }, "targets": [ { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "editorMode": "code", "expr": "go_goroutines{job=\"kube-controller-proxy\",instance=~\"$instance\"}", "legendFormat": "{{instance}}", "range": true, "refId": "A" } ], "title": "Go协程数量", "transformations": [ { "id": "renameByRegex", "options": { "regex": "(.*):10257", "renamePattern": "$1" } } ], "type": "timeseries" }, { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "description": "队列深度,展示各个controller的处理任务的变化情况。", "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisCenteredZero": true, "axisColorMode": "text", "axisLabel": "", "axisPlacement": "auto", "axisSoftMax": 2, "axisSoftMin": -1, "barAlignment": 0, "drawStyle": "line", "fillOpacity": 3, "gradientMode": "none", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "lineInterpolation": "linear", "lineWidth": 1, "pointSize": 5, "scaleDistribution": { "type": "linear" }, "showPoints": "auto", "spanNulls": false, "stacking": { "group": "A", "mode": "none" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [ { "options": { "null": { "index": 0, "text": "N/A" } }, "type": "value" } ], "min": -1, "thresholds": { "mode": "absolute", "steps": [ { "color": "green", "value": null }, { "color": "red", "value": 80 } ] } }, "overrides": [] }, "gridPos": { "h": 7, "w": 23, "x": 0, "y": 26 }, "id": 123126, "options": { "legend": { "calcs": [ "lastNotNull", "max" ], "displayMode": "table", "placement": "right", "showLegend": true, "sortBy": "Max", "sortDesc": true }, "timezone": [ "browser" ], "tooltip": { "mode": "single", "sort": "none" } }, "targets": [ { "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "editorMode": "code", "exemplar": false, "expr": "sum(rate(workqueue_depth{ job=\"kube-controller-proxy\", instance=~\"$instance\"}[$__rate_interval])) by (name)", "instant": false, "interval": "", "legendFormat": "{{name}}", "range": true, "refId": "A" } ], "title": "Workqueue深度", "type": "timeseries" }, { "datasource": { "type": "datasource", "uid": "grafana" }, "gridPos": { "h": 3, "w": 24, "x": 0, "y": 33 }, "id": 1, "targets": [ { "datasource": { "type": "datasource", "uid": "grafana" }, "refId": "A" } ], "type": "welcome" }, { "datasource": { "type": "datasource", "uid": "grafana" }, "gridPos": { "h": 9, "w": 24, "x": 0, "y": 36 }, "id": 123123, "targets": [ { "datasource": { "type": "datasource", "uid": "grafana" }, "refId": "A" } ], "type": "gettingstarted" }, { "datasource": { "type": "datasource", "uid": "grafana" }, "gridPos": { "h": 15, "w": 12, "x": 0, "y": 45 }, "id": 3, "links": [], "options": { "folderId": 0, "maxItems": 30, "query": "", "showHeadings": true, "showRecentlyViewed": true, "showSearch": false, "showStarred": true, "tags": [] }, "pluginVersion": "9.5.3", "tags": [], "targets": [ { "datasource": { "type": "datasource", "uid": "grafana" }, "refId": "A" } ], "title": "Dashboards", "type": "dashlist" }, { "datasource": { "type": "datasource", "uid": "grafana" }, "gridPos": { "h": 15, "w": 12, "x": 12, "y": 45 }, "id": 4, "links": [], "options": { "feedUrl": "https://grafana.com/blog/news.xml", "showImage": true }, "targets": [ { "datasource": { "type": "datasource", "uid": "grafana" }, "refId": "A" } ], "title": "Latest from the blog", "type": "news" } ], "refresh": "", "schemaVersion": 38, "style": "dark", "tags": [], "templating": { "list": [ { "current": { "selected": false, "text": "prometheus", "value": "prometheus" }, "description": "选择Grafana需要对接使用的数据源。", "hide": 0, "includeAll": false, "label": "数据源", "multi": true, "name": "dataSource", "options": [], "query": "prometheus", "queryValue": "", "refresh": 1, "regex": "", "skipUrlSync": false, "type": "datasource" }, { "current": { "selected": true, "text": [ "All" ], "value": [ "$__all" ] }, "datasource": { "type": "prometheus", "uid": "${dataSource}" }, "definition": "query_result(up{job=\"kube-controller-proxy\"})", "description": "kube-controller-manager所在master节点", "hide": 0, "includeAll": true, "label": "节点实例", "multi": true, "name": "instance", "options": [], "query": { "query": "query_result(up{job=\"kube-controller-proxy\"})", "refId": "PrometheusVariableQueryEditor-VariableQuery" }, "refresh": 2, "regex": ".*instance=\"(.*?)\".*", "skipUrlSync": false, "sort": 1, "type": "query" } ] }, "time": { "from": "now-10m", "to": "now" }, "timepicker": { "hidden": false, "refresh_intervals": [ "5s", "10s", "30s", "1m", "5m", "15m", "30m", "1h", "2h", "1d" ], "time_options": [ "5m", "15m", "1h", "6h", "12h", "24h", "2d", "7d", "30d" ], "type": "timepicker" }, "timezone": "browser", "title": "kube-controller-manager", "uid": "acbefb2d-c82a-47e7-9154-cf1a240f2967", "version": 1, "weekStart": "" }
- 修改Grafana 负载yaml,添加volumes和volumeMounts信息。
apiVersion: apps/v1 kind: Deployment metadata: name: grafana namespace: monitoring spec: xxx: xxx template: spec: containers: - env: [] image: grafana/grafana:9.5.3 volumeMounts: - xxx: xxx ## 添加容器挂载点信息 - mountPath: /grafana-dashboard-definitions/0/controllermanager name: grafana-dashboard-controllermanager readOnly: false volumes: - xxx: xxx - configMap: #将dashboard信息以volume形式挂载 name: grafana-dashboard-controllermanager name: grafana-dashboard-controllermanager
- 登录Grafana验证查看
这样配置后dashboard就会持久化存储。后面无论是否重启Pod,该大板都不会消失。
- 点赞
- 收藏
- 关注作者
评论(0)