zabbix自发现实时监控docker容器及容器中各个服务的状态线上业务展示
本文从开始到监控docker容器状态分为三部分
1.查看自己的环境变量以及自己的服务路径
(1)jdk环境: jdk1.8
(2)zabbix版本:zabbix3.4.5
(3)zabbix脚本存放路径:/data/zabbix/scripts/
( 4 ) .conf文件存放路径:/data/zabbix/etc/zabbix_agentd.conf.d/
2.配置脚本、key、模板
首选,zabbix_agentd 配置 vim/data/zabbix/etc/zabbix_agentd.conf.d/docker.conf
UserParameter=docker.discovery,/data/zabbix/scripts/docker.py UserParameter=docker.[*],/data/zabbix/script/docker.py $1 $2
下面是docker.py 脚本,采用自动发现规则来发现容器,然后指定容器获取状态信息:
#!/usr/bin/python import sys import os import json def discover(): d = {} d['data'] = [] with os.popen("docker ps -a --format {{.Names}}") as pipe: for line in pipe: info = {} info['{#CONTAINERNAME}'] = line.replace("\n","") d['data'].append(info) print json.dumps(d) def status(name,action): if action == "ping": cmd = 'docker inspect --format="{{.State.Running}}" %s' %name result = os.popen(cmd).read().replace("\n","") if result == "true": print 1 else: print 0 else: cmd = 'docker stats %s --no-stream --format "{{.%s}}"' % (name,action) result = os.popen(cmd).read().replace("\n","") if "%" in result: print float(result.replace("%","")) else: print result if name == 'main': try: name, action = sys.argv[1], sys.argv[2] status(name,action) except IndexError: discover()
这里说一下自动发现规则的坑。。。我被坑了好久才找出来.....一是必须返回json格式内容,二是 info['{#CONTAINERNAME}' ] 这个key一定要这么写{#CONTAINERNAME}
返回结果如下,一定要是这样的层级关系
{"data": [{"{#CONTAINERNAME}": "node-3"}, {"{#CONTAINERNAME}": "node-2"}, {"{#CONTAINERNAME}": "node-1"}, {"{#CONTAINERNAME}": "web"}, {"{#CONTAINERNAME}": "cadvisor"}, {"{#CONTAINERNAME}": "updatol"}, {"{#CONTAINERNAME}": "research"}, {"{#CONTAINERNAME}": "services"}, {"{#CONTAINERNAME}": "data"}, {"{#CONTAINERNAME}": "rabbitmq"}, {"{#CONTAINERNAME}": "redis"}, {"{#CONTAINERNAME}": "mysql"}, {"{#CONTAINERNAME}": "ssdb"}]}另外那个函数的很简单了,就是调用docker 命令在获取数据的。
然后导入模板:
模板如下:
<?xml version="1.0" encoding="UTF-8"?> <zabbix_export> <version>3.2</version> <date>2018-06-04T04:12:36Z</date> <groups> <group> <name>Templates</name> </group> </groups> <templates> <template> <template>docker-status</template> <name>docker-status</name> <description/> <groups> <group> <name>Templates</name> </group> </groups> <applications> <application> <name>docker_test</name> </application> </applications> <items/> <discovery_rules> <discovery_rule> <name>docker.discovery</name> <type>0</type> <snmp_community/> <snmp_oid/> <key>docker.discovery</key> <delay>60</delay> <status>0</status> <allowed_hosts/> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <delay_flex/> <params/> <ipmi_sensor/> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <filter> <evaltype>0</evaltype> <formula/> <conditions> <condition> <macro>{#CONTAINERNAME}</macro> <value>@ CONTAINER NAME</value> <operator>8</operator> <formulaid>A</formulaid> </condition> </conditions> </filter> <lifetime>30</lifetime> <description/> <item_prototypes> <item_prototype> <name>Container {#CONTAINERNAME} Diskio usage:</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>docker.[{#CONTAINERNAME} ,BlockIO]</key> <delay>60</delay> <history>90</history> <trends>0</trends> <status>0</status> <value_type>1</value_type> <allowed_hosts/> <units/> <delta>0</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>docker_test</name> </application> </applications> <valuemap/> <logtimefmt/> <application_prototypes/> </item_prototype> <item_prototype> <name>Container{#CONTAINERNAME} CPU usage:</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>docker.[{#CONTAINERNAME},CPUPerc]</key> <delay>60</delay> <history>90</history> <trends>365</trends> <status>0</status> <value_type>0</value_type> <allowed_hosts/> <units>%</units> <delta>0</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>docker_test</name> </application> </applications> <valuemap/> <logtimefmt/> <application_prototypes/> </item_prototype> <item_prototype> <name>Container {#CONTAINERNAME} mem usage:</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>docker.[{#CONTAINERNAME},MemPerc]</key> <delay>60</delay> <history>90</history> <trends>365</trends> <status>0</status> <value_type>0</value_type> <allowed_hosts/> <units>%</units> <delta>0</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>docker_test</name> </application> </applications> <valuemap/> <logtimefmt/> <application_prototypes/> </item_prototype> <item_prototype> <name>Container {#CONTAINERNAME} NETio usage:</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>docker.[{#CONTAINERNAME},NetIO]</key> <delay>60</delay> <history>90</history> <trends>0</trends> <status>0</status> <value_type>1</value_type> <allowed_hosts/> <units/> <delta>0</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>docker_test</name> </application> </applications> <valuemap/> <logtimefmt/> <application_prototypes/> </item_prototype> <item_prototype> <name>Container{#CONTAINERNAME} is_run :</name> <type>0</type> <snmp_community/> <multiplier>0</multiplier> <snmp_oid/> <key>docker.[{#CONTAINERNAME} ,ping]</key> <delay>30</delay> <history>90</history> <trends>365</trends> <status>0</status> <value_type>3</value_type> <allowed_hosts/> <units/> <delta>0</delta> <snmpv3_contextname/> <snmpv3_securityname/> <snmpv3_securitylevel>0</snmpv3_securitylevel> <snmpv3_authprotocol>0</snmpv3_authprotocol> <snmpv3_authpassphrase/> <snmpv3_privprotocol>0</snmpv3_privprotocol> <snmpv3_privpassphrase/> <formula>1</formula> <delay_flex/> <params/> <ipmi_sensor/> <data_type>0</data_type> <authtype>0</authtype> <username/> <password/> <publickey/> <privatekey/> <port/> <description/> <inventory_link>0</inventory_link> <applications> <application> <name>docker_test</name> </application> </applications> <valuemap/> <logtimefmt/> <application_prototypes/> </item_prototype> </item_prototypes> <trigger_prototypes> <trigger_prototype> <expression>{docker-status:docker.[{#CONTAINERNAME} ,ping].last()}=0</expression> <recovery_mode>0</recovery_mode> <recoveryexpression/> <name>docker{#CONTAINERNAME}_down</name> <correlation_mode>0</correlation_mode> <correlation_tag/> <url/> <status>0</status> <priority>5</priority> <description/> <type>0</type> <manual_close>0</manual_close> <dependencies/> <tags/> </trigger_prototype> </trigger_prototypes> <graph_prototypes/> <host_prototypes/> </discovery_rule> </discovery_rules> <httptests/> <macros/> <templates/> <screens/> </template> </templates> </zabbix_export>~~
模板下载链接:https://pan.baidu.com/share/init?surl=18Z9QIkSuLQ3sSPqSlbY2A 密码:3544
3.web端操作
导入模板后
可能有的人在这导入模板后会出现这个问题:
zabbixGlobal regular expression " CONTAINER NAME" does not exist.
出现这个问题,证明:问题不大。
去这个地方把这个去掉
如果你要了解这个时什么意思:请参考官网:zabbix正则表达式写法,大体意思如下:
去管理 、一般,里边点开正则
自己先去了解这个怎么用,去添加就好了
接下来我们看一下模板中都监控了什么
触发器有一个
都没问题之后我们看一下最新数据
我们看到了各个容器状态已经添加上去了,
到此结束,你可以向你领导交差了。
本文转自拎壶冲冲冲博客51CTO博客,如需转载,请自行联系原作者。
- 点赞
- 收藏
- 关注作者
评论(0)