Blackbox_Exporter 项目地址: blackbox_exporter
Blackbox Exporter(黑盒监控)
blackbox_exporter 允许通过 HTTP、HTTPS、DNS、TCP 和 ICMP 对端点进行黑盒探测。新版的 Prometheus Stack 已经默认安装了 Blackbox Exporter,可以通过以下命令查看
1 | # kubectl get po -n monitoring -l app.kubernetes.io/name=blackbox-exporter |
同时也会创建一个 Service,可以通过该 Service 访问 Blackbox Exporter 并传递一些参数
1 | # kubectl get svc -n monitoring -l app.kubernetes.io/name=blackbox-exporter |
- 19115 是 HTTP 协议,9115 是 HTTPS 协议
示例: 可以使用以下命令检测域名 qdmwms.china-snow.net 的状态:
1 | # curl -s "http://10.108.228.200:19115/probe?target=qdmwms.china-snow.net&module=http_2xx" |tail -5 |
- probe: 是 Blackbox Exporter 的指标接口地址,类似于 /metrics;
- target: 需要检测的目标
- module: 使用哪个模块进行检测
Blackbox Exporter 配置文件
Blackbox Exporter 的配置文件说明地址: CONFIGURATION.md
Prometheus 静态配置
Prometheus 支持通过静态配置来配置抓取目标,静态配置的关键字是 static_configs
。以下示例为在 Kubernetes 中使用静态配置方式为 Blackbox Exporter 配置抓取目标。
创建一个空文件,然后通过该文件创建一个 Secret,那么这个 Secret 既可作为 Prometheus 的静态配置。
1
2
3# touch prometheus-additional.yaml # 创建一个空文件
# kubectl create secret generic additional-configs --from-file=prometheus-additional.yaml -n monitoring # 创建 Secret
secret/additional-configs created编辑 Prometheus 配置
1
# kubectl edit prometheus -n monitoring k8s
在 image 同级字段下,添加以下内容,然后保存退出。无需重启 Prometheus 的 Pod 即可生效。
1
2
3
4
5
6
7...
image: quay.io/prometheus/prometheus:v2.29.1
additionalScrapeConfigs:
key: prometheus-additional.yaml
name: additional-configs
optional: true
...在 prometheus-additional.yaml 文件内编辑一些静态配置,此处用黑盒监控的配置进行演示,内容如下
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16- job_name: 'blackbox'
metrics_path: /probe
params:
module: [http_2xx] # Look for a HTTP 200 response.
static_configs:
- targets:
- http://qdmwms.china-snow.net # Target to probe with http.
- https://sal.china-snow.net # Target to probe with https.
- http://nexus.china-snow.net # Target to probe with http on port 8080.
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: blackbox-exporter:19115 # The blackbox exporter's real hostname:port.使用以下命令热更新 Secret
1
2# kubectl create secret generic additional-configs --from-file=prometheus-additional.yaml --dry-run=client -oyaml | kubectl replace -f - -n monitoring
secret/additional-configs replaced可以使用以下命令检查静态配置的 Secret
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32# kubectl get secrets -n monitoring additional-configs -oyaml # 查看 Secret
apiVersion: v1
data:
prometheus-additional.yaml: LSBqb2JfbmFtZTogJ2JsYWNrYm94JwogIG1ldHJpY3NfcGF0aDogL3Byb2JlCiAgcGFyYW1zOgogICAgbW9kdWxlOiBbaHR0cF8yeHhdICAjIExvb2sgZm9yIGEgSFRUUCAyMDAgcmVzcG9uc2UuCiAgc3RhdGljX2NvbmZpZ3M6CiAgICAtIHRhcmdldHM6CiAgICAgIC0gaHR0cDovL3FkbXdtcy5jaGluYS1zbm93Lm5ldCAgICAjIFRhcmdldCB0byBwcm9iZSB3aXRoIGh0dHAuCiAgICAgIC0gaHR0cHM6Ly9zYWwuY2hpbmEtc25vdy5uZXQgICAjIFRhcmdldCB0byBwcm9iZSB3aXRoIGh0dHBzLgogICAgICAtIGh0dHA6Ly9uZXh1cy5jaGluYS1zbm93Lm5ldCAjIFRhcmdldCB0byBwcm9iZSB3aXRoIGh0dHAgb24gcG9ydCA4MDgwLgogIHJlbGFiZWxfY29uZmlnczoKICAgIC0gc291cmNlX2xhYmVsczogW19fYWRkcmVzc19fXQogICAgICB0YXJnZXRfbGFiZWw6IF9fcGFyYW1fdGFyZ2V0CiAgICAtIHNvdXJjZV9sYWJlbHM6IFtfX3BhcmFtX3RhcmdldF0KICAgICAgdGFyZ2V0X2xhYmVsOiBpbnN0YW5jZQogICAgLSB0YXJnZXRfbGFiZWw6IF9fYWRkcmVzc19fCiAgICAgIHJlcGxhY2VtZW50OiBibGFja2JveC1leHBvcnRlcjoxOTExNSAgIyBUaGUgYmxhY2tib3ggZXhwb3J0ZXIncyByZWFsIGhvc3RuYW1lOnBvcnQuCg==
kind: Secret
metadata:
creationTimestamp: "2021-10-11T02:45:48Z"
name: additional-configs
namespace: monitoring
resourceVersion: "3710432"
uid: 5296ca4b-f66f-4bc4-a148-2724190c9cc2
type: Opaque
# 通过 base64 命令解码秘钥获取静态配置内容分
# echo """LSBqb2JfbmFtZTogJ2JsYWNrYm94JwogIG1ldHJpY3NfcGF0aDogL3Byb2JlCiAgcGFyYW1zOgogICAgbW9kdWxlOiBbaHR0cF8yeHhdICAjIExvb2sgZm9yIGEgSFRUUCAyMDAgcmVzcG9uc2UuCiAgc3RhdGljX2NvbmZpZ3M6CiAgICAtIHRhcmdldHM6CiAgICAgIC0gaHR0cDovL3FkbXdtcy5jaGluYS1zbm93Lm5ldCAgICAjIFRhcmdldCB0byBwcm9iZSB3aXRoIGh0dHAuCiAgICAgIC0gaHR0cHM6Ly9zYWwuY2hpbmEtc25vdy5uZXQgICAjIFRhcmdldCB0byBwcm9iZSB3aXRoIGh0dHBzLgogICAgICAtIGh0dHA6Ly9uZXh1cy5jaGluYS1zbm93Lm5ldCAjIFRhcmdldCB0byBwcm9iZSB3aXRoIGh0dHAgb24gcG9ydCA4MDgwLgogIHJlbGFiZWxfY29uZmlnczoKICAgIC0gc291cmNlX2xhYmVsczogW19fYWRkcmVzc19fXQogICAgICB0YXJnZXRfbGFiZWw6IF9fcGFyYW1fdGFyZ2V0CiAgICAtIHNvdXJjZV9sYWJlbHM6IFtfX3BhcmFtX3RhcmdldF0KICAgICAgdGFyZ2V0X2xhYmVsOiBpbnN0YW5jZQogICAgLSB0YXJnZXRfbGFiZWw6IF9fYWRkcmVzc19fCiAgICAgIHJlcGxhY2VtZW50OiBibGFja2JveC1leHBvcnRlcjoxOTExNSAgIyBUaGUgYmxhY2tib3ggZXhwb3J0ZXIncyByZWFsIGhvc3RuYW1lOnBvcnQuCg==""" | base64 -d
- job_name: 'blackbox'
metrics_path: /probe
params:
module: [http_2xx] # Look for a HTTP 200 response.
static_configs:
- targets:
- http://qdmwms.china-snow.net # Target to probe with http.
- https://sal.china-snow.net # Target to probe with https.
- http://nexus.china-snow.net # Target to probe with http on port 8080.
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: blackbox-exporter:19115 # The blackbox exporter's real hostname:port.更新完成后,稍等一分钟即可在 Prometheus Web UI看到该配置
监控状态 UP 后,导入黑盒监控的模板(
https://grafana.com/grafana/dashboards/13659
)即可
创建告警规则,监控 SSL 证书
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36cat << EOF | kubectl create -f -
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
app.kubernetes.io/component: prometheus
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.26.0
prometheus: k8s
role: alert-rules
name: prometheus-k8s-other-rules
namespace: monitoring
spec:
groups:
- name: SSL 证书即将过期
rules:
- alert: SslCertificateWillExpireSoon
expr: probe_ssl_earliest_cert_expiry - time() < 86400 * 30
for: 5m
labels:
severity: warning
annotations:
summary: "SSL certificate will expire soon (instance {{ \$labels.instance }})"
description: "SSL 证书距离过期时间还有 {{ \$value }} / 86400 days\\n VALUE = {{ \$value }}\\n LABELS: {{ \$labels }}"
- name: 证书已过期
rules:
- alert: SslCertificateExpired
expr: probe_ssl_earliest_cert_expiry - time() <= 0
for: 5m
labels:
severity: error
annotations:
summary: "SSL certificate expired (instance {{ \$labels.instance }})"
description: "SSL 证书已过期\\n VALUE = {{ \$value }}\\n LABELS: {{ \$labels }}"
EOF登录 Prometheus Web 界面,查看 Alerts 页面,是否正常加载相关的告警规则。