k8s学习(29):使用Helm部署Prometheus

Prometheus架构

Prometheus是一个非常优秀的监控工具。准确的说,应该是监控方案。Prometheus提供数据收集、存储、处理、可视化和告警一套完整的解决方案

img

  • Prometheus Server:负责数据采集和存储,并提供一套灵活的查询语言(PromQL)供用户使用
  • Exporter:Exporter负责收集目标对象(host、container)的性能数据,并通过HTTP接口供Prometheus Server获取
  • Grafana:可视化组件,能够与Prometheus无缝集成,提供完美的数据展示
  • Alertmanager:用户可以定义基于监控数据的告警规则,规则会触发告警。一旦Alertmanager收到告警,会通过预定义的方式高级功能通知。支持Email、PagerDuty、Webhook

Prometheus Operator架构

Prometheus Operator的目标是尽可能简化在k8s中部署和维护Prometheus的工作

img

  • Operator:Operator即Prometheus Operator,在k8s中以Deployment运行。其职责是部署和管理Prometheus Server,根据ServiceMonitor动态更新Prometheus Server的监控对象
  • Prometheus Server:Prometheus Server会作为K8s应用部署到集群中,为了更好的在k8s中管理Prometheus,CoreOS的开发人员专门定义了一个命名为Prometheus类型的k8s定制化资源,我们可以把Prometheus看做是一种特殊的Deployment,它的用途就是专门部署PrometheusServer
  • Service:这里的Service就是Cluster中的service资源,也是Prometheus要监控的对象,在Prometheus中叫做Target。每个监控对象都有一个对应的Service。比如要监控kubernetes scheduler,就有一个与Scheduler对应的Service。由Prometheus Operator负责创建
  • ServiceMonitor:Operator能够动态更新Prometheus的Target列表,ServiceMonitor就是Target的抽象。比如监控K8s Scheduler,用户可以创建一个与Scheduler Service相映射的ServiceMonitor对象。Operator则会发现新的ServiceMonitor,并将Scheduler的Target提阿南爱到Prometheus的监控列表中(ServiceMonitor也是PrometheusOperator专门开发的一种Kubernetes定制化资源类型)
  • Alertmanager:也是Operator开发的第三种kubernetes定制化资源,用途就是专门部署Alertmanager组件

获取资源

image-20200206114456130

这个heapster在v1.12版本已经被移除,可以通过安装metrice-server来获取这个命令,也可以通过helm安装Prometheus来安装这个服务

相关地址信息

Prometheus GitHub 地址: https://github.com/coreos/kube-prometheus

组件说明

  • 1 MetricServer:是kubernetes集群资源使用情况的聚合器,收集数据给kubernetes集群内使用,如kubectl、HPA、scheduler等
  • 2 PrometheusOperator:是一个系统检测和警报工具箱,用来存储监控数据
  • 3 NodeExporter:用于各个node的关键度量指标状态数据
  • 4 KubeStateMetrics:手机kubernetes集群内资源对象数据,指定告警规则
  • 5 Prometheus:采用pull方式手机apiserver、scheduler、controller-manager、kubelet组件数据,通过http协议传输
  • 6 Grafana:是可视化数据统计和监控平台

构建记录

从GitHub上下载Prometheus

1
2
3
4
mkdir prometheus
cd prometheus
git clone https://github.com/coreos/kube-prometheus.git
cd kube-prometheus/manifests/

修改grafana-service.yaml文件,使用NodePort方式grafana:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
vim grafana-service.yaml
apiVersion: v1
kind: Service
metadata:
name: grafana
namespace: monitoring
spec:
type: NodePort ## 添加内容
ports:
- name: http
port: 3000
targetPort: http
nodePort: 30100 # 添加内容
selector:
app: grafana

修改prometheus-service.yaml,改为NodePort

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
vim prometheus-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
prometheus: k8s
name: prometheus-k8s
namespace: monitoring
spec:
type: NodePort
ports:
- name: web
port: 9-9-
targetPort: web
nodePort: 30200
selector:
app: prometheus
prometheus: k8s

修改alertmanager-service.yaml,改为NodePort

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
vim alertmanager-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
alertmanager: main
name: alertmanager-main
namespace: monitoring
spec:
type: NodePort
ports:
- name: web
port: 9093
targetPort: web
nodePort: 30300
selector:
alertmanager: main
app: alertmanager

安装

1
2
3
kubectl apply -f ../manifests/setup/
kubectl apply -f ../manifests/
多执行几次,一定要指定目录,不能用/*,这里坑了我好久,我还以为我下载的包不对

查看

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
[root@k8s-master manifests]# kubectl get pods -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 15m
alertmanager-main-1 0/2 Terminating 0 15m
alertmanager-main-2 2/2 Running 0 15m
grafana-697c9fc764-l5s48 1/1 Running 0 15m
kube-state-metrics-6df7855645-nqz2v 3/3 Running 0 15m
node-exporter-c6vzw 2/2 Running 0 15m
node-exporter-dj5bx 0/2 ContainerCreating 0 15m
node-exporter-gr62p 2/2 Running 0 15m
prometheus-adapter-5948989dcf-hjd4l 1/1 Running 0 15m
prometheus-k8s-0 3/3 Running 1 15m
prometheus-k8s-1 0/3 Terminating 0 15m
prometheus-operator-85dc59d49b-qp5dq 1/1 Running 0 15m


[root@k8s-master manifests]# kubectl get pods -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 15m
alertmanager-main-1 0/2 Terminating 0 15m
alertmanager-main-2 2/2 Running 0 15m
grafana-697c9fc764-l5s48 1/1 Running 0 15m
kube-state-metrics-6df7855645-nqz2v 3/3 Running 0 15m
node-exporter-c6vzw 2/2 Running 0 15m
node-exporter-dj5bx 0/2 ContainerCreating 0 15m
node-exporter-gr62p 2/2 Running 0 15m
prometheus-adapter-5948989dcf-hjd4l 1/1 Running 0 15m
prometheus-k8s-0 3/3 Running 1 15m
prometheus-k8s-1 0/3 Terminating 0 15m
prometheus-operator-85dc59d49b-qp5dq 1/1 Running 0 15m



[root@k8s-master manifests]# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-master 212m 5% 1286Mi 33%
k8s-node02 139m 3% 1357Mi 35%
k8s-node01 <unknown> <unknown> <unknown> <unknown>
[root@k8s-master manifests]# kubectl top pod
NAME CPU(cores) MEMORY(bytes)
hello-world-658bf986c8-d8v76 0m 1Mi
hello-world-658bf986c8-sx4dv 0m 1Mi
myapp-65b44bddd5-z8vnd 0m 1Mi

访问Prometheus

浏览器访问

prometheus对应的NodePort端口为30200,访问http://masterIP:30200**

image-20200206181351491

查看状态

image-20200206181443946

实例

prometheus的web提供了基本的查询k8s集群中每个pod的CPU的情况,查询条件如下

1
sum by (pod_name)( rate(container_cpu_usage_seconds_total{image!="",pod_name!=""}[1m]))

image-20200206181749872

image-20200206181858371

上述的查询有出现数据,说明node-exporter往Prometheus中写入数据正常,接下来我们就可以部署访问grafana组件,实现更友好的WebUI展示数据

访问Grafana

浏览器访问

http://192.168.128.140:30100

默认用户名密码:admin/admin

image-20200206182157204

修改默认密码

image-20200206182222385

添加数据来源

image-20200206182419022

默认已经添加好,点击进入,测试

image-20200206182543187

image-20200206182558431

dashboard导入模板

image-20200206182637824

查看数据

进home,点击左上角

image-20200206182758668

查看node数据

image-20200206182840122

可以看到node资源使用情况

image-20200206182921185

单独安装metrics-server

metrics-server和Prometheus只能安装一个,metrics-server没有界面

image-20200206171713920

image-20200206171729387