kubernetes学习笔记(prometheus)

  1. 1. 安装prometheus
    1. 1.1. RBAC授权
    2. 1.2. prometheus存储
    3. 1.3. 配置文件
    4. 1.4. 部署prometheus
    5. 1.5. 暴露prometheus服务
    6. 1.6. 访问测试
  2. 2. 部署Grafana
    1. 2.1. 定义存储
    2. 2.2. 部署Grafana
    3. 2.3. 暴露服务
  3. 3. 部署node-exporter
  4. 4. 使用

安装prometheus

RBAC授权

首先使用RBAC对prometheus进行授权:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: kube-monitor
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources: ["nodes","nodes/proxy","services","endpoints","pods"]
verbs: ["get", "list", "watch"]
- apiGroups: ["extensions"]
resources: ["ingress"]
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: prometheus
namespace: kube-monitor
1
kubectl apply -f prometheus-rbac.yaml -n kube-monitor

prometheus存储

定义StorageClass存储(需要读写权限):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus
labels:
k8s-app: prometheus
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: course-nfs-storage ## 指定 StorageClass,PVC 中设置的该名称要和这里保持一致
mountOptions:
- hard
nfs:
server: 172.16.8.40 ## NFS 服务器的地址
path: /data/k8s ## NFS 数据存储目录
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: prometheus
labels:
k8s-app: prometheus
spec:
accessModes:
- ReadWriteOnce
storageClassName: course-nfs-storage ## 指定 StorageClass
resources:
requests:
storage: 10Gi
selector:
matchLabels:
k8s-app: prometheus
1
kubectl apply -f prometheus-sc.yaml -n kube-monitor

配置文件

我们把配置文件放在configMap中进行管理:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
data:
prometheus.yml: |
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
cluster: "kubernetes"
scrape_configs:
###################### Node Exporter ######################
- job_name: 'node-exporter'
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: replace
source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:9100'
target_label: __address__
1
kubectl apply -f prometheus-config.yaml -n kube-monitor

部署prometheus

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus
labels:
k8s-app: prometheus
spec:
replicas: 1
selector:
matchLabels:
k8s-app: prometheus
template:
metadata:
labels:
k8s-app: prometheus
spec:
serviceAccountName: prometheus
containers:
- name: prometheus
image: prom/prometheus:v2.26.0
ports:
- name: http
containerPort: 9090
securityContext:
runAsUser: 65534
privileged: true
command:
- "/bin/prometheus"
args:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--web.enable-lifecycle"
- "--storage.tsdb.path=/prometheus"
- "--storage.tsdb.retention.time=10d"
- "--web.console.libraries=/etc/prometheus/console_libraries"
- "--web.console.templates=/etc/prometheus/consoles"
resources:
limits:
cpu: 2000m
memory: 1024Mi
requests:
cpu: 1000m
memory: 512Mi
readinessProbe:
httpGet:
path: /-/ready
port: 9090
initialDelaySeconds: 5
timeoutSeconds: 10
livenessProbe:
httpGet:
path: /-/healthy
port: 9090
initialDelaySeconds: 30
timeoutSeconds: 30
volumeMounts:
- name: data
mountPath: /prometheus
subPath: prometheus
- name: config
mountPath: /etc/prometheus
- name: configmap-reload
image: jimmidyson/configmap-reload:v0.5.0
args:
- "--volume-dir=/etc/config"
- "--webhook-url=http://localhost:9090/-/reload"
resources:
limits:
cpu: 10m
memory: 10Mi
requests:
cpu: 10m
memory: 10Mi
volumeMounts:
- name: config
mountPath: /etc/config
readOnly: true
volumes:
- name: data
persistentVolumeClaim:
claimName: prometheus
- name: config
configMap:
name: prometheus-config

注:这边要采用ingress的方式访问prometheus,也可通过其他方式访问:

方式二:

1
kubectl port-forward `kubectl get pod -l app=prometheus -n kube-ops -o go-template --template '{{range .items}}{{.metadata.name}}{{end}}'` 9090:9090

127.0.0.1:9090可以访问到prometheus

1
kubectl apply -f prometheus-depoloy.yaml -n kube-monitor

暴露prometheus服务

1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: v1
kind: Service
metadata:
name: prometheus
labels:
k8s-app: prometheus
spec:
ports:
- name: web
protocol: TCP
port: 9090
selector:
k8s-app: prometheus
1
kubectl apply -f prometheus-service.yaml -n kube-monitor
1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: prometheus-route
spec:
entryPoints:
- web
routes:
- match: Host(`prometheus.k8s.local`)
kind: Rule
services:
- name: prometheus
port: 9090
1
kubectl apply -f prometheus-ingress.yaml -n kube-monitor

访问测试

修改本地hosts为prometheus.k8s.local 访问此地址即可。

部署Grafana

Grafana是一个相对好用的数据展示工具,部署起来也比较简单

定义存储

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
apiVersion: v1
kind: PersistentVolume
metadata:
name: grafana
labels:
k8s-app: grafana
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: grafana
mountOptions:
- hard
- nfsvers=4.1
nfs:
path: /data/k8s ## NFS 服务器目录
server: 172.16.8.40 ## NFS 服务器地址
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: grafana
labels:
k8s-app: grafana
spec:
accessModes:
- ReadWriteOnce
storageClassName: grafana
resources:
requests:
storage: 5Gi
selector:
matchLabels:
k8s-app: grafana
1
kubectl apply -f grafana-storage.yaml -n kube-monitor

部署Grafana

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
labels:
k8s-app: grafana
spec:
selector:
matchLabels:
k8s-app: grafana
template:
metadata:
labels:
k8s-app: grafana
spec:
initContainers: ## 初始化容器,用于修改挂载的存储的文件夹归属组与归属用户
- name: init-file
image: busybox:1.28
imagePullPolicy: IfNotPresent
securityContext:
runAsUser: 0
command: ['chown', '-R', "472:0", "/var/lib/grafana"]
volumeMounts:
- name: data
mountPath: /var/lib/grafana
subPath: grafana
containers:
- name: grafana ## Grafana 容器
image: grafana/grafana:7.4.3
ports:
- name: http
containerPort: 3000
env: ## 配置环境变量,设置 Grafana 的默认管理员用户名/密码
- name: GF_SECURITY_ADMIN_USER
value: "admin"
- name: GF_SECURITY_ADMIN_PASSWORD
value: "admin"
readinessProbe: ## 就绪探针
failureThreshold: 10
httpGet:
path: /api/health
port: 3000
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 30
livenessProbe: ## 存活探针
failureThreshold: 10
httpGet:
path: /api/health
port: 3000
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
volumeMounts: ## 容器挂载配置
- name: data
mountPath: /var/lib/grafana
subPath: grafana
volumes: ## 共享存储挂载配置
- name: data
persistentVolumeClaim:
claimName: grafana ## 指定使用的 PVC
1
kubectl apply -f grafana-deploy.yaml -n kube-monitor

暴露服务

1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: v1
kind: Service
metadata:
name: grafana
labels:
k8s-app: grafana
spec:
ports:
- name: grafana
protocol: TCP
port: 3000
selector:
k8s-app: grafana
1
kubectl apply -f grafana-service.yaml -n kube-monitor
1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: grafana-route
spec:
entryPoints:
- web
routes:
- match: Host(`grafana.k8s.local`)
kind: Rule
services:
- name: grafana
port: 3000
1
kubectl apply -f grafana-ingress.yaml -n kube-monitor

部署node-exporter

node-exporter用于收集节点信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
apiVersion: v1
kind: Service
metadata:
name: node-exporter
labels:
k8s-app: node-exporter
spec:
type: ClusterIP
ports:
- name: http
port: 9100
targetPort: 9100
selector:
k8s-app: node-exporter
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-exporter
labels:
k8s-app: node-exporter
spec:
selector:
matchLabels:
k8s-app: node-exporter
template:
metadata:
labels:
k8s-app: node-exporter
spec:
containers:
- name: node-exporter
image: prom/node-exporter:v1.1.2
ports:
- name: metrics
containerPort: 9100
args:
- "--path.procfs=/host/proc"
- "--path.sysfs=/host/sys"
- "--path.rootfs=/host"
volumeMounts:
- name: dev
mountPath: /host/dev
- name: proc
mountPath: /host/proc
- name: sys
mountPath: /host/sys
- name: rootfs
mountPath: /host
volumes:
- name: dev
hostPath:
path: /dev
- name: proc
hostPath:
path: /proc
- name: sys
hostPath:
path: /sys
- name: rootfs
hostPath:
path: /
hostPID: true
hostNetwork: true
tolerations:
- operator: "Exists"
1
kubectl apply -f node-exporter.yaml -n kube-monitor

使用

在Grafana的dashboard中添加data source, 填入prometheus地址 ,这边是http://prometheus:9090,刷新即可看到效果