本文最后编辑于  前,其中的内容可能需要更新。
                
                
                    
                
                
                安装prometheus
RBAC授权
首先使用RBAC对prometheus进行授权:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
   | apiVersion: v1 kind: ServiceAccount metadata:   name: prometheus   namespace: kube-monitor --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata:   name: prometheus rules: - apiGroups: [""]   resources: ["nodes","nodes/proxy","services","endpoints","pods"]   verbs: ["get", "list", "watch"]  - apiGroups: ["extensions"]   resources: ["ingress"]   verbs: ["get", "list", "watch"] - nonResourceURLs: ["/metrics"]   verbs: ["get"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata:   name: prometheus roleRef:    apiGroup: rbac.authorization.k8s.io   kind: ClusterRole   name: cluster-admin subjects: - kind: ServiceAccount   name: prometheus   namespace: kube-monitor
   | 
 
1
   | kubectl apply -f prometheus-rbac.yaml -n kube-monitor
   | 
 
prometheus存储
定义StorageClass存储(需要读写权限):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
   | apiVersion: v1 kind: PersistentVolume metadata:   name: prometheus   labels:     k8s-app: prometheus spec:   capacity:               storage: 10Gi   accessModes:            - ReadWriteOnce   persistentVolumeReclaimPolicy: Retain   storageClassName: course-nfs-storage       mountOptions:     - hard   nfs:      server: 172.16.8.40               path: /data/k8s           --- kind: PersistentVolumeClaim apiVersion: v1 metadata:   name: prometheus   labels:     k8s-app: prometheus   spec:   accessModes:     - ReadWriteOnce   storageClassName: course-nfs-storage      resources:     requests:       storage: 10Gi   selector:     matchLabels:       k8s-app: prometheus
   | 
 
1
   | kubectl apply -f prometheus-sc.yaml -n kube-monitor
   | 
 
配置文件
我们把配置文件放在configMap中进行管理:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
   | apiVersion: v1 kind: ConfigMap metadata:   name: prometheus-config data:   prometheus.yml: |     global:       scrape_interval:     15s       evaluation_interval: 15s       external_labels:         cluster: "kubernetes"     scrape_configs:     ###################### Node Exporter ######################     - job_name: 'node-exporter'       kubernetes_sd_configs:       - role: node       relabel_configs:       - action: replace         source_labels: [__address__]         regex: '(.*):10250'         replacement: '${1}:9100'         target_label: __address__    
   | 
 
1
   | kubectl apply -f prometheus-config.yaml -n kube-monitor
   | 
 
部署prometheus
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
   | apiVersion: apps/v1 kind: Deployment metadata:   name: prometheus   labels:     k8s-app: prometheus spec:   replicas: 1   selector:     matchLabels:       k8s-app: prometheus   template:     metadata:       labels:         k8s-app: prometheus     spec:       serviceAccountName: prometheus       containers:       - name: prometheus         image: prom/prometheus:v2.26.0         ports:         - name: http           containerPort: 9090         securityContext:           runAsUser: 65534           privileged: true         command:         - "/bin/prometheus"         args:         - "--config.file=/etc/prometheus/prometheus.yml"         - "--web.enable-lifecycle"         - "--storage.tsdb.path=/prometheus"         - "--storage.tsdb.retention.time=10d"         - "--web.console.libraries=/etc/prometheus/console_libraries"         - "--web.console.templates=/etc/prometheus/consoles"         resources:           limits:             cpu: 2000m             memory: 1024Mi           requests:             cpu: 1000m             memory: 512Mi         readinessProbe:           httpGet:             path: /-/ready             port: 9090           initialDelaySeconds: 5           timeoutSeconds: 10         livenessProbe:           httpGet:             path: /-/healthy             port: 9090           initialDelaySeconds: 30           timeoutSeconds: 30         volumeMounts:         - name: data           mountPath: /prometheus           subPath: prometheus         - name: config           mountPath: /etc/prometheus       - name: configmap-reload         image: jimmidyson/configmap-reload:v0.5.0         args:         - "--volume-dir=/etc/config"         - "--webhook-url=http://localhost:9090/-/reload"         resources:           limits:             cpu: 10m             memory: 10Mi           requests:             cpu: 10m             memory: 10Mi         volumeMounts:         - name: config           mountPath: /etc/config           readOnly: true       volumes:       - name: data         persistentVolumeClaim:           claimName: prometheus       - name: config         configMap:           name: prometheus-config
   | 
 
注:这边要采用ingress的方式访问prometheus,也可通过其他方式访问:
方式二:
1
   | kubectl port-forward `kubectl get pod -l app=prometheus -n kube-ops -o go-template --template '{{range .items}}{{.metadata.name}}{{end}}'` 9090:9090
  | 
 
127.0.0.1:9090可以访问到prometheus
1
   | kubectl apply -f prometheus-depoloy.yaml -n kube-monitor
   | 
 
暴露prometheus服务
1 2 3 4 5 6 7 8 9 10 11 12 13
   | apiVersion: v1 kind: Service metadata:   name: prometheus   labels:     k8s-app: prometheus spec:   ports:   - name: web     protocol: TCP     port: 9090   selector:     k8s-app: prometheus
   | 
 
1
   | kubectl apply -f prometheus-service.yaml -n kube-monitor
   | 
 
1 2 3 4 5 6 7 8 9 10 11 12 13
   | apiVersion: traefik.containo.us/v1alpha1 kind: IngressRoute metadata:   name: prometheus-route spec:   entryPoints:   - web   routes:   - match: Host(`prometheus.k8s.local`)     kind: Rule     services:       - name: prometheus         port: 9090
   | 
 
1
   | kubectl apply -f prometheus-ingress.yaml -n kube-monitor
   | 
 
访问测试
修改本地hosts为prometheus.k8s.local 访问此地址即可。
部署Grafana
Grafana是一个相对好用的数据展示工具,部署起来也比较简单
定义存储
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
   | apiVersion: v1 kind: PersistentVolume metadata:   name: grafana   labels:     k8s-app: grafana spec:   capacity:               storage: 5Gi   accessModes:            - ReadWriteOnce   persistentVolumeReclaimPolicy: Retain   storageClassName: grafana   mountOptions:     - hard     - nfsvers=4.1       nfs:      path: /data/k8s          server: 172.16.8.40    --- kind: PersistentVolumeClaim apiVersion: v1 metadata:   name: grafana   labels:     k8s-app: grafana   spec:   accessModes:     - ReadWriteOnce   storageClassName: grafana   resources:     requests:       storage: 5Gi   selector:     matchLabels:       k8s-app: grafana
   | 
 
1
   | kubectl apply -f grafana-storage.yaml -n kube-monitor
   | 
 
部署Grafana
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
   | apiVersion: apps/v1 kind: Deployment metadata:   name: grafana   labels:     k8s-app: grafana spec:   selector:     matchLabels:       k8s-app: grafana   template:     metadata:       labels:         k8s-app: grafana     spec:       initContainers:                    - name: init-file         image: busybox:1.28         imagePullPolicy: IfNotPresent         securityContext:           runAsUser: 0         command: ['chown', '-R', "472:0", "/var/lib/grafana"]         volumeMounts:         - name: data           mountPath: /var/lib/grafana           subPath: grafana       containers:                       - name: grafana                      image: grafana/grafana:7.4.3         ports:         - name: http           containerPort: 3000         env:                               - name: GF_SECURITY_ADMIN_USER           value: "admin"         - name: GF_SECURITY_ADMIN_PASSWORD           value: "admin"         readinessProbe:                      failureThreshold: 10           httpGet:             path: /api/health             port: 3000             scheme: HTTP           initialDelaySeconds: 10           periodSeconds: 10           successThreshold: 1           timeoutSeconds: 30         livenessProbe:                       failureThreshold: 10           httpGet:             path: /api/health             port: 3000             scheme: HTTP           initialDelaySeconds: 10           periodSeconds: 10           successThreshold: 1           timeoutSeconds: 1         volumeMounts:                     - name: data           mountPath: /var/lib/grafana           subPath: grafana       volumes:                          - name: data         persistentVolumeClaim:           claimName: grafana     
   | 
 
1
   | kubectl apply -f grafana-deploy.yaml -n kube-monitor
   | 
 
暴露服务
1 2 3 4 5 6 7 8 9 10 11 12 13
   | apiVersion: v1 kind: Service metadata:   name: grafana   labels:     k8s-app: grafana spec:   ports:   - name: grafana     protocol: TCP     port: 3000   selector:     k8s-app: grafana
   | 
 
1
   | kubectl apply -f grafana-service.yaml -n kube-monitor
   | 
 
1 2 3 4 5 6 7 8 9 10 11 12 13
   | apiVersion: traefik.containo.us/v1alpha1 kind: IngressRoute metadata:   name: grafana-route spec:   entryPoints:   - web   routes:   - match: Host(`grafana.k8s.local`)     kind: Rule     services:       - name: grafana         port: 3000
   | 
 
1
   | kubectl apply -f grafana-ingress.yaml -n kube-monitor
   | 
 
部署node-exporter
node-exporter用于收集节点信息
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66
   | apiVersion: v1 kind: Service metadata:   name: node-exporter   labels:     k8s-app: node-exporter spec:   type: ClusterIP   ports:   - name: http     port: 9100     targetPort: 9100   selector:     k8s-app: node-exporter --- apiVersion: apps/v1 kind: DaemonSet metadata:   name: node-exporter   labels:     k8s-app: node-exporter spec:   selector:     matchLabels:       k8s-app: node-exporter   template:     metadata:       labels:         k8s-app: node-exporter     spec:       containers:       - name: node-exporter         image: prom/node-exporter:v1.1.2         ports:         - name: metrics           containerPort: 9100         args:         - "--path.procfs=/host/proc"         - "--path.sysfs=/host/sys"         - "--path.rootfs=/host"         volumeMounts:         - name: dev           mountPath: /host/dev         - name: proc           mountPath: /host/proc         - name: sys           mountPath: /host/sys         - name: rootfs           mountPath: /host       volumes:         - name: dev           hostPath:             path: /dev         - name: proc           hostPath:             path: /proc         - name: sys           hostPath:             path: /sys         - name: rootfs           hostPath:             path: /       hostPID: true       hostNetwork: true       tolerations:       - operator: "Exists"
   | 
 
1
   | kubectl apply -f node-exporter.yaml -n kube-monitor
   | 
 
使用
在Grafana的dashboard中添加data source, 填入prometheus地址 ,这边是http://prometheus:9090,刷新即可看到效果