Prometheus, Grafana

Prometheus

프로메테우스를 실행할 NS 생성

kubectl create ns monitoring
YAML
복사

prometheus-community 차트 레포지토리 추가

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
YAML
복사

prometheus 실행

helm upgrade -i prometheus prometheus-community/prometheus \
    --namespace monitoring \
    --set alertmanager.persistentVolume.storageClass="gp2",server.persistentVolume.storageClass="gp2"
YAML
복사

Pod에서 prometheus-alertmanager 와 prometheus-server 가 pending 상태로 decribe을 확인해보면 아래와 같이 실패원인을 알 수 있습니다.

원인은 PVC가 정상 작동하지 않았는데 이는 EBS에 대한 권한이 없기 때문에 발생할 수 있습니다.

CSI 드라이버 IAM역할 생성

AWS EBS를 사용하기 위한 접근 권한 생성

eksctl create iamserviceaccount \
  --name ebs-csi-controller-sa \
  --namespace kube-system \
  --cluster <CLUSTER-NAME> \
  --attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy \
  --approve \
  --role-only \
  --role-name AmazonEKS_EBS_CSI_DriverRole
YAML
복사

IAM 역할 생성

eksctl create addon --name aws-ebs-csi-driver --cluster <CLUSTER-NAME> --service-account-role-arn arn:aws:iam::<AWS-Account-ID>:role/AmazonEKS_EBS_CSI_DriverRole --force
eksctl create addon --name aws-ebs-csi-driver --cluster k8s-cluster --service-account-role-arn arn:aws:iam::255260635764:role/AmazonEKS_EBS_CSI_DriverRole --force
YAML
복사

EBS CSI 확인

eksctl get addon --name aws-ebs-csi-driver --cluster <CLUSTER-NAME>
YAML
복사

권한이 부여가 완료된 상태에서 아래와같이 server는 bound 되었지만 alertmanager은 pending 상태

kubctl describe 를 활용하여 확인해 보니 클래스가 설정되지 않았다고 합니다.

edit를 통해 문제가 있는 pvc를 확인합니다.

kubectl edit pvc storage-prometheus-alertmanager-0 -n monitoring
YAML
복사

아래와 같이 storageClassName의 내용이 없다면 추가해줍니다

이후 Bound 상태로 변경될걸 확인 할 수 있습니다.

Grafana

Grafana 설치

Deploy Grafana on Kubernetes | Grafana documentation

Guide for deploying Grafana on Kubernetes

https://grafana.com/docs/grafana/latest/setup-grafana/installation/kubernetes/

Grafana의 설치방법은 위의 링크를 통해 확인할 수 있습니다.

Grafana 공식 사이트를 통해 yaml 파일을 받아와 수정해줍니다.

# grafana.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: grafana-pvc
  namespace: monitoring # namespace 추가
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: gp2 # storageClassName 지정
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: grafana
  name: grafana
  namespace: monitoring # namespace 추가
spec:
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      securityContext:
        fsGroup: 472
        supplementalGroups:
          - 0
      containers:
        - name: grafana
          image: grafana/grafana:latest
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 3000
              name: http-grafana
              protocol: TCP
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /robots.txt
              port: 3000
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 30
            successThreshold: 1
            timeoutSeconds: 2
          livenessProbe:
            failureThreshold: 3
            initialDelaySeconds: 30
            periodSeconds: 10
            successThreshold: 1
            tcpSocket:
              port: 3000
            timeoutSeconds: 1
          resources:
            requests:
              cpu: 250m
              memory: 750Mi
          volumeMounts:
            - mountPath: /var/lib/grafana
              name: grafana-pv
      volumes:
        - name: grafana-pv
          persistentVolumeClaim:
            claimName: grafana-pvc
---
# subDomain을 활용하여 접속하기 위해 type과 port 수정
apiVersion: v1
kind: Service
metadata:
  name: grafana
  namespace: monitoring # namespace 추가
spec:
  ports:
    - port: 80 
      protocol: TCP
      targetPort: 3000
  selector:
    app: grafana
  sessionAffinity: None
  type: ClusterIP # LoadBalancer -> ClusterIP
YAML
복사

이후 Prometheus와 Grafana 접속을 위한 ALB Ingress를 작성합니다.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: monitoring-ingress
  namespace: monitoring
  annotations:
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS":443}]'
    alb.ingress.kubernetes.io/certificate-arn: <certificate-arn>
    alb.ingress.kubernetes.io/ssl-redirect: '443'
spec:
  ingressClassName: alb
  rules:
  - host: prometheus.enjoytrip.shop
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: prometheus-server
            port:
              number: 80
  - host: grafana.enjoytrip.shop
    http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: grafana
                port:
                  number: 80
YAML
복사

Ingress를 생성하게 되면 LoadBalancer가 생성됩니다.

해당 LoadBalancer의 DNS 값을 Route53에 등록한 도메인에 레코드에 추가하면 됩니다.

접속 확인

초기 ID : admin PW : admin

Grafana 설정

1. Data Sources 생성

Connection에서 Prometheus의 Service와 연결해줍니다.

2. Dash Board 설정

Grafana의 공식 DashBoard는 아래의 사이트에서 확인할 수 있습니다.

Grafana dashboards | Grafana Labs

Browse a library of official and community-built dashboards.

https://grafana.com/grafana/dashboards/