1. Engineering/Kubeflow

[Kubeflow] Kubeflow 설치

Honestree 2022. 10. 20. 17:27
  1. NFS 설정
  2. PV, PVC, StorageClass 구성
  3. Kubeflow 설치

1. NFS 설정

  • Worker node 서버를 NFS 서버로 사용

1.1 NFS 서버 구성

  • worker node 접속 후 명령 수행
$ sudo -i 
$ apt-get update 
$ apt install nfs-common nfs-kernel-server portmap 
$ mkdir /home/share/nfs -p 
$ chmod 777 /home/share/nfs 
$ vi /etc/exports
# 내용 추가 /home/share/nfs *(rw,no_root_squash,sync,insecure,no_subtree_check) # 
$ service nfs-server restart $ systemctl status nfs-server.service 
$ showmount -e 127.0.0.1 
$ mount -t nfs 192.168.72.102:/home/share/nfs /mnt

 

  • master node 접속 후 명령 수행
$ sudo -i 
$ apt-get update
$ apt install nfs-common nfs-kernel-server portmap

2. PV, PVC, StorageClass 구성

  • Master node 접속 후 yaml 파일 생성
$ vim test.yaml

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
        name: my-storageclass
provisioner: kubernetes.io/no-provisioner
parameters:
        server: 192.168.72.102
        path: /home/share/nfs
        readOnly: "false"

---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: nfs-pvc
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-pc
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteMany
  nfs:
    server: 192.168.72.102
    path: /home/share/nfs

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: dpl-nginx
spec:
  selector:
    matchLabels:
      app: dpl-nginx
  replicas: 2
  template:
    metadata:
      labels:
        app: dpl-nginx
    spec:
      containers:
      - name: master
        image: nginx
        ports:
        - containerPort: 80
        volumeMounts:
        - mountPath: /mnt
          name: pvc-volume
      volumes:
      - name: pvc-volume
        persistentVolumeClaim:
          claimName: nfs-pvc

$ kubectl apply -f test.yaml
$ kubectl patch storageclass my-storageclass -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
$ sudo vim /etc/kubernetes/manifests/kube-apiserver.yaml

# 내용 추가
    - --enable-admission-plugins=NodeRestriction,PodNodeSelector,DefaultStorageClass
    - --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
    - --service-account-issuer=kubernetes.default.svc
#
$ kubectl taint nodes --all node-role.kubernetes.io/master-

3. Kubeflow 설치

  • Master node 접속 후 명령 수행
$ git clone https://github.com/kubeflow/manifests.git
$ cd manifests
$ git checkout tags/v1.3.1
$ wget https://github.com/kubernetes-sigs/kustomize/releases/download/v3.2.0/kustomize_3.2.0_linux_amd64 -O kustomize
$ sudo mv ./kustomize /usr/local/bin/kustomize
$ chmod 777 /usr/local/bin/kustomize
$ while ! kustomize build example | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done
  • 설치완료 후 실행확인
kubectl get pod -A | egrep 'NAME|^auth|^cert-manager|^istio-system|^knative-|^kubeflow'

Trouble

  • 스펙 문제인지 현재 kubeflow관련 container들이 생성되는 도중에 VM이 다운되는 문제가 발생
    • 현재 스펙:
      • 4 CPU, Memory 12GB, VDI 100GB