1. Engineering/Kubeflow
[Kubeflow] Kubeflow 설치
Honestree
2022. 10. 20. 17:27
- NFS 설정
- PV, PVC, StorageClass 구성
- Kubeflow 설치
1. NFS 설정
- Worker node 서버를 NFS 서버로 사용
1.1 NFS 서버 구성
- worker node 접속 후 명령 수행
$ sudo -i
$ apt-get update
$ apt install nfs-common nfs-kernel-server portmap
$ mkdir /home/share/nfs -p
$ chmod 777 /home/share/nfs
$ vi /etc/exports
# 내용 추가 /home/share/nfs *(rw,no_root_squash,sync,insecure,no_subtree_check) #
$ service nfs-server restart $ systemctl status nfs-server.service
$ showmount -e 127.0.0.1
$ mount -t nfs 192.168.72.102:/home/share/nfs /mnt
- master node 접속 후 명령 수행
$ sudo -i
$ apt-get update
$ apt install nfs-common nfs-kernel-server portmap
2. PV, PVC, StorageClass 구성
- Master node 접속 후 yaml 파일 생성
$ vim test.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: my-storageclass
provisioner: kubernetes.io/no-provisioner
parameters:
server: 192.168.72.102
path: /home/share/nfs
readOnly: "false"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nfs-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pc
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
nfs:
server: 192.168.72.102
path: /home/share/nfs
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: dpl-nginx
spec:
selector:
matchLabels:
app: dpl-nginx
replicas: 2
template:
metadata:
labels:
app: dpl-nginx
spec:
containers:
- name: master
image: nginx
ports:
- containerPort: 80
volumeMounts:
- mountPath: /mnt
name: pvc-volume
volumes:
- name: pvc-volume
persistentVolumeClaim:
claimName: nfs-pvc
$ kubectl apply -f test.yaml
$ kubectl patch storageclass my-storageclass -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
$ sudo vim /etc/kubernetes/manifests/kube-apiserver.yaml
# 내용 추가
- --enable-admission-plugins=NodeRestriction,PodNodeSelector,DefaultStorageClass
- --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
- --service-account-issuer=kubernetes.default.svc
#
$ kubectl taint nodes --all node-role.kubernetes.io/master-
3. Kubeflow 설치
- Master node 접속 후 명령 수행
$ git clone https://github.com/kubeflow/manifests.git
$ cd manifests
$ git checkout tags/v1.3.1
$ wget https://github.com/kubernetes-sigs/kustomize/releases/download/v3.2.0/kustomize_3.2.0_linux_amd64 -O kustomize
$ sudo mv ./kustomize /usr/local/bin/kustomize
$ chmod 777 /usr/local/bin/kustomize
$ while ! kustomize build example | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done
- 설치완료 후 실행확인
kubectl get pod -A | egrep 'NAME|^auth|^cert-manager|^istio-system|^knative-|^kubeflow'
Trouble
- 스펙 문제인지 현재 kubeflow관련 container들이 생성되는 도중에 VM이 다운되는 문제가 발생
- 현재 스펙:
- 4 CPU, Memory 12GB, VDI 100GB
- 현재 스펙: