资源限制对Pod调度的影响
容器资源限制:
- resources.limits.cpu
- resources.limits.memory
容器使用的最小资源需求,作为容器调度时资源分配的依据:
- resources.requests.cpu
- resources.requests.memory
CPU的单位为m(毫),既可以写列如100m或者0.1,并不能使用%
NodeSelector & NodeAffinity
nodeSelector:用于将Pod调度到匹配Label的Node上,如果没有匹配的标签会调度失败。
作用:
- 约束Pod到特定的节点运行·完全匹配节点标签
应用场景:
- 专用节点:根据业务线将Node分组管理
- 配备特殊硬件:部分Node配有SSD硬盘、GPU
示例:确保Pod分配到具有SSD硬盘的节点上
//第一步:给节点添加标签
格式:Kubectl label nodes <node-name> <label-key>=<label-value>
例如:kubectl label nodes node1.example.com disktype=ssd
验证:kubectl get nodes --show-labels
//第二步:添加nodeSelector字段到Pod配置中
//最后验证:
kubectl get pods -o wide
调度成功案例
[root@master ~]# kubectl get nodes node1 --show-labels
NAME STATUS ROLES AGE VERSION LABELS
node1 Ready <none> 5d23h v1.20.0 app=nginx
[root@master ~]# vim test.yml
[root@master ~]# cat test.yml
apiVersion: v1
kind: Pod
metadata:
name: nginx
namespace: default
spec:
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresent
nodeSelector:
app: nginx
[root@master ~]# kubectl apply -f test.yml
pod/nginx created
[root@master ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx 1/1 Running 0 4s 10.244.1.48 node1 <none> <none>
另外的两种方式示例
第一种
node1 打两个标签(app=nginx gpu=nvdia)
node2 打一个标签(app=nginx)
- required:必须满足
- preferred:尝试满足,但不保证
//node1 打两个标签(app=nginx gpu=nvdia)
[root@master ~]# kubectl label nodes node1 app=nginx gpu=nvdia
node/node1 labeled
//node2 打一个标签(app=nginx)
[root@master ~]# kubectl label nodes node2 app=nginx
node/node2 labeled
//查看
[root@master ~]# kubectl get nodes node1 node2 --show-labels
NAME STATUS ROLES AGE VERSION LABELS
node1 Ready <none> 5d23h v1.20.0 app=nginx gpu=nvdia
node2 Ready <none> 5d23h v1.20.0 app=nginx
[root@master ~]# cat test.yml
apiVersion: v1
kind: Pod
metadata:
name: test
namespace: default
spec:
containers:
- name: test1
image: busybox
imagePullPolicy: IfNotPresent
command: ["bin/sh","-c","sleep 45"]
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: app
operator: In
values:
- nginx
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 3
preference:
matchExpressions:
- key: gpu
operator: In
values:
- nvdia
[root@master ~]# kubectl apply -f test.yml
pod/test created
[root@master ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test 1/1 Running 0 18s 10.244.1.82 node1 <none> <none>
第二种:
node1 打一个标签(app=nginx )
node2 打一个标签(app=nginx)
//node1 打一个标签(app=nginx)
[root@master ~]# kubectl label nodes node1 app=nginx
node/node1 labeled
//node2 打一个标签(app=nginx)
[root@master ~]# kubectl label nodes node2 app=nginx
node/node2 labeled
[root@master ~]# kubectl get nodes node1 node2 --show-labels
NAME STATUS ROLES AGE VERSION LABELS
node1 Ready <none> 5d5h v1.20.0 app=nginx,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux
node2 Ready <none> 5d5h v1.20.0 app=nginx,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node2,kubernetes.io/os=linux
[root@master ~]# cat test.yml
apiVersion: v1
kind: Pod
metadata:
name: test
namespace: default
spec:
containers:
- name: test
image: busybox
imagePullPolicy: IfNotPresent
command: ["bin/sh","-c","sleep 45"]
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: app
operator: In
values:
- nginx
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 3
preference:
matchExpressions:
- key: gpu
operator: In
values:
- nvdia
[root@master ~]# kubectl delete -f test.yml
pod "test" deleted
[root@master ~]# kubectl apply -f test.yml
pod/test created
[root@master ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test 1/1 Running 0 5s 10.244.1.82 node1 <none> <none>
Taint(污点)& Tolerations(污点容忍)
Taints
: 避免Pod调度到特定Node上
TolerationsI
: 允许Pod调度到持有Taints的Node上
应用场景:
- 专用节点:根据业务线将Node分组管理,希望在默认情况下不调度该节点,只有配置了污点容忍才允许分配
- 配备特殊硬件:部分Node配有SSD硬盘、GPU,希望在默认情况下不调度该节点,只有配置了污点容忍才允许分配
- 基于Taint的驱逐
给节点添加污点
格式:kubectl taint node [node] key=value:[effect]
例如:kubectl taint node node1 gpu=yes:NoSchedule
验证:kubectl describe node node1 lgrep Taint
去掉污点:kubectl taint node [node] key:[effect]-
//查看污点
[root@master ~]# kubectl describe node node1 node2 master | grep -i taint
Taints: <none>
Taints: <none>
Taints: node-role.kubernetes.io/master:NoSchedule
其中[effect]
可取值
NoSchedule :一定不能被调度
PreferNoSchedule:尽量不要调度,非必须配置容忍
NoExecute:不仅不会调度,还会驱逐Node上已有的Pod
添加污点容忍(tolrations)字段到Pod配置中
apiVersion: v1
kind: Pod
metadata:
name: pod-taints
spec:
containers:
- name: pod-taints
image: busybox:latest
tolerations:
- key: "gpu"
operator: "Equal"
value: "yes"
effect: "NoSchedule"
两种示例
第一种(NoSchedule):不能被调度
//给node1加污点
[root@master ~]# kubectl taint node node1 node1:NoSchedule
node/node1 tainted
[root@master ~]# kubectl describe node node1 | grep -i taint
Taints: node1:NoSchedule
[root@master ~]# cat test.yml
apiVersion: v1
kind: Pod
metadata:
name: test
namespace: default
spec:
containers:
- name: b1
image: busybox
imagePullPolicy: IfNotPresent
command: ["bin/sh","-c","sleep 45"]
[root@master ~]# kubectl apply -f test.yml
pod/test created
[root@master ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test 1/1 Running 0 11s 10.244.2.52 node2 <none> <none>
//清除污点
[root@master ~]# kubectl taint node node1 node1:NoSchedule-
node/node1 untainted
[root@master ~]# kubectl describe node node1 | grep -i taint
Taints: <none>
master自身的就是默认给它添加了一个污点,所以我们创建Pod,一直在node1或者node2上运行,却不会在master上运行
第二种(PreferNoSchedule):尽量不要调度,也有可能调度
[root@master ~]# kubectl taint node node1 node1:PreferNoSchedule
node/node1 tainted
[root@master ~]# kubectl describe node node1 | grep -i taint
Taints: node1:PreferNoSchedule
[root@master ~]# vi test.yml
[root@master ~]# kubectl delete -f test.yml
pod "test" deleted
[root@master ~]# cat test.yml
apiVersion: v1
kind: Pod
metadata:
name: test
namespace: default
spec:
containers:
- name: b1
image: busybox
imagePullPolicy: IfNotPresent
command: ["bin/sh","-c","sleep 45"]
[root@master ~]# kubectl apply -f test.yml
pod/test created
[root@master ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx 1/1 Running 0 11m 10.244.2.51 node2 <none> <none>
test 1/1 Running 0 14s 10.244.2.53 node2 <none> <none>
[root@master ~]# kubectl taint node node1 node1:PreferNoSchedule-
node/node1 untainted
[root@master ~]# kubectl describe node node1 | grep -i taint
Taints: <none>
第三种(NoExecute)
- 驱逐
- 不仅不会调度,还会驱逐Node上已有的Pod
[root@master ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test-afsad2kv6-coiwe 1/1 Running 1 80s 10.244.1.83 node2 <none> <none>
web1-df82nmoaf-dfdgc 1/1 Running 2 45h 10.244.1.84 node2 <none> <none>
web2-dasd23ifs-mxua2 1/1 Running 2 45h 10.244.1.85 node2 <none> <none>
//给node2 添加污点后
[root@master ~]# kubectl taint node node2 node2:NoExecute
node/node2 tainted
[root@master ~]# kubectl describe node node2 | grep -i taint
Taints: node2:NoExecute
//此时在node2上
[root@master ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test-afsad2kv6-coiwe 1/1 Running 1 53s 10.244.1.82 node1 <none> <none>
web1-df82nmoaf-dfdgc 1/1 Running 0 53s 10.244.1.84 node1 <none> <none>
web2-dasd23ifs-mxua2 1/1 Running 0 53s 10.244.1.85 node1 <none> <none>
评论区