参考:
K8S - Assign Pods to Nodes using Node Affinity
K8S - Assigning Pods to Nodes
一、node亲和性
node亲和性策略表示pod部署到符合某些条件的node上
.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
selector:
matchLabels:
app: nginx
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: name
operator: In
values:
- nginx1
- nginx2
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: app
operator: In
values:
- nginx
containers:
- name: nginx-server
image: nginx:latest
上面的这个例子表示pod必须部署到满足nodeSelectorTerms中的条件,尽量满足preference中的条件.
nodeAffinity
表示pod和node的亲和性策略nodeSelectorTerms
表示打了这些标签的node上才能部署pod,这里的key values都可以指定多个值,可以实现类似nodeSelector的或的操作.preference
表示最好部署到满足这些条件的node上.- 注意
requiredDuringSchedulingIgnoredDuringExecution -> nodeSelectorTerms
preferredDuringSchedulingIgnoredDuringExecution -> preference
两套组合
策略
-
requiredDuringSchedulingIgnoredDuringExecution
硬策略表示调度过程必须满足执行过程忽略 -
preferredDuringSchedulingIgnoredDuringExecution
软策略表示调度过程尽量满足执行过程忽略
操作符
这里的匹配逻辑是label在某个列表中,可选的操作符有:
- In: label的值在某个列表中
- NotIn:label的值不在某个列表中
- Exists:某个label存在
- DoesNotExist:某个label不存在
- Gt:label的值大于某个值(字符串比较)
- Lt:label的值小于某个值(字符串比较)
二、pod亲和性、反亲和性
pod亲和性和非亲和性表示pod部署到或不部署到满足某些label的pod所在的node上
.
apiVersion: v1
kind: Pod
metadata:
name: with-pod-affinity
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S1
topologyKey: topology.kubernetes.io/zone
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S2
topologyKey: topology.kubernetes.io/zone
containers:
- name: with-pod-affinity
image: k8s.gcr.io/pause:2.0
podAffinity
表示pod亲和性,即将当前pod部署到node上,且此node上的其他pod需要满足labelSelector指定的条件podAntiAffinity
表示pod反亲和性,即将当前pod部署到node上,且此node上的其他pod不满足labelSelector指定的条件toplogyKey
用来指定节点的label,即包含该label的节点才满足调度条件(用于划分zone),默认可以使用:kubernetes.io/hostname- pod亲和性仅支持:
In, NotIn, Exists, DoesNotExist
- 策略同node亲和性:
requiredDuringSchedulingIgnoredDuringExecution, preferredDuringSchedulingIgnoredDuringExecution
实际使用中,亦可以实现:
- 同一个deployment下的多个pod分别部署到不同的node上,实现服务的高可用性(podAntiAffinity)
- 将相关联的服务部署到同一个node上,如web和存储(podAffinity)
- 亦可将1和2结合,即使用podAntiAffinity来分散同一deployment的不同pod实例到不同node以实现服务高可用,同时使用podAffinity来使得相关联的服务都部署在同一个节点,减少网络通信。
具体示例参见:https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#more-practical-use-cases
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-
spec:
selector:
matchLabels:
app: store
replicas: 3
template:
metadata:
labels:
app: store
spec:
affinity:
# 使用pod反亲和性podAntiAffinity
# 即同一deployment下的replicas个pod分散到不同的node上
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
# 即相同标签app: store互相排斥
matchExpressions:
- key: app
operator: In
values:
- store
topologyKey: "kubernetes.io/hostname"
containers:
- name: redis-server
image: redis:3.2-alpine
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-server
spec:
selector:
matchLabels:
app: web-store
replicas: 3
template:
metadata:
labels:
app: web-store
spec:
affinity:
# 使用pod反亲和性podAntiAffinity
# 即同一deployment下的relicas个pod分散到不同的node上
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
# 即相同标签app: web-store互相排斥
matchExpressions:
- key: app
operator: In
values:
- web-store
topologyKey: "kubernetes.io/hostname"
# 使用pod亲和性podAffinity
# 即当前deployment下的replicas个pod需要部署到node上
# 且node上需要运行包含标签app: store的pod
# 即web-store自身分散到不同的node上(podAntiAffinity),且分散到的node上还需要运行redis-cache(podAffinity)
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- store
topologyKey: "kubernetes.io/hostname"
containers:
- name: web-app
image: nginx:1.16-alpine