Running in Production¶
Number of Replicas¶
Always run at least two replicas (three or more are recommended) of your application to survive cluster updates and autoscaling without downtime.
Readiness Probes¶
Web applications should always configure a readinessProbe
to make sure that the container only gets traffic after a successful startup:
containers:
- name: mycontainer
image: myimage
readinessProbe:
httpGet:
# Path to probe; should be cheap, but representative of typical behavior
path: /.well-known/health
port: 8080
timeoutSeconds: 1
See Configuring Liveness and Readiness Probes for details.
Resource Requests¶
Always configure resource requests for both CPU and memory. The Kubernetes scheduler and cluster autoscaler need this information in order to make the right decisions. Example:
containers:
- name: mycontainer
image: myimage
resources:
requests:
cpu: 100m # 100 millicores
memory: 200Mi # 200 MiB
Resource Limits¶
You should configure a resource limit for memory if possible. The memory resource limit will get your container OOMKilled
when reaching the limit.
Set the JVM heap memory dynamically by using the java-dynamic-memory-opts
script from Zalando’s OpenJDK base image and setting MEM_TOTAL_KB
to limits.memory
:
containers:
- name: mycontainer
image: myjvmdockerimage
env:
# set the maximum available memory as JVM would assume host/node capacity otherwise
# this is evaluated by java-dynamic-memory-opts in the Zalando OpenJDK base image
# see https://github.com/zalando/docker-openjdk
- name: MEM_TOTAL_KB
valueFrom:
resourceFieldRef:
resource: limits.memory
divisor: 1Ki
resources:
requests:
cpu: 100m
memory: 2Gi
limits:
memory: 2Gi