[Django]-Kubernetes liveness probe fails when the logs show a 200 response

0👍

python containers usually take high memory, in general to others. When you are running django behind a gunicorn, container by default utilize around 150M (for me) of your memory (depends on your amount of source code too[for us the source code was around 50M]). It too depends on the pip packages installed on the container for your app. Its good practise to provide usually 20% high memory than expected for django with gunicorn. You should also increase the timeoutSeconds to 30 or 20 according to the amount of traffic you are handling in one container. Plus , either have readinessProbe or livenessProbe on the container, both the probes will create too much noisy traffic to the container. Plus ,use HPA to scale your app, scale the app at 60% cpu and 60% memory to control burst of traffic.

As you are using threads , be careful around number of active connections on db. You are also exporting django health metrics (to prometheus) which is addition to the app functionality, remember to add extra resources for that too. prometheus scraping can also create too much overhead on the container too , so choose the number of prometheus scraping the same container and scrape_interval wisely. You don’t want your container just to serve just the internal traffic.

For more relevant reference to this problem : https://github.com/kubernetes/kubernetes/issues/89898

Leave a comment