Reduce no. of DB connection with Django
Behind the Story
Recently in my office we optimised our resources, our infra team removed unnecessary services, databases etc based on resource consumption.
Let’s say we had 10 services using PostgreSQL, normally for that we had 10 separate PostgreSQL database instances(each instance can have max 500 concurrent db connection). Our infra team reduced from 10 database to 1 database(can have max 300 concurrent db connection).
Now all of the 10 services are using 1 db instance. Performance wise they were working fine, and initially we saw around 220 connection are being used up. We did not bother about it. :(
Then suddenly we were getting FATAL too many connections error from multiple services at a same time, which we have never encountered before. On that day 4–5 services was needed more resources, they took more than 80 extra connection since we had auto scaling enabled.
For that all the other services stuck at the same time, because total db connection exceeded for all other the services.
Actual Problem and Solution:
When we were debugging the system we saw one Django service (
DS)
was holding more than 150 connection at the same time and not releasing those connections at all! That why rest of the services can not consume for connection even if they needed more!
Let’s assume we had 10 replica for DS, each instance running on gunicorn
server with the following settingsgunicorn --workers=4 --threads=4 main:app
For this setting each thread in each worker is opening 1 db connection. Means 16 connection per instance max!
But it according to normal django
setting, it should not hold those 16 connections all the time.
Then we found the following configuration in the django
setting in our DS codebase,
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'CONN_MAX_AGE': 60,
'NAME': 'postgres',
'USER': 'postgres',
'PASSWORD': 'postgres',
'HOST': '127.0.0.1',
'PORT': 5432
}
}
According to the doc CONN_MAX_AGE
was holding db connections for 60 sec in each thread, even if those are not being used at that time. After 60 sec they renewing the connection but not releasing ever.
Solution:
So we changed the settings like the following and our concurrent connection number changes drastically.'CONN_MAX_AGE': 0
Let’s say overall consumption reduced from 300 to 100 .
But this solution has a side effect. It increases the response time by 10–40ms(avg.), since db connection are being opened whenever they needed and being closed after the request is done.
db
Discussion worth looking into:
a comment in
gunicorn
github issue .server-fault discussion