#StackBounty: #django #django-models Django and a Database with write-Instance + Multiple Read Replicas — running Celery jobs

Bounty: 500

I have a django app running in production. Its database has main write instance and a few read replicas. I use DATABASE_ROUTERS to route between the write instance and the read replicas based on whether I need to read or write.

I encountered a situation where I have to do some async processing on an object due to a user request. The order of actions is:

  1. User submits a request via HTTPS/REST.
  2. The view creates an Object and saves it to the DB.
  3. Trigger a celery job to process the object outside of the request-response cycle and passing the object ID to it.
  4. Sending an OK response to the request.

Now, the celery job may kick in in 10 ms or 10 minutes depending on the queue. When it finally tuns, the celery job first tries to load the object based on the ID provided. Initially I had issues doing a my_obj = MyModel.objects.get(pk=given_id) because the read replica would be used at this point, if the queue is empty and the celery job runs immediately after being triggered, the object may have not propagated to the read-replicas yet.

I resolved that issue by replacing my_obj = MyModel.objects.get(pk=given_id) with my_obj = MyModel.objects.using('default').get(pk=given_id) — this ensures the object is read from my write-db-instance and is always available.

however, now I have another issue I did not anticipate.

calling my_obj.certain_many_to_many_objects.all() triggers another call to the database as the ORM is lazy. That call IS being done on the read-replica. I was hoping it would stick to the database I defined with using but that’s not the case. Is there a way to force all sub-element objects to use the same write-db-instance?


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.