sandeepk

ORM

The Debug Diary – Chapter I

Lately, I was debugging an issue with the importer tasks of our codebase and came across a code block which looks fine but makes an extra database query in the loop. When you have a look at the Django ORM query

jato_vehicles = JatoVehicle.objects.filter(
    year__in=available_years,<more_filters>
).only("manufacturer_code", "uid", "year", "model", "trim")

for entry in jato_vehicles.iterator():
    if entry.manufacturer_code:
        <logic>
    ymt_key = (entry.year, entry.model, entry.trim_processed)
...

you will notice we are using only, which only loads the set of fields mentioned and deferred other fields, but in the loop, we are using the field trim_processed which is a deferred field and will result in an extra database call.

Now, as we have identified the performance issue, the best way to handle the cases like this is to use values or values_list. The use of only should be discouraged in the cases like these.

Update code will look like this

jato_vehicles = JatoVehicle.objects.filter(
    year__in=available_years,<more-filters>).values_list(
    "manufacturer_code",
    "uid",
    "year",
    "model",
    "trim_processed",
    named=True,
)

for entry in jato_vehicles.iterator():
    if entry.manufacturer_code:
        <logic>
    ymt_key = (entry.year, entry.model, entry.trim_processed)
...

By doing this, we are safe from accessing the fields which are not mentioned in the values_list. If anyone tries to do so, an exception will be raised.

** By using named=True we get the result as a named tuple which makes it easy to access the values :)

Cheers!

#Django #ORM #Debug

select_for_update is the answer if you want to acquire a lock on the row. The lock is only released after the transaction is completed. This is similar to the Select for update statement in the SQL query.

>>> Dealership.objects.select_for_update().get(pk='iamid')
>>> # Here lock is only required on Dealership object
>>> Dealership.objects.select_related('oem').select_for_update(of=('self',))

select_for_update have these four arguments with these default value – nowait=False – skiplocked=False – of=() – nokey=False

Let's see what these all arguments mean

nowait

Think of the scenario where the lock is already acquired by another query, in this case, you want your query to wait or raise an error, This behavior can be controlled by nowait, If nowait=True we will raise the DatabaseError otherwise it will wait for the lock to be released.

skip_locked

As somewhat name implies, it helps to decide whether to consider a locked row in the evaluated query. If the skip_locked=true locked rows will not be considered.

nowait and skip_locked are mutually exclusive using both together will raise ValueError

of

In select_for_update when the query is evaluated, the lock is also acquired on the select related rows as in the query. If one doesn't wish the same, they can use of where they can specify fields to acquire a lock on

>>> Dealership.objects.select_related('oem').select_for_update(of=('self',))
# Just be sure we don't have any nullable relation with OEM

no_key

This helps you to create a weak lock. This means the other query can create new rows which refer to the locked rows (any reference relationship).

Few more important points to keep in mind select_for_update doesn't allow nullable relations, you have to explicitly exclude these nullable conditions. In auto-commit mode, select_for_update fails with error TransactionManagementError you have to add code in a transaction explicitly. I have struggled around these points :).

Here is all about select_for_update which you require to know to use in your code and to do changes to your database.

Cheers!

#Python #Django #ORM #Database