[Answered ]-Django filter JSONField when key is a number

1👍

Django automatically converting number-like values to numbers and array indexes:

Let’s compare raw SQLs

print(Foo.objects.filter(node_status__has_key='dog').query)
print(Foo.objects.filter(node_status__has_key='2').query)
SELECT `django4_foo`.`id`, `django4_foo`.`node_status` FROM `django4_foo` WHERE JSON_CONTAINS_PATH(`django4_foo`.`node_status`, 'one', $."dog")
SELECT `django4_foo`.`id`, `django4_foo`.`node_status` FROM `django4_foo` WHERE JSON_CONTAINS_PATH(`django4_foo`.`node_status`, 'one', $[2])

As you can see 2 is converted to $[2] but dog is converted to $."dog"

One possible solution is to get raw SQL, reformat it manually and init new query set using raw SQL:

raw_sql = str(Foo.objects.filter(node_status__has_key='2').query).replace(f'$[2]', f'\'$.2\'')

print(raw_sql)

rqs = Foo.objects.raw(raw_sql)

for o in rqs:
  print(o.node_status)
SELECT `django4_foo`.`id`, `django4_foo`.`node_status` FROM `django4_foo` WHERE JSON_CONTAINS_PATH(`django4_foo`.`node_status`, 'one', '$.2')
{'cat': '1', '2': True, 'dog': True}

However this hack don’t work with node_status__2=True filter.

Let’s check this query for dog key:

print(Foo.objects.filter(node_status__dog=True).query)

Output is:

SELECT `django4_foo`.`id`, `django4_foo`.`node_status` FROM `django4_foo` WHERE JSON_UNQUOTE(JSON_EXTRACT(`django4_foo`.`node_status`, $."dog")) = JSON_EXTRACT(true, '$')

But actually working raw SQL is:

SELECT `django4_foo`.`id`, `django4_foo`.`node_status` FROM `django4_foo` WHERE JSON_UNQUOTE(JSON_EXTRACT(`django4_foo`.`node_status`, "$.dog")) = 'True'

So here is many problems with filterraw conversions. (Django community also don’t recommend to use str(query) for raw SQL query generations)

Django is positioned like universal screwdriver for all types of screws. Internal conflicts and overcomplicated "universal and all in one" logic is a common problem for Django.

Yes it is possible to override query class and change key transformation logic but it is hard to read and understand, pretty complicated and another problems may occur with customized logic (because Django don’t expect that logic was changed).

So easier ways will be:

  • just use Foo.objects.raw(sql_query) where sql_query is predefined manually string
  • use cursor.execute(sql_query, \[params\]) with sql_query predefined same as ①
  • get Foo.objects.all() and use python logic to filter it ([e for e in Foo.objects.all() if '2' in e.node_status])

Other possible ways are:

※ My choice 😉

👤rzlvmp

Leave a comment