[Django]-Django Fulltext search on JSON field

8👍

As @e4c5 reported there’s Cast function since Django 1.10 (the same version you used).

So if you would to search a JSON field as text you have to cast it as text:

from django.contrib.postgres.search import SearchVector
from django.db.models import TextField
from django.db.models.functions import Cast

Product.objects.annotate(
    search=SearchVector(Cast('attributes', TextField())),
).filter(search=keyword)

You can also use only specific part of your JSON field in your SearchVector:

from django.contrib.postgres.search import SearchVector
from django.contrib.postgres.fields.jsonb import KeyTextTransform

Product.objects.annotate(
    search=SearchVector(KeyTextTransform('key1', 'attributes')),
).filter(search=keyword)

PostgreSQL 10 added Full Text Search support for JSON and JSONB.

1👍

Why you shouldn’t do this

The very reason that JSONB data type and JSON/JSONB functions were introduced was to avoid this kind of search! The Django ORM provides access to most of this functionality. The Func expression can be used to for functionality that cannot quite be reached with the double underscore notation etc.

If on the other hand you have a large text field in your JSONB column that really does need to be full text searched. This indicates that the database design isn’t optimal. That field should be pulled out of the JSON and it should be a field on it’s own right.

If you still want to do this

Django 1.10 just added a Cast function.

class Cast(expression, output_field)
Forces the result type of
expression to be the one from output_field.

If you are on an older version of Django you can use the RawSQL function. You will find an example on the same page linked above. Please note that the use of the RawSQL function is not the same as executing a raw query.

Update Jan 2018

There is a certain belief that using RawSQL will make your code non portable and you should never use it. This question deals with the django.contrib.postgres package, your code most definitely isn’t portable. So if you want to avoid raw sql, it should be for a reason other than portability.

👤e4c5

Leave a comment