[Fixed]-Django: iregex is case sensitive

1πŸ‘

  • β€œLATIN SMALL LETTER C is not considered to be the same as β€œCYRILLIC SMALL LETTER ES”.
  • Ditto for β€œCYRILLIC SMALL LETTER I” and β€œCYRILLIC SMALL LETTER SHORT I”
  • MySQL’s REGEXP works with bytes not characters. hence only unaccented English letters work in REGEXP; no Cyrillic letter can (reliably) work.
  • MariaDB 10.0.5’s REGEXP should do a better job. Ref: https://mariadb.com/kb/en/mariadb/pcre/
πŸ‘€Rick James

0πŸ‘

I suggest switching to Postgres database which handles non-latin symbols pretty good.

Just tried to reproduce your issue on my Django 1.10 and Postgres 9.6 setup.

from django.contrib.auth.models import User
users = User.objects.filter(username__iregex='Босницки(ΠΈ|ΠΉ)')
users
<QuerySet [<User: Босницкий>, <User: сосницкий>, <User: сосницкии>, <User: Π‘ΠΎΠ‘Π½ΠΈΡ†ΠΊΠΈΠΈ>]>

Seems to be working.

πŸ‘€Nik

Leave a comment