6👍
Unidecode will help you solve a certain form of this problem. Unidecode will translate the non-ascii characters to ascii, for example:
>>> from unidecode import unidecode
>>> unidecode(u"İstanbul")
'Istanbul'
You can achieve a similar effect by decomposing the unicode characters and removing the combining diacritics. The problem with this technique is that certain characters are not decomposable. So, while “ö” will decompose to an “o” and an umlaut, “Ł” (L-stroke) will stay the same. Unidecode successfully translates “Ł” to “L”.
But Undeicode doesn’t solve all your problems; cities can be known by different names, or these names can be written differently. For example, in the US we call the capital of China “Beijing”, but we used to call it “Peking” (and it’s still called “Peking” in Swedish), and translating its name with unidecode
gives us something else:
>>> unidecode(u"\u5317\u4EB0")
'Bei Jing '
The best solution is to have a language-specific list of names and not to use the city’s actual name.
1👍
I don’t think that there is something ready for it in django.
I would create a separate column in database called something like NameCombinations where I’d put all possible combinations, e.g. Istanbulİstanbul and would query
cities = City.objects.filter(NameCombinations__icontains=query)
- [Django]-How to delete ONLY m2m relation?
- [Django]-Accessing Django Runserver on Shared Hosting
- [Django]-Change model admin form default value
0👍
It’s hard to give definitive advice without more information about what behaviour you want.
However, one obvious step is to define a canonical form for each name (lower case, no accents, etc.), and store the canonical form of the name in a second column of the database, in addition to the correct name. Then map the search string to the canonical form as well. Thus, “istanbul” could be the canonical form of “İstanbul”.
Another obvious step is to separate the city name into a separate table from the rest of the information about the city. This lets each city have several names, ie synonyms. Then, for each city name, define as many synonyms as needed to catch different spellings your users favour. For instance, you could enter “Istanbul” and “イスタンブル” as synonyms of “İstanbul”.
You can of course use both of these approaches together.
- [Django]-Django.db.utils.OperationalError: (1091, "Can't DROP 'company_id'; check that column/key exists")
- [Django]-Django 1.1 equivalent of the 'in' operator
- [Django]-Specifying a namespace in include() without providing an app_name
- [Django]-How to logout an inactive user in django?
-1👍
Once you have set an appropriate collation in the database, the comparison will do exactly as you want.
- [Django]-Passing data form datatable to modal django
- [Django]-Django Admin display foreign key value
- [Django]-Django rest registration