10π
This more of a starting point than a full solution, but I hope it help and that other users
can improve this idea and reach a better solution.
Using Haystack to index a multilingual site (using django-transmeta or django-multilingual) you face two problems:
- how to index the content for all the
languages - how to search the query
the correct index depending on the
selected languages
1) Index the content for all the languages
Create a separate fields for each language in every SearchIndex model, using a common prefix
and the language code:
text_en = indexes.CharField(model_attr='body_en', document=True)
text_pt = indexes.CharField(model_attr='body_pt')
If you want to index several fields you can obviously use a template. Only one of the indexes can have document=True.
If you need pre-rendered http://haystacksearch.org/docs/searchindex_api.html field for
faster display, you should create one for each language (ie, rendered_en, rendered_pt)
2) Querying the correct index
The default haystack auto_query method is programmed to receive a βqβ query parameter on the request
and search the βcontentβ index field β the one marked as document=True β in all the Index models.
Only one of the indexes can have document=True and I believe we can only have a SearchIndex
for each django Model.
The simplest solution, using the common search form, is to create a Multilingual SearchQuerySet
that filters based, not on content, but on text_ (text being the prefix used on
the Searchindex model above)
from django.conf import settings
from django.utils.translation import get_language
from haystack.query import SearchQuerySet, DEFAULT_OPERATOR
class MlSearchQuerySet(SearchQuerySet):
def filter(self, **kwargs):
"""Narrows the search based on certain attributes and the default operator."""
if 'content' in kwargs:
kwd = kwargs.pop('content')
kwdkey = "text_%s" % str(get_language())
kwargs[kwdkey] = kwd
if getattr(settings, 'HAYSTACK_DEFAULT_OPERATOR', DEFAULT_OPERATOR) == 'OR':
return self.filter_or(**kwargs)
else:
return self.filter_and(**kwargs)
and point your search URL to a view that uses this query set:
from haystack.forms import ModelSearchForm
from haystack.views import SearchView
urlpatterns += patterns('haystack.views',
url(r'^search/$', SearchView(
searchqueryset=MlSearchQuerySet(),
form_class=ModelSearchForm
), name='haystack_search_ml'),
)
Now your search should be aware of the selected language.
1π
I wrote a detailed explanation about how-to do it here: http://anthony-tresontani.github.com/Django/2012/09/20/multilingual-search/
That implies writing a custom solr engine (backend + query) and settings multiple cores by languages.
- [Django]-Django URLS, how to map root to app?
- [Django]-Is it possible to decorate include(β¦) in django urls with login_required?
- [Django]-When should you use django-admin.py versus manage.py?
0π
There are few commercial products β for example multilingual indexer for Solr or Lucene capable of determining the language automatically.
I donβt like commercial products but the idea is nice and simple β crawl the website, determine the language (with meta tag for example) and index.
So choose the search engine and try to extend it to handle multilingual sites.
Good question though, let us know how you solved this.
- [Django]-Django post_save signals on update
- [Django]-Django project models.py versus app models.py
- [Django]-Importance of apps orders in INSTALLED_APPS
0π
Here is a solution.
Use Sphinx. Create an index for each locale. E.g. Articles-en_us, Articles-es_mx, etc.
When you pass the search query to the sphinx search api, append the locale code to the index name.
Here is a reference on how to setup sphinx with django.
- [Django]-Django icontains with __in lookup
- [Django]-Django β how to make translation work?
- [Django]-Override "remaining elements truncated" in Python
0π
Avoid sphinx if you can since youβre going to want less dependencies. I use django to achieve multilingua using parameter hl=languageCode eg hl=el for greek or whatever 39 languages or so django with appengine supports. gae engineers will update backend no matter my updates, .po files with project gettext are my languagepack
- [Django]-Creating email templates with Django
- [Django]-Django-debug-toolbar breaking on admin while getting sql stats
- [Django]-Atomic increment of a counter in django