0👍
I think you get it right. There’s a patch on inkscape xapian_backend fork, inspired from xapian omega project.
I’ve done something like you’ve done on my project, with some trick in order to use the search index template:
# regex to efficiently truncate with re.sub
_max_length = 240
_regex = re.compile(r"([^\s]{{{}}})([^\s]+)".format(_max_length))
def prepare_text(self, object):
# this is for using the template mechanics
field = self.fields["text"]
text = self.prepared_data[field.index_fieldname]
encoding = "utf8"
encoded = text.encode(encoding)
prepared = re.sub(_regex, r"\1", encoded, re.UNICODE)
if len(prepared) != len(encoded):
return prepared.decode(encoding, 'ignore')
return text
Source:stackexchange.com