8π
β
>>> s = "This is a very long string with many many many many and many more sentences and there is not one character that i can use to split by, just by number of words"
>>> l = s.split()
>>> n = 5
>>> [' '.join(l[x:x+n]) for x in xrange(0, len(l), n)]
['This is a very long',
'string with many many many',
'many and many more sentences',
'and there is not one',
'character that i can use',
'to split by, just by',
'number of words']
π€DrTyrsa
1π
Here is an idea:
def split_chunks(s, chunksize):
pos = 0
while(pos != -1):
new_pos = s.rfind(" ", pos, pos+chunksize)
if(new_pos == pos):
new_pos += chunksize # force split in word
yield s[pos:new_pos]
pos = new_pos
This tries to split strings into chunks at most chunksize
in length. It tries to split at spaces, but if it canβt it splits in the middle of a word:
>>> foo = "asdf qwerty sderf sdefw regf"
>>> list(split_chunks(foo, 6)
['asdf', ' qwert', 'y', ' sderf', ' sdefw', ' regf', '']
I guess it requires some tweaking though (for instance how to handle splits that occur inside words), but it should give you a starting point.
To split by number of words, do this:
def split_n_chunks(s, words_per_chunk):
s_list = s.split()
pos = 0
while pos < len(s_list):
yield s_list[pos:pos+words_per_chunk]
pos += words_per_chunk
π€BjΓΆrn Pollex
- [Django]-Why are form field __init__ methods being called on Django startup?
- [Django]-Cannot concatenate 'str' and 'tuple' objects β Django β johnny cache
- [Django]-Where in Django can I run startup to load data?
Source:stackexchange.com