[Django]-Sort list of real numbers mixed with letters

4👍

Let the DBMS do the sorting, that’s what it is very good at. You can hardly rival the performance in your application.

If all you got is fractional numbers with A or B appended, you can simply:

SELECT *
FROM  (
   SELECT unnest(
    ARRAY['1', '1.1', '1.2', '2', '2.1A', '2.1B', '2.2A', '101.1', '101.2']) AS s
   ) x
ORDER  BY rtrim(s, 'AB')::numeric, s;

Orders exactly as requested, and fast, too. The subselect with ARRAY and unnest() is just for building a quick testcase. The ORDER BY clause is what matters –rtrim() in the manual.

If there are other characters involved, you might want to update your question to complete the picture.

2👍

x = ['1', '1.1', '1.2', '2', '2.1A', '2.1B', '2.2A', '101.1', '101.2']

#sort by the real number portion

import string

letters = tuple(string.ascii_letters)

def change(x):
    if x.endswith(letters):
        return float(x[:len(x) -1])
    else:
        return float(x)

my_list = sorted(x, key = lambda k: change(k))

Result:

>>> my_list
['1', '1.1', '1.2', '2', '2.1A', '2.1B', '2.2A', '101.1', '101.2']

0👍

I prematurely generalized to arbitrary amounts of letters on the end:

from itertools import takewhile

def sort_key(value):
    cut_point = len(value) - len(list(takewhile(str.isalpha, reversed(value))))
    return (float(value[:cut_point]), value[cut_point:])

sorted((
    l.number
    for l in Locker.objects.extra(select={'asnumber': 'CAST(number as BYTEA)'})
), key = sort_key)

0👍

Split the strings into tuples – a real number (convert it to float or decimal) and an often empty string of characters. If you sort the tuples, and use python’s builtin sort (timesort), it should be really fast.

Be careful if scientific notation is allowed in your reals, eg 1e10.

If there’s any chance at all that there’ll be additional complexity in the comparisons later, use a class instead of a tuple. But the tuples will likely be faster. Then define one or more comparison functions (depending on if you’re in python 2.x or 3.x).

Tuples compare element 0, then element 1, etc.

Your class alternative would need to have a cmp method or the 3.x equivalent.

0👍

Storing the string as a string and then parsing it to sort it seems like the wrong approach. If what you really have there is

  • major number
  • minor number
  • optional revision

Then I would strongly suggest storing it as two integers and a text field. Sorting on major_number, minor_number, revision would work exactly as expected. You could either define the asnumber as a view at the database level or as a class based on the three base numbers with an associated __cmp__().

👤Andrew

Leave a comment