[Answer]-How Do I Get Unique IDs Efficiently in Python?

1👍

To requote my comment:

uuid.hex returns a hexadecimal encoding so you only have [0-9a-f] in there. Hence the requirement of removing the specified characters does not even come up.

However, using six hexadecimal digits, gives you only 16^6 = 16,777,216 possible PINs from the start. So with 100 million users, you will be running out of PINs (and have an endless loop).

In general I’d simply suggest choosing a large enough PIN space and dropping the requirement of having unique PINs if possible by the application design.

If you want 100 million+ users, you need enough room in your PIN space so that random number generation will not fail too often. This is rather vague, so let’s come up with some numbers:

When you have a pin space of n and u existing users, generating a random number from n will yield an non-existing PIN (n - u) / n number of times. Add l-times looping and the probability of needing l loops is ((u / n) ** (t - 1)) * ((n - u) / n) (i.e. the probability of selecting an existing number t - 1 times and finally getting a non-existing number.

Now, with 6-character PINs and the mentioned characters dropped, your alphabet is probably something like:

alphabet = 'abcdefghijkmnpqrtuvwxyz2346789'

This gives you 30 characters and with 6-digit PINs you have 30 ** 6 =
729,000,000
possible PINs. The first try to generate a unique PIN will thus fail around 1/7 of all possible cases when the user space reaches 100 million.

The bottleneck, however will be checking if the newly generated PIN already exists. Searching a number in a table of 100 million numbers multiple times is simply never a fast operation.

As for generating a unique string over an alphabet: Use random.choice():

import random
pin_length = 6
random_pin = ''.join(random.choice(alphabet) for i in xrange(0, pin_length))
👤dhke

Leave a comment