[Django]-How do I quickly get a weighted random instance of a Django model instance based on a weight field on that model?

3👍

You can combine sharding with any of the approaches you list. Choose a number of shards (preferably with number of rows / number of shards significantly greater than log(number of rows) to avoid empty shards with high probability), assign each row a uniform random shard ID, and make the shard ID the first entry of the primary key so that the table is sorted by shard. To sample, choose a uniform random shard and then sample within the shard. This is inaccurate to the extent that the shard totals are unbalanced, but if the shards are large enough, then the law of large numbers will kick in. (If the shards are too large, though, then that starts to defeat the point of sharding.)

Leave a comment