traffic_replay: Change user distribution to use Pareto Distribution
authorTim Beale <timbeale@catalyst.net.nz>
Mon, 15 Oct 2018 21:57:29 +0000 (10:57 +1300)
committerTim Beale <timbeale@samba.org>
Sun, 4 Nov 2018 22:55:16 +0000 (23:55 +0100)
The current probability we were assigning to users roughly approximates
the Pareto Distribution (with shape=1.0). This means the code now uses a
documented algorithm (i.e. explanation on Wikipedia). It also allows us
to vary the distribution by changing the shape parameter.

Signed-off-by: Tim Beale <timbeale@catalyst.net.nz>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
python/samba/emulate/traffic.py

index 8eb10ee881999c543010870c3dd836e7e702e374..16672286cd89dde5ab2cbc781236af659820a86a 100644 (file)
@@ -1842,11 +1842,12 @@ class GroupAssignments(object):
     def generate_user_distribution(self, n):
         """Probability distribution of a user belonging to a group.
         """
-        # Assign a weighted probability to each user. Probability decreases
-        # as the user-ID increases
+        # Assign a weighted probability to each user. Use the Pareto
+        # Distribution so that some users are in a lot of groups, and the
+        # bulk of users are in only a few groups
         weights = []
         for x in range(1, n + 1):
-            p = 1 / (x + 0.001)
+            p = random.paretovariate(1.0)
             weights.append(p)
 
         # convert the weights to a cumulative distribution between 0.0 and 1.0