quantify_randomness
¶
Module Contents¶
Classes¶
Return integer elements in shards |
Functions¶
|
Compute the kendall tau randomness metric |
|
Quantify the randomness of sampling from a driver with the given shuffle parameters. |
-
class
quantify_randomness.
DummyShardedDriver
(num_shard: int, shard_size: int)¶ Bases:
squirrel.driver.MapDriver
Return integer elements in shards
Init dummy sharded driver
-
name
= dummy_sharded_driver¶
-
get_iter
(flatten: bool = True, **kwargs) → squirrel.iterstream.base.Composable¶ Get iterator
-
keys
() → Iterable¶ Get key iterator
-
-
quantify_randomness.
kendalltau_metric
(result1: numpy.array, result2: numpy.array) → float¶ Compute the kendall tau randomness metric
-
quantify_randomness.
quantify_randomness
(num_shard: int, shard_size: int, buffer_size: int, initial: int, n_samples: int = 250, metric: Callable = kendalltau_metric, seed1: squirrel.constants.SeedType = None, seed2: squirrel.constants.SeedType = None) → float¶ Quantify the randomness of sampling from a driver with the given shuffle parameters. This function assumes that we always fully shuffle all keys and the parameters for the item buffer is what we are interested in.
- Parameters
num_shard (int) – number of shards
shard_size (int) – size of each shard assuming that all shards are of equal size
buffer_size (int) – buffer size for item shuffle buffer
initial (int) – initial size of item shuffle buffer
n_samples (int) – influences the accuracy of the estimate by controlling the number of sampled trajectories
metric (Callable) – how to measure the distance
seed1 (SeedType) – seed for the first trajectory
seed2 (SeedType) – seed for the second trajectory
- Returns
- randomness measure computed from the kendall tau coefficient. Values between 0 and 1 while 1 means
completely deterministic and 0 means random.
- Return type