test_torch_distibuted_loading

Module Contents

Functions

create_data(test_folder: str) → int

Helper function to create test data.

test_JSONSource(test_data: Generator) → None

Quick sanity test that the data can be loaded and a fixed number of samples taken.

test_data() → Iterator[Tuple[str, int]]

Fixture for this modules test data

test_multi_worker(test_data: Generator, take_samples: int) → None

Test that we can load the data across multiple PyTorch dataloaders workers in a single node setup

test_multi_worker_multi_rank(test_data: Generator, take_samples: int) → None

Test we can load the data correctly in a multi-node and multi-worker setting.

Attributes

MAX_SAMPLES_PER_SHARD

MIN_SAMPLES_PER_SHARD

N_SHARDS

test_torch_distibuted_loading.MAX_SAMPLES_PER_SHARD = 100
test_torch_distibuted_loading.MIN_SAMPLES_PER_SHARD = 50
test_torch_distibuted_loading.N_SHARDS = 20
test_torch_distibuted_loading.create_data(test_folder: str)int

Helper function to create test data.

test_torch_distibuted_loading.test_JSONSource(test_data: Generator)None

Quick sanity test that the data can be loaded and a fixed number of samples taken.

test_torch_distibuted_loading.test_data()Iterator[Tuple[str, int]]

Fixture for this modules test data

test_torch_distibuted_loading.test_multi_worker(test_data: Generator, take_samples: int)None

Test that we can load the data across multiple PyTorch dataloaders workers in a single node setup

test_torch_distibuted_loading.test_multi_worker_multi_rank(test_data: Generator, take_samples: int)None

Test we can load the data correctly in a multi-node and multi-worker setting.