squirrel.driver.jsonl

Module Contents

Classes

JsonlDriver

A StoreDriver that by default uses SquirrelStore with jsonl serialization. Please see the parent class for

class squirrel.driver.jsonl.JsonlDriver(url: str, deser_hook: Optional[Callable] = None, storage_options: dict[str, t.Any] | None = None, **kwargs)

Bases: squirrel.driver.store.StoreDriver

A StoreDriver that by default uses SquirrelStore with jsonl serialization. Please see the parent class for additional configuration

Initializes JsonlDriver with default serializer.

Parameters
  • url (str) – Path to the root directory. If this path does not exist, it will be created.

  • deser_hook (Callable) – Callable that is passed as object_hook to JsonDecoder during json deserialization. Defaults to None.

  • storage_options (Dict) – a dictionary containing storage_options to be passed to fsspec. Example of storage_options if you want to enable fsspec caching: storage_options={“protocol”: “simplecache”, “target_protocol”: “gs”, “cache_storage”: “path/to/cache”}

  • **kwargs – Keyword arguments passed to the super class initializer.

name = jsonl
get_iter(get_kwargs: Optional[Dict] = None, **kwargs)squirrel.iterstream.Composable

Returns an iterable of samples as specified by fetcher_func.

Parameters
  • get_kwargs (Dict) – Keyword arguments that will be passed as get_kwargs to MapDriver.get_iter(). get_kwargs will always have compression=”gzip”. Defaults to None.

  • **kwargs – Other keyword arguments that will be passed to MapDriver.get_iter().

Returns

(Composable) Iterable over the items in the store.