squirrel.zarr.store

At a low level, squirrel offers some store options that could enhance or modify zarr stores such as

  • reducing number of HTTP calls zarr has to make in order to fetch items from cloud datasets

  • fetching items from datasets in an asynchronous manner

  • caching storage path keys and metadata

to allow fast cluster / full iteration through the dataset.

Currently, we offer the following store:

  • SquirrelFSStore is based on zarr.storage.FSStore. In SquirrelFSStore, we use the squirrel.fsspec module to control HTTP connections to cloud buckets.

Module Contents

Classes

SquirrelFSStore

Alters the default fs argument to CustomGCSFileSystem if the url

Functions

suggest_compression(→ numcodecs.abc.Codec)

Suggested compression method from squirrel.

Attributes

logger

squirrel.zarr.store.logger
class squirrel.zarr.store.SquirrelFSStore(url: squirrel.constants.URL, key_separator: str = '.', mode: str = 'w', exceptions: Sequence[Type[Exception]] = (KeyError, PermissionError, IOError), check_exists: bool = False, **storage_options)

Bases: zarr.storage.FSStore

Alters the default fs argument to CustomGCSFileSystem if the url location directs to google cloud bucket. This allows us to use the custom gcsfs to open FSStore inside Zarr. The main purpose is too add google 400 error into retriable options when doing https requests to google cloud storage buckets so that 400 does not break our cloud build, tests, and other time consuming operations. For the main issue, see https://github.com/dask/gcsfs/issues/290.

Initialize SquirrelFSStore.

Parameters
  • url (URL) – Path to the store.

  • key_separator (str, optional) – Separator placed between the dimensions of a chunk. Defaults to “.”.

  • mode (str, optional) – File IO mode to use. Defaults to “w”.

  • exceptions (Sequence[Type[Exception]], optional) – When accessing data, any of these exceptions will be treated as a missing key. Defaults to (KeyError, PermissionError, IOError).

  • check_exists (bool, optional) – Whether to check that url corresponds to a directory. Defaults to False.

  • **storage_options – Storage options to be passed to fsspec.

Raises

FSPathExistNotDir – If check_exists == True and url does not correspond to a directory.

getsize(path: squirrel.constants.URL = None)int

Get size of a subdir inside the store.

listdir(path: squirrel.constants.URL = None)List[str]

List all dirs under the path, except for squirrel keys and zarr keys.

squirrel.zarr.store.suggest_compression()numcodecs.abc.Codec

Suggested compression method from squirrel.