squirrel.driver.csv_driver

Module Contents

Classes

CsvDriver

Drives the access to a data source.

class squirrel.driver.csv_driver.CsvDriver(path: str, **kwargs)

Bases: squirrel.driver.file_driver.FileDriver, squirrel.driver.driver.DataFrameDriver

Drives the access to a data source.

Initializes CsvDriver.

Parameters
  • path (str) – Path to a .csv file.

  • **kwargs – Keyword arguments passed to the super class initializer.

name = csv
get_df(self, **kwargs)dask.dataframe.DataFrame

Returns the data in the .csv file as a Dask DataFrame.

Parameters

**kwargs – Keyword arguments passed to dask.dataframe.read_csv() to read the .csv file.

Returns

(dask.dataframe.DataFrame) Dask DataFrame constructed from the .csv file.

get_iter(self, itertuples_kwargs: Optional[Dict] = None, read_csv_kwargs: Optional[Dict] = None)squirrel.iterstream.Composable

Returns an iterator over rows.

Note that first the csv file is read into a DataFrame and then df.itertuples() is called.

Parameters
  • itertuples_kwargs – Keyword arguments to be passed to dask.dataframe.DataFrame.itertuples().

  • read_csv_kwargs – Keyword arguments to be passed to dask.dataframe.read_csv().

Returns

(squirrel.iterstream.Composable) Iterable over the rows of the data frame as namedtuples.