dataprofiler.data_readers.base_data module

Contains abstract class for data loading and saving.

class dataprofiler.data_readers.base_data.BaseData(input_file_path: str | None, data: Any, options: Dict)

Bases: object

Abstract class for data loading and saving.

Initialize Base class for loading a dataset.

Options can be specified and maybe more specific to the subclasses.

Parameters:
  • input_file_path (str) – path to the file being loaded or None

  • data (multiple types) – data being loaded into the class instead of an input file

  • options (dict) – options pertaining to the data type

Returns:

None

data_type: str
info: str | None = None
property data

Return data.

property data_format: str | None

Return data format.

property is_structured: bool

Determine compatibility with StructuredProfiler.

property file_encoding: str | None

Return file encoding.

get_batch_generator(batch_size: int) Generator[DataFrame | List, None, None]

Get batch generator.

classmethod is_match(input_file_path: str, options: Dict | None) bool

Return true if match, false otherwise.

reload(input_file_path: str | None, data: Any, options: Dict | None) None

Reload the data class with a new dataset.

This erases all existing data/options and replaces it with the input data/options.

Parameters:
  • input_file_path (str) – path to the file being loaded or None

  • data (multiple types) – data being loaded into the class instead of an input file

  • options (dict) – options pertaining to the data type

Returns:

None

property length: int

Return the length of the dataset which is loaded.

Returns:

length of the dataset