Base Column Profilers

Contains parent column profiler class.

class dataprofiler.profilers.base_column_profilers.BaseColumnProfiler(name: str | None, options: BaseOption | None = None)

Bases: Generic[dataprofiler.profilers.base_column_profilers.BaseColumnProfilerT]

Abstract class for profiling a column of data.

Initialize base class properties for the subclass.

Parameters

name (String) – Name of the dataset

col_type = None
diff(other_profile: dataprofiler.profilers.base_column_profilers.BaseColumnProfilerT, options: Optional[dict] = None) dict

Find the differences for columns.

Parameters

other_profile (BaseColumnProfiler) – profile to find the difference with

Returns

the stat differences

Return type

dict

abstract update(df_series: pandas.core.frame.DataFrame) dataprofiler.profilers.base_column_profilers.BaseColumnProfiler

Update the profile.

Parameters

df_series (Pandas Dataframe) – Data to profile.

abstract property profile: dict

Return the profile of the column.

abstract report(remove_disabled_flag: bool = False) dict

Return report.

Parameters

remove_disabled_flag (boolean) – flag to determine if disabled options should be excluded in the report.

classmethod load_from_dict(data: dict[str, Any], config: dict | None = None) BaseColumnProfilerT

Parse attribute from json dictionary into self.

Parameters
  • data (dict[string, Any]) – dictionary with attributes and values.

  • config (Dict | None) – config for loading column profiler params from dictionary

Returns

Profiler with attributes populated.

Return type

BaseColumnProfiler

class dataprofiler.profilers.base_column_profilers.BaseColumnPrimitiveTypeProfiler(name: str | None)

Bases: dataprofiler.profilers.base_column_profilers.BaseColumnProfiler[dataprofiler.profilers.base_column_profilers.BaseColumnPrimitiveTypeProfilerT]

Abstract class for profiling primative data type for col of data.

Initialize base class properties for the subclass.

Parameters

name (String) – Name of the data

sample_size: int
col_type = None
diff(other_profile: dataprofiler.profilers.base_column_profilers.BaseColumnProfilerT, options: Optional[dict] = None) dict

Find the differences for columns.

Parameters

other_profile (BaseColumnProfiler) – profile to find the difference with

Returns

the stat differences

Return type

dict

classmethod load_from_dict(data: dict[str, Any], config: dict | None = None) BaseColumnProfilerT

Parse attribute from json dictionary into self.

Parameters
  • data (dict[string, Any]) – dictionary with attributes and values.

  • config (Dict | None) – config for loading column profiler params from dictionary

Returns

Profiler with attributes populated.

Return type

BaseColumnProfiler

abstract property profile: dict

Return the profile of the column.

abstract report(remove_disabled_flag: bool = False) dict

Return report.

Parameters

remove_disabled_flag (boolean) – flag to determine if disabled options should be excluded in the report.

abstract update(df_series: pandas.core.frame.DataFrame) dataprofiler.profilers.base_column_profilers.BaseColumnProfiler

Update the profile.

Parameters

df_series (Pandas Dataframe) – Data to profile.

name: str | None
metadata: dict
times: dict
thread_safe: bool