Base Column Profilers

Contains parent column profiler class.

class dataprofiler.profilers.base_column_profilers.BaseColumnProfiler(name: Optional[str])

Bases: object

Abstract class for profiling a column of data.

Initialize base class properties for the subclass.

Parameters

name (String) – Name of the dataset

col_type = None
diff(other_profile: dataprofiler.profilers.base_column_profilers.BaseColumnProfiler, options: Optional[Dict] = None) Dict

Find the differences for columns.

Parameters

other_profile (BaseColumnProfiler) – profile to find the difference with

Returns

the stat differences

Return type

dict

abstract update(df_series: pandas.core.frame.DataFrame) dataprofiler.profilers.base_column_profilers.BaseColumnProfiler

Update the profile.

Parameters

df_series (Pandas Dataframe) – Data to profile.

abstract property profile: Dict

Return the profile of the column.

abstract report(remove_disabled_flag: bool = False) Dict

Return report.

Parameters

remove_disabled_flag (boolean) – flag to determine if disabled options should be excluded in the report.

class dataprofiler.profilers.base_column_profilers.BaseColumnPrimitiveTypeProfiler(name: Optional[str])

Bases: dataprofiler.profilers.base_column_profilers.BaseColumnProfiler

Abstract class for profiling primative data type for col of data.

Initialize base class properties for the subclass.

Parameters

name (String) – Name of the data

col_type = None
diff(other_profile: dataprofiler.profilers.base_column_profilers.BaseColumnProfiler, options: Optional[Dict] = None) Dict

Find the differences for columns.

Parameters

other_profile (BaseColumnProfiler) – profile to find the difference with

Returns

the stat differences

Return type

dict

abstract property profile: Dict

Return the profile of the column.

abstract report(remove_disabled_flag: bool = False) Dict

Return report.

Parameters

remove_disabled_flag (boolean) – flag to determine if disabled options should be excluded in the report.

abstract update(df_series: pandas.core.frame.DataFrame) dataprofiler.profilers.base_column_profilers.BaseColumnProfiler

Update the profile.

Parameters

df_series (Pandas Dataframe) – Data to profile.

name: Optional[str]
sample_size: int
metadata: Dict
times: Dict
thread_safe: bool