Column Profile Compilers¶
- class dataprofiler.profilers.column_profile_compilers.BaseCompiler(df_series=None, options=None, pool=None)¶
Bases:
object
- abstract report(remove_disabled_flag=False)¶
Abstract method for returning report.
- Parameters
remove_disabled_flag (boolean) – flag to determine if disabled options should be excluded in the report.
- property profile¶
Property for profile. Returns the profile of the column.
- diff(other, options=None)¶
Finds the difference between 2 compilers and returns the report
- Parameters
other (BaseCompiler) – profile compiler finding the difference with this one.
- Returns
difference of the profiles
- Return type
dict
- update_profile(df_series, pool=None)¶
Updates the profiles from the data frames
- Parameters
df_series (pandas.core.series.Series) – a given column, assume df_series in str
pool (multiprocessing.Pool) – pool to utilized for multiprocessing
- Returns
Self
- Return type
- class dataprofiler.profilers.column_profile_compilers.ColumnPrimitiveTypeProfileCompiler(df_series=None, options=None, pool=None)¶
Bases:
dataprofiler.profilers.column_profile_compilers.BaseCompiler
- report(remove_disabled_flag=False)¶
Method for returning report.
- Parameters
remove_disabled_flag (boolean) – flag to determine if disabled options should be excluded in the report.
- property profile¶
Property for profile. Returns the profile of the column.
- property selected_data_type¶
Finds the selected data_type in a primitive compiler
- Returns
name of the selected data type
- Return type
str
- diff(other, options=None)¶
Finds the difference between 2 compilers and returns the report
- Parameters
other (ColumnPrimitiveTypeProfileCompiler) – profile compiler finding the difference with this one.
- Returns
difference of the profiles
- Return type
dict
- update_profile(df_series, pool=None)¶
Updates the profiles from the data frames
- Parameters
df_series (pandas.core.series.Series) – a given column, assume df_series in str
pool (multiprocessing.Pool) – pool to utilized for multiprocessing
- Returns
Self
- Return type
- class dataprofiler.profilers.column_profile_compilers.ColumnStatsProfileCompiler(df_series=None, options=None, pool=None)¶
Bases:
dataprofiler.profilers.column_profile_compilers.BaseCompiler
- report(remove_disabled_flag=False)¶
Method for returning report.
- Parameters
remove_disabled_flag (boolean) – flag to determine if disabled options should be excluded in the report.
- diff(other, options=None)¶
Finds the difference between 2 compilers and returns the report
- Parameters
other (ColumnStatsProfileCompiler) – profile compiler finding the difference with this one.
- Returns
difference of the profiles
- Return type
dict
- property profile¶
Property for profile. Returns the profile of the column.
- update_profile(df_series, pool=None)¶
Updates the profiles from the data frames
- Parameters
df_series (pandas.core.series.Series) – a given column, assume df_series in str
pool (multiprocessing.Pool) – pool to utilized for multiprocessing
- Returns
Self
- Return type
- class dataprofiler.profilers.column_profile_compilers.ColumnDataLabelerCompiler(df_series=None, options=None, pool=None)¶
Bases:
dataprofiler.profilers.column_profile_compilers.BaseCompiler
- report(remove_disabled_flag=False)¶
Method for returning report.
- Parameters
remove_disabled_flag (boolean) – flag to determine if disabled options should be excluded in the report.
- diff(other, options=None)¶
Finds the difference between 2 compilers and returns the report
- Parameters
other (ColumnDataLabelerCompiler) – profile compiler finding the difference with this one.
options (dict) – options to change results of the difference
- Returns
difference of the profiles
- Return type
dict
- property profile¶
Property for profile. Returns the profile of the column.
- update_profile(df_series, pool=None)¶
Updates the profiles from the data frames
- Parameters
df_series (pandas.core.series.Series) – a given column, assume df_series in str
pool (multiprocessing.Pool) – pool to utilized for multiprocessing
- Returns
Self
- Return type
- class dataprofiler.profilers.column_profile_compilers.UnstructuredCompiler(df_series=None, options=None, pool=None)¶
Bases:
dataprofiler.profilers.column_profile_compilers.BaseCompiler
- report(remove_disabled_flag=False)¶
Report on profile attribute of the class and pop value from self.profile if key not in self.__calculations
- diff(other, options=None)¶
Finds the difference between 2 compilers and returns the report
- Parameters
other (UnstructuredCompiler) – profile compiler finding the difference with this one.
options (dict) – options to impact the results of the diff
- Returns
difference of the profiles
- Return type
dict
- property profile¶
Property for profile. Returns the profile of the column.
- update_profile(df_series, pool=None)¶
Updates the profiles from the data frames
- Parameters
df_series (pandas.core.series.Series) – a given column, assume df_series in str
pool (multiprocessing.Pool) – pool to utilized for multiprocessing
- Returns
Self
- Return type