Column Profile Compilers¶
-
class
dataprofiler.profilers.column_profile_compilers.
BaseCompiler
(df_series=None, options=None, pool=None)¶ Bases:
object
-
abstract property
profile
¶
-
diff
(other, options=None)¶ Finds the difference between 2 compilers and returns the report
- Parameters
other (BaseCompiler) – profile compiler finding the difference with this one.
- Returns
difference of the profiles
- Return type
dict
-
update_profile
(df_series, pool=None)¶ Updates the profiles from the data frames
- Parameters
df_series (pandas.core.series.Series) – a given column, assume df_series in str
pool (multiprocessing.Pool) – pool to utilized for multiprocessing
- Returns
Self
- Return type
-
abstract property
-
class
dataprofiler.profilers.column_profile_compilers.
ColumnPrimitiveTypeProfileCompiler
(df_series=None, options=None, pool=None)¶ Bases:
dataprofiler.profilers.column_profile_compilers.BaseCompiler
-
property
profile
¶
-
property
selected_data_type
¶ Finds the selected data_type in a primitive compiler
- Returns
name of the selected data type
- Return type
str
-
diff
(other, options=None)¶ Finds the difference between 2 compilers and returns the report
- Parameters
other (ColumnPrimitiveTypeProfileCompiler) – profile compiler finding the difference with this one.
- Returns
difference of the profiles
- Return type
dict
-
update_profile
(df_series, pool=None)¶ Updates the profiles from the data frames
- Parameters
df_series (pandas.core.series.Series) – a given column, assume df_series in str
pool (multiprocessing.Pool) – pool to utilized for multiprocessing
- Returns
Self
- Return type
-
property
-
class
dataprofiler.profilers.column_profile_compilers.
ColumnStatsProfileCompiler
(df_series=None, options=None, pool=None)¶ Bases:
dataprofiler.profilers.column_profile_compilers.BaseCompiler
-
property
profile
¶
-
diff
(other, options=None)¶ Finds the difference between 2 compilers and returns the report
- Parameters
other (BaseCompiler) – profile compiler finding the difference with this one.
- Returns
difference of the profiles
- Return type
dict
-
update_profile
(df_series, pool=None)¶ Updates the profiles from the data frames
- Parameters
df_series (pandas.core.series.Series) – a given column, assume df_series in str
pool (multiprocessing.Pool) – pool to utilized for multiprocessing
- Returns
Self
- Return type
-
property
-
class
dataprofiler.profilers.column_profile_compilers.
ColumnDataLabelerCompiler
(df_series=None, options=None, pool=None)¶ Bases:
dataprofiler.profilers.column_profile_compilers.BaseCompiler
-
property
profile
¶
-
diff
(other, options=None)¶ Finds the difference between 2 compilers and returns the report
- Parameters
other (BaseCompiler) – profile compiler finding the difference with this one.
- Returns
difference of the profiles
- Return type
dict
-
update_profile
(df_series, pool=None)¶ Updates the profiles from the data frames
- Parameters
df_series (pandas.core.series.Series) – a given column, assume df_series in str
pool (multiprocessing.Pool) – pool to utilized for multiprocessing
- Returns
Self
- Return type
-
property
-
class
dataprofiler.profilers.column_profile_compilers.
UnstructuredCompiler
(df_series=None, options=None, pool=None)¶ Bases:
dataprofiler.profilers.column_profile_compilers.BaseCompiler
-
property
profile
¶
-
diff
(other, options=None)¶ Finds the difference between 2 compilers and returns the report
- Parameters
other (BaseCompiler) – profile compiler finding the difference with this one.
- Returns
difference of the profiles
- Return type
dict
-
update_profile
(df_series, pool=None)¶ Updates the profiles from the data frames
- Parameters
df_series (pandas.core.series.Series) – a given column, assume df_series in str
pool (multiprocessing.Pool) – pool to utilized for multiprocessing
- Returns
Self
- Return type
-
property