Data Labeler Column Profile¶
-
class
dataprofiler.profilers.data_labeler_column_profile.
DataLabelerColumn
(name, options=None)¶ Bases:
dataprofiler.profilers.base_column_profilers.BaseColumnProfiler
Initialization of Data Label profiling for structured datasets.
- Parameters
data_labeler_dirpath (String) – Directory path to the data labeler
options (DataLabelerOptions) – Options for the data labeler column
-
col_type
= 'data_labeler'¶
-
static
assert_equal_conditions
(data_labeler, data_labeler2)¶ Ensures data labelers have the same values. Raises error otherwise.
- Parameters
data_labeler (DataLabelerColumn) – first data_labeler
data_labeler2 (DataLabelerColumn) – second data_labeler
- Returns
None
-
property
data_label
¶ Returns the data labels which best fit the data it has seen based on the DataLabeler used. Data labels must be within the minimum probability differential of the top predicted value. If nothing is more than minimum top label value, it says it could not determine the data label.
-
property
avg_predictions
¶ Averages all sample predictions for each data label.
-
property
label_representation
¶ Representation of label found within the dataset based on ranked voting. When top_k=1, this is simply the distribution of data labels found within the dataset.
-
property
profile
¶ Property for profile. Returns the profile of the column.
-
update
(df_series)¶ Updates the column profile.
- Parameters
df_series (pandas.core.series.Series) – df series
- Returns
None