Profiler Options¶
Specify the options when running the data profiler.
- class dataprofiler.profilers.profiler_options.BaseOption¶
Bases:
object
For configuring options.
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Return a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.BooleanOption(is_enabled: bool = True)¶
Bases:
dataprofiler.profilers.profiler_options.BaseOption
For setting Boolean options.
Initialize Boolean option.
- Variables
is_enabled (bool) – boolean option to enable/disable the option.
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Return a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.HistogramOption(is_enabled: bool = True, bin_count_or_method: Union[str, int, List[str]] = 'auto')¶
Bases:
dataprofiler.profilers.profiler_options.BooleanOption
For setting histogram options.
Initialize Options for histograms.
- Variables
is_enabled (bool) – boolean option to enable/disable the option.
bin_count_or_method (Union[str, int, list(str)]) – bin count or the method with which to calculate histograms
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Return a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.ModeOption(is_enabled: bool = True, max_k_modes: int = 5)¶
Bases:
dataprofiler.profilers.profiler_options.BooleanOption
For setting mode estimation options.
Initialize Options for mode estimation.
- Variables
is_enabled (bool) – boolean option to enable/disable the option.
top_k_modes (int) – the max number of modes to return, if applicable
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Return a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.BaseInspectorOptions(is_enabled: bool = True)¶
Bases:
dataprofiler.profilers.profiler_options.BooleanOption
For setting Base options.
Initialize Base options for all the columns.
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
- is_prop_enabled(prop: str) bool ¶
Check to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Return a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.NumericalOptions¶
Bases:
dataprofiler.profilers.profiler_options.BaseInspectorOptions
For configuring options for Numerican Stats Mixin.
Initialize Options for the Numerical Stats Mixin.
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
min (BooleanOption) – boolean option to enable/disable min
max (BooleanOption) – boolean option to enable/disable max
mode (ModeOption) – option to enable/disable mode and set return count
median (BooleanOption) – option to enable/disable median
sum (BooleanOption) – boolean option to enable/disable sum
variance (BooleanOption) – boolean option to enable/disable variance
skewness (BooleanOption) – boolean option to enable/disable skewness
kurtosis (BooleanOption) – boolean option to enable/disable kurtosis
histogram_and_quantiles (BooleanOption) – boolean option to enable/disable histogram_and_quantiles
:ivar bias_correction : boolean option to enable/disable existence of bias :vartype bias: BooleanOption :ivar num_zeros: boolean option to enable/disable num_zeros :vartype num_zeros: BooleanOption :ivar num_negatives: boolean option to enable/disable num_negatives :vartype num_negatives: BooleanOption :ivar is_numeric_stats_enabled: boolean to enable/disable all numeric
stats
- property is_numeric_stats_enabled: bool¶
Return the state of numeric stats being enabled / disabled.
If any numeric stats property is enabled it will return True, otherwise it will return False.
- Returns
true if any numeric stats property is enabled, otherwise false
- Rtype bool
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Include is_enabled.
is_enabled: Turns on or off the column.
- is_prop_enabled(prop: str) bool ¶
Check to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.IntOptions¶
Bases:
dataprofiler.profilers.profiler_options.NumericalOptions
For configuring options for Int Column.
Initialize Options for the Int Column.
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
min (BooleanOption) – boolean option to enable/disable min
max (BooleanOption) – boolean option to enable/disable max
mode (ModeOption) – option to enable/disable mode and set return count
median (BooleanOption) – option to enable/disable median
sum (BooleanOption) – boolean option to enable/disable sum
variance (BooleanOption) – boolean option to enable/disable variance
skewness (BooleanOption) – boolean option to enable/disable skewness
kurtosis (BooleanOption) – boolean option to enable/disable kurtosis
histogram_and_quantiles (BooleanOption) – boolean option to enable/disable histogram_and_quantiles
:ivar bias_correction : boolean option to enable/disable existence of bias :vartype bias: BooleanOption :ivar num_zeros: boolean option to enable/disable num_zeros :vartype num_zeros: BooleanOption :ivar num_negatives: boolean option to enable/disable num_negatives :vartype num_negatives: BooleanOption :ivar is_numeric_stats_enabled: boolean to enable/disable all numeric
stats
- property is_numeric_stats_enabled: bool¶
Return the state of numeric stats being enabled / disabled.
If any numeric stats property is enabled it will return True, otherwise it will return False.
- Returns
true if any numeric stats property is enabled, otherwise false
- Rtype bool
- is_prop_enabled(prop: str) bool ¶
Check to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Include is_enabled.
is_enabled: Turns on or off the column.
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.PrecisionOptions(is_enabled: bool = True, sample_ratio: Optional[float] = None)¶
Bases:
dataprofiler.profilers.profiler_options.BooleanOption
For configuring options for precision.
Initialize Options for precision.
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
sample_ratio (float) – float option to determine ratio of valid float samples in determining percision. This ratio will override any defaults.
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Return a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.FloatOptions¶
Bases:
dataprofiler.profilers.profiler_options.NumericalOptions
For configuring options for Float Column.
Initialize Options for the Float Column.
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
min (BooleanOption) – boolean option to enable/disable min
max (BooleanOption) – boolean option to enable/disable max
mode (ModeOption) – option to enable/disable mode and set return count
median (BooleanOption) – option to enable/disable median
sum (BooleanOption) – boolean option to enable/disable sum
variance (BooleanOption) – boolean option to enable/disable variance
skewness (BooleanOption) – boolean option to enable/disable skewness
kurtosis (BooleanOption) – boolean option to enable/disable kurtosis
histogram_and_quantiles (BooleanOption) – boolean option to enable/disable histogram_and_quantiles
:ivar bias_correction : boolean option to enable/disable existence of bias :vartype bias: BooleanOption :ivar num_zeros: boolean option to enable/disable num_zeros :vartype num_zeros: BooleanOption :ivar num_negatives: boolean option to enable/disable num_negatives :vartype num_negatives: BooleanOption :ivar is_numeric_stats_enabled: boolean to enable/disable all numeric
stats
- property is_numeric_stats_enabled: bool¶
Return the state of numeric stats being enabled / disabled.
If any numeric stats property is enabled it will return True, otherwise it will return False.
- Returns
true if any numeric stats property is enabled, otherwise false
- Rtype bool
- is_prop_enabled(prop: str) bool ¶
Check to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Include is_enabled.
is_enabled: Turns on or off the column.
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.TextOptions¶
Bases:
dataprofiler.profilers.profiler_options.NumericalOptions
For configuring options for Text Column.
Initialize Options for the Text Column.
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
vocab (BooleanOption) – boolean option to enable/disable vocab
min (BooleanOption) – boolean option to enable/disable min
max (BooleanOption) – boolean option to enable/disable max
mode (ModeOption) – option to enable/disable mode and set return count
median (BooleanOption) – option to enable/disable median
sum (BooleanOption) – boolean option to enable/disable sum
variance (BooleanOption) – boolean option to enable/disable variance
skewness (BooleanOption) – boolean option to enable/disable skewness
kurtosis (BooleanOption) – boolean option to enable/disable kurtosis
:ivar bias_correction : boolean option to enable/disable existence of bias :vartype bias: BooleanOption :ivar histogram_and_quantiles: boolean option to enable/disable
histogram_and_quantiles
- Variables
num_zeros (BooleanOption) – boolean option to enable/disable num_zeros
num_negatives (BooleanOption) – boolean option to enable/disable num_negatives
is_numeric_stats_enabled (bool) – boolean to enable/disable all numeric stats
- property is_numeric_stats_enabled: bool¶
Return the state of numeric stats being enabled / disabled.
If any numeric stats property is enabled it will return True, otherwise it will return False. Although it seems redundant, this method is needed in order for the function below, the setter function also called is_numeric_stats_enabled, to properly work.
- Returns
true if any numeric stats property is enabled, otherwise false
- Rtype bool
- is_prop_enabled(prop: str) bool ¶
Check to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Include is_enabled.
is_enabled: Turns on or off the column.
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.DateTimeOptions¶
Bases:
dataprofiler.profilers.profiler_options.BaseInspectorOptions
For configuring options for Datetime Column.
Initialize Options for the Datetime Column.
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
- is_prop_enabled(prop: str) bool ¶
Check to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Return a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.OrderOptions¶
Bases:
dataprofiler.profilers.profiler_options.BaseInspectorOptions
For configuring options for Order Column.
Initialize options for the Order Column.
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
- is_prop_enabled(prop: str) bool ¶
Check to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Return a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.CategoricalOptions(is_enabled: bool = True, top_k_categories: Optional[int] = None)¶
Bases:
dataprofiler.profilers.profiler_options.BaseInspectorOptions
For configuring options Categorical Column.
Initialize options for the Categorical Column.
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
top_k_categories ([None, int]) – number of categories to be displayed when called
- is_prop_enabled(prop: str) bool ¶
Check to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Return a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.CorrelationOptions(is_enabled: bool = False, columns: Optional[List[str]] = None)¶
Bases:
dataprofiler.profilers.profiler_options.BaseInspectorOptions
For configuring options for Correlation between Columns.
Initialize options for the Correlation between Columns.
- Variables
is_enabled (bool) – boolean option to enable/disable.
columns (list()) – Columns considered to calculate correlation
- is_prop_enabled(prop: str) bool ¶
Check to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Return a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.DataLabelerOptions¶
Bases:
dataprofiler.profilers.profiler_options.BaseInspectorOptions
For configuring options for Data Labeler Column.
Initialize options for the Data Labeler Column.
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
data_labeler_dirpath (str) – String to load data labeler
max_sample_size (BaseDataLabeler) – Int to decide sample size
data_labeler_object – DataLabeler object used in profiler
- property properties: Dict¶
Return a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
- is_prop_enabled(prop: str) bool ¶
Check to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.TextProfilerOptions(is_enabled: bool = True, is_case_sensitive: bool = True, stop_words: Optional[Set[str]] = None, top_k_chars: Optional[int] = None, top_k_words: Optional[int] = None)¶
Bases:
dataprofiler.profilers.profiler_options.BaseInspectorOptions
For configuring options for text profiler.
Construct the TextProfilerOption object with default values.
- Variables
is_enabled (bool) – boolean option to enable/disable the option.
is_case_sensitive (bool) – option set for case sensitivity.
stop_words (Union[None, list(str)]) – option set for stop words.
top_k_chars (Union[None, int]) – option set for number of top common characters.
top_k_words (Union[None, int]) – option set for number of top common words.
words (BooleanOption) – option set for word update.
vocab (BooleanOption) – option set for vocab update.
- is_prop_enabled(prop: str) bool ¶
Check to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Return a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.StructuredOptions(null_values: Optional[Dict] = None)¶
Bases:
dataprofiler.profilers.profiler_options.BaseOption
For configuring options for structured profiler.
Construct the StructuredOptions object with default values.
- Parameters
null_values – null values we input.
- Variables
int (IntOptions) – option set for int profiling.
float (FloatOptions) – option set for float profiling.
datetime (DateTimeOptions) – option set for datetime profiling.
text (TextOptions) – option set for text profiling.
order (OrderOptions) – option set for order profiling.
category (CategoricalOptions) – option set for category profiling.
data_labeler (DataLabelerOptions) – option set for data_labeler profiling.
correlation (CorrelationOptions) – option set for correlation profiling.
chi2_homogeneity (BooleanOption()) – option set for chi2_homogeneity matrix
null_replication_metrics (BooleanOptions) – option set for metrics calculation for replicating nan vals
null_values (Union[None, dict]) – option set for defined null values
- property enabled_profiles: List[str]¶
Return a list of the enabled profilers for columns.
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Return a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.UnstructuredOptions¶
Bases:
dataprofiler.profilers.profiler_options.BaseOption
For configuring options for unstructured profiler.
Construct the UnstructuredOptions object with default values.
- Variables
text (TextProfilerOptions) – option set for text profiling.
data_labeler (DataLabelerOptions) – option set for data_labeler profiling.
- property enabled_profiles: List[str]¶
Return a list of the enabled profilers.
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Return a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
- set(options: Dict[str, bool]) None ¶
Set all the options.
Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- class dataprofiler.profilers.profiler_options.ProfilerOptions(presets: Optional[str] = None)¶
Bases:
dataprofiler.profilers.profiler_options.BaseOption
For configuring options for profiler.
Initialize the ProfilerOptions object.
- Variables
structured_options (StructuredOptions) – option set for structured dataset profiling.
unstructured_options (UnstructuredOptions) – option set for unstructured dataset profiling.
- property properties: Dict[str, dataprofiler.profilers.profiler_options.BooleanOption]¶
Return a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
- validate(raise_error: bool = True) Optional[List[str]] ¶
Validate the options do not conflict and cause errors.
Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
- set(options: Dict[str, bool]) None ¶
Overwrite BaseOption.set.
We do this because the type (unstructured/structured) may need to be specified if the same options exist within both self.structured_options and self.unstructured_options
- Parameters
options (dict) – Dictionary of options to set
- Return
None