dataprofiler.profilers.profiler_options module¶
coding=utf-8 Specify the options when running the data profiler.
-
class
dataprofiler.profilers.profiler_options.
BaseOption
¶ Bases:
object
-
property
properties
¶ Returns a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
-
set
(options)¶ Set all the options. Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
-
validate
(raise_error=True)¶ Validates the options do not conflict and cause errors. Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
-
property
-
class
dataprofiler.profilers.profiler_options.
BooleanOption
(is_enabled=True)¶ Bases:
dataprofiler.profilers.profiler_options.BaseOption
Boolean option
- Variables
is_enabled (bool) – boolean option to enable/disable the option.
-
property
properties
¶ Returns a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
-
set
(options)¶ Set all the options. Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
-
validate
(raise_error=True)¶ Validates the options do not conflict and cause errors. Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
-
class
dataprofiler.profilers.profiler_options.
BaseColumnOptions
¶ Bases:
dataprofiler.profilers.profiler_options.BooleanOption
Base options for all the columns.
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
-
is_prop_enabled
(prop)¶ Checks to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
-
property
properties
¶ Returns a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
-
set
(options)¶ Set all the options. Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
-
validate
(raise_error=True)¶ Validates the options do not conflict and cause errors. Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
-
class
dataprofiler.profilers.profiler_options.
NumericalOptions
¶ Bases:
dataprofiler.profilers.profiler_options.BaseColumnOptions
Options for the Numerical Stats Mixin
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
min (BooleanOption) – boolean option to enable/disable min
max (BooleanOption) – boolean option to enable/disable max
sum (BooleanOption) – boolean option to enable/disable sum
variance (BooleanOption) – boolean option to enable/disable variance
histogram_and_quantiles (BooleanOption) – boolean option to enable/disable histogram_and_quantiles
is_numeric_stats_enabled (bool) – boolean to enable/disable all numeric stats
-
property
is_numeric_stats_enabled
¶ Returns the state of numeric stats being enabled / disabled. If any numeric stats property is enabled it will return True, otherwise it will return False.
- Returns
true if any numeric stats property is enabled, otherwise false
- Rtype bool
-
property
properties
¶ Includes at least: is_enabled: Turns on or off the column.
-
is_prop_enabled
(prop)¶ Checks to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
-
set
(options)¶ Set all the options. Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
-
validate
(raise_error=True)¶ Validates the options do not conflict and cause errors. Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
-
class
dataprofiler.profilers.profiler_options.
IntOptions
¶ Bases:
dataprofiler.profilers.profiler_options.NumericalOptions
Options for the Int Column
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
min (BooleanOption) – boolean option to enable/disable min
max (BooleanOption) – boolean option to enable/disable max
sum (BooleanOption) – boolean option to enable/disable sum
variance (BooleanOption) – boolean option to enable/disable variance
histogram_and_quantiles (BooleanOption) – boolean option to enable/disable histogram_and_quantiles
is_numeric_stats_enabled (bool) – boolean to enable/disable all numeric stats
-
property
is_numeric_stats_enabled
¶ Returns the state of numeric stats being enabled / disabled. If any numeric stats property is enabled it will return True, otherwise it will return False.
- Returns
true if any numeric stats property is enabled, otherwise false
- Rtype bool
-
is_prop_enabled
(prop)¶ Checks to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
-
property
properties
¶ Includes at least: is_enabled: Turns on or off the column.
-
set
(options)¶ Set all the options. Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
-
validate
(raise_error=True)¶ Validates the options do not conflict and cause errors. Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
-
class
dataprofiler.profilers.profiler_options.
FloatOptions
¶ Bases:
dataprofiler.profilers.profiler_options.NumericalOptions
Options for the Float Column.
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
precision (BooleanOption) – boolean option to enable/disable precision
min (BooleanOption) – boolean option to enable/disable min
max (BooleanOption) – boolean option to enable/disable max
sum (BooleanOption) – boolean option to enable/disable sum
variance (BooleanOption) – boolean option to enable/disable variance
histogram_and_quantiles (BooleanOption) – boolean option to enable/disable histogram_and_quantiles
is_numeric_stats_enabled (bool) – boolean to enable/disable all numeric stats
-
property
is_numeric_stats_enabled
¶ Returns the state of numeric stats being enabled / disabled. If any numeric stats property is enabled it will return True, otherwise it will return False.
- Returns
true if any numeric stats property is enabled, otherwise false
- Rtype bool
-
is_prop_enabled
(prop)¶ Checks to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
-
property
properties
¶ Includes at least: is_enabled: Turns on or off the column.
-
set
(options)¶ Set all the options. Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
-
validate
(raise_error=True)¶ Validates the options do not conflict and cause errors. Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
-
class
dataprofiler.profilers.profiler_options.
TextOptions
¶ Bases:
dataprofiler.profilers.profiler_options.NumericalOptions
Options for the Text Column:
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
vocab (BooleanOption) – boolean option to enable/disable vocab
min (BooleanOption) – boolean option to enable/disable min
max (BooleanOption) – boolean option to enable/disable max
sum (BooleanOption) – boolean option to enable/disable sum
variance (BooleanOption) – boolean option to enable/disable variance
histogram_and_quantiles (BooleanOption) – boolean option to enable/disable histogram_and_quantiles
is_numeric_stats_enabled (bool) – boolean to enable/disable all numeric stats
-
property
is_numeric_stats_enabled
¶ Returns the state of numeric stats being enabled / disabled. If any numeric stats property is enabled it will return True, otherwise it will return False.
- Returns
true if any numeric stats property is enabled, otherwise false
- Rtype bool
-
is_prop_enabled
(prop)¶ Checks to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
-
property
properties
¶ Includes at least: is_enabled: Turns on or off the column.
-
set
(options)¶ Set all the options. Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
-
validate
(raise_error=True)¶ Validates the options do not conflict and cause errors. Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
-
class
dataprofiler.profilers.profiler_options.
DateTimeOptions
¶ Bases:
dataprofiler.profilers.profiler_options.BaseColumnOptions
Options for the Datetime Column
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
-
is_prop_enabled
(prop)¶ Checks to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
-
property
properties
¶ Returns a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
-
set
(options)¶ Set all the options. Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
-
validate
(raise_error=True)¶ Validates the options do not conflict and cause errors. Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
-
class
dataprofiler.profilers.profiler_options.
OrderOptions
¶ Bases:
dataprofiler.profilers.profiler_options.BaseColumnOptions
Options for the Order Column
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
-
is_prop_enabled
(prop)¶ Checks to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
-
property
properties
¶ Returns a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
-
set
(options)¶ Set all the options. Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
-
validate
(raise_error=True)¶ Validates the options do not conflict and cause errors. Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
-
class
dataprofiler.profilers.profiler_options.
CategoricalOptions
¶ Bases:
dataprofiler.profilers.profiler_options.BaseColumnOptions
Options for the Categorical Column
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
-
is_prop_enabled
(prop)¶ Checks to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
-
property
properties
¶ Returns a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
-
set
(options)¶ Set all the options. Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
-
validate
(raise_error=True)¶ Validates the options do not conflict and cause errors. Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
-
class
dataprofiler.profilers.profiler_options.
DataLabelerOptions
¶ Bases:
dataprofiler.profilers.profiler_options.BaseColumnOptions
Options for the Data Labeler Column.
- Variables
is_enabled (bool) – boolean option to enable/disable the column.
data_labeler_dirpath (str) – String to load data labeler from
max_sample_size (int) – Int to decide sample size
-
is_prop_enabled
(prop)¶ Checks to see if a property is enabled or not and returns boolean.
- Parameters
prop (String) – The option to check if it is enabled
- Returns
Whether or not the property is enabled
- Return type
Boolean
-
property
properties
¶ Returns a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
-
set
(options)¶ Set all the options. Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
-
validate
(raise_error=True)¶ Validates the options do not conflict and cause errors. Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
-
class
dataprofiler.profilers.profiler_options.
StructuredOptions
¶ Bases:
dataprofiler.profilers.profiler_options.BaseOption
Constructs the StructuredOptions object with default values.
- Variables
int (IntOptions) – option set for int profiling.
float (FloatOptions) – option set for float profiling.
datetime (DateTimeOptions) – option set for datetime profiling.
text (TextOptions) – option set for text profiling.
order (OrderOptions) – option set for order profiling.
category (CategoricalOptions) – option set for category profiling.
data_labeler (DataLabelerOptions) – option set for data_labeler profiling.
-
property
enabled_columns
¶ Returns a list of the enabled profiler columns.
-
property
properties
¶ Returns a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
-
set
(options)¶ Set all the options. Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
-
validate
(raise_error=True)¶ Validates the options do not conflict and cause errors. Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)
-
class
dataprofiler.profilers.profiler_options.
ProfilerOptions
¶ Bases:
dataprofiler.profilers.profiler_options.BaseOption
Initializes the ProfilerOptions object.
- Variables
structured_options (StructuredOptions) – option set for structured dataset profiling.
-
property
properties
¶ Returns a copy of the option properties.
- Returns
dictionary of the option’s properties attr: value
- Return type
dict
-
set
(options)¶ Set all the options. Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.
- Parameters
options (dict) – dict containing the options you want to set.
- Returns
None
-
validate
(raise_error=True)¶ Validates the options do not conflict and cause errors. Raises error/warning if so.
- Parameters
raise_error (bool) – Flag that raises errors if true. Returns errors if false.
- Returns
list of errors (if raise_error is false)
- Return type
list(str)