Profiler Options

Specify the options when running the data profiler.

class dataprofiler.profilers.profiler_options.BaseOption

Bases: object

For configuring options.

property properties: dict[str, BooleanOption]

Return a copy of the option properties.

Returns

dictionary of the option’s properties attr: value

Return type

dict

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.BooleanOption(is_enabled: bool = True)

Bases: dataprofiler.profilers.profiler_options.BaseOption

For setting Boolean options.

Initialize Boolean option.

Variables

is_enabled (bool) – boolean option to enable/disable the option.

property properties: dict[str, BooleanOption]

Return a copy of the option properties.

Returns

dictionary of the option’s properties attr: value

Return type

dict

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.HistogramOption(is_enabled: bool = True, bin_count_or_method: str | int | list[str] = 'auto')

Bases: dataprofiler.profilers.profiler_options.BooleanOption

For setting histogram options.

Initialize Options for histograms.

Variables
  • is_enabled (bool) – boolean option to enable/disable the option.

  • bin_count_or_method (Union[str, int, list(str)]) – bin count or the method with which to calculate histograms

property properties: dict[str, BooleanOption]

Return a copy of the option properties.

Returns

dictionary of the option’s properties attr: value

Return type

dict

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.ModeOption(is_enabled: bool = True, max_k_modes: int = 5)

Bases: dataprofiler.profilers.profiler_options.BooleanOption

For setting mode estimation options.

Initialize Options for mode estimation.

Variables
  • is_enabled (bool) – boolean option to enable/disable the option.

  • top_k_modes (int) – the max number of modes to return, if applicable

property properties: dict[str, BooleanOption]

Return a copy of the option properties.

Returns

dictionary of the option’s properties attr: value

Return type

dict

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.BaseInspectorOptions(is_enabled: bool = True)

Bases: dataprofiler.profilers.profiler_options.BooleanOption

For setting Base options.

Initialize Base options for all the columns.

Variables

is_enabled (bool) – boolean option to enable/disable the column.

is_prop_enabled(prop: str) bool

Check to see if a property is enabled or not and returns boolean.

Parameters

prop (String) – The option to check if it is enabled

Returns

Whether or not the property is enabled

Return type

Boolean

property properties: dict[str, BooleanOption]

Return a copy of the option properties.

Returns

dictionary of the option’s properties attr: value

Return type

dict

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.NumericalOptions

Bases: dataprofiler.profilers.profiler_options.BaseInspectorOptions

For configuring options for Numerican Stats Mixin.

Initialize Options for the Numerical Stats Mixin.

Variables
  • is_enabled (bool) – boolean option to enable/disable the column.

  • min (BooleanOption) – boolean option to enable/disable min

  • max (BooleanOption) – boolean option to enable/disable max

  • mode (ModeOption) – option to enable/disable mode and set return count

  • median (BooleanOption) – option to enable/disable median

  • sum (BooleanOption) – boolean option to enable/disable sum

  • variance (BooleanOption) – boolean option to enable/disable variance

  • skewness (BooleanOption) – boolean option to enable/disable skewness

  • kurtosis (BooleanOption) – boolean option to enable/disable kurtosis

  • histogram_and_quantiles (BooleanOption) – boolean option to enable/disable histogram_and_quantiles

:ivar bias_correction : boolean option to enable/disable existence of bias :vartype bias: BooleanOption :ivar num_zeros: boolean option to enable/disable num_zeros :vartype num_zeros: BooleanOption :ivar num_negatives: boolean option to enable/disable num_negatives :vartype num_negatives: BooleanOption :ivar is_numeric_stats_enabled: boolean to enable/disable all numeric

stats

property is_numeric_stats_enabled: bool

Return the state of numeric stats being enabled / disabled.

If any numeric stats property is enabled it will return True, otherwise it will return False.

Returns

true if any numeric stats property is enabled, otherwise false

Rtype bool

property properties: dict[str, BooleanOption]

Include is_enabled.

is_enabled: Turns on or off the column.

is_prop_enabled(prop: str) bool

Check to see if a property is enabled or not and returns boolean.

Parameters

prop (String) – The option to check if it is enabled

Returns

Whether or not the property is enabled

Return type

Boolean

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.IntOptions

Bases: dataprofiler.profilers.profiler_options.NumericalOptions

For configuring options for Int Column.

Initialize Options for the Int Column.

Variables
  • is_enabled (bool) – boolean option to enable/disable the column.

  • min (BooleanOption) – boolean option to enable/disable min

  • max (BooleanOption) – boolean option to enable/disable max

  • mode (ModeOption) – option to enable/disable mode and set return count

  • median (BooleanOption) – option to enable/disable median

  • sum (BooleanOption) – boolean option to enable/disable sum

  • variance (BooleanOption) – boolean option to enable/disable variance

  • skewness (BooleanOption) – boolean option to enable/disable skewness

  • kurtosis (BooleanOption) – boolean option to enable/disable kurtosis

  • histogram_and_quantiles (BooleanOption) – boolean option to enable/disable histogram_and_quantiles

:ivar bias_correction : boolean option to enable/disable existence of bias :vartype bias: BooleanOption :ivar num_zeros: boolean option to enable/disable num_zeros :vartype num_zeros: BooleanOption :ivar num_negatives: boolean option to enable/disable num_negatives :vartype num_negatives: BooleanOption :ivar is_numeric_stats_enabled: boolean to enable/disable all numeric

stats

property is_numeric_stats_enabled: bool

Return the state of numeric stats being enabled / disabled.

If any numeric stats property is enabled it will return True, otherwise it will return False.

Returns

true if any numeric stats property is enabled, otherwise false

Rtype bool

is_prop_enabled(prop: str) bool

Check to see if a property is enabled or not and returns boolean.

Parameters

prop (String) – The option to check if it is enabled

Returns

Whether or not the property is enabled

Return type

Boolean

property properties: dict[str, BooleanOption]

Include is_enabled.

is_enabled: Turns on or off the column.

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.PrecisionOptions(is_enabled: bool = True, sample_ratio: Optional[float] = None)

Bases: dataprofiler.profilers.profiler_options.BooleanOption

For configuring options for precision.

Initialize Options for precision.

Variables
  • is_enabled (bool) – boolean option to enable/disable the column.

  • sample_ratio (float) – float option to determine ratio of valid float samples in determining percision. This ratio will override any defaults.

property properties: dict[str, BooleanOption]

Return a copy of the option properties.

Returns

dictionary of the option’s properties attr: value

Return type

dict

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.FloatOptions

Bases: dataprofiler.profilers.profiler_options.NumericalOptions

For configuring options for Float Column.

Initialize Options for the Float Column.

Variables
  • is_enabled (bool) – boolean option to enable/disable the column.

  • min (BooleanOption) – boolean option to enable/disable min

  • max (BooleanOption) – boolean option to enable/disable max

  • mode (ModeOption) – option to enable/disable mode and set return count

  • median (BooleanOption) – option to enable/disable median

  • sum (BooleanOption) – boolean option to enable/disable sum

  • variance (BooleanOption) – boolean option to enable/disable variance

  • skewness (BooleanOption) – boolean option to enable/disable skewness

  • kurtosis (BooleanOption) – boolean option to enable/disable kurtosis

  • histogram_and_quantiles (BooleanOption) – boolean option to enable/disable histogram_and_quantiles

:ivar bias_correction : boolean option to enable/disable existence of bias :vartype bias: BooleanOption :ivar num_zeros: boolean option to enable/disable num_zeros :vartype num_zeros: BooleanOption :ivar num_negatives: boolean option to enable/disable num_negatives :vartype num_negatives: BooleanOption :ivar is_numeric_stats_enabled: boolean to enable/disable all numeric

stats

property is_numeric_stats_enabled: bool

Return the state of numeric stats being enabled / disabled.

If any numeric stats property is enabled it will return True, otherwise it will return False.

Returns

true if any numeric stats property is enabled, otherwise false

Rtype bool

is_prop_enabled(prop: str) bool

Check to see if a property is enabled or not and returns boolean.

Parameters

prop (String) – The option to check if it is enabled

Returns

Whether or not the property is enabled

Return type

Boolean

property properties: dict[str, BooleanOption]

Include is_enabled.

is_enabled: Turns on or off the column.

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.TextOptions

Bases: dataprofiler.profilers.profiler_options.NumericalOptions

For configuring options for Text Column.

Initialize Options for the Text Column.

Variables
  • is_enabled (bool) – boolean option to enable/disable the column.

  • vocab (BooleanOption) – boolean option to enable/disable vocab

  • min (BooleanOption) – boolean option to enable/disable min

  • max (BooleanOption) – boolean option to enable/disable max

  • mode (ModeOption) – option to enable/disable mode and set return count

  • median (BooleanOption) – option to enable/disable median

  • sum (BooleanOption) – boolean option to enable/disable sum

  • variance (BooleanOption) – boolean option to enable/disable variance

  • skewness (BooleanOption) – boolean option to enable/disable skewness

  • kurtosis (BooleanOption) – boolean option to enable/disable kurtosis

:ivar bias_correction : boolean option to enable/disable existence of bias :vartype bias: BooleanOption :ivar histogram_and_quantiles: boolean option to enable/disable

histogram_and_quantiles

Variables
  • num_zeros (BooleanOption) – boolean option to enable/disable num_zeros

  • num_negatives (BooleanOption) – boolean option to enable/disable num_negatives

  • is_numeric_stats_enabled (bool) – boolean to enable/disable all numeric stats

property is_numeric_stats_enabled: bool

Return the state of numeric stats being enabled / disabled.

If any numeric stats property is enabled it will return True, otherwise it will return False. Although it seems redundant, this method is needed in order for the function below, the setter function also called is_numeric_stats_enabled, to properly work.

Returns

true if any numeric stats property is enabled, otherwise false

Rtype bool

is_prop_enabled(prop: str) bool

Check to see if a property is enabled or not and returns boolean.

Parameters

prop (String) – The option to check if it is enabled

Returns

Whether or not the property is enabled

Return type

Boolean

property properties: dict[str, BooleanOption]

Include is_enabled.

is_enabled: Turns on or off the column.

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.DateTimeOptions

Bases: dataprofiler.profilers.profiler_options.BaseInspectorOptions

For configuring options for Datetime Column.

Initialize Options for the Datetime Column.

Variables

is_enabled (bool) – boolean option to enable/disable the column.

is_prop_enabled(prop: str) bool

Check to see if a property is enabled or not and returns boolean.

Parameters

prop (String) – The option to check if it is enabled

Returns

Whether or not the property is enabled

Return type

Boolean

property properties: dict[str, BooleanOption]

Return a copy of the option properties.

Returns

dictionary of the option’s properties attr: value

Return type

dict

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.OrderOptions

Bases: dataprofiler.profilers.profiler_options.BaseInspectorOptions

For configuring options for Order Column.

Initialize options for the Order Column.

Variables

is_enabled (bool) – boolean option to enable/disable the column.

is_prop_enabled(prop: str) bool

Check to see if a property is enabled or not and returns boolean.

Parameters

prop (String) – The option to check if it is enabled

Returns

Whether or not the property is enabled

Return type

Boolean

property properties: dict[str, BooleanOption]

Return a copy of the option properties.

Returns

dictionary of the option’s properties attr: value

Return type

dict

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.CategoricalOptions(is_enabled: bool = True, top_k_categories: Optional[int] = None)

Bases: dataprofiler.profilers.profiler_options.BaseInspectorOptions

For configuring options Categorical Column.

Initialize options for the Categorical Column.

Variables
  • is_enabled (bool) – boolean option to enable/disable the column.

  • top_k_categories ([None, int]) – number of categories to be displayed when called

is_prop_enabled(prop: str) bool

Check to see if a property is enabled or not and returns boolean.

Parameters

prop (String) – The option to check if it is enabled

Returns

Whether or not the property is enabled

Return type

Boolean

property properties: dict[str, BooleanOption]

Return a copy of the option properties.

Returns

dictionary of the option’s properties attr: value

Return type

dict

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.CorrelationOptions(is_enabled: bool = False, columns: list[str] = None)

Bases: dataprofiler.profilers.profiler_options.BaseInspectorOptions

For configuring options for Correlation between Columns.

Initialize options for the Correlation between Columns.

Variables
  • is_enabled (bool) – boolean option to enable/disable.

  • columns (list()) – Columns considered to calculate correlation

is_prop_enabled(prop: str) bool

Check to see if a property is enabled or not and returns boolean.

Parameters

prop (String) – The option to check if it is enabled

Returns

Whether or not the property is enabled

Return type

Boolean

property properties: dict[str, BooleanOption]

Return a copy of the option properties.

Returns

dictionary of the option’s properties attr: value

Return type

dict

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.DataLabelerOptions

Bases: dataprofiler.profilers.profiler_options.BaseInspectorOptions

For configuring options for Data Labeler Column.

Initialize options for the Data Labeler Column.

Variables
  • is_enabled (bool) – boolean option to enable/disable the column.

  • data_labeler_dirpath (str) – String to load data labeler

  • max_sample_size (BaseDataLabeler) – Int to decide sample size

  • data_labeler_object – DataLabeler object used in profiler

property properties: dict

Return a copy of the option properties.

Returns

dictionary of the option’s properties attr: value

Return type

dict

is_prop_enabled(prop: str) bool

Check to see if a property is enabled or not and returns boolean.

Parameters

prop (String) – The option to check if it is enabled

Returns

Whether or not the property is enabled

Return type

Boolean

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.TextProfilerOptions(is_enabled: bool = True, is_case_sensitive: bool = True, stop_words: set[str] = None, top_k_chars: int = None, top_k_words: int = None)

Bases: dataprofiler.profilers.profiler_options.BaseInspectorOptions

For configuring options for text profiler.

Construct the TextProfilerOption object with default values.

Variables
  • is_enabled (bool) – boolean option to enable/disable the option.

  • is_case_sensitive (bool) – option set for case sensitivity.

  • stop_words (Union[None, list(str)]) – option set for stop words.

  • top_k_chars (Union[None, int]) – option set for number of top common characters.

  • top_k_words (Union[None, int]) – option set for number of top common words.

  • words (BooleanOption) – option set for word update.

  • vocab (BooleanOption) – option set for vocab update.

is_prop_enabled(prop: str) bool

Check to see if a property is enabled or not and returns boolean.

Parameters

prop (String) – The option to check if it is enabled

Returns

Whether or not the property is enabled

Return type

Boolean

property properties: dict[str, BooleanOption]

Return a copy of the option properties.

Returns

dictionary of the option’s properties attr: value

Return type

dict

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.StructuredOptions(null_values: dict[str, re.RegexFlag | int] = None, column_null_values: dict[int, dict[str, re.RegexFlag | int]] = None)

Bases: dataprofiler.profilers.profiler_options.BaseOption

For configuring options for structured profiler.

Construct the StructuredOptions object with default values.

Parameters
  • null_values – null values we input.

  • column_null_values – column level null values we input.

Variables
  • int (IntOptions) – option set for int profiling.

  • float (FloatOptions) – option set for float profiling.

  • datetime (DateTimeOptions) – option set for datetime profiling.

  • text (TextOptions) – option set for text profiling.

  • order (OrderOptions) – option set for order profiling.

  • category (CategoricalOptions) – option set for category profiling.

  • data_labeler (DataLabelerOptions) – option set for data_labeler profiling.

  • correlation (CorrelationOptions) – option set for correlation profiling.

  • chi2_homogeneity (BooleanOption()) – option set for chi2_homogeneity matrix

  • null_replication_metrics (BooleanOptions) – option set for metrics calculation for replicating nan vals

  • null_values (Union[None, dict]) – option set for defined null values

property enabled_profiles: list[str]

Return a list of the enabled profilers for columns.

property properties: dict[str, BooleanOption]

Return a copy of the option properties.

Returns

dictionary of the option’s properties attr: value

Return type

dict

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.UnstructuredOptions

Bases: dataprofiler.profilers.profiler_options.BaseOption

For configuring options for unstructured profiler.

Construct the UnstructuredOptions object with default values.

Variables
property enabled_profiles: list[str]

Return a list of the enabled profilers.

property properties: dict[str, BooleanOption]

Return a copy of the option properties.

Returns

dictionary of the option’s properties attr: value

Return type

dict

set(options: dict[str, bool]) None

Set all the options.

Send in a dict that contains all of or a subset of the appropriate options. Set the values of the options. Will raise error if the formatting is improper.

Parameters

options (dict) – dict containing the options you want to set.

Returns

None

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

class dataprofiler.profilers.profiler_options.ProfilerOptions(presets: Optional[str] = None)

Bases: dataprofiler.profilers.profiler_options.BaseOption

For configuring options for profiler.

Initialize the ProfilerOptions object.

Variables
  • structured_options (StructuredOptions) – option set for structured dataset profiling.

  • unstructured_options (UnstructuredOptions) – option set for unstructured dataset profiling.

property properties: dict[str, BooleanOption]

Return a copy of the option properties.

Returns

dictionary of the option’s properties attr: value

Return type

dict

validate(raise_error: bool = True) list[str] | None

Validate the options do not conflict and cause errors.

Raises error/warning if so.

Parameters

raise_error (bool) – Flag that raises errors if true. Returns errors if false.

Returns

list of errors (if raise_error is false)

Return type

list(str)

set(options: dict[str, bool]) None

Overwrite BaseOption.set.

We do this because the type (unstructured/structured) may need to be specified if the same options exist within both self.structured_options and self.unstructured_options

Parameters

options (dict) – Dictionary of options to set

Return

None