Base Validators¶

Build model for dataset by identifying col type along with its respective params.

dataprofiler.validators.base_validators.is_in_range(x: float | int, config: dict) → bool¶

Check to see x is in the range of the config.

Parameters

x (int/float) – number
config (dict) – configuration

Returns

bool

dataprofiler.validators.base_validators.is_in_list(x: str, config: dict) → bool¶

Check to see x is in the config list.

Parameters

x (string) – item
config (dict) – configuration

Returns

bool

class dataprofiler.validators.base_validators.Validator¶

Bases: object

For validating a data set.

Initialize Validator object.

validate(data: pd.DataFrame | dd.DataFrame, config: dict) → None¶

Validate a data set.

No option for validating a partial data set.

Set configuration on run not on instantiation of the class such that you have the option to run multiple times with different configurations without having to also reinstantiate the class.

Parameters

data (DataFrame Dask/Pandas) – The data to be processed by the validator. Processing occurs in a column-wise fashion.
config (dict) – configuration for how the validator should run across the given data. Validator will only run over columns specified in the configuration.

Example

This is an example of the config:

config = {
        <column_name>: {
                range: {
                    'start': 1,
                    'end':2
                },
                list: [1,2,3]
            }
        }

get() → dict¶: Get the results of the validation run.