Base Validators

coding=utf-8

Build model for a dataset by identifying type of column along with its respective parameters.

dataprofiler.validators.base_validators.is_in_range(x, config)

Checks to see x is in the range of the config.

Parameters
  • x (int/float) – number

  • config (dict) – configuration

Returns

bool

dataprofiler.validators.base_validators.is_in_list(x, config)

Checks to see x is in the config list.

Parameters
  • x (string) – item

  • config (dict) – configuration

Returns

bool

class dataprofiler.validators.base_validators.Validator

Bases: object

validate(data, config)

Validate a data set. No option for validating a partial data set.

Set configuration on run not on instantiation of the class such that you have the option to run multiple times with different configurations without having to also reinstantiate the class.

Parameters
  • data (DataFrame Dask/Pandas) – The data to be processed by the validator. Processing occurs in a column-wise fashion.

  • config (dict) – configuration for how the validator should run across the given data. Validator will only run over columns specified in the configuration.

Example

This is an example of the config:

config = {
        <column_name>: {
                range: {
                    'start': 1,
                    'end':2
                },
                list: [1,2,3]
            }
        }
get()

Get the results of the validation run.