Numerical Column Stats

coding=utf-8 Build model for a dataset by identifying type of column along with its respective parameters.

class dataprofiler.profilers.numerical_column_stats.abstractstaticmethod(function)

Bases: staticmethod

class dataprofiler.profilers.numerical_column_stats.NumericStatsMixin(options=None)

Bases: object

Abstract numerical column profile subclass of BaseColumnProfiler. Represents a column in the dataset which is a text column. Has Subclasses itself.

Initialization of column base properties and itself.

Parameters

options (NumericalOptions) – Options for the numerical stats.

type = None
profile()

Property for profile. Returns the profile of the column. :return:

diff(other_profile, options=None)

Finds the differences for several numerical stats.

Parameters

other_profile (NumericStatsMixin Profile) – profile to find the difference with

Returns

the numerical stats differences

Return type

dict

property mean
property mode

Finds an estimate for the mode(s) of the data.

Returns

the mode(s) of the data

Return type

list(float)

property variance
property stddev
property skewness
property kurtosis
abstract update(df_series)

Abstract Method for updating the numerical profile properties with an uncleaned dataset.

Parameters

df_series (pandas.core.series.Series) – df series with nulls removed

Returns

None

static is_float(x)

For “0.80” this function returns True For “1.00” this function returns True For “1” this function returns True

Parameters

x (str) – string to test

Returns

if is float or not

Return type

bool

static is_int(x)

For “0.80” This function returns False For “1.00” This function returns True For “1” this function returns True

Parameters

x (str) – string to test

Returns

if is integer or not

Return type

bool

static np_type_to_type(val)

Converts numpy variables to base python type variables

Parameters

val (numpy type or base type) – value to check & change

Return val

base python type

Rtype val

int or float