Float Column Profile

class dataprofiler.profilers.float_column_profile.FloatColumn(name, options=None)

Bases: dataprofiler.profilers.numerical_column_stats.NumericStatsMixin, dataprofiler.profilers.base_column_profilers.BaseColumnPrimitiveTypeProfiler

Float column profile mixin with of numerical stats. Represents a column in the dataset which is a float column.

Initialization of column base properties and itself. :param name: Name of the data :type name: String :param options: Options for the float column :type options: FloatOptions

type = 'float'
diff(other_profile, options=None)

Finds the differences for FloatColumns.

Parameters

other_profile (FloatColumn) – profile to find the difference with

Returns

the FloatColumn differences

Return type

dict

property profile

Property for profile. Returns the profile of the column. :return:

property precision

Property reporting statistics on the significant figures of each element in the data. :return: Precision statistics :rtype: dict

property data_type_ratio

Calculates the ratio of samples which match this data type. :return: ratio of data type :rtype: float

col_type = None
static is_float(x)

For “0.80” this function returns True For “1.00” this function returns True For “1” this function returns True

Parameters

x (str) – string to test

Returns

if is float or not

Return type

bool

static is_int(x)

For “0.80” This function returns False For “1.00” This function returns True For “1” this function returns True

Parameters

x (str) – string to test

Returns

if is integer or not

Return type

bool

property kurtosis
property mean
property median

Estimates the median of the data.

Returns

the median

Return type

float

property median_abs_deviation
Get median absolute deviation estimated from the histogram of the data

Subtract bin edges from the median value Fold the histogram to positive and negative parts around zero Impose the two bin edges from the two histogram Calculate the counts for the two histograms with the imposed bin edges Superimpose the counts from the two histograms Interpolate the median absolute deviation from the superimposed counts

Returns

median absolute deviation

property mode

Finds an estimate for the mode(s) of the data.

Returns

the mode(s) of the data

Return type

list(float)

static np_type_to_type(val)

Converts numpy variables to base python type variables

Parameters

val (numpy type or base type) – value to check & change

Return val

base python type

Rtype val

int or float

property skewness
property stddev
update(df_series)

Updates the column profile. :param df_series: df series :type df_series: pandas.core.series.Series :return: None

property variance