dataprofiler.profilers package¶
Subpackages¶
Submodules¶
- dataprofiler.profilers.base_column_profilers module
BaseColumnProfiler
BaseColumnPrimitiveTypeProfiler
BaseColumnPrimitiveTypeProfiler.sample_size
BaseColumnPrimitiveTypeProfiler.col_type
BaseColumnPrimitiveTypeProfiler.diff()
BaseColumnPrimitiveTypeProfiler.load_from_dict()
BaseColumnPrimitiveTypeProfiler.profile
BaseColumnPrimitiveTypeProfiler.report()
BaseColumnPrimitiveTypeProfiler.update()
BaseColumnPrimitiveTypeProfiler.name
BaseColumnPrimitiveTypeProfiler.metadata
BaseColumnPrimitiveTypeProfiler.times
BaseColumnPrimitiveTypeProfiler.thread_safe
- dataprofiler.profilers.categorical_column_profile module
CategoricalColumn
CategoricalColumn.type
CategoricalColumn.gini_impurity
CategoricalColumn.unalikeability
CategoricalColumn.diff()
CategoricalColumn.report()
CategoricalColumn.load_from_dict()
CategoricalColumn.profile
CategoricalColumn.categories
CategoricalColumn.categorical_counts
CategoricalColumn.unique_ratio
CategoricalColumn.unique_count
CategoricalColumn.is_match
CategoricalColumn.col_type
CategoricalColumn.name
CategoricalColumn.sample_size
CategoricalColumn.metadata
CategoricalColumn.times
CategoricalColumn.thread_safe
CategoricalColumn.update()
- dataprofiler.profilers.column_profile_compilers module
- dataprofiler.profilers.data_labeler_column_profile module
DataLabelerColumn
DataLabelerColumn.type
DataLabelerColumn.thread_safe
DataLabelerColumn.assert_equal_conditions()
DataLabelerColumn.reverse_label_mapping
DataLabelerColumn.possible_data_labels
DataLabelerColumn.rank_distribution
DataLabelerColumn.sum_predictions
DataLabelerColumn.data_label
DataLabelerColumn.avg_predictions
DataLabelerColumn.label_representation
DataLabelerColumn.profile
DataLabelerColumn.load_from_dict()
DataLabelerColumn.report()
DataLabelerColumn.col_type
DataLabelerColumn.diff()
DataLabelerColumn.name
DataLabelerColumn.sample_size
DataLabelerColumn.metadata
DataLabelerColumn.times
DataLabelerColumn.update()
- dataprofiler.profilers.datetime_column_profile module
DateTimeColumn
DateTimeColumn.type
DateTimeColumn.report()
DateTimeColumn.load_from_dict()
DateTimeColumn.profile
DateTimeColumn.data_type_ratio
DateTimeColumn.diff()
DateTimeColumn.update()
DateTimeColumn.col_type
DateTimeColumn.match_count
DateTimeColumn.sample_size
DateTimeColumn.name
DateTimeColumn.metadata
DateTimeColumn.times
DateTimeColumn.thread_safe
- dataprofiler.profilers.float_column_profile module
FloatColumn
FloatColumn.type
FloatColumn.diff()
FloatColumn.report()
FloatColumn.load_from_dict()
FloatColumn.profile
FloatColumn.precision
FloatColumn.data_type_ratio
FloatColumn.col_type
FloatColumn.is_float()
FloatColumn.is_int()
FloatColumn.kurtosis
FloatColumn.mean
FloatColumn.median
FloatColumn.median_abs_deviation
FloatColumn.mode
FloatColumn.np_type_to_type()
FloatColumn.skewness
FloatColumn.stddev
FloatColumn.update()
FloatColumn.variance
FloatColumn.match_count
FloatColumn.sample_size
FloatColumn.name
FloatColumn.metadata
FloatColumn.times
FloatColumn.thread_safe
- dataprofiler.profilers.graph_profiler module
- dataprofiler.profilers.histogram_utils module
- dataprofiler.profilers.int_column_profile module
IntColumn
IntColumn.type
IntColumn.report()
IntColumn.load_from_dict()
IntColumn.profile
IntColumn.data_type_ratio
IntColumn.update()
IntColumn.col_type
IntColumn.diff()
IntColumn.is_float()
IntColumn.is_int()
IntColumn.kurtosis
IntColumn.mean
IntColumn.median
IntColumn.median_abs_deviation
IntColumn.mode
IntColumn.np_type_to_type()
IntColumn.skewness
IntColumn.stddev
IntColumn.variance
IntColumn.match_count
IntColumn.sample_size
IntColumn.name
IntColumn.metadata
IntColumn.times
IntColumn.thread_safe
- dataprofiler.profilers.json_decoder module
- dataprofiler.profilers.json_encoder module
- dataprofiler.profilers.numerical_column_stats module
abstractstaticmethod
NumericStatsMixin
NumericStatsMixin.type
NumericStatsMixin.profile()
NumericStatsMixin.report()
NumericStatsMixin.diff()
NumericStatsMixin.mean
NumericStatsMixin.mode
NumericStatsMixin.median
NumericStatsMixin.variance
NumericStatsMixin.stddev
NumericStatsMixin.skewness
NumericStatsMixin.kurtosis
NumericStatsMixin.median_abs_deviation
NumericStatsMixin.col_type
NumericStatsMixin.load_from_dict()
NumericStatsMixin.name
NumericStatsMixin.sample_size
NumericStatsMixin.metadata
NumericStatsMixin.times
NumericStatsMixin.thread_safe
NumericStatsMixin.update()
NumericStatsMixin.is_float()
NumericStatsMixin.is_int()
NumericStatsMixin.np_type_to_type()
- dataprofiler.profilers.order_column_profile module
- dataprofiler.profilers.profile_builder module
- dataprofiler.profilers.profiler_options module
BaseOption
BooleanOption
HistogramAndQuantilesOption
ModeOption
BaseInspectorOptions
NumericalOptions
IntOptions
PrecisionOptions
FloatOptions
TextOptions
DateTimeOptions
OrderOptions
CategoricalOptions
CorrelationOptions
HyperLogLogOptions
UniqueCountOptions
RowStatisticsOptions
DataLabelerOptions
TextProfilerOptions
StructuredOptions
UnstructuredOptions
ProfilerOptions
- dataprofiler.profilers.profiler_utils module
recursive_dict_update()
KeyDict
shuffle_in_chunks()
warn_on_profile()
partition()
auto_multiprocess_toggle()
suggest_pool_size()
generate_pool()
overlap()
add_nested_dictionaries()
biased_skew()
biased_kurt()
Subtractable
find_diff_of_numbers()
find_diff_of_strings_and_bools()
find_diff_of_lists_and_sets()
find_diff_of_dates()
find_diff_of_dicts()
find_diff_of_matrices()
find_diff_of_dicts_with_diff_keys()
get_memory_size()
method_timeit()
perform_chi_squared_test_for_homogeneity()
chunk()
merge()
merge_profile_list()
reload_labeler_from_options_or_get_new()
- dataprofiler.profilers.text_column_profile module
TextColumn
TextColumn.type
TextColumn.report()
TextColumn.profile
TextColumn.diff()
TextColumn.data_type_ratio
TextColumn.update()
TextColumn.load_from_dict()
TextColumn.col_type
TextColumn.is_float()
TextColumn.is_int()
TextColumn.kurtosis
TextColumn.mean
TextColumn.median
TextColumn.median_abs_deviation
TextColumn.mode
TextColumn.np_type_to_type()
TextColumn.skewness
TextColumn.stddev
TextColumn.variance
TextColumn.min
TextColumn.max
TextColumn.sum
TextColumn.max_histogram_bin
TextColumn.min_histogram_bin
TextColumn.histogram_bin_method_names
TextColumn.histogram_selection
TextColumn.user_set_histogram_bin
TextColumn.bias_correction
TextColumn.num_zeros
TextColumn.num_negatives
TextColumn.histogram_methods
TextColumn.quantiles
TextColumn.match_count
TextColumn.name
TextColumn.sample_size
TextColumn.metadata
TextColumn.times
TextColumn.thread_safe
- dataprofiler.profilers.unstructured_labeler_profile module
- dataprofiler.profilers.unstructured_text_profile module
Module contents¶
Package for providing statistics and predictions for a given dataset.