Unstructured Text Profile

For profiling unstructured text data.

class dataprofiler.profilers.unstructured_text_profile.TextProfiler(name: Optional[str], options: Optional[dataprofiler.profilers.profiler_options.TextProfilerOptions] = None)

Bases: object

Profiles text data.

Initialize TextProfiler object.

Parameters
  • name (String) – Name of the data

  • options (TextProfilerOptions) – Options for the Text Profiler

type = 'text'
diff(other_profile: dataprofiler.profilers.unstructured_text_profile.TextProfiler, options: Optional[Dict] = None) Dict

Find the differences for two unstructured text profiles.

Parameters
  • other_profile (TextProfiler) – profile to find the difference with

  • options (dict) – options for diff output

Returns

the difference between profiles

Return type

dict

report(remove_disabled_flag: bool = False) Dict

Report profile attribute of class; potentially pop val from self.profile.

property profile: Dict

Return the profile of the column.

Returns

profile of the column

Return type

dict

update(data: pandas.core.series.Series) dataprofiler.profilers.unstructured_text_profile.TextProfiler

Update the column profile.

Parameters

data (pandas.core.series.Series) – df series

Returns

updated TextProfiler

Return type

TextProfiler