dataprofiler.profilers.unstructured_text_profile module

For profiling unstructured text data.

class dataprofiler.profilers.unstructured_text_profile.TextProfiler(name: str | None, options: TextProfilerOptions = None)

Bases: object

Profiles text data.

Initialize TextProfiler object.

Parameters:
  • name (String) – Name of the data

  • options (TextProfilerOptions) – Options for the Text Profiler

type = 'text'
diff(other_profile: TextProfiler, options: dict | None = None) dict

Find the differences for two unstructured text profiles.

Parameters:
  • other_profile (TextProfiler) – profile to find the difference with

  • options (dict) – options for diff output

Returns:

the difference between profiles

Return type:

dict

report(remove_disabled_flag: bool = False) dict

Report profile attribute of class; potentially pop val from self.profile.

property profile: dict

Return the profile of the column.

Returns:

profile of the column

Return type:

dict

update(data: Series) TextProfiler

Update the column profile.

Parameters:

data (pandas.core.series.Series) – df series

Returns:

updated TextProfiler

Return type:

TextProfiler