Char Load Tf Model

Contains class for training data labeler model.

class dataprofiler.labelers.char_load_tf_model.CharLoadTFModel(model_path: str, label_mapping: Dict[str, int], parameters: Optional[Dict] = None)

Bases: dataprofiler.labelers.base_model.BaseTrainableModel

For training data labeler model.

Initialize Loadable TF Model.

Parameters
  • model_path (str) – path to model to load

  • label_mapping (dict) – maps labels to their encoded integers

  • parameters (dict) –

    Contains all the appropriate parameters for the model. Must contain num_labels. Other possible parameters are:

    max_length, max_char_encoding_id, dim_embed, size_fc dropout, size_conv, num_fil, optimizer, default_label

Returns

None

requires_zero_mapping: bool = False
set_label_mapping(label_mapping: Union[List[str], Dict[str, int]]) None

Set the labels for the model.

Parameters

label_mapping (dict) – label mapping of the model

Returns

None

save_to_disk(dirpath: str) None

Save whole model to disk with weights.

Parameters

dirpath (str) – directory path where you want to save the model to

Returns

None

classmethod load_from_disk(dirpath: str) dataprofiler.labelers.char_load_tf_model.CharLoadTFModel

Load whole model from disk with weights.

Parameters

dirpath (str) – directory path where you want to load the model from

Returns

loaded CharLoadTFModel

Return type

CharLoadTFModel

reset_weights() None

Reset the weights of the model.

Returns

None

fit(train_data: Union[pandas.core.frame.DataFrame, pandas.core.series.Series, numpy.ndarray], val_data: Optional[Union[pandas.core.frame.DataFrame, pandas.core.series.Series, numpy.ndarray]] = None, batch_size: Optional[int] = None, epochs: Optional[int] = None, label_mapping: Optional[Dict[str, int]] = None, reset_weights: bool = False, verbose: bool = True) Tuple[Dict, Optional[float], Dict]

Train the current model with the training data and validation data.

Parameters
  • train_data (Union[list, np.ndarray]) – Training data used to train model

  • val_data (Union[list, np.ndarray]) – Validation data used to validate the training

  • batch_size (int) – Used to determine number of samples in each batch

  • label_mapping (Union[dict, None]) – maps labels to their encoded integers

  • reset_weights (bool) – Flag to determine whether to reset the weights or not

  • verbose (bool) – Flag to determine whether to print status or not

Returns

history, f1, f1_report

Return type

Tuple[dict, float, dict]

predict(data: Union[pandas.core.frame.DataFrame, pandas.core.series.Series, numpy.ndarray], batch_size: int = 32, show_confidences: bool = False, verbose: bool = True) Dict

Run model and get predictions.

Parameters
  • data (Union[list, numpy.ndarray]) – text input

  • batch_size (int) – number of samples in the batch of data

  • show_confidences – whether user wants prediction confidences

  • verbose (bool) – Flag to determine whether to print status or not

Returns

char level predictions and confidences

Return type

dict

details() None

Print the relevant details of the model.

Details include summary, parameters, label mapping.

add_label(label: str, same_as: Optional[str] = None) None

Add a label to the data labeler.

Parameters
  • label (str) – new label being added to the data labeler

  • same_as (str) – label to have the same encoding index as for multi-label to single encoding index.

Returns

None

classmethod get_class(class_name: str) Optional[Type[dataprofiler.labelers.base_model.BaseModel]]

Get subclasses.

get_parameters(param_list: Optional[List[str]] = None) Dict

Return a dict of parameters from the model given a list.

Parameters

param_list (List[str]) – list of parameters to retrieve from the model.

Returns

dict of parameters

classmethod help() None

Help describe alterable parameters.

Returns

None

property label_mapping: Dict[str, int]

Return mapping of labels to their encoded values.

property labels: List[str]

Retrieve the label.

Returns

list of labels

property num_labels: int

Return max label mapping.

property reverse_label_mapping: Dict[int, str]

Return reversed order of current labels.

Useful for when needed to extract Labels via indices.

set_params(**kwargs: Any) None

Set the parameters if they exist given kwargs.