Labeler Utils

dataprofiler.labelers.labeler_utils.f1_report_dict_to_str(f1_report, label_names)

Returns the report string from the f1_report dict.

Example Output:

precision recall f1-score support

class 0 0.00 0.00 0.00 1 class 1 1.00 0.67 0.80 3

micro avg 0.67 0.50 0.57 4 macro avg 0.50 0.33 0.40 4

weighted avg 0.75 0.50 0.60 4

Note: this is generally taken from the classification_report function inside sklearn. :param f1_report: f1 report dictionary from sklearn :type f1_report: dict :param label_names: names of labels included in the report :type label_names: list(str) :return: string representing f1_report printout :rtype: str

dataprofiler.labelers.labeler_utils.evaluate_accuracy(predicted_entities_in_index, true_entities_in_index, num_labels, entity_rev_dict, verbose=True, omitted_labels=('PAD', 'UNKNOWN'), confusion_matrix_file=None)

Evaluate the accuracy from comparing the predicted labels with true labels

  • predicted_entities_in_index (list(array(int))) – predicted encoded labels for input sentences

  • true_entities_in_index (list(array(int))) – true encoded labels for input sentences

  • entity_rev_dict (dict([index, entity])) – dictionary to convert indices to entities

  • verbose (boolean) – print additional information for debugging

  • omitted_labels (list() of text labels) – labels to omit from the accuracy evaluation

  • confusion_matrix_file (str) – File name (and dir) for confusion matrix

:return : f1-score :rtype: float