Classification Report Utils¶

dataprofiler.labelers.classification_report_utils.convert_confusion_matrix_to_MCM(conf_matrix)¶

Converts a confusion matrix into the MCM format for precision/recall/fscore/ support computation by sklearn. The format is as specified by sklearn below: In multilabel confusion matrix \(MCM\), the count of true negatives is \(MCM_{:,0,0}\), false negatives is \(MCM_{:,1,0}\), true positives is \(MCM_{:,1,1}\) and false positives is \(MCM_{:,0,1}\). Note: this utilizes code/ideology from sklearn.

Parameters: conf_matrix (Union[list, np.ndarray]) – confusion matrix, which is a square matrix describing false positives and false negatives, true positives and true negatives for classification
Returns: MCM format for readability by sklearn confusion reports.
Return type: np.ndarray

dataprofiler.labelers.classification_report_utils.precision_recall_fscore_support(MCM, beta=1.0, labels=None, pos_label=1, average=None, warn_for=('precision', 'recall', 'f-score'), sample_weight=None)¶

Copy of the precision_recall_fscore_support function from sklearn.metrics with the update to receiving the MCM instead of calculating each time it is called.

Parameters

MCM (array, shape (n_outputs, 2, 2)) – Multi-classification confusion matrix as referenced by the sklearn metrics module. A 2x2 confusion matrix corresponding to each output in the input. In multilabel confusion matrix \(MCM\), the count of true negatives is \(MCM_{:,0,0}\), false negatives is \(MCM_{:,1,0}\), true positives is \(MCM_{:,1,1}\) and false positives is \(MCM_{:,0,1}\).
beta (float, 1.0 by default) – The strength of recall versus precision in the F-score.
labels (list, optional) – The set of labels to include when average != 'binary', and their order if average is None. Labels present in the data can be excluded, for example to calculate a multiclass average ignoring a majority negative class, while labels not present in the data will result in 0 components in a macro average. For multilabel targets, labels are column indices. By default, all labels in y_true and y_pred are used in sorted order.
pos_label (str or int, 1 by default) – The class to report if average='binary' and the data is binary. If the data are multiclass or multilabel, this will be ignored; setting labels=[pos_label] and average != 'binary' will report scores for that label only.
average (string, [None (default), 'binary', 'micro', 'macro', 'weighted']) –
If None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:

'binary':
Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.

'micro':
Calculate metrics globally by counting the total true positives, false negatives and false positives.

'macro':
Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.

'weighted':
Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.
warn_for (tuple or set, for internal use) – This determines which warnings will be made in the case that this function is being used to return only one of its metrics.
sample_weight (array-like of shape = [n_samples], optional) – Sample weights.

Returns

precision (float (if average is not None) or array of float, shape = [n_unique_labels])
recall (float (if average is not None) or array of float, , shape = [n_unique_labels])
fbeta_score (float (if average is not None) or array of float, shape = [n_unique_labels])
support (int (if average is not None) or array of int, shape = [n_unique_labels]) – The number of occurrences of each label in y_true.

References

1: Wikipedia entry for the Precision and recall
2: Wikipedia entry for the F1-score
3: Discriminative Methods for Multi-labeled Classification Advances in Knowledge Discovery and Data Mining (2004), pp. 22-30 by Shantanu Godbole, Sunita Sarawagi

Notes

When true positive + false positive == 0, precision is undefined; When true positive + false negative == 0, recall is undefined. In such cases, the metric will be set to 0, as will f-score, and UndefinedMetricWarning will be raised.

dataprofiler.labelers.classification_report_utils.classification_report(conf_matrix, labels=None, target_names=None, sample_weight=None, digits=2, output_dict=False)¶

Copy of the classification_report function from sklearn.metrics with the update to receiving the conf_matrix instead of calculating each time it is called.

Build a text report showing the main classification metrics

Read more in the User Guide.

Parameters

conf_matrix (array, shape = [n_labels, n_labels]) – confusion matrix, which is a square matrix describing false positives and false negatives, true positives and true negatives for classification.
labels (array, shape = [n_labels]) – Optional list of label indices to include in the report.
target_names (list of strings) – Optional display names matching the labels (same order).
sample_weight (array-like of shape = [n_samples], optional) – Sample weights.
digits (int) – Number of digits for formatting output floating point values. When output_dict is True, this will be ignored and the returned values will not be rounded.
output_dict (bool (default = False)) – If True, return output as dict

Returns

report – Text summary of the precision, recall, F1 score for each class. Dictionary returned if output_dict is True. Dictionary has the following structure:

{'label 1': {'precision':0.5,
             'recall':1.0,
             'f1-score':0.67,
             'support':1},
 'label 2': { ... },
  ...
}

The reported averages include macro average (averaging the unweighted mean per label), weighted average (averaging the support-weighted mean per label), sample average (only for multilabel classification) and micro average (averaging the total true positives, false negatives and false positives) it is only shown for multi-label or multi-class with a subset of classes because it is accuracy otherwise. See also:func:precision_recall_fscore_support for more details on averages.

Note that in binary classification, recall of the positive class is also known as “sensitivity”; recall of the negative class is “specificity”.

Return type

string / dict