Classification Report Utils

Contains functions for classification.

dataprofiler.labelers.classification_report_utils.convert_confusion_matrix_to_MCM(conf_matrix: Union[List, numpy.ndarray]) numpy.ndarray

Convert a confusion matrix into the MCM format.

Format for precision/recall/fscore/ support computation by sklearn.

The format is as specified by sklearn below: In multilabel confusion matrix \(MCM\), the count of true negatives is \(MCM_{:,0,0}\), false negatives is \(MCM_{:,1,0}\), true positives is \(MCM_{:,1,1}\) and false positives is \(MCM_{:,0,1}\). Note: this utilizes code/ideology from sklearn.

Parameters

conf_matrix (Union[list, np.ndarray]) – confusion matrix, which is a square matrix describing false positives and false negatives, true positives and true negatives for classification

Returns

MCM format for readability by sklearn confusion reports.

Return type

np.ndarray

dataprofiler.labelers.classification_report_utils.precision_recall_fscore_support(MCM: numpy.ndarray, beta: float = 1.0, labels: Optional[numpy.ndarray] = None, pos_label: Union[str, int] = 1, average: Optional[str] = None, warn_for: Union[Tuple[str, ...], Set[str]] = ('precision', 'recall', 'f-score'), sample_weight: Optional[numpy.ndarray] = None) Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, Optional[numpy.ndarray]]

Perform same functionality as recision_recall_fscore_support function.

Copy of the precision_recall_fscore_support function from sklearn.metrics with the update to receiving the MCM instead of calculating each time it is called.

Parameters
  • MCM (array, shape (n_outputs, 2, 2)) – Multi-classification confusion matrix as referenced by the sklearn metrics module. A 2x2 confusion matrix corresponding to each output in the input. In multilabel confusion matrix \(MCM\), the count of true negatives is \(MCM_{:,0,0}\), false negatives is \(MCM_{:,1,0}\), true positives is \(MCM_{:,1,1}\) and false positives is \(MCM_{:,0,1}\).

  • beta (float, 1.0 by default) – The strength of recall versus precision in the F-score.

  • labels (list, optional) – The set of labels to include when average != 'binary', and their order if average is None. Labels present in the data can be excluded, for example to calculate a multiclass average ignoring a majority negative class, while labels not present in the data will result in 0 components in a macro average. For multilabel targets, labels are column indices. By default, all labels in y_true and y_pred are used in sorted order.

  • pos_label (str or int, 1 by default) – The class to report if average='binary' and the data is binary. If the data are multiclass or multilabel, this will be ignored; setting labels=[pos_label] and average != 'binary' will report scores for that label only.

  • average (string, [None (default), 'binary', 'micro', 'macro', 'weighted']) –

    If None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:

    'binary':

    Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.

    'micro':

    Calculate metrics globally by counting the total true positives, false negatives and false positives.

    'macro':

    Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.

    'weighted':

    Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.

  • warn_for (tuple or set, for internal use) – This determines which warnings will be made in the case that this function is being used to return only one of its metrics.

  • sample_weight (array-like of shape = [n_samples], optional) – Sample weights.

Returns

  • precision (float (if average is not None) or array of float, shape = [n_unique_labels])

  • recall (float (if average is not None) or array of float, , shape = [n_unique_labels])

  • fbeta_score (float (if average is not None) or array of float, shape = [n_unique_labels])

  • support (int (if average is not None) or array of int, shape = [n_unique_labels]) – The number of occurrences of each label in y_true.

References

1

Wikipedia entry for the Precision and recall

2

Wikipedia entry for the F1-score

3

Discriminative Methods for Multi-labeled Classification Advances in Knowledge Discovery and Data Mining (2004), pp. 22-30 by Shantanu Godbole, Sunita Sarawagi

Notes

When true positive + false positive == 0, precision is undefined; When true positive + false negative == 0, recall is undefined. In such cases, the metric will be set to 0, as will f-score, and UndefinedMetricWarning will be raised.

dataprofiler.labelers.classification_report_utils.classification_report(conf_matrix: numpy.ndarray, labels: Optional[Union[List, numpy.ndarray]] = None, target_names: Optional[List[str]] = None, sample_weight: Optional[numpy.ndarray] = None, digits: int = 2, output_dict: bool = False) Union[str, Dict]

Build a text report showing the main classification metrics.

Copy of the classification_report function from sklearn.metrics with the update to receiving the conf_matrix instead of calculating each time it is called.

Read more in the User Guide.

Parameters
  • conf_matrix (array, shape = [n_labels, n_labels]) – confusion matrix, which is a square matrix describing false positives and false negatives, true positives and true negatives for classification.

  • labels (array, shape = [n_labels]) – Optional list of label indices to include in the report.

  • target_names (list of strings) – Optional display names matching the labels (same order).

  • sample_weight (array-like of shape = [n_samples], optional) – Sample weights.

  • digits (int) – Number of digits for formatting output floating point values. When output_dict is True, this will be ignored and the returned values will not be rounded.

  • output_dict (bool (default = False)) – If True, return output as dict

Returns

report – Text summary of the precision, recall, F1 score for each class. Dictionary returned if output_dict is True. Dictionary has the following structure:

{'label 1': {'precision':0.5,
             'recall':1.0,
             'f1-score':0.67,
             'support':1},
 'label 2': { ... },
  ...
}

The reported averages include macro average (averaging the unweighted mean per label), weighted average (averaging the support-weighted mean per label), sample average (only for multilabel classification) and micro average (averaging the total true positives, false negatives and false positives) it is only shown for multi-label or multi-class with a subset of classes because it is accuracy otherwise. See also:func:precision_recall_fscore_support for more details on averages.

Note that in binary classification, recall of the positive class is also known as “sensitivity”; recall of the negative class is “specificity”.

Return type

string / dict

See also

precision_recall_fscore_support, confusion_matrix, multilabel_confusion_matrix