dataprofiler.labelers.classification_report_utils module

Contains functions for classification.

dataprofiler.labelers.classification_report_utils.convert_confusion_matrix_to_MCM(conf_matrix: list | np.ndarray) np.ndarray

Convert a confusion matrix into the MCM format.

Format for precision/recall/fscore/ support computation by sklearn.

The format is as specified by sklearn below: In multilabel confusion matrix \(MCM\), the count of true negatives is \(MCM_{:,0,0}\), false negatives is \(MCM_{:,1,0}\), true positives is \(MCM_{:,1,1}\) and false positives is \(MCM_{:,0,1}\). Note: this utilizes code/ideology from sklearn.


conf_matrix (Union[list, np.ndarray]) – confusion matrix, which is a square matrix describing false positives and false negatives, true positives and true negatives for classification


MCM format for readability by sklearn confusion reports.

Return type:


dataprofiler.labelers.classification_report_utils.precision_recall_fscore_support(MCM: np.ndarray, beta: float = 1.0, labels: np.ndarray | None = None, pos_label: str | int = 1, average: str | None = None, warn_for: tuple[str, ...] | set[str] = ('precision', 'recall', 'f-score'), sample_weight: np.ndarray | None = None) tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray | None]

Perform same functionality as recision_recall_fscore_support function.

Copy of the precision_recall_fscore_support function from sklearn.metrics with the update to receiving the MCM instead of calculating each time it is called.

  • MCM (array, shape (n_outputs, 2, 2)) – Multi-classification confusion matrix as referenced by the sklearn metrics module. A 2x2 confusion matrix corresponding to each output in the input. In multilabel confusion matrix \(MCM\), the count of true negatives is \(MCM_{:,0,0}\), false negatives is \(MCM_{:,1,0}\), true positives is \(MCM_{:,1,1}\) and false positives is \(MCM_{:,0,1}\).

  • beta (float, 1.0 by default) – The strength of recall versus precision in the F-score.

  • labels (list, optional) – The set of labels to include when average != 'binary', and their order if average is None. Labels present in the data can be excluded, for example to calculate a multiclass average ignoring a majority negative class, while labels not present in the data will result in 0 components in a macro average. For multilabel targets, labels are column indices. By default, all labels in y_true and y_pred are used in sorted order.

  • pos_label (str or int, 1 by default) – The class to report if average='binary' and the data is binary. If the data are multiclass or multilabel, this will be ignored; setting labels=[pos_label] and average != 'binary' will report scores for that label only.

  • average (string, [None (default), 'binary', 'micro', 'macro', 'weighted']) –

    If None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:


    Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.


    Calculate metrics globally by counting the total true positives, false negatives and false positives.


    Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.


    Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.

  • warn_for (tuple or set, for internal use) – This determines which warnings will be made in the case that this function is being used to return only one of its metrics.

  • sample_weight (array-like of shape = [n_samples], optional) – Sample weights.


  • precision (float (if average is not None) or array of float, shape = [n_unique_labels])

  • recall (float (if average is not None) or array of float, , shape = [n_unique_labels])

  • fbeta_score (float (if average is not None) or array of float, shape = [n_unique_labels])

  • support (int (if average is not None) or array of int, shape = [n_unique_labels]) – The number of occurrences of each label in y_true.



When true positive + false positive == 0, precision is undefined; When true positive + false negative == 0, recall is undefined. In such cases, the metric will be set to 0, as will f-score, and UndefinedMetricWarning will be raised.

dataprofiler.labelers.classification_report_utils.classification_report(conf_matrix: np.ndarray, labels: list | np.ndarray | None = None, target_names: list[str] | None = None, sample_weight: np.ndarray | None = None, digits: int = 2, output_dict: bool = False) str | dict

Build a text report showing the main classification metrics.

Copy of the classification_report function from sklearn.metrics with the update to receiving the conf_matrix instead of calculating each time it is called.

Read more in the User Guide.

  • conf_matrix (array, shape = [n_labels, n_labels]) – confusion matrix, which is a square matrix describing false positives and false negatives, true positives and true negatives for classification.

  • labels (array, shape = [n_labels]) – Optional list of label indices to include in the report.

  • target_names (list of strings) – Optional display names matching the labels (same order).

  • sample_weight (array-like of shape = [n_samples], optional) – Sample weights.

  • digits (int) – Number of digits for formatting output floating point values. When output_dict is True, this will be ignored and the returned values will not be rounded.

  • output_dict (bool (default = False)) – If True, return output as dict


report – Text summary of the precision, recall, F1 score for each class. Dictionary returned if output_dict is True. Dictionary has the following structure:

{'label 1': {'precision':0.5,
 'label 2': { ... },

The reported averages include macro average (averaging the unweighted mean per label), weighted average (averaging the support-weighted mean per label), sample average (only for multilabel classification) and micro average (averaging the total true positives, false negatives and false positives) it is only shown for multi-label or multi-class with a subset of classes because it is accuracy otherwise. See also:func:precision_recall_fscore_support for more details on averages.

Note that in binary classification, recall of the positive class is also known as “sensitivity”; recall of the negative class is “specificity”.

Return type:

string / dict

See also

precision_recall_fscore_support, confusion_matrix, multilabel_confusion_matrix