
class aif360.metrics.ClassificationMetric(dataset, classified_dataset, unprivileged_groups=None, privileged_groups=None)[source]

Class for computing metrics based on two BinaryLabelDatasets.

The first dataset is the original one and the second is the output of the classification transformer (or similar).

  • dataset (BinaryLabelDataset) – Dataset containing ground-truth labels.
  • classified_dataset (BinaryLabelDataset) – Dataset containing predictions.
  • privileged_groups (list(dict)) – Privileged groups. Format is a list of dicts where the keys are protected_attribute_names and the values are values in protected_attributes. Each dict element describes a single group. See examples for more details.
  • unprivileged_groups (list(dict)) – Unprivileged groups in the same format as privileged_groups.

TypeErrordataset and classified_dataset must be BinaryLabelDataset types.


accuracy \(ACC = (TP + TN)/(P + N)\).
average_abs_odds_difference Average of absolute difference in FPR and TPR for unprivileged and privileged groups:
average_odds_difference Average of difference in FPR and TPR for unprivileged and privileged groups:
base_rate Compute the base rate, \(Pr(Y = 1) = P/(P+N)\), optionally conditioned on protected attributes.
between_all_groups_coefficient_of_variation The between-group coefficient of variation is the square root of two times the between_all_groups_generalized_entropy_index() with \(\alpha = 2\).
between_all_groups_generalized_entropy_index Between-group generalized entropy index that uses all combinations of groups based on self.dataset.protected_attributes.
between_all_groups_theil_index The between-group Theil index is the between_all_groups_generalized_entropy_index() with \(\alpha = 1\).
between_group_coefficient_of_variation The between-group coefficient of variation is the square root of two times the between_group_generalized_entropy_index() with \(\alpha = 2\).
between_group_generalized_entropy_index Between-group generalized entropy index that uses self.privileged_groups and self.unprivileged_groups as the only two groups.
between_group_theil_index The between-group Theil index is the between_group_generalized_entropy_index() with \(\alpha = 1\).
binary_confusion_matrix Compute the number of true/false positives/negatives, optionally conditioned on protected attributes.
coefficient_of_variation The coefficient of variation is the square root of two times the generalized_entropy_index() with \(\alpha = 2\).
consistency Individual fairness metric from [1]_ that measures how similar the labels are for similar instances.
difference Compute difference of the metric for unprivileged and privileged groups.
differential_fairness_bias_amplification Bias amplification is the difference in smoothed EDF between the classifier and the original dataset.
equal_opportunity_difference Alias of true_positive_rate_difference().
error_rate \(ERR = (FP + FN)/(P + N)\)
error_rate_difference Difference in error rates for unprivileged and privileged groups, \(ERR_{D = \text{unprivileged}} - ERR_{D = \text{privileged}}\).
error_rate_ratio Ratio of error rates for unprivileged and privileged groups, \(\frac{ERR_{D = \text{unprivileged}}}{ERR_{D = \text{privileged}}}\).
false_discovery_rate \(FDR = FP/(TP + FP)\)
false_discovery_rate_difference \(FDR_{D = \text{unprivileged}} - FDR_{D = \text{privileged}}\)
false_discovery_rate_ratio \(\frac{FDR_{D = \text{unprivileged}}}{FDR_{D = \text{privileged}}}\)
false_negative_rate \(FNR = FN/P\)
false_negative_rate_difference \(FNR_{D = \text{unprivileged}} - FNR_{D = \text{privileged}}\)
false_negative_rate_ratio \(\frac{FNR_{D = \text{unprivileged}}}{FNR_{D = \text{privileged}}}\)
false_omission_rate \(FOR = FN/(TN + FN)\)
false_omission_rate_difference \(FOR_{D = \text{unprivileged}} - FOR_{D = \text{privileged}}\)
false_omission_rate_ratio \(\frac{FOR_{D = \text{unprivileged}}}{FOR_{D = \text{privileged}}}\)
false_positive_rate \(FPR = FP/N\)
false_positive_rate_difference \(FPR_{D = \text{unprivileged}} - FPR_{D = \text{privileged}}\)
false_positive_rate_ratio \(\frac{FPR_{D = \text{unprivileged}}}{FPR_{D = \text{privileged}}}\)
generalized_binary_confusion_matrix Compute the number of generalized true/false positives/negatives, optionally conditioned on protected attributes.
generalized_entropy_index Generalized entropy index is proposed as a unified individual and group fairness measure in [3].
generalized_false_negative_rate \(GFNR = GFN/P\)
generalized_false_positive_rate \(GFPR = GFP/N\)
generalized_true_negative_rate \(GTNR = GTN/N\)
generalized_true_positive_rate Return the ratio of generalized true positives to positive examples in the dataset, \(GTPR = GTP/P\), optionally conditioned on protected attributes.
mean_difference Alias of statistical_parity_difference().
negative_predictive_value \(NPV = TN/(TN + FN)\)
num_false_negatives \(FN = \sum_{i=1}^n \mathbb{1}[y_i = \text{favorable}]\mathbb{1}[\hat{y}_i = \text{unfavorable}]\)
num_false_positives \(FP = \sum_{i=1}^n \mathbb{1}[y_i = \text{unfavorable}]\mathbb{1}[\hat{y}_i = \text{favorable}]\)
num_generalized_false_negatives Return the generalized number of false negatives, \(GFN\), the weighted sum of 1 - predicted scores where true labels are ‘favorable’, optionally conditioned on protected attributes.
num_generalized_false_positives Return the generalized number of false positives, \(GFP\), the weighted sum of predicted scores where true labels are ‘unfavorable’, optionally conditioned on protected attributes.
num_generalized_true_negatives Return the generalized number of true negatives, \(GTN\), the weighted sum of 1 - predicted scores where true labels are ‘unfavorable’, optionally conditioned on protected attributes.
num_generalized_true_positives Return the generalized number of true positives, \(GTP\), the weighted sum of predicted scores where true labels are ‘favorable’, optionally conditioned on protected attributes.
num_instances Compute the number of instances, \(n\), in the dataset conditioned on protected attributes if necessary.
num_negatives Compute the number of negatives, \(N = \sum_{i=1}^n \mathbb{1}[y_i = 0]\), optionally conditioned on protected attributes.
num_positives Compute the number of positives, \(P = \sum_{i=1}^n \mathbb{1}[y_i = 1]\), optionally conditioned on protected attributes.
num_pred_negatives \(\sum_{i=1}^n \mathbb{1}[\hat{y}_i = \text{unfavorable}]\)
num_pred_positives \(\sum_{i=1}^n \mathbb{1}[\hat{y}_i = \text{favorable}]\)
num_true_negatives \(TN = \sum_{i=1}^n \mathbb{1}[y_i = \text{unfavorable}]\mathbb{1}[\hat{y}_i = \text{unfavorable}]\)
num_true_positives Return the number of instances in the dataset where both the predicted and true labels are ‘favorable’, \(TP = \sum_{i=1}^n \mathbb{1}[y_i = \text{favorable}]\mathbb{1}[\hat{y}_i = \text{favorable}]\), optionally conditioned on protected attributes.
performance_measures Compute various performance measures on the dataset, optionally conditioned on protected attributes.
positive_predictive_value \(PPV = TP/(TP + FP)\)
power Alias of num_true_positives().
precision Alias of positive_predictive_value().
ratio Compute ratio of the metric for unprivileged and privileged groups.
recall Alias of true_positive_rate().
rich_subgroup Audit dataset with respect to rich subgroups defined by linear thresholds of sensitive attributes
selection_rate \(Pr(\hat{Y} = \text{favorable})\)
sensitivity Alias of true_positive_rate().
smoothed_empirical_differential_fairness Smoothed EDF from [#foulds18]_.
specificity Alias of true_negative_rate().
theil_index The Theil index is the generalized_entropy_index() with \(\alpha = 1\).
true_negative_rate \(TNR = TN/N\)
true_positive_rate Return the ratio of true positives to positive examples in the dataset, \(TPR = TP/P\), optionally conditioned on protected attributes.
true_positive_rate_difference \(TPR_{D = \text{unprivileged}} - TPR_{D = \text{privileged}}\)
__init__(dataset, classified_dataset, unprivileged_groups=None, privileged_groups=None)[source]
  • dataset (BinaryLabelDataset) – Dataset containing ground-truth labels.
  • classified_dataset (BinaryLabelDataset) – Dataset containing predictions.
  • privileged_groups (list(dict)) – Privileged groups. Format is a list of dicts where the keys are protected_attribute_names and the values are values in protected_attributes. Each dict element describes a single group. See examples for more details.
  • unprivileged_groups (list(dict)) – Unprivileged groups in the same format as privileged_groups.

TypeErrordataset and classified_dataset must be BinaryLabelDataset types.


\(ACC = (TP + TN)/(P + N)\).

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

Average of absolute difference in FPR and TPR for unprivileged and privileged groups:

\[\tfrac{1}{2}\left[|FPR_{D = \text{unprivileged}} - FPR_{D = \text{privileged}}| + |TPR_{D = \text{unprivileged}} - TPR_{D = \text{privileged}}|\right]\]

A value of 0 indicates equality of odds.


Average of difference in FPR and TPR for unprivileged and privileged groups:

\[\tfrac{1}{2}\left[(FPR_{D = \text{unprivileged}} - FPR_{D = \text{privileged}}) + (TPR_{D = \text{unprivileged}} - TPR_{D = \text{privileged}}))\right]\]

A value of 0 indicates equality of odds.


The between-group coefficient of variation is the square root of two times the between_all_groups_generalized_entropy_index() with \(\alpha = 2\).


Between-group generalized entropy index that uses all combinations of groups based on self.dataset.protected_attributes. See _between_group_generalized_entropy_index().

Parameters:alpha (int) – See generalized_entropy_index().

The between-group Theil index is the between_all_groups_generalized_entropy_index() with \(\alpha = 1\).


The between-group coefficient of variation is the square root of two times the between_group_generalized_entropy_index() with \(\alpha = 2\).


Between-group generalized entropy index that uses self.privileged_groups and self.unprivileged_groups as the only two groups. See _between_group_generalized_entropy_index().

Parameters:alpha (int) – See generalized_entropy_index().

The between-group Theil index is the between_group_generalized_entropy_index() with \(\alpha = 1\).


Compute the number of true/false positives/negatives, optionally conditioned on protected attributes.

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Returns:dict – Number of true positives, false positives, true negatives, false negatives (optionally conditioned).

The coefficient of variation is the square root of two times the generalized_entropy_index() with \(\alpha = 2\).


Bias amplification is the difference in smoothed EDF between the classifier and the original dataset. Positive values mean the bias increased due to the classifier.

Parameters:concentration (float, optional) – Concentration parameter for Dirichlet smoothing. Must be non-negative.
\[\frac{Pr(\hat{Y} = 1 | D = \text{unprivileged})} {Pr(\hat{Y} = 1 | D = \text{privileged})}\]

Alias of true_positive_rate_difference().


\(ERR = (FP + FN)/(P + N)\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

Difference in error rates for unprivileged and privileged groups, \(ERR_{D = \text{unprivileged}} - ERR_{D = \text{privileged}}\).


Ratio of error rates for unprivileged and privileged groups, \(\frac{ERR_{D = \text{unprivileged}}}{ERR_{D = \text{privileged}}}\).


\(FDR = FP/(TP + FP)\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

\(FDR_{D = \text{unprivileged}} - FDR_{D = \text{privileged}}\)


\(\frac{FDR_{D = \text{unprivileged}}}{FDR_{D = \text{privileged}}}\)


\(FNR = FN/P\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

\(FNR_{D = \text{unprivileged}} - FNR_{D = \text{privileged}}\)


\(\frac{FNR_{D = \text{unprivileged}}}{FNR_{D = \text{privileged}}}\)


\(FOR = FN/(TN + FN)\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

\(FOR_{D = \text{unprivileged}} - FOR_{D = \text{privileged}}\)


\(\frac{FOR_{D = \text{unprivileged}}}{FOR_{D = \text{privileged}}}\)


\(FPR = FP/N\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

\(FPR_{D = \text{unprivileged}} - FPR_{D = \text{privileged}}\)


\(\frac{FPR_{D = \text{unprivileged}}}{FPR_{D = \text{privileged}}}\)


Compute the number of generalized true/false positives/negatives, optionally conditioned on protected attributes. Generalized counts are based on scores and not on the hard predictions.

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Returns:dict – Number of generalized true positives, generalized false positives, generalized true negatives, generalized false negatives (optionally conditioned).

Generalized entropy index is proposed as a unified individual and group fairness measure in [3]. With \(b_i = \hat{y}_i - y_i + 1\):

\[\begin{split}\mathcal{E}(\alpha) = \begin{cases} \frac{1}{n \alpha (\alpha-1)}\sum_{i=1}^n\left[\left(\frac{b_i}{\mu}\right)^\alpha - 1\right],& \alpha \ne 0, 1,\\ \frac{1}{n}\sum_{i=1}^n\frac{b_{i}}{\mu}\ln\frac{b_{i}}{\mu},& \alpha=1,\\ -\frac{1}{n}\sum_{i=1}^n\ln\frac{b_{i}}{\mu},& \alpha=0. \end{cases}\end{split}\]
Parameters:alpha (int) – Parameter that regulates the weight given to distances between values at different parts of the distribution.


[3](1, 2) T. Speicher, H. Heidari, N. Grgic-Hlaca, K. P. Gummadi, A. Singla, A. Weller, and M. B. Zafar, “A Unified Approach to Quantifying Algorithmic Unfairness: Measuring Individual and Group Unfairness via Inequality Indices,” ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2018.

\(GFNR = GFN/P\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

\(GFPR = GFP/N\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

\(GTNR = GTN/N\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

Return the ratio of generalized true positives to positive examples in the dataset, \(GTPR = GTP/P\), optionally conditioned on protected attributes.

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

\(NPV = TN/(TN + FN)\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

\(FN = \sum_{i=1}^n \mathbb{1}[y_i = \text{favorable}]\mathbb{1}[\hat{y}_i = \text{unfavorable}]\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

\(FP = \sum_{i=1}^n \mathbb{1}[y_i = \text{unfavorable}]\mathbb{1}[\hat{y}_i = \text{favorable}]\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

Return the generalized number of false negatives, \(GFN\), the weighted sum of 1 - predicted scores where true labels are ‘favorable’, optionally conditioned on protected attributes.

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

Return the generalized number of false positives, \(GFP\), the weighted sum of predicted scores where true labels are ‘unfavorable’, optionally conditioned on protected attributes.

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be must be provided at initialization to condition on them.

Return the generalized number of true negatives, \(GTN\), the weighted sum of 1 - predicted scores where true labels are ‘unfavorable’, optionally conditioned on protected attributes.

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

Return the generalized number of true positives, \(GTP\), the weighted sum of predicted scores where true labels are ‘favorable’, optionally conditioned on protected attributes.

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

\(\sum_{i=1}^n \mathbb{1}[\hat{y}_i = \text{unfavorable}]\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

\(\sum_{i=1}^n \mathbb{1}[\hat{y}_i = \text{favorable}]\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

\(TN = \sum_{i=1}^n \mathbb{1}[y_i = \text{unfavorable}]\mathbb{1}[\hat{y}_i = \text{unfavorable}]\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

Return the number of instances in the dataset where both the predicted and true labels are ‘favorable’, \(TP = \sum_{i=1}^n \mathbb{1}[y_i = \text{favorable}]\mathbb{1}[\hat{y}_i = \text{favorable}]\), optionally conditioned on protected attributes.

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

Compute various performance measures on the dataset, optionally conditioned on protected attributes.

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Returns:dict – True positive rate, true negative rate, false positive rate, false negative rate, positive predictive value, negative predictive value, false discover rate, false omission rate, and accuracy (optionally conditioned).

\(PPV = TP/(TP + FP)\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

Alias of num_true_positives().


Alias of positive_predictive_value().


Alias of true_positive_rate().


\(Pr(\hat{Y} = \text{favorable})\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

Alias of true_positive_rate().


Alias of true_negative_rate().

\[Pr(\hat{Y} = 1 | D = \text{unprivileged}) - Pr(\hat{Y} = 1 | D = \text{privileged})\]

The Theil index is the generalized_entropy_index() with \(\alpha = 1\).


\(TNR = TN/N\)

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

Return the ratio of true positives to positive examples in the dataset, \(TPR = TP/P\), optionally conditioned on protected attributes.

Parameters:privileged (bool, optional) – Boolean prescribing whether to condition this metric on the privileged_groups, if True, or the unprivileged_groups, if False. Defaults to None meaning this metric is computed over the entire dataset.
Raises:AttributeErrorprivileged_groups or unprivileged_groups must be provided at initialization to condition on them.

\(TPR_{D = \text{unprivileged}} - TPR_{D = \text{privileged}}\)