aif360.sklearn.metrics.mdss_bias_scan

aif360.sklearn.metrics.mdss_bias_scan(y_true, probas_pred, X=None, *, pos_label=1, scoring='Bernoulli', privileged=True, n_iter=10, penalty=1e-17, **kwargs)[source]

DEPRECATED: Change to new interface - aif360.sklearn.detectors.mdss_detector.bias_scan by version 0.5.0.

Scan to find the highest scoring subset of records.

Bias scan is a technique to identify bias in predictive models using subset scanning [1].

Parameters:
  • y_true (array-like) – Ground truth (correct) target values.
  • probas_pred (array-like) – Probability estimates of the positive class.
  • X (dataframe, optional) – The dataset (containing the features) that was used to predict probas_pred. If not specified, the subset is returned as indices.
  • pos_label (scalar) – Label of the positive class.
  • scoring (str or class) – One of ‘Bernoulli’ or ‘BerkJones’ or subclass of aif360.metrics.mdss.ScoringFunctions.ScoringFunction.
  • privileged (bool) – Flag for which direction to scan: privileged (True) implies negative (observed worse than predicted outcomes) while unprivileged (False) implies positive (observed better than predicted outcomes).
  • n_iter (scalar) – Number of iterations (random restarts).
  • penalty (scalar) – Penalty coefficient. Should be positive. The higher the penalty, the less complex (number of features and feature values) the highest scoring subset that gets returned is.
  • **kwargs – Additional kwargs to be passed to scoring (not including direction).
Returns:

tuple – Highest scoring subset and its bias score

  • subset (dict) – Mapping of features to values defining the highest scoring subset.
  • score (float) – Bias score for that group.

References

[1]Zhang, Z. and Neill, D. B., “Identifying significant predictive bias in classifiers,” arXiv preprint, 2016.