aif360.sklearn.preprocessing
.LearnedFairRepresentations¶
-
class
aif360.sklearn.preprocessing.
LearnedFairRepresentations
(prot_attr=None, n_prototypes=5, reconstruct_weight=0.01, target_weight=1.0, fairness_weight=50.0, tol=0.0001, max_iter=200, verbose=0, random_state=None)[source]¶ Learned Fair Representations.
Learned fair representations is a pre-processing technique that finds a latent representation which encodes the data well but obfuscates information about protected attributes [1]. It can also be used as an in- processing method by utilizing the learned target coefficients.
References
[1] R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork, “Learning Fair Representations.” International Conference on Machine Learning, 2013. # Based on code from https://github.com/zjelveh/learning-fair-representations
Variables: - prot_attr (str or list(str)) – Protected attribute(s) used for reweighing.
- groups (array, shape (n_groups,)) – A list of group labels known to the transformer.
- classes (array, shape (n_classes,)) – A list of class labels known to the transformer.
- priv_group (scalar) – The label of the privileged group.
- coef (array, shape (n_prototypes, 1) or (n_prototypes, n_classes)) – Coefficient of the intermediate representation for classification.
- prototypes (array, shape (n_prototypes, n_features)) – The prototype set used to form a probabilistic mapping to the intermediate representation. These act as clusters and are in the same space as the samples.
- n_iter (int) – Actual number of iterations.
Parameters: - prot_attr (single label or list-like, optional) – Protected
attribute(s) to use in the reweighing process. If more than one
attribute, all combinations of values (intersections) are
considered. Default is
None
meaning all protected attributes from the dataset are used. - n_prototypes (int, optional) – Size of the set of “prototypes,” Z.
- reconstruct_weight (float, optional) – Weight coefficient on the L_x loss term, A_x.
- target_weight (float, optional) – Weight coefficient on the L_y loss term, A_y.
- fairness_weight (float, optional) – Weight coefficient on the L_z loss term, A_z.
- tol (float, optional) – Tolerance for stopping criteria.
- max_iter (int, optional) – Maximum number of iterations taken for the solver to converge.
- verbose (int, optional) – Verbosity. 0 = silent, 1 = final loss only, 2 = print loss every 50 iterations.
- random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for shuffling data and seeding weights.
Methods
fit
Compute the transformation parameters that lead to fair representations. fit_transform
Fit to data, then transform it. get_params
Get parameters for this estimator. predict
Transform the targets using the learned model parameters. predict_proba
Transform the targets using the learned model parameters. score
Return the mean accuracy on the given test data and labels. set_params
Set the parameters of this estimator. transform
Transform the dataset using the learned model parameters. -
__init__
(prot_attr=None, n_prototypes=5, reconstruct_weight=0.01, target_weight=1.0, fairness_weight=50.0, tol=0.0001, max_iter=200, verbose=0, random_state=None)[source]¶ Parameters: - prot_attr (single label or list-like, optional) – Protected
attribute(s) to use in the reweighing process. If more than one
attribute, all combinations of values (intersections) are
considered. Default is
None
meaning all protected attributes from the dataset are used. - n_prototypes (int, optional) – Size of the set of “prototypes,” Z.
- reconstruct_weight (float, optional) – Weight coefficient on the L_x loss term, A_x.
- target_weight (float, optional) – Weight coefficient on the L_y loss term, A_y.
- fairness_weight (float, optional) – Weight coefficient on the L_z loss term, A_z.
- tol (float, optional) – Tolerance for stopping criteria.
- max_iter (int, optional) – Maximum number of iterations taken for the solver to converge.
- verbose (int, optional) – Verbosity. 0 = silent, 1 = final loss only, 2 = print loss every 50 iterations.
- random_state (int or numpy.RandomState, optional) – Seed of pseudo- random number generator for shuffling data and seeding weights.
- prot_attr (single label or list-like, optional) – Protected
attribute(s) to use in the reweighing process. If more than one
attribute, all combinations of values (intersections) are
considered. Default is
-
fit
(X, y, priv_group=1, sample_weight=None)[source]¶ Compute the transformation parameters that lead to fair representations.
Parameters: - X (pandas.DataFrame) – Training samples.
- y (array-like) – Training labels.
- priv_group (scalar, optional) – The label of the privileged group.
- sample_weight (array-like, optional) – Sample weights.
Returns: self
-
predict
(X)[source]¶ Transform the targets using the learned model parameters.
Parameters: X (pandas.DataFrame) – Training samples. Returns: numpy.ndarray – Transformed targets.
-
predict_proba
(X)[source]¶ Transform the targets using the learned model parameters.
Parameters: X (pandas.DataFrame) – Training samples. Returns: numpy.ndarray – Transformed targets. Returns the probability of the sample for each class in the model, where classes are ordered as they are in self.classes_
.
-
transform
(X)[source]¶ Transform the dataset using the learned model parameters.
Parameters: X (pandas.DataFrame) – Training samples. Returns: pandas.DataFrame – Transformed samples.