FairMOE: counterfactually-fair mixture of experts with levels of interpretability

Published in Machine Learning, 2024

Recommended citation: Germino, Joe, Nuno Moniz, and Nitesh V. Chawla. "FairMOE: counterfactually-fair mixture of experts with levels of interpretability." Machine Learning (2024): 1-21. https://doi.org/10.1007/978-3-031-45275-8_23

Abstract: With the rise of artificial intelligence in our everyday lives, the need for human interpretation of machine learning models’ predictions emerges as a critical issue. Generally, interpretability is viewed as a binary notion with a performance trade-off. Either a model is fully-interpretable but lacks the ability to capture more complex patterns in the data, or it is a black box. In this paper, we argue that this view is severely limiting and that instead interpretability should be viewed as a continuous domain-informed concept. We leverage the well-known Mixture of Experts architecture with user-defined limits on non-interpretability. We extend this idea with a counterfactual fairness module to ensure the selection of consistently fair experts: FairMOE. We perform an extensive experimental evaluation with fairness-related data sets and compare our proposal against state-of-the-art methods. Our results demonstrate that FairMOE is competitive with the leading fairness-aware algorithms in both fairness and predictive measures while providing more consistent performance, competitive scalability, and, most importantly, greater interpretability.

Download paper here

Germino, Joe, Nuno Moniz, and Nitesh V. Chawla. “FairMOE: counterfactually-fair mixture of experts with levels of interpretability.” Machine Learning (2024): 1-21.