FairMOE: counterfactually-fair mixture of experts with levels of interpretability
Published in Machine Learning, 2024
Recommended citation: Germino, Joe, Nuno Moniz, and Nitesh V. Chawla. "FairMOE: counterfactually-fair mixture of experts with levels of interpretability." Machine Learning (2024): 1-21. https://doi.org/10.1007/978-3-031-45275-8_23
Abstract: With the rise of artificial intelligence in our everyday lives, the need for human interpretation of machine learning models’ predictions emerges as a critical issue. Generally, interpretability is viewed as a binary notion with a performance trade-off. Either a model is fully-interpretable but lacks the ability to capture more complex patterns in the data, or it is a black box. In this paper, we argue that this view is severely limiting and that instead interpretability should be viewed as a continuous domain-informed concept. We leverage the well-known Mixture of Experts architecture with user-defined limits on non-interpretability. We extend this idea with a counterfactual fairness module to ensure the selection of consistently fair experts: FairMOE. We perform an extensive experimental evaluation with fairness-related data sets and compare our proposal against state-of-the-art methods. Our results demonstrate that FairMOE is competitive with the leading fairness-aware algorithms in both fairness and predictive measures while providing more consistent performance, competitive scalability, and, most importantly, greater interpretability.
Germino, Joe, Nuno Moniz, and Nitesh V. Chawla. “FairMOE: counterfactually-fair mixture of experts with levels of interpretability.” Machine Learning (2024): 1-21.