When the Stakes Are High, Do Machine Learning Models Make Fair Decisions?

May 28, 2025

Are machine learning models fair? New research says they are not.

Machine learning is an integral part of high stakes decision making in a broad swath of human-computer interactions. You apply for a job. You submit a loan application. Algorithms determine who advances and who is declined.

Computer scientists from the University of California San Diego and the University of Wisconsin – Madison are challenging the common practice of using a single machine learning (ML) model to make such critical decisions. They asked how people feel when “equally good” ML models reach different conclusions.

Associate Professor Loris D’Antoni with the Jacobs School of Engineering Department of Computer Science and Engineering led the research that was presented at the 2025 Conference on Human Factors in Computing Systems (CHI). The paper, Perceptions of the Fairness Impacts of Multiplicity in Machine Learning, outlines work D’Antoni began with fellow researchers during his tenure at the University of Wisconsin and is continuing today at UC San Diego.

D’Antoni worked with team members to build on existing evidence that distinct models, like their human counterparts, have variable outcomes. In other words, one good model might reject an application while another approves it. Naturally, this leads to questions regarding how objective decisions can be reached.

“ML researchers posit that current practices pose a fairness risk. Our research dug deeper into this problem. We asked lay stakeholders, or regular people, how they think decisions should be made when multiple highly accurate models give different predictions for a given input,” said D’Antoni.

The study uncovered a few significant findings. First, the stakeholders balked at the standard practice of relying on a single model, especially when multiple models disagreed. Second, participants rejected the notion that decisions should be randomized in such instances.

“We find these results interesting because these preferences contrast with standard practice in ML development and philosophy research on fair practices,” said first author and PhD student Anna Meyer, who was advised by D’Antoni at the University of Wisconsin and will start as assistant professor at Carlton College in the fall.

The team hopes these insights will guide future model development and policy. Key recommendations include expanding searches over a range of models and implementing human decision-making to adjudicate disagreements – especially in high-stakes settings.

Other members of the research team include Aws Albarghouthi, an associate professor in computer science at University of Wisconsin, and Yea-Seul Kim from Apple. The work was conducted with the support of NSF grants.

UC San Diego’s Steven Dow, a professor in the Department Cognitive Science and in the Design Lab and an affiliate professor in computer science, represented the university at CHI 2025 with two additional papers, including: DesignWeaver: Dimensional Scaffolding for Text-to-Image Product Design and Productive vs. Reflective: How Different Ways of Integrating AI into Design Workflows Affect Cognition and Motivation.

Haijun Xia, an assistant professor in the Department of Cognitive Science and in the Design Lab who is also affiliated with the department, had four papers accepted at CHI 2025, including: Generative and Malleable User Interfaces with Generative and Evolving Task-Driven Data Model, Malleable Overview-Detail Interfaces, Compositional Structures in Substrates for Human-AI Co-Creation Environment: A Design Approach and Case Study, and The Shapes of Abstraction in Data Structure Diagrams, which received an Honorable Mention.

Organized by the Association for Computing Machinery, CHI is the premier international conference on human-computer interaction.

By Kimberley Clementi