In modern medical settings, dermatologists may rely on AI models to evaluate skin lesions for cancer risk. However, if these models exhibit bias towards certain skin tones, they might not accurately assess risk for all patients. Bias in AI, especially related to model training data and architecture, remains a significant challenge. This is particularly crucial in medical contexts where bias can lead to safety issues due to poor model performance.
A recent study by researchers from MIT, Worcester Polytechnic Institute, and Google introduces a new debiasing method called “Weighted Rotational DebiasING” (WRING). This technique, designed for vision language models (VLMs) like OpenAI’s OpenCLIP, was presented at the 2026 International Conference for Learning Representations. VLMs can process multiple data types, such as images and text, simultaneously. Previous debiasing methods, like projection debiasing, have led to the “Whac-A-Mole dilemma,” a term coined in AI research in 2023.
Projection debiasing removes bias by projecting unwanted information out of the model’s representation space, but it often alters other learned relationships. Walter Gerych, the lead author and a computer science professor at Worcester Polytechnic Institute, notes that this can unintentionally change various model dynamics. He collaborated with MIT students Cassandra Parent and Quinn Perian, Google’s Rafiya Javed, and MIT professors Justin Solomon and Marzyeh Ghassemi.
While projection debiasing can prevent a model from acting on specific biases, it may inadvertently create new ones. Ghassemi highlights this issue with an example: removing racial bias from a model could unintentionally increase gender bias. WRING addresses this by adjusting certain coordinates in the model’s high-dimensional space, altering bias-related angles without affecting other model relationships. Like projection debiasing, WRING is a post-processing technique, allowing it to be applied to pre-trained models without retraining.
Gerych emphasizes WRING’s efficiency, stating it avoids the need for retraining models, which is resource-intensive. The research showed WRING effectively reduced bias in specific areas without introducing new biases, though it is currently limited to Contrastive Language-Image Pre-training (CLIP) models. “Extending this for ChatGPT-style, generative language models, is the reasonable next step for us,” Gerych says. The study received support from various awards and foundations, including the National Science Foundation and MIT-Google Computing Innovation Award.
Original Source: news.mit.edu
