
The promise of artificial intelligence (AI) in revolutionizing healthcare, from diagnostic support to treatment recommendations, is undeniable. However, a recent study from MIT highlights a critical, yet often overlooked, vulnerability in these sophisticated systems: common human typing errors and linguistic nuances. For healthcare professionals eager to leverage AI’s potential, understanding these limitations is paramount, especially as they intersect with existing health disparities, particularly for Black patients.
Presented at an Association for Computing Machinery conference, the MIT research reveals that seemingly innocuous errors—such as typos, extra white spaces, missing gender references, or the use of slang—can significantly compromise an AI’s ability to accurately analyze patient records. The consequences are far from trivial: these human mistakes can skew AI’s recommendations, increasing the likelihood of an AI suggesting self-management over a necessary clinical appointment or lab test.
Lead researcher Abinitha Gourabathina, a graduate student at the MIT Department of Electrical Engineering and Computer Science, emphasized the disconnect between AI training and real-world application. “These models are often trained and tested on medical exam questions but then used in tasks that are pretty far from that, like evaluating the severity of a clinical case,” Gourabathina noted. This discrepancy is crucial, as the “cleaned and structured” medical datasets typically used for AI training rarely reflect the messy, often informal, nature of real-world patient communication.
The study’s methodology involved deliberately “perturbing” patient records. Researchers swapped or removed gender references, inserted extra spaces or typos, and added “colorful” or “uncertain” language. Colorful language included exclamations like “wow” or adverbs like “really” or “very,” while uncertain language featured hedge words such as “kind of,” “sort of,” “possibly,” or “suppose.” Even with all critical clinical data—like medications and diagnoses—preserved, these linguistic alterations significantly impacted AI output. When presented with this altered data, the four different AIs tested were 7% to 9% more likely to recommend self-care, with colorful language having the most pronounced effect.
Disproportionate Impact on Women and Black Patients
Perhaps most concerning for healthcare equity, the study found that AI models made approximately 7% more errors for female patients, frequently recommending self-management at home even when explicit gender cues were removed. This finding resonates with existing research on algorithmic bias, where AI systems, due to biases embedded in their training data, can perpetuate and amplify societal inequities.
While the MIT study specifically highlights the impact on women, its implications for Black patients are profound and warrant immediate attention from healthcare professionals. Black patients already face systemic biases in healthcare, including historical mistreatment, implicit bias from providers, and disparities in access to care. When AI systems are fed data that may inadvertently contain linguistic patterns or colloquialisms more prevalent in certain communities, or when data reflecting the nuances of Black patient experiences is underrepresented in training sets, these systems can inadvertently exacerbate existing health inequities.
Consider the potential for “colorful” or “uncertain” language. Linguistic expressions and communication styles can vary significantly across different cultural and racial groups. If AI models are primarily trained on data reflecting a dominant linguistic style, they may misinterpret or devalue information conveyed in other ways. For instance, a Black patient using a common idiom or a more expressive communication style to describe symptoms might be misinterpreted by an AI trained on formal, clinical language, leading to an inaccurate assessment of their condition severity or an inappropriate recommendation for self-care. This could delay critical interventions, worsen health outcomes, and further erode trust in the healthcare system among Black communities.
Furthermore, if historical data used for AI training contains biases related to the underdiagnosis or undertreatment of conditions in Black patients, the AI could learn and perpetuate these harmful patterns. This is particularly concerning given the documented disparities in the diagnosis and treatment of pain, cardiovascular disease, and mental health conditions in Black individuals.
Moving Forward: Mitigating Bias and Ensuring Equity
For healthcare organizations and professionals integrating AI, these findings underscore the urgent need for a multi-faceted approach:
- Diverse and Representative Training Data: Prioritizing the development of AI models trained on vast, diverse, and meticulously curated datasets that accurately reflect the linguistic and demographic variations of the entire patient population, including Black patients. This involves proactive efforts to address historical data gaps and biases.
- Robust Pre-processing and Error Handling: Implementing advanced natural language processing (NLP) techniques that are more resilient to common human errors, slang, and linguistic variations. This includes developing algorithms specifically designed to normalize text and identify intent despite imperfections.
- Bias Auditing and Validation: Regular and rigorous auditing of AI algorithms for bias against specific demographic groups, particularly racial and ethnic minorities. This goes beyond simply testing for accuracy and involves actively looking for disproportionate impacts on patient recommendations.
- Human-in-the-Loop Oversight: Maintaining strong human oversight in AI-driven decision-making processes. AI should serve as a powerful assistant, not a replacement for human clinical judgment, especially when patient outcomes could be jeopardized by algorithmic errors or biases. Follow-up research from MIT confirming that human doctors were unaffected by these changes reinforces the importance of human clinical wisdom.
- Patient Education and Transparency: Educating patients about the limitations of AI and encouraging clear, direct communication of symptoms, while also ensuring that AI systems are not used in ways that inadvertently create new barriers to care.
As AI continues to embed itself in healthcare, understanding and actively mitigating its vulnerabilities, especially those that disproportionately affect marginalized populations like Black patients, is not just a technical challenge—it is an ethical imperative for achieving true health equity.
