Evaluating gender bias in large language models in long-term care

Rickman, S.

(2025). Evaluating gender bias in large language models in long-term care. BMC Medical Informatics and Decision Making, 25(1). https://doi.org/10.1186/s12911-025-03118-0

Copy

Abstract

Background: Large language models (LLMs) are being used to reduce the administrative burden in long-term care by automatically generating and summarising case notes. However, LLMs can reproduce bias in their training data. This study evaluates gender bias in summaries of long-term care records generated with two state-of-the-art, open-source LLMs released in 2024: Meta’s Llama 3 and Google Gemma. Methods: Gender-swapped versions were created of long-term care records for 617 older people from a London local authority. Summaries of male and female versions were generated with Llama 3 and Gemma, as well as benchmark models from Meta and Google released in 2019: T5 and BART. Counterfactual bias was quantified through sentiment analysis alongside an evaluation of word frequency and thematic patterns. Results: The benchmark models exhibited some variation in output on the basis of gender. Llama 3 showed no gender-based differences across any metrics. Gemma displayed the most significant gender-based differences. Male summaries focus more on physical and mental health issues. Language used for men was more direct, with women’s needs downplayed more often than men’s. Conclusion: Care services are allocated on the basis of need. If women’s health issues are underemphasised, this may lead to gender-based disparities in service receipt. LLMs may offer substantial benefits in easing administrative burden. However, the findings highlight the variation in state-of-the-art LLMs, and the need for evaluation of bias. The methods in this paper provide a practical framework for quantitative evaluation of gender bias in LLMs. The code is available on GitHub.

Item Type	Article
Copyright holders	© 2025 The Author(s)
Departments	LSE > Research Centres > Care Policy and Evaluation Centre
DOI	10.1186/s12911-025-03118-0
Date Deposited	17 July 2025
Acceptance Date	17 July 2025
URI	https://researchonline.lse.ac.uk/id/eprint/128867

Explore Further

Rickman, Samuel

https://www.scopus.com/pages/publications/105013209883 (Scopus publication)

Rickman, S. (2024). samrickman/evaluate-llm-gender-bias-ltc: v1.0.0. [Dataset]. London School of Economics and Political Science. https://doi.org/10.5281/zenodo.14176609

picture_as_pdf

subject: Published Version
: Creative Commons: Attribution 4.0

Download

EndNote

BibTeX

Reference Manager (RIS)

Refer

Atom

Dublin Core

JSON

Multiline CSV

Export

Downloads

View more statistics