American Association for Hand Surgery

AAHS Home AAHS Home Past & Future Meetings Past & Future Meetings
Facebook    Twitter

Back to 2026 ePosters


ChatGPT as a Translation Tool for Hand Patient Education Documents
Karishma R Desai, BS1, Ivan F Rubel, MD1, Viviana M Serra-Lopez, MD, MS1, Roberto Hernandez-Irizarry, MD1; Nicole Zelenski, MD2
(1)Emory University, Atlanta, GA, (2)Emory School of Medicine, Atlanta, GA

Introduction: This study explores whether ChatGPT can effectively translate Spanish hand patient education documents into English and maintain readability.

Methods: English patient-education documents for 6 different hand pathologies (carpal tunnel, de Quervain's tenosynovitis, trigger finger, Dupuytren's contracture, hand fractures, and wrist fractures were obtained from the ASSH website, entered into ChatGPT with the instruction: "Translate the following text into Spanish", and also translated using Google Translate. Three bilingual orthopaedic surgeons rated each document for accuracy (1: completely incorrect - 6: completely correct), completeness (1: incomplete response - 3: comprehensive response), fluency (1: no fluency - 5: perfect fluency), adequacy (1: 0% of original information conveyed - 5: 100% of original information conveyed), meaning (1: totally different meaning from original - 5: same meaning as original), and severity (1: dangerous to patient care - 5: no effect on patient care). Readability was scored using average reading level consensus calculators for both English and Spanish documents.

Results: No documents received an average completeness rating below 2 (adequate), fluency below 3 (good), adequacy below 4 (75% original information conveyed), or severity score below 4 (unclear effect on patient care). ChatGPT translations had the highest average scores between translated documents across all categories (Accuracy 5.33, Completeness 3, Fluency 4.01, Adequacy 4.89, Meaning 4.78, Severity 4.89). Average scores for all categories are visualized in Figures 1-6.

ASSH and Google translations had significantly lower mean accuracy than the English originals. There was no significant difference between mean accuracy of English originals and ChatGPT translations (5.72 vs 5.33). There was no significant difference in completeness ratings between English documents and translations.

There was no significant difference between translation fluency or severity. ChatGPT translations had significantly higher average adequacy (4.89 vs 4.28) and meaning (4.77 vs 4.11) than ASSH translations, but not significantly higher than the adequacy (4.89 vs 4.66) or meaning (4.77 vs 4.50) of Google translations.

The average readability of the English documents was 8.89, equivalent to a 9th-grade reading level. All translation types had an average readability corresponding to an 8th-grade reading level: ASSH 8.11, ChatGPT 8.43, and Google 8.25 (Figure 7). There was no significant difference in readability between the English and Spanish texts (p=0.18, 0.61, 0.33), or between Spanish translations.

Conclusion: ChatGPT may be an effective tool to improve hand patient access to care and health literacy across language barriers.
Back to 2026 ePosters