How does ChatGPT perform on the European Board of Orthopedics and Traumatology examination? A comparative study

Ulus, Sait Anil

How does ChatGPT perform on the European Board of Orthopedics and Traumatology examination? A comparative study

dc.contributor.author	Ulus, Sait Anil
dc.date.accessioned	2024-04-24T17:20:46Z
dc.date.available	2024-04-24T17:20:46Z
dc.date.issued	2023
dc.department	Dicle Üniversitesi	en_US
dc.description.abstract	Objective: The objective of this investigation was to comprehensively compare the precision and responsiveness of GPT-3.5 and GPT-4 within the domain of Orthopedics and Traumatology. Specifically, our aim was to evaluate their capacity to provide accurate answers to a series of sample questions derived from the European Board of Orthopedics and Traumatology (EBOT) exam.Methods: The study was conducted over the period from 10th May 2023 to 15th May 2023. It involved a comparative analysis of two AI language models, namely GPT-3.5 and GPT-4, specifically in the field of Orthopedics and Traumatology and with a focus on sample questions extracted from the EBOT exam. Two separate sets, each containing 80 sample questions (totaling 160 questions), were compiled from the pool of available EBOT sample questions.Results: A total of 160 questions were included in the study, 80 questions from each group. In the field of General Orthopedics, GPT-4 demonstrated a higher success rate (75%) compared to GPT-3.5 (45%) (p=0.053). In the Traumatology domain, GPT-4 delivered a notable success rate of 80%, compared to GPT-3.5's (p=0.010). For Oncological Orthopedic Surgery, both models showed a similar trend (P=0.057). Overall, GPT-4 exhibited superior performance across all domains, with a cumulative success rate of 75% as compared to GPT-3.5's 43.75% (p<0.001). When considering the overall responses, the Odds Ratio between GPT4 and GPT-3.5 was determined to be 3.8.Conclusions: Based on the findings of this comparative study, it can be firmly concluded that GPT-4 demonstrates a remarkable superiority over GPT-3.5 in effectively addressing the EBOT exam sample questions	en_US
dc.identifier.doi	10.3306/AJHS.2023.38.06.43
dc.identifier.issn	1579-5853
dc.identifier.issn	2255-0569
dc.identifier.issue	6	en_US
dc.identifier.uri	https://doi.org/10.3306/AJHS.2023.38.06.43
dc.identifier.uri	https://hdl.handle.net/11468/19238
dc.identifier.volume	38	en_US
dc.identifier.wos	WOS:001130784100009
dc.identifier.wosquality	N/A
dc.indekslendigikaynak	Web of Science
dc.language.iso	es	en_US
dc.publisher	Reial Acad Medicina Illes Balears	en_US
dc.relation.ispartof	Medicina Balear
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı	en_US
dc.rights	info:eu-repo/semantics/closedAccess	en_US
dc.subject	Chatgpt	en_US
dc.subject	Orthopedics	en_US
dc.subject	Traumatology	en_US
dc.subject	Ebot	en_US
dc.subject	Exam	en_US
dc.title	How does ChatGPT perform on the European Board of Orthopedics and Traumatology examination? A comparative study	en_US
dc.title	How does ChatGPT perform on the European Board of Orthopedics and Traumatology examination? A comparative study
dc.type	Article	en_US

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu

How does ChatGPT perform on the European Board of Orthopedics and Traumatology examination? A comparative study

Dosyalar

Koleksiyon