How does ChatGPT perform on the European Board of Orthopedics and Traumatology examination? A comparative study

dc.contributor.authorUlus, Sait Anil
dc.date.accessioned2024-04-24T17:20:46Z
dc.date.available2024-04-24T17:20:46Z
dc.date.issued2023
dc.departmentDicle Üniversitesien_US
dc.description.abstractObjective: The objective of this investigation was to comprehensively compare the precision and responsiveness of GPT-3.5 and GPT-4 within the domain of Orthopedics and Traumatology. Specifically, our aim was to evaluate their capacity to provide accurate answers to a series of sample questions derived from the European Board of Orthopedics and Traumatology (EBOT) exam.Methods: The study was conducted over the period from 10th May 2023 to 15th May 2023. It involved a comparative analysis of two AI language models, namely GPT-3.5 and GPT-4, specifically in the field of Orthopedics and Traumatology and with a focus on sample questions extracted from the EBOT exam. Two separate sets, each containing 80 sample questions (totaling 160 questions), were compiled from the pool of available EBOT sample questions.Results: A total of 160 questions were included in the study, 80 questions from each group. In the field of General Orthopedics, GPT-4 demonstrated a higher success rate (75%) compared to GPT-3.5 (45%) (p=0.053). In the Traumatology domain, GPT-4 delivered a notable success rate of 80%, compared to GPT-3.5's (p=0.010). For Oncological Orthopedic Surgery, both models showed a similar trend (P=0.057). Overall, GPT-4 exhibited superior performance across all domains, with a cumulative success rate of 75% as compared to GPT-3.5's 43.75% (p<0.001). When considering the overall responses, the Odds Ratio between GPT4 and GPT-3.5 was determined to be 3.8.Conclusions: Based on the findings of this comparative study, it can be firmly concluded that GPT-4 demonstrates a remarkable superiority over GPT-3.5 in effectively addressing the EBOT exam sample questionsen_US
dc.identifier.doi10.3306/AJHS.2023.38.06.43
dc.identifier.issn1579-5853
dc.identifier.issn2255-0569
dc.identifier.issue6en_US
dc.identifier.urihttps://doi.org/10.3306/AJHS.2023.38.06.43
dc.identifier.urihttps://hdl.handle.net/11468/19238
dc.identifier.volume38en_US
dc.identifier.wosWOS:001130784100009
dc.identifier.wosqualityN/A
dc.indekslendigikaynakWeb of Science
dc.language.isoesen_US
dc.publisherReial Acad Medicina Illes Balearsen_US
dc.relation.ispartofMedicina Balear
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectChatgpten_US
dc.subjectOrthopedicsen_US
dc.subjectTraumatologyen_US
dc.subjectEboten_US
dc.subjectExamen_US
dc.titleHow does ChatGPT perform on the European Board of Orthopedics and Traumatology examination? A comparative studyen_US
dc.titleHow does ChatGPT perform on the European Board of Orthopedics and Traumatology examination? A comparative study
dc.typeArticleen_US

Dosyalar