How does ChatGPT perform on the European Board of Orthopedics and Traumatology examination? A comparative study
dc.contributor.author | Ulus, Sait Anil | |
dc.date.accessioned | 2024-04-24T17:20:46Z | |
dc.date.available | 2024-04-24T17:20:46Z | |
dc.date.issued | 2023 | |
dc.department | Dicle Üniversitesi | en_US |
dc.description.abstract | Objective: The objective of this investigation was to comprehensively compare the precision and responsiveness of GPT-3.5 and GPT-4 within the domain of Orthopedics and Traumatology. Specifically, our aim was to evaluate their capacity to provide accurate answers to a series of sample questions derived from the European Board of Orthopedics and Traumatology (EBOT) exam.Methods: The study was conducted over the period from 10th May 2023 to 15th May 2023. It involved a comparative analysis of two AI language models, namely GPT-3.5 and GPT-4, specifically in the field of Orthopedics and Traumatology and with a focus on sample questions extracted from the EBOT exam. Two separate sets, each containing 80 sample questions (totaling 160 questions), were compiled from the pool of available EBOT sample questions.Results: A total of 160 questions were included in the study, 80 questions from each group. In the field of General Orthopedics, GPT-4 demonstrated a higher success rate (75%) compared to GPT-3.5 (45%) (p=0.053). In the Traumatology domain, GPT-4 delivered a notable success rate of 80%, compared to GPT-3.5's (p=0.010). For Oncological Orthopedic Surgery, both models showed a similar trend (P=0.057). Overall, GPT-4 exhibited superior performance across all domains, with a cumulative success rate of 75% as compared to GPT-3.5's 43.75% (p<0.001). When considering the overall responses, the Odds Ratio between GPT4 and GPT-3.5 was determined to be 3.8.Conclusions: Based on the findings of this comparative study, it can be firmly concluded that GPT-4 demonstrates a remarkable superiority over GPT-3.5 in effectively addressing the EBOT exam sample questions | en_US |
dc.identifier.doi | 10.3306/AJHS.2023.38.06.43 | |
dc.identifier.issn | 1579-5853 | |
dc.identifier.issn | 2255-0569 | |
dc.identifier.issue | 6 | en_US |
dc.identifier.uri | https://doi.org/10.3306/AJHS.2023.38.06.43 | |
dc.identifier.uri | https://hdl.handle.net/11468/19238 | |
dc.identifier.volume | 38 | en_US |
dc.identifier.wos | WOS:001130784100009 | |
dc.identifier.wosquality | N/A | |
dc.indekslendigikaynak | Web of Science | |
dc.language.iso | es | en_US |
dc.publisher | Reial Acad Medicina Illes Balears | en_US |
dc.relation.ispartof | Medicina Balear | |
dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
dc.rights | info:eu-repo/semantics/closedAccess | en_US |
dc.subject | Chatgpt | en_US |
dc.subject | Orthopedics | en_US |
dc.subject | Traumatology | en_US |
dc.subject | Ebot | en_US |
dc.subject | Exam | en_US |
dc.title | How does ChatGPT perform on the European Board of Orthopedics and Traumatology examination? A comparative study | en_US |
dc.title | How does ChatGPT perform on the European Board of Orthopedics and Traumatology examination? A comparative study | |
dc.type | Article | en_US |