ChatGPT Passes the USMLE
ChatGPT, along with other AI programs, has recently gained a lot of attention for its potential to disrupt sectors by generating human-like responses to given prompts. Developed by OpenAI, ChatGPT is powered by a large language model called GPT3.5. Large language models are different from deep learning models by their ability to predict and general new sequences of words that mimic the natural human language. In regard to its role in the field of medicine, many believe that it can be a useful tool for research, medical education, and clinical decision-making.
How Well Did ChatGPT and Other AIs Actually Do on USMLE?
An article published on MedPageToday generated some buzz about the fact that two AI models, ChatGPT and Flan-PaLM, passed the U.S. Medical Licensing Examinations. The original research articles could be found here and here. It was noted that ChatGPT performed with >50% accuracy across all the exams while Flan-PaLM achieved 67% accuracy. For reference, the historical passing percentage for the USMLE series (Step 1, Step 2 CK, Step 3) is roughly 60%. Furthermore, the ChatGPT tool was found to show “a high level of concordance and insight in its explanations”. To conclude, this article raises the discussion of how AI can potentially contribute to the future of healthcare.
It is important to keep in mind that while ChatGPT may possibly pass the USMLE, it does not mean that it can replace the role of physicians anytime soon. Although it shows the potential to “pass”, the score is not very impressive. USMLE scores are one of the main deciding factors for residency acceptance, and competitive scores are generally much higher than just 60%. Since ChatGPT is trained on a large dataset of medical texts and literature, taking the USMLE exam is essentially taking an open-book test. From this perspective, many medical students and physicians clearly still outperform this AI model on the USMLE. One can imagine that it is through medical training and experience that allows them to better navigate the nuances of clinical diagnosis and clinical decision-making skills on these exams. In addition, ChatGPT cannot provide empathy nor take responsibility for clinical decision-making. Still, it is undeniable that ChatGPT passing the USMLE is impressive, and it would be interesting to see how it evolves in the future.
How Can ChatGPT Be Utilized by Medical Students?
ChatGPT also has the potential to be a useful interactive tool for medical education because of its human-like responses. In particular, we were curious to see how ChatGPT could help medical students through different points of their medical school careers. We looked into whether it could replace traditional QBanks for USMLE-style practice questions, be used to draft residency and medical school personal statements, create custom study schedules, and more. When looking into creating practice questions in particular, we found that ChatGPT is only as good as the data it is trained on and that it often confidently gives you wrong information. You can read more about our thoughts on that in the blog post here.
Current Trends and Concerns Regarding Future Uses
Another interesting development is that ChatGPT was listed as an author on several scientific research papers. A recently published Nature article explains why many scientists disapprove of this. It is not a surprise that this news brings uneasiness to the scientific community. One objection is that there could be a misuse of ChatGPT in academia, where people without the proper training and expertise could potentially publish scientific papers. In order to be a co-author on research papers, ChatGPT would also need to make “significant scholarly contributions” to the paper. Lastly, the biggest ethical roadblock is that ChatGPT cannot consent to being a co-author and take full responsibility for its contributions.
ChatGPT passes the USMLE. ChatGPT is listed as an author on scientific papers. While this is ground-breaking news for the scientific and medical community, this is just the tip of the iceberg. These very early findings demonstrate the potential of large language models to impact healthcare. At this point, it is still unclear whether or not AI models like ChatGPT can significantly contribute to healthcare in the future. However, we can expect to see many more new research papers and medical articles regarding this topic as it gains more traction.