Unleashing the power of generative artificial intelligence (AI) in transfusion medicine

Unleashing the power of generative artificial intelligence (AI) in transfusion medicine

There’s a lot of debate and controversy, and perhaps even fear, about the use of AI in medicine. And the question is, what will the role of AI in transfusion medicine be?  

To explore this, we used GPT-3.5 to write an article about a publication in the October edition of Transfusion journal on the use of AI in transfusion medicine.   

Here’s what it came up with. We’ll let you be the judge.  

Generative AI tools like OpenAI's Generative Pretrained Transformer (GPT), Google Bard, and Meta LLaMA are making waves for their ability to create and play around with text, all thanks to large language models (LLMs). These AI tools are winning hearts with their easy use, vast knowledge, smooth responses, and a rapidly growing user base - GPT alone gained over 100 million users between December 2022 and February 2023. The progress in LLMs has been astounding. For instance, GPT-4, released in March 2023, significantly outperformed its predecessor, GPT-3.5, achieving a remarkable 90th percentile score on the legal Uniform Bar Examination, a stark improvement from GPT-3.5's 10th percentile score in November 2022. This remarkable leap showcases the concept of "emergent properties" in LLMs—small tweaks in training resulting in unexpectedly large performance gains.   

With these advancements, excitement and concern have rippled through various sectors as generative AIs become increasingly capable of handling human work. Economists estimate that more than 300 million jobs in Europe and the US are susceptible to automation using LLMs, particularly in repetitive, information-based tasks.   

Understanding how LLMs work is crucial. They operate by recognising patterns and relationships between pieces of text, known as "tokens," which can be words, fragments, or even punctuation marks. A model like GPT-3 processes around 50,000 tokens. During training, the model analyses tons of data, learning the connections between tokens and encoding this information as weights. When given a text prompt, the model predicts the next words based on probabilities and generates a response, continually building on this "autoregressive" approach until it deems the answer complete.   

However, LLMs demand vast amounts of training data to craft accurate responses. Training sets encompass several terabytes of carefully selected data from the internet. While specifics of training sets for newer LLMs like GPT-4 aren't public, GPT-3.5 was trained on a mix of sources including Common Crawl, Wikipedia, and public domain book texts.   

Studying LLMs in transfusion medicine   

Researchers recently conducted a study to evaluate the potential of three publicly available LLMs—Google Bard, OpenAI GPT-3.5, and OpenAI GPT-4—in performing various knowledge-based tasks related to transfusion medicine. This research aimed to determine if these models could accurately handle transfusion scenarios, clinical consultation questions, and a validated transfusion medicine knowledge test.   

Task 1: Making informed transfusion decisions   

In the first task, the LLMs were presented with 44 diverse scenarios related to red blood cell transfusion. Their goal was to determine if a transfusion was necessary and, if so, specify the minimum post-transfusion haemoglobin goal. Results showed that GPT-4 provided the most accurate responses, closely aligning with AABB guidelines for restrictive transfusion. On the other hand, Bard lagged behind, highlighting the advancements from one LLM generation to the next.   

Task 2: Addressing clinical consultations   

Task 2 involved posing 12 questions about clinical transfusion medicine to the LLMs, simulating common consultations. GPT-4 stood out by providing detailed and accurate responses to both explicitly and implicitly phrased questions. This task also highlighted the impact of question phrasing on the models' performance.   

Task 3: Testing transfusion medicine knowledge   

The third task assessed explicit transfusion medicine knowledge using the Internal Medicine BEST-TEST, a validated assessment. GPT-4 consistently outperformed, surpassing published high scores among postgraduate trainees. Interestingly, all LLMs faced a consistent challenge in correctly addressing a question related to TRALI treatment.   

The way forward   

Generative AI holds immense potential to transform medical practices by providing valuable insights and aiding decision-making. However, it's crucial to use this technology responsibly, considering potential biases and the importance of human expertise. As we stand on the edge of a transformation in transfusion medicine, judicious use of generative AI can enhance patient care, research, and knowledge dissemination within the field. Let's embrace this technology responsibly and ensure that it serves the best interests of both patients and healthcare professionals.  

Reference: 

Hurley, NC, Schroeder, KM, Hess, AS. Would doctors dream of electric blood bankers? Large language model-based artificial intelligence performs well in many aspects of transfusion medicine. Transfusion. 2023; 63(10): 1833–1840. https://doi.org/10.1111/trf.17526