The race for tailored medical AI models is heating up. Google and DeepMind have just released a new paper describing Med-Gemini, a group of advanced AI models targeting healthcare applications. Authors claim that Med-Gemini is outperforming competing models such as GPT-4 of OpenAI. However, the latter is not lagging behind in the medical arena, recently expanding its collaboration with Moderna, a large pharmaceutical company.
Med-Gemini’s striking leap forward, if validated in real-world settings, is its ability to capture context and temporality; a known pitfall in existing health-related AI models. It’s true, we, physicians, are notorious for our abbreviations and lack of uniformity in documentation. Nonetheless, the true challenge in training medical algorithms is not the textual complexity – but rather the contextual one.
A simple example to that is one any parent to a toddler knows well: having to visit a pediatrician for your youngster’s fever and rash. The doctor will always ask: what came first – was it the fever or the rash? Did it spread from the head down or the legs up? These simple characteristics can differentiate a mild and self-limiting diseases, like Roseola, from a potentially life threatening one, as meningococcal meningitis. These seemingly straightforward questions, with their multidimensionality and time-series characteristics, can throw an AI model completely off with the slightest inaccuracy.
This exact contextuality seems to have been tackled by Med-Gemini through breaking away from the massive undertaking of building an all-encompassing general medical model. Instead, Google’s developers have adopted a vertical-by-vertical approach of related models, referred to as a “family” of models, each optimizing a specific medical domain or scenario. This has reportedly resulted in improved and nuanced accuracy, and a more transparent reasoning, providing some interpretable feedback, such as why a suggested diagnosis is the most likely one.
As doctors are expected to keep abreast of recent research, Google seems to hold Med-Gemini to the same standard. The new model also incorporates a significant additional layer – a web-based search of up-to-date information, allowing augmentation of data with external knowledge, integrating online results into the model.
Though Med-Gemini has leveraged diverse data sources, such as excerpts from health records, X-rays, photos of skin lesions, medical exam prep question and others– it is still important to remember what has yet to happen: a prospective, real-world validation on actual production-level data.
Undoubtedly, multimodal models have ushered the wind of progress with AI powered health advancements. Yet, the burden of proof is still to be demonstrated in real-life clinical settings.







