Papers

Generative Medical Event Models Improve with Scale

S Waxler, P Blazek, D White, D Sneider… - arXiv preprint arXiv …, 2025 - arxiv.org
Computer Science paper cs.LG Suggest

… Remarkably for a foundation model with generic pretraining and simulation-based inference, CoMET generally outperformed or matched task-specific supervised models …

Link to paper

BibTeX

@article{2508.12104v2,
Author = {Shane Waxler and Paul Blazek and Davis White and Daniel Sneider and Kevin Chung and Mani Nagarathnam and Patrick Williams and Hank Voeller and Karen Wong and Matthew Swanhorst and Sheng Zhang and Naoto Usuyama and Cliff Wong and Tristan Naumann and Hoifung Poon and Andrew Loza and Daniella Meeker and Seth Hain and Rahul Shah},
Title = {Generative Medical Event Models Improve with Scale},
Eprint = {2508.12104v2},
ArchivePrefix = {arXiv},
PrimaryClass = {cs.LG},
Abstract = {Realizing personalized medicine at scale calls for methods that distill
insights from longitudinal patient journeys, which can be viewed as a sequence
of medical events. Foundation models pretrained on large-scale medical event
data represent a promising direction for scaling real-world evidence generation
and generalizing to diverse downstream tasks. Using Epic Cosmos, a dataset with
medical events from de-identified longitudinal health records for 16.3 billion
encounters over 300 million unique patient records from 310 health systems, we
introduce the Comet models, a family of decoder-only transformer models
pretrained on 118 million patients representing 115 billion discrete medical
events (151 billion tokens). We present the largest scaling-law study of
medical event data, establishing a methodology for pretraining and revealing
power-law scaling relationships for compute, tokens, and model size.
Consequently, we pretrained a series of compute-optimal models with up to 1
billion parameters. Conditioned on a patient's real-world history, Comet
autoregressively predicts the next medical event to simulate patient health
timelines. We studied 78 real-world tasks, including diagnosis prediction,
disease prognosis, and healthcare operations. Remarkably for a foundation model
with generic pretraining and simulation-based inference, Comet generally
outperformed or matched task-specific supervised models on these tasks, without
requiring task-specific fine-tuning or few-shot examples. Comet's predictive
power consistently improves as the model and pretraining scale. Our results
show that Comet, a generative medical event foundation model, can effectively
capture complex clinical dynamics, providing an extensible and generalizable
framework to support clinical decision-making, streamline healthcare
operations, and improve patient outcomes.},
Year = {2025},
Month = {Aug},
Url = {http://arxiv.org/abs/2508.12104v2},
File = {2508.12104v2.pdf}
}

Share