Foundation Models for Healthcare

Foundation Models for Healthcare #

What are Foundation Models? #

  • Foundation models are trained on massive amounts of unlabeled data using self-supervised or unsupervised learning.
  • They are “foundational” because they can be adapted to multiple downstream tasks with high efficiency and minimal data.
  • They demonstrate sample efficiency and can handle multiple modalities like text, images, genomics, etc.

Few-Shot vs. Zero-Shot Learning #

  • Few-Shot Learning: Learns from just a few labeled examples per class and generalizes to new examples.
  • Zero-Shot Learning: Learns to perform tasks it hasn’t seen in training, relying on general knowledge from pretraining.

➡️ These abilities allow foundation models to generalize efficiently across healthcare tasks, even with limited supervision.

What kind of data powers these models? #

Foundation models are trained on multi-modal health data:

  • Text: Clinical notes, EHRs, literature
  • Images: X-rays, MRIs, CTs
  • Sequences: Genomics, proteomics
  • Graphs: Molecular structures
  • Time Series: ECGs, continuous monitoring
  • Video: Ultrasound

Who provides this data? #

  • Hospitals, pharma, insurance payers, academic researchers, patients (via wearables), and public forums.

Downstream Use Cases #

  • For providers: Diagnosis, treatment planning, trial recruitment, drug discovery.
  • For patients: QA, health education, personalized guidance, assistive care.

Foundation models serve as AI interfaces to improve decision-making and patient engagement.

Why foundation models are timely now #

  • Human systems evolve linearly, but technology is exponential.
  • Data is growing rapidly, e.g.:
    • 1950s: Data doubled every 50 years.
    • 2020s: Every 73 days.
  • Healthcare data exploded from 150 EB (2013) → over 2,000 EB (2020).

The Chessboard Paradox #

  • A grain of rice doubled per square = >9B grains by square 64.
  • Shows how exponential growth is counterintuitive to humans.

Compute acceleration #

  • Moore’s Law + AI accelerators (e.g., GPUs) made it feasible to train large foundation models.

➡️ These forces combine to make now the critical window to apply foundation models in healthcare.

Narrow vs. General AI #

  • Narrow AI: Performs one task; static.
  • General AI: Learns multiple tasks and evolves over time.
  • Foundation models aim for General AI characteristics.

Emergent Behaviors #

  • Large-scale models exhibit behaviors not explicitly programmed.
  • Example: Google’s PaLM:
    • 8B: Basic QA & language understanding
    • 62B: Summarization, code completion
    • 540B: Common-sense reasoning, joke explanation, logic chaining

Risk: Hallucination #

  • Model may generate confident but false outputs (e.g., imaging results that don’t exist).
  • Needs human oversight for reliability.

Transformer Components #

  • Self-attention: Captures relationships between all tokens.
  • Encoder: Converts input tokens into vector embeddings.
  • Decoder: Generates output from internal representation.

RLHF: Reinforcement Learning with Human Feedback #

  • Step 1: Train a supervised model on human examples.
  • Step 2: Collect human preferences to train a reward model.
  • Step 3: Fine-tune the language model using reinforcement learning (PPO).

This aligns model outputs with human intent and preferences.

What is Prompt Engineering? #

  • Crafting inputs to steer output behavior of foundation models.

Prompt Types #

  • Simple instructions
  • Role-based prompts
  • Few-shot examples
  • Chain-of-Thought (CoT)
  • Zero-shot-CoT (“Let’s think step by step.”)
  • Self-consistency (multiple CoTs, pick majority)
  • Generative knowledge prep (generate before answering)

Text-Based Applications in Healthcare #

  • Appointment scheduling
  • Inbox management
  • Chart summarization
  • Trial eligibility
  • Decision support
  • Medical QA and patient communication

➡️ These tools reduce burnout and support both provider productivity and patient engagement.

Modalities Beyond Text #

  • Imaging (X-rays, CTs)
  • Genomics/proteomics
  • Signal data (ECG)
  • VATT-like models process multiple data types in a unified transformer architecture.

Do Foundation Models “Understand” Imaging? #

  • They can generate plausible results, but:
    • Miss clinical context
    • Can’t compare time-series or integrate history like radiologists

Imaging as a Biomarker Source #

  • CT @ L3 can yield:
    • Fat/muscle measurements
    • Aortic calcification
    • Organomegaly
    • Predictive health markers

Foundation models unlock quantitative phenotyping from visual data.

Clinical Readiness Gaps #

  • 83% clinicians want AI in training
  • 70% feel overwhelmed by new tech

Academia vs. Industry #

  • Industry dominates in compute + data
  • Collaboration is essential for clinical relevance & ethical development

Model Drift Risks #

  • Data Drift: Input data distribution changes
  • Model Drift: Degrading performance over time

Deployment Best Practices #

  1. Monitor performance regularly
  2. Update models with new data
  3. Ensure data quality
  4. Audit for fairness and bias
  5. Collaborate across sectors

➡️ Treat foundation models like medical devices — with continuous monitoring, recalibration, and governance.


📚 Additional Readings #

  1. Attention Is All You Need
  2. The Illustrated Transformer
  3. The Annotated Transformer
  4. Opportunities and Risks of Foundation Models (CRFM)
  5. Shifting ML for Healthcare – Nature Biomedical Engineering