Foundation Models for Healthcare

Foundation Models for Healthcare #

What are Foundation Models? #

Foundation models are trained on massive amounts of unlabeled data using self-supervised or unsupervised learning.
They are “foundational” because they can be adapted to multiple downstream tasks with high efficiency and minimal data.
They demonstrate sample efficiency and can handle multiple modalities like text, images, genomics, etc.

Few-Shot vs. Zero-Shot Learning #

Few-Shot Learning: Learns from just a few labeled examples per class and generalizes to new examples.
Zero-Shot Learning: Learns to perform tasks it hasn’t seen in training, relying on general knowledge from pretraining.

➡️ These abilities allow foundation models to generalize efficiently across healthcare tasks, even with limited supervision.

What kind of data powers these models? #

Foundation models are trained on multi-modal health data:

Text: Clinical notes, EHRs, literature
Images: X-rays, MRIs, CTs
Sequences: Genomics, proteomics
Graphs: Molecular structures
Time Series: ECGs, continuous monitoring
Video: Ultrasound

Who provides this data? #

Hospitals, pharma, insurance payers, academic researchers, patients (via wearables), and public forums.

Downstream Use Cases #

For providers: Diagnosis, treatment planning, trial recruitment, drug discovery.
For patients: QA, health education, personalized guidance, assistive care.

Foundation models serve as AI interfaces to improve decision-making and patient engagement.

Why foundation models are timely now #

Human systems evolve linearly, but technology is exponential.
Data is growing rapidly, e.g.:
- 1950s: Data doubled every 50 years.
- 2020s: Every 73 days.
Healthcare data exploded from 150 EB (2013) → over 2,000 EB (2020).

The Chessboard Paradox #

A grain of rice doubled per square = >9B grains by square 64.
Shows how exponential growth is counterintuitive to humans.

Compute acceleration #

Moore’s Law + AI accelerators (e.g., GPUs) made it feasible to train large foundation models.

➡️ These forces combine to make now the critical window to apply foundation models in healthcare.

Narrow vs. General AI #

Narrow AI: Performs one task; static.
General AI: Learns multiple tasks and evolves over time.
Foundation models aim for General AI characteristics.

Emergent Behaviors #

Large-scale models exhibit behaviors not explicitly programmed.
Example: Google’s PaLM:
- 8B: Basic QA & language understanding
- 62B: Summarization, code completion
- 540B: Common-sense reasoning, joke explanation, logic chaining

Risk: Hallucination #

Model may generate confident but false outputs (e.g., imaging results that don’t exist).
Needs human oversight for reliability.

Transformer Components #

Self-attention: Captures relationships between all tokens.
Encoder: Converts input tokens into vector embeddings.
Decoder: Generates output from internal representation.

RLHF: Reinforcement Learning with Human Feedback #

Step 1: Train a supervised model on human examples.
Step 2: Collect human preferences to train a reward model.
Step 3: Fine-tune the language model using reinforcement learning (PPO).

This aligns model outputs with human intent and preferences.

What is Prompt Engineering? #

Crafting inputs to steer output behavior of foundation models.

Prompt Types #

Simple instructions
Role-based prompts
Few-shot examples
Chain-of-Thought (CoT)
Zero-shot-CoT (“Let’s think step by step.”)
Self-consistency (multiple CoTs, pick majority)
Generative knowledge prep (generate before answering)

Text-Based Applications in Healthcare #

Appointment scheduling
Inbox management
Chart summarization
Trial eligibility
Decision support
Medical QA and patient communication

➡️ These tools reduce burnout and support both provider productivity and patient engagement.

Modalities Beyond Text #

Imaging (X-rays, CTs)
Genomics/proteomics
Signal data (ECG)
VATT-like models process multiple data types in a unified transformer architecture.

Do Foundation Models “Understand” Imaging? #

They can generate plausible results, but:
- Miss clinical context
- Can’t compare time-series or integrate history like radiologists

Imaging as a Biomarker Source #

CT @ L3 can yield:
- Fat/muscle measurements
- Aortic calcification
- Organomegaly
- Predictive health markers

Foundation models unlock quantitative phenotyping from visual data.

Clinical Readiness Gaps #

83% clinicians want AI in training
70% feel overwhelmed by new tech

Academia vs. Industry #

Industry dominates in compute + data
Collaboration is essential for clinical relevance & ethical development

Model Drift Risks #

Data Drift: Input data distribution changes
Model Drift: Degrading performance over time

Deployment Best Practices #

Monitor performance regularly
Update models with new data
Ensure data quality
Audit for fairness and bias
Collaborate across sectors

➡️ Treat foundation models like medical devices — with continuous monitoring, recalibration, and governance.