Foundation Models for Healthcare #
What are Foundation Models? #
- Foundation models are trained on massive amounts of unlabeled data using self-supervised or unsupervised learning.
- They are “foundational” because they can be adapted to multiple downstream tasks with high efficiency and minimal data.
- They demonstrate sample efficiency and can handle multiple modalities like text, images, genomics, etc.
Few-Shot vs. Zero-Shot Learning #
- Few-Shot Learning: Learns from just a few labeled examples per class and generalizes to new examples.
- Zero-Shot Learning: Learns to perform tasks it hasn’t seen in training, relying on general knowledge from pretraining.
➡️ These abilities allow foundation models to generalize efficiently across healthcare tasks, even with limited supervision.
What kind of data powers these models? #
Foundation models are trained on multi-modal health data:
- Text: Clinical notes, EHRs, literature
- Images: X-rays, MRIs, CTs
- Sequences: Genomics, proteomics
- Graphs: Molecular structures
- Time Series: ECGs, continuous monitoring
- Video: Ultrasound
Who provides this data? #
- Hospitals, pharma, insurance payers, academic researchers, patients (via wearables), and public forums.
Downstream Use Cases #
- For providers: Diagnosis, treatment planning, trial recruitment, drug discovery.
- For patients: QA, health education, personalized guidance, assistive care.
Foundation models serve as AI interfaces to improve decision-making and patient engagement.
Why foundation models are timely now #
- Human systems evolve linearly, but technology is exponential.
- Data is growing rapidly, e.g.:
- 1950s: Data doubled every 50 years.
- 2020s: Every 73 days.
- Healthcare data exploded from 150 EB (2013) → over 2,000 EB (2020).
The Chessboard Paradox #
- A grain of rice doubled per square = >9B grains by square 64.
- Shows how exponential growth is counterintuitive to humans.
Compute acceleration #
- Moore’s Law + AI accelerators (e.g., GPUs) made it feasible to train large foundation models.
➡️ These forces combine to make now the critical window to apply foundation models in healthcare.
Narrow vs. General AI #
- Narrow AI: Performs one task; static.
- General AI: Learns multiple tasks and evolves over time.
- Foundation models aim for General AI characteristics.
Emergent Behaviors #
- Large-scale models exhibit behaviors not explicitly programmed.
- Example: Google’s PaLM:
- 8B: Basic QA & language understanding
- 62B: Summarization, code completion
- 540B: Common-sense reasoning, joke explanation, logic chaining
Risk: Hallucination #
- Model may generate confident but false outputs (e.g., imaging results that don’t exist).
- Needs human oversight for reliability.
Transformer Components #
- Self-attention: Captures relationships between all tokens.
- Encoder: Converts input tokens into vector embeddings.
- Decoder: Generates output from internal representation.
RLHF: Reinforcement Learning with Human Feedback #
- Step 1: Train a supervised model on human examples.
- Step 2: Collect human preferences to train a reward model.
- Step 3: Fine-tune the language model using reinforcement learning (PPO).
This aligns model outputs with human intent and preferences.
What is Prompt Engineering? #
- Crafting inputs to steer output behavior of foundation models.
Prompt Types #
- Simple instructions
- Role-based prompts
- Few-shot examples
- Chain-of-Thought (CoT)
- Zero-shot-CoT (“Let’s think step by step.”)
- Self-consistency (multiple CoTs, pick majority)
- Generative knowledge prep (generate before answering)
Text-Based Applications in Healthcare #
- Appointment scheduling
- Inbox management
- Chart summarization
- Trial eligibility
- Decision support
- Medical QA and patient communication
➡️ These tools reduce burnout and support both provider productivity and patient engagement.
Modalities Beyond Text #
- Imaging (X-rays, CTs)
- Genomics/proteomics
- Signal data (ECG)
- VATT-like models process multiple data types in a unified transformer architecture.
Do Foundation Models “Understand” Imaging? #
- They can generate plausible results, but:
- Miss clinical context
- Can’t compare time-series or integrate history like radiologists
Imaging as a Biomarker Source #
- CT @ L3 can yield:
- Fat/muscle measurements
- Aortic calcification
- Organomegaly
- Predictive health markers
Foundation models unlock quantitative phenotyping from visual data.
Clinical Readiness Gaps #
- 83% clinicians want AI in training
- 70% feel overwhelmed by new tech
Academia vs. Industry #
- Industry dominates in compute + data
- Collaboration is essential for clinical relevance & ethical development
Model Drift Risks #
- Data Drift: Input data distribution changes
- Model Drift: Degrading performance over time
Deployment Best Practices #
- Monitor performance regularly
- Update models with new data
- Ensure data quality
- Audit for fairness and bias
- Collaborate across sectors
➡️ Treat foundation models like medical devices — with continuous monitoring, recalibration, and governance.
📚 Additional Readings #
- Attention Is All You Need
- The Illustrated Transformer
- The Annotated Transformer
- Opportunities and Risks of Foundation Models (CRFM)
- Shifting ML for Healthcare – Nature Biomedical Engineering