ToC of Course 2/5: Introduction to Clinical Data
#
Module 1: Asking and Answering Questions via Clinical Data Mining
#
- Introduction to the data mining workflow
- Real Life Example
- Example: Finding similar patients
- Example: Estimating risk
- Putting patient data on timeline
- Revisit the data mining workflow steps
- Types of research questions
- Research questions suited for clinical data
- Example: making decision to treat
- Properties that make answering a research question useful
Module 2: Data Available from Healthcare Systems
#
- Review of the healthcare system
- Review of key entities and the data they collect
- Actors with different interests
- Common data types in Healthcare
- Strengths and weaknesses of observational data
- Bias and error from the healthcare system perspective
- Bias and error of exposures and outcomes
- How a patient’s exposure might be misclassified
- How a patient’s outcome could be misclassified
- Electronic medical record data
- Claims data
- Pharmacy
- Surveillance datasets and Registries
- Population health data sets
- A framework to assess if a data source is useful
Module 3: Representing Time, and Timing of Events, for Clinical Data Mining
#
- Introduction
- Time, timelines, timescales and representations of time
- Timescale: Choosing the relevant units of time
- What affects the timescale
- Representation of time
- Time series and non-time series data
- Order of events
- Implicit representations of time
- Different ways to put data in bins
- Timing of exposures and outcomes
- Clinical processes are non-stationary
Module 4: Creating Analysis Ready Datasets from Patient Timelines
#
- Turning clinical data into something you can analyze
- Defining the unit of analysis
- Using features and the presence of features
- How to create features from structured sources
- Standardizing features
- Dealing with too many features
- The origins of missing values
- Dealing with missing values
- Summary recommendations for missing values
- Constructing new features
- Examples of engineered features
- When to consider engineered features
- Main points about creating analysis ready datasets
- Structured knowledge graphs
- So what exactly is in a knowledge graph
- What are important knowledge graphs
- How to choose which knowledge graph to use
Module 5: Handling Unstructured Healthcare Data: Text, Images, Signals
#
- Introduction to unstructured data
- What is clinical text
- The value of clinical text
- What makes clinical text difficult to handle
- Privacy and de-identification
- A primer on Natural Language Processing
- Practical approach to processing clinical text
- Summary - Clinical text
- Overview and goals of medical imaging
- Why are images important?
- What are images?
- A typical image management process
- Summary - Images
- Overview of biomedical signals
- Why are signals important?
- What are signals?
- What are the major issues with using signals?
- Summary - Signals
Module 6: Putting the Pieces Together: Electronic Phenotyping
#
- Introduction to electronic phenotyping
- Challenges in electronic phenotyping
- Specifying an electronic phenotype
- Two approaches to phenotyping
- Rule-based electronic phenotyping
- Examples of rule based electronic phenotype definitions
- Constructing a rule based phenotype definition
- Probabilistic phenotyping
- Approaches for creating a probabilistic phenotype definition
- Software for probabilistic phenotype definitions
Module 7: Ethics
#
- Introduction to Research Ethics and AI
- The Belmont Report: A Framework for Research Ethics
- Ethical Issues in Data sources for AI
- Secondary Uses of Data
- Return of Results
- AI and The Learning Health System
- Ethics Summary