[ToC] Course 2

ToC of Course 2/5: Introduction to Clinical Data #


Module 1: Asking and Answering Questions via Clinical Data Mining #

  1. Introduction to the data mining workflow
  2. Real Life Example
  3. Example: Finding similar patients
  4. Example: Estimating risk
  5. Putting patient data on timeline
  6. Revisit the data mining workflow steps
  7. Types of research questions
  8. Research questions suited for clinical data
  9. Example: making decision to treat
  10. Properties that make answering a research question useful

Module 2: Data Available from Healthcare Systems #

  1. Review of the healthcare system
  2. Review of key entities and the data they collect
  3. Actors with different interests
  4. Common data types in Healthcare
  5. Strengths and weaknesses of observational data
  6. Bias and error from the healthcare system perspective
  7. Bias and error of exposures and outcomes
  8. How a patient’s exposure might be misclassified
  9. How a patient’s outcome could be misclassified
  10. Electronic medical record data
  11. Claims data
  12. Pharmacy
  13. Surveillance datasets and Registries
  14. Population health data sets
  15. A framework to assess if a data source is useful

Module 3: Representing Time, and Timing of Events, for Clinical Data Mining #

  1. Introduction
  2. Time, timelines, timescales and representations of time
  3. Timescale: Choosing the relevant units of time
  4. What affects the timescale
  5. Representation of time
  6. Time series and non-time series data
  7. Order of events
  8. Implicit representations of time
  9. Different ways to put data in bins
  10. Timing of exposures and outcomes
  11. Clinical processes are non-stationary

Module 4: Creating Analysis Ready Datasets from Patient Timelines #

  1. Turning clinical data into something you can analyze
  2. Defining the unit of analysis
  3. Using features and the presence of features
  4. How to create features from structured sources
  5. Standardizing features
  6. Dealing with too many features
  7. The origins of missing values
  8. Dealing with missing values
  9. Summary recommendations for missing values
  10. Constructing new features
  11. Examples of engineered features
  12. When to consider engineered features
  13. Main points about creating analysis ready datasets
  14. Structured knowledge graphs
  15. So what exactly is in a knowledge graph
  16. What are important knowledge graphs
  17. How to choose which knowledge graph to use

Module 5: Handling Unstructured Healthcare Data: Text, Images, Signals #

  1. Introduction to unstructured data
  2. What is clinical text
  3. The value of clinical text
  4. What makes clinical text difficult to handle
  5. Privacy and de-identification
  6. A primer on Natural Language Processing
  7. Practical approach to processing clinical text
  8. Summary - Clinical text
  9. Overview and goals of medical imaging
  10. Why are images important?
  11. What are images?
  12. A typical image management process
  13. Summary - Images
  14. Overview of biomedical signals
  15. Why are signals important?
  16. What are signals?
  17. What are the major issues with using signals?
  18. Summary - Signals

Module 6: Putting the Pieces Together: Electronic Phenotyping #

  1. Introduction to electronic phenotyping
  2. Challenges in electronic phenotyping
  3. Specifying an electronic phenotype
  4. Two approaches to phenotyping
  5. Rule-based electronic phenotyping
  6. Examples of rule based electronic phenotype definitions
  7. Constructing a rule based phenotype definition
  8. Probabilistic phenotyping
  9. Approaches for creating a probabilistic phenotype definition
  10. Software for probabilistic phenotype definitions

Module 7: Ethics #

  1. Introduction to Research Ethics and AI
  2. The Belmont Report: A Framework for Research Ethics
  3. Ethical Issues in Data sources for AI
  4. Secondary Uses of Data
  5. Return of Results
  6. AI and The Learning Health System
  7. Ethics Summary