ml in medicine

view markdown

some rough notes on ml in medicine


  • 3 types
    • disease and patient categorization (e.g. classification)
    • fundamental biological study
    • treatment of patients
  • philosophy
    • want to focus on problems doctors can’t do
    • alternatively, focus on automating problems parents can do to screen people at home in cost-effective way
  • pathology - branch of medicine where you take some tissue from a patient (e.g. tumor), look at it under a microscope, and make an assesment of what the disease is
  • websites are often easier than apps for patients
  • The clinical artificial intelligence department: a prerequisite for success (cosgriff et al. 2020) - we need designated departments for clinical ai so we don’t have to rely on 3rd-party vendors and can test for things like distr. shift
  • challenges in ai healthcare (news)
    • adversarial examples
    • things can’t be de-identified
    • algorithms / data can be biased
    • correlation / causation get confused
  • healthcare is 20% of US GDP
  • prognosis is a guess as to the outcome of treatment
  • diagnosis is actually identifying the problem and giving it a name, such as depression or obsessive-compulsive disorder
  • AI is a technology, but it’s not a product
  • health economics incentives align with health incentives: catching tumor early is cheaper for hospitals


  • focus on building something you want to deploy
    • clinically useful - more efficient, cutting costs?
    • effective - does it improve the current baseline
    • focused on patient care - what are the unintended consequences
  • need to think a lot about regulation
    • USA: FDA
    • Europe: CE (more convoluted)
  • intended use
    • very specific and well-defined


medical system


  • doctors are evaluated infrequently (and things like personal traits are often included)
  • US has pretty good care but it is expensive per patient
  • expensive things (e.g. Da Vinci robot)
  • even if ml is not perfect, it may still outperform some doctors

medical education

  • rarely textbooks (often just slides)
  • 1-2% miss rate for diagnosis can be seen as acceptable
  • how doctors think
    • 2 years: memorizing facts about physiology, pharmacology, and pathology
    • 2 years learning practical applications for this knowledge, such as how to decipher an EKG and how to determine the appropriate dose of insulin for a diabetic
    • little emphasis on metal logic for making a correct diagnosis and avoiding mistakes
    • see work by pat croskerry
    • there is limited data on misdiagnosis rates
    • representativeness error - thinking is overly influenced by what is typically true
    • availability error - tendency to judge the likelihood of an event by the ease with which relevant examples come to mind
      • common infections tend to occur in epidemics, afflicting large numbers of people in a single community at the same time
      • confirmation bias
    • affective error - decisions based on what we wish were true (e.g. caring too much about patient)
    • See one, do one, teach one - teaching axiom

political elements

  • why doctors should organize
  • big pharma
  • day-to-day
    • Doctors now face a burnout epidemic: thirty-five per cent of them show signs of high depersonalization
    • according to one recent report, only thirteen per cent of a physician’s day, on average, is spent on doctor-patient interaction
    • study during an average, eleven-hour workday, six hours are spent at the keyboard, maintaining electronic health records.
    • medicare’s r.v.u - changes how doctors are reimbursed, emphasising procedural over cognitive things
    • ai could help - make simple diagnoses faster, reduce paperwork, help patients manage their own diseases like diabetes
    • ai could also make things worse - hospitals are mostly run by business people

medical communication

“how do doctors think?”

communicating findings

  • don’t use ROC curves, use deciles
  • need to evaluate use, not just metric
  • internal/external validity = training/testing error
  • model -> fitted model
  • retrospective (more confounding, looks back) vs prospective study
  • internal/external validity = train/test (although external was usually using different patient population, so is stronger)
  • specificity/sensitivity = precision/recall


succesful examples of ai in medicine

  • ECG (NEJM, 1991)
  • EKG has a small interpretation on it
  • there used to be bayesian networks / expert systems but they went away…

icu interpretability example

  • goal: explain the model not the patient (that is the doctor’s job)
  • want to know interactions between features
  • some features are difficult to understand
    • e.g. max over this window, might seem high to a doctor unless they think about it
  • some features don’t really make sense to change (e.g. was this thing measured)
  • doctors like to see trends - patient health changes over time and must include history
  • feature importance under intervention

high-performance ai studies

  • chest-xray: chexnet
  • echocardiograms: madani, ali, et al. 2018
  • skin: esteva, andre, et al. 2017
  • pathology: campanella, gabriele, et al.. 2019
  • mammogram: kerlikowske, karla, et al. 2018

medical imaging

improving medical studies

  • Machine learning methods for developing precision treatment rules with observational data (Kessler et al. 2019)
    • goal: find precision treatment rules
    • problem: need large sample sizes but can’t obtain them in RCTs
    • recommendations
      • screen important predictors using large observational medical records rather than RCTs
        • important to do matching / weighting to account for bias in treatment assignments
        • alternatively, can look for natural experiment / instrumental variable / discontinuity analysis
        • has many benefits
      • modeling: should use ensemble methods rather than individual models