some rough notes on ml in medicine

1.8. ml in medicine¶

1.8.1. general¶

  • 3 types

    • disease and patient categorization (e.g. classification)

    • fundamental biological study

    • treatment of patients

  • philosophy

    • want to focus on problems doctors can’t do

    • alternatively, focus on automating problems parents can do to screen people at home in cost-effective way

  • pathology - branch of medicine where you take some tissue from a patient (e.g. tumor), look at it under a microscope, and make an assesment of what the disease is

  • websites are often easier than apps for patients

  • The clinical artificial intelligence department: a prerequisite for success (cosgriff et al. 2020) - we need designated departments for clinical ai so we don’t have to rely on 3rd-party vendors and can test for things like distr. shift

  • challenges in ai healthcare (news)

    • adversarial examples

    • things can’t be de-identified

    • algorithms / data can be biased

    • correlation / causation get confused

  • healthcare is 20% of US GDP

  • prognosis is a guess as to the outcome of treatment

  • diagnosis is actually identifying the problem and giving it a name, such as depression or obsessive-compulsive disorder

  • AI is a technology, but it’s not a product

  • fda clearance for paige prostate product (2019) - got ce mark

    • helps pathologists identify cancers (less misses)

  • health economics incentives align with health incentives: catching tumor early is cheaper for hospitals

  • digital slide viewer also got CE mark

1.8.1.1. high-level¶

  • focus on building something you want to deploy

    • clinically useful - more efficient, cutting costs?

    • effective - does it improve the current baseline

    • focused on patient care - what are the unintended consequences

  • need to think a lot about regulation

    • USA: FDA

    • Europe: CE (more convoluted)

  • intended use

    • very specific and well-defined

1.8.2. medical system¶

1.8.2.1. evaluation¶

  • doctors are evaluated infrequently (and things like personal traits are often included)

  • US has pretty good care but it is expensive per patient

  • expensive things (e.g. Da Vinci robot)

  • even if ml is not perfect, it may still outperform some doctors

1.8.2.2. medical education¶

  • rarely textbooks (often just slides)

  • 1-2% miss rate for diagnosis can be seen as acceptable

  • how doctors think

    • 2 years: memorizing facts about physiology, pharmacology, and pathology

    • 2 years learning practical applications for this knowledge, such as how to decipher an EKG and how to determine the appropriate dose of insulin for a diabetic

    • little emphasis on metal logic for making a correct diagnosis and avoiding mistakes

    • see work by pat croskerry

    • there is limited data on misdiagnosis rates

    • representativeness error - thinking is overly influenced by what is typically true

    • availability error - tendency to judge the likelihood of an event by the ease with which relevant examples come to mind

      • common infections tend to occur in epidemics, afflicting large numbers of people in a single community at the same time

      • confirmation bias

    • affective error - decisions based on what we wish were true (e.g. caring too much about patient)

    • See one, do one, teach one - teaching axiom

1.8.2.3. political elements¶

  • why doctors should organize

  • big pharma

  • day-to-day

    • Doctors now face a burnout epidemic: thirty-five per cent of them show signs of high depersonalization

    • according to one recent report, only thirteen per cent of a physician’s day, on average, is spent on doctor-patient interaction

    • study during an average, eleven-hour workday, six hours are spent at the keyboard, maintaining electronic health records.

    • medicare’s r.v.u - changes how doctors are reimbursed, emphasising procedural over cognitive things

    • ai could help - make simple diagnoses faster, reduce paperwork, help patients manage their own diseases like diabetes

    • ai could also make things worse - hospitals are mostly run by business people

1.8.3. medical communication¶

1.8.3.1. “how do doctors think?”¶

1.8.3.2. communicating findings¶

  • don’t use ROC curves, use deciles

  • need to evaluate use, not just metric

  • internal/external validity = training/testing error

  • model -> fitted model

  • retrospective (more confounding, looks back) vs prospective study

  • internal/external validity = train/test (although external was usually using different patient population, so is stronger)

  • specificity/sensitivity = precision/recall

1.8.4. examples¶

1.8.4.1. succesful examples of ai in medicine¶

  • ECG (NEJM, 1991)

  • EKG has a small interpretation on it

  • there used to be bayesian networks / expert systems but they went away…

1.8.4.2. icu interpretability example¶

  • goal: explain the model not the patient (that is the doctor’s job)

  • want to know interactions between features

  • some features are difficult to understand

    • e.g. max over this window, might seem high to a doctor unless they think about it

  • some features don’t really make sense to change (e.g. was this thing measured)

  • doctors like to see trends - patient health changes over time and must include history

  • feature importance under intervention

1.8.4.3. high-performance ai studies¶

  • chest-xray: chexnet

  • echocardiograms: madani, ali, et al. 2018

  • skin: esteva, andre, et al. 2017

  • pathology: campanella, gabriele, et al.. 2019

  • mammogram: kerlikowske, karla, et al. 2018

1.8.5. medical imaging¶

1.8.6. improving medical studies¶

  • Machine learning methods for developing precision treatment rules with observational data (Kessler et al. 2019)

    • goal: find precision treatment rules

    • problem: need large sample sizes but can’t obtain them in RCTs

    • recommendations

      • screen important predictors using large observational medical records rather than RCTs

        • important to do matching / weighting to account for bias in treatment assignments

        • alternatively, can look for natural experiment / instrumental variable / discontinuity analysis

        • has many benefits

      • modeling: should use ensemble methods rather than individual models