seeking good explanations with machine learning

senior researcher at microsoft research (deep learning group) ; phd from berkeley (with prof. bin yu)


Some areas I'm currently excited about. If you want to chat about research or are interested in interning at MSR, feel free to reach out over email :)
🔎 Interpretability. I'm interested in rethinking interpretability in the context of LLMs

augmented imodels - use LLMs to build a transparent model
imodels - build interpretable models in the style of scikit-learn
explanation penalization - regularize explanations to align models with prior knowledge
adaptive wavelet distillation - replace neural nets with simple, performant wavelet models
🚗 LLM steering. Interpretability tools can provide ways to better guide and use LLMs

tree prompting - improve black-box few-shot text classification with decision trees
attention steering - guide LLMs by emphasizing specific input spans
interpretable autoprompting - automatically find fluent natural-language prompts
🧠 Neuroscience. Since joining MSR, I have been focused on building and applying these methods to understand how the human brain represents language (using fMRI in collaboration with the Huth lab at UT Austin).

summarize & score explanations - generate natural-language explanations of fMRI encoding models
💊 Healthcare. I'm also actively working in how we can improve clinical decision instruments by using the information contained across various sources in the medical literature (in collaboration with Aaron Kornblith at UCSF and the MSR Health Futures team).

clinical self-verification - self-verification improves performance and interpretability of clinical information extraction
clinical rule vetting - stress testing a clinical decision instrument performance for intra-abdominal injury

year title authors tags paper code misc
'24 Rethinking Interpretability in the Era of Large Language Models singh, inala, galley, caruana, & gao 🔎🌀 arxiv
'24 Towards Consistent Natural-Language Explanations via Explanation-Consistency Finetuning chen et al. 🔎🌀 arXiv
'24 Learning a Decision Tree Algorithm with Transformers zhuang et al. 🔎🌀🌳 arxiv
'24 Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs zhang et al. 🔎🌀 iclr
'24 Attribute Structuring Improves LLM-Based Evaluation of Clinical Text Summaries gero et al. 🔎🌀 arxiv
'23 Tree Prompting morris*, singh*, rush, gao, & deng 🔎🌀🌳 emnlp
'23 Augmenting Interpretable Models with LLMs during Training singh, askari, caruana, & gao 🔎🌀🌳 nature communications
'23 Explaining black box text modules in natural language with language models singh*, hsu*, antonello, jain, huth, yu & gao 🔎🌀 neurips workshop
'23 Self-Verification Improves Few-Shot Clinical Information Extraction gero*, singh*, cheng, naumann, galley, gao, & poon 🔎🌀💊 icml workshop (imlh)
'22 Explaining patterns in data with language models via interpretable autoprompting singh*, morris*, aneja, rush, & gao 🔎🌀 emnlp workshop (blackboxnlp)
'22 Stress testing a clinical decision instrument performance for intra-abdominal injury kornblith*, singh* et al. 🔎🌳💊 PLOS digital health
'22 Fast interpretable greedy-tree sums (FIGS) tan*, singh*, nasseri, agarwal, & yu 🔎🌳 arxiv
'22 Hierarchical shrinkage for trees agarwal*, tan*, ronen, singh, & yu 🔎🌳 icml (spotlight)
'22 VeridicalFlow: a python package for building trustworthy data science pipelines with PCS duncan*, kapoor*, agarwal*, singh*, & yu 💻🔍 joss
'21 imodels: a python package for fitting interpretable models singh*, nasseri*, et al. 💻🔍🌳 joss
'21 Adaptive wavelet distillation from neural networks through interpretations ha, singh, et al. 🔍🌀🌳 neurips
'21 Matched sample selection with GANs for mitigating attribute confounding singh, balakrishnan, & perona 🌀 cvpr workshop
'21 Revisiting complexity and the bias-variance tradeoff dwivedi*, singh*, yu & wainwright 🌀 jmlr
'20 Curating a COVID-19 data repository and forecasting county-level death counts in the United States altieri et al. 🔎🦠 hdsr
'20 Transformation importance with applications to cosmology singh*, ha*, lanusse, boehm, liu & yu 🔎🌀🌌 iclr workshop (spotlight)
'20 Interpretations are useful: penalizing explanations to align neural networks with prior knowledge rieger, singh, murdoch & yu 🔎🌀 icml
'19 Hierarchical interpretations for neural network predictions Singh*, Murdoch*, & Yu 🔍🌀 ICLR
'19 interpretable machine learning: definitions, methods, and applications Murdoch*, Singh*, et al. 🔍🌳🌀 pnas
'19 disentangled attribution curves for interpreting random forests and boosted trees devlin, singh, murdoch & yu 🔍🌳 arxiv
'18 large scale image segmentation with structured loss based deep learning for connectome reconstruction Funke*, Tschopp*, et al. 🧠🌀 TPAMI
'18 linearization of excitatory synaptic integration at no extra cost Morel, Singh, & Levy 🧠 J Comp Neuro
'17 a consensus layer V pyramidal neuron can sustain interpulse-interval coding Singh & Levy 🧠 Plos One
'17 a constrained, weighted-l1 minimization approach for joint discovery of heterogeneous neural connectivity graphs Singh, Wang, & Qi 🧠 neurips Workshop ,

resources + posts

Notes in machine learning / neuroscience.

Mini personal projects. There's also some dumb stuff here.


I've been lucky to be advised by / collaborate with many amazing people

It has been my pleasure to help advise some incredible students

MSR PhD interns

Berkeley undergrad/MS students