notes updates

csinva · May 29, 2024 · 7692eb6 · 7692eb6
1 parent 7660b59
commit 7692eb6
Show file tree

Hide file tree

Showing 4 changed files with 31 additions and 14 deletions.
diff --git a/_includes/01_research.html b/_includes/01_research.html
@@ -52,18 +52,20 @@ <h2 style="text-align: center; margin-top: -150px;"> Research</h2>
 
     <div class="research_box">
 
-        <strong>🧠 Neuroscience. </strong> Since joining MSR, I have been focused on building and applying these methods
+        <strong>🧠 Neuroscience. </strong> Since joining MSR, I've been focused on leveraging LLM interpretability
         to understand how the human brain represents language (using fMRI in collaboration with the <a
             href="https://www.cs.utexas.edu/~huth/index.html">Huth lab</a> at UT Austin).
         <br>
         <br>
+        <a href="https://arxiv.org/abs/2405.16714">qa embeddings</a> - build interpretable fMRI encoding models by
+        asking yes/no questions to LLMs<br>
         <a href="https://arxiv.org/abs/2305.09863">summarize &amp; score explanations</a> - generate natural-language
         explanations of fMRI encoding models
     </div>
 
 
     <div class="research_box"><strong>💊
-            Healthcare. </strong>I'm also actively working in how we can improve clinical decision instruments by using
+            Healthcare. </strong>I'm also actively working on how we can improve clinical decision instruments by using
         the information contained across various sources in the medical literature (in collaboration with <a
             href="https://profiles.ucsf.edu/aaron.kornblith">Aaron Kornblith</a> at UCSF and the MSR <a
             href="https://www.microsoft.com/en-us/research/group/real-world-evidence/">Health Futures team</a>).
@@ -156,7 +158,8 @@ <h2 style="text-align: center; margin-top: -150px;"> Research</h2>
             <td>benara*, singh*, morris, antonello, stoica, huth, & gao</td>
             <td class="med">🧠🔎🌀</td>
             <td class="center"><a href="https://arxiv.org/abs/2405.16714">arxiv</a></td>
-            <td class="big"></td>
+            <td class="big"><a href="https://github.com/csinva/interpretable-embeddings"><i
+                        class="fa fa-github fa-fw"></i></a></td>
             <td class="med">
             </td>
         </tr>

diff --git a/_includes/02_notes_main.html b/_includes/02_notes_main.html
@@ -95,20 +95,20 @@ <h3 align="center">research posts</h3>
 <div class="coll3">
     <h3 align="center">slides</h3>
     <ul class="list-items">
-        <li><a href="{{ site.baseurl }}/pres/189/" style="font-size:medium; font-weight: bolder;"> intro ml - cs 189
-            </a>(<a href="https://github.com/csinva/csinva.github.io/blob/master/pres/189/_slides_ml.md"><i
-                    class="fa fa-github fa-fw"></i></a>) uc berkeley ⭐
-        </li>
-        <li><a href="{{ site.baseurl }}/pres/188/" style="font-size:medium"> intro ai - cs 188 </a>(<a
-                href="https://github.com/csinva/csinva.github.io/blob/master/pres/188/_slides_ai.md"><i
-                    class="fa fa-github fa-fw"></i></a>) uc berkeley
-        </li>
         <li><a href="https://docs.google.com/presentation/d/1qL_cATZWiwOg4EjgrQ93m2zEpNMYYdIUUqqMO1REbIk/"
                 style="font-size:medium; font-weight: bolder;"> explanations from text data </a> '23 ⭐</li>
+        <li><a href="{{ site.baseurl }}/pres/189/" style="font-size:medium; font-weight: bolder;"> intro ml
+            </a>(<a href="https://github.com/csinva/csinva.github.io/blob/master/pres/189/_slides_ml.md"><i
+                    class="fa fa-github fa-fw"></i></a>) uc berkeley cs 189 ⭐
+        </li>
         <li><a href="https://docs.google.com/presentation/d/101roPSL6AlSKf5iYcxnF9TCHOHZIBGODlQMCbEXnIeM/"
                 style="font-size:medium"> uniting trees and LLMs </a> '23</li>
         <li><a href="https://docs.google.com/presentation/d/1IyxCrB5Ol8RsvFBTTy4Y5DQH3TxS9Wgl7Pxt1thrBBY/"
                 style="font-size:medium"> phd research ovw (animated) </a> '22</li>
+        <li><a href="{{ site.baseurl }}/pres/188/" style="font-size:medium"> intro ai </a>(<a
+                href="https://github.com/csinva/csinva.github.io/blob/master/pres/188/_slides_ai.md"><i
+                    class="fa fa-github fa-fw"></i></a>) uc berkeley cs 188
+        </li>
         <li><a href="https://docs.google.com/presentation/d/19fTICv0pyRiwGE39mqE_eGTq0Arn3OtVnDSrURFbPMA/"
                 style="font-size:medium"> deep learning interpretation ovw </a> '20</li>
         <li><a href="https://docs.google.com/presentation/d/1RIdbV279r20marRrN0b1bu2z9STkrivsMDa_Dauk8kE/"

diff --git a/_notes/research_ovws/ovw_interp.md b/_notes/research_ovws/ovw_interp.md
@@ -471,6 +471,7 @@ Symbolic regression learns a symbolic (e.g. a mathematical formula) for a functi
       - Human evaluation: agreement of concept scores and contribution of concept to output
   - Concept transformers ([rigotti, ... scotton, 2022](https://openreview.net/pdf?id=kAa9eDS0RdO)) - use human-given concepts and explain predictions as a function of these concepts
   - Knowledge-enhanced Bottlenecks (KnoBo) - A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis ([yang...yatskar, 2024](https://yueyang1996.github.io/papers/knobo.pdf)) - CBMs that incorporate knowledge priors that constrain it to reason with clinically relevant factors found in medical textbooks or PubMed
+  - Crafting Interpretable Embeddings by Asking LLMs Questions ([benara...gao, 2024](https://arxiv.org/pdf/2405.16714))
 - MoIE: Route, Interpret, Repeat: Blurring the Line Between Post hoc Explainability and Interpretable Models ([ghosh, ..., batmangehelich, 2023](https://arxiv.org/abs/2302.10289#)) - mixture of different interpretable models, with black-box routing
 - SASC - learn factors from BERT using dictionary learning, assign each factor a natural-language explanation, then build a sparse linear model of these factors ([singh, ..., gao, 2023](https://arxiv.org/abs/2305.09863))
 - [Concept Whitening for Interpretable Image Recognition](https://arxiv.org/pdf/2002.01650.pdf) (chen et al. 2020) - force network to separate "concepts" (like in TCAV) along different axes

diff --git a/_notes/research_ovws/ovw_llms.md b/_notes/research_ovws/ovw_llms.md
@@ -165,7 +165,6 @@ See related papers in the [📌 interpretability](https://csinva.io/notes/resear
     - How Can We Know What Language Models Know? ([jiang ... neubig, 2020](https://arxiv.org/abs/1911.12543))
       - mining-based and paraphrasing-based methods to automatically generate high-quality diverse prompts
       - ensemble methods to combine answers from different prompts (e.g. avg logits and more)
-
     - Noisy Channel Language Model Prompting for Few-Shot Text Classification ([min et al. 2022](https://arxiv.org/pdf/2108.04106.pdf))
     - Querying $P(question|answer)$ with Bayes rule outperforms standard querying $P(answer|question)$
 
@@ -408,6 +407,16 @@ See related papers in the [📌 interpretability](https://csinva.io/notes/resear
 - LLMLingua ([jiang, wu...qiu, 2023](https://arxiv.org/abs/2310.05736)) - learn BERT-size model to compress prompt (iterative token classification approach from distilled GPT-4 compressed prompts)
   - LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression ([jiang, wu...qiu, 2023](https://arxiv.org/abs/2310.06839))
 
+### classifier-guided generation
+
+- Plug and Play Language Models: A Simple Approach to Controlled Text Generation ([dathathri, …, yosinski, & liu, 2020](https://arxiv.org/abs/1912.02164))
+    - gradients from the classifier push the LM’s hidden activations, then recompute logits to guide generation (and maybe avg with original logits to maintain fluency)
+- FUDGE: Controlled Text Generation With Future Discriminators ([yang & klein, 2021](https://arxiv.org/abs/2104.05218))
+    - classifier predicts probability of attribute for running sequence with each next-token appended
+    - these attribute probs. are multiplied with next-token probs for each token and then we sample from that distr (after normalization)
+- Diffusion-LM Improves Controllable Text Generation ([lisa li, thickstun, gulrajani, liang, & hashimoto, 2022](https://arxiv.org/abs/2205.14217))
+- Mixture of Soft Prompts for Controllable Data Generation ([chen, lee, …, yu, 2023](https://arxiv.org/pdf/2303.01580.pdf)) - trains a small model on data from a big frozen LLM that is then more controllable
+
 # misc
 
 ## adaptation / transfer
@@ -921,6 +930,7 @@ mixture of experts models have become popular because of the need for (1) fast s
 - customization
   - e.g. add prompt or prefixes like *search query*, *search document*, *classification*, *clustering* before embedding so model knows how to match things
 - top-performing models
+  - NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models ([lee...ping, 2024](https://arxiv.org/abs/2405.17428))
   - LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders ([behnamghader...reddy, 2024](https://arxiv.org/abs/2404.05961))
   - Gecko: Versatile Text Embeddings Distilled from LLMs ([lee...naim, 2024](https://arxiv.org/abs/2403.20327))
   - GRIT: Generative Representational Instruction Tuning ([meunninghoff...kiela, 2024](https://arxiv.org/abs/2402.09906)) - train a single model that, given different instructions, can produce either generations or embeddings
@@ -939,6 +949,7 @@ mixture of experts models have become popular because of the need for (1) fast s
   - BGE ([github](https://github.com/FlagOpen/FlagEmbedding))
   - Nomic Embed ([nussbaum, morris, duderstadt, & mulyar, 2024](https://static.nomic.ai/reports/2024_Nomic_Embed_Text_Technical_Report.pdf)), ([blog post](https://blog.nomic.ai/posts/nomic-embed-text-v1))
   - Older: [SBERT](https://arxiv.org/abs/1908.10084), [SIMCSE](https://arxiv.org/abs/2104.08821), [SGPT](https://arxiv.org/abs/2202.08904)
+
 - embedding approaches [overview](https://github.com/caiyinqiong/Semantic-Retrieval-Models)
   - 3 levels of interaction
     - bi-encoder: separately encode query & doc
@@ -976,7 +987,8 @@ mixture of experts models have become popular because of the need for (1) fast s
       - embeddings consist of answers to questions
       - answer models are finetuned on QA datasets
       - questions are given ahead of time
-
+    - Learning Interpretable Style Embeddings via Prompting LLMs ([patel, rao, kothary, mckeown, & callison-burch, 2023](https://arxiv.org/abs/2305.12696))
+
   - multimodal
     - SPLICE: Interpreting CLIP with Sparse Linear Concept Embeddings ([bhalla…lakkaraju, 2024](https://arxiv.org/abs/2402.10376))
       - given CLIP, build an embedding concept dictionary by taking text embeddings of a bunch of individual semantic words
@@ -986,7 +998,7 @@ mixture of experts models have become popular because of the need for (1) fast s
     - Why do These Match? Explaining the Behavior of Image Similarity Models ([plummer…saenko, forsyth, 2020](https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123560630.pdf)) - generate saliency map + with an attribute based on the salient region
     - Towards Visually Explaining Similarity Models ([zheng…wu, 2020](https://arxiv.org/abs/2008.06035)) - similarity of cnn embeddings
   - Interpretable entity representations through large-scale typing ([onoe & durrett, 2020](https://arxiv.org/abs/2005.00147)) - embedding is interpretable predictions for different entities
-
+  
 - Explaining similarity with different outputs
   - Analogies and Feature Attributions for Model Agnostic Explanation of Similarity Learners ([ramamurthy…tariq, 2022](https://arxiv.org/pdf/2202.01153.pdf)) - returned explanation is an analogy (pair from the training set) rather than a saliency map
   - Sim2Word: Explaining Similarity with Representative Attribute Words via Counterfactual Explanations ([chen…cao, 2023](https://dl.acm.org/doi/full/10.1145/3563039)) - give both saliency map + counterfactual explanation
@@ -1139,6 +1151,7 @@ mixture of experts models have become popular because of the need for (1) fast s
     - TopicGPT: A Prompt-based Topic Modeling Framework ([pham...iyyer, 2023](https://arxiv.org/abs/2311.01449))
   - What is different between these datasets? ([babbar, guo, & rudin, 2024](https://arxiv.org/abs/2403.05652)) - combine a variety of different methods to find the difference between (mostly tabular) datasets
   - GSCLIP : A Framework for Explaining Distribution Shifts in Natural Language ([zhu...james zou, 2022](https://arxiv.org/abs/2206.15007)) - automatically explain dataset-level distribution shifts (in image datasets) with natural language
+    - Domino: Discovering Systematic Errors with Cross-Modal Embeddings ([eyuboglu...zou, re, 2022](https://arxiv.org/abs/2203.14960))
   - MaNtLE: Model-agnostic Natural Language Explainer ([menon, zaman, & srivastava, 2023](https://arxiv.org/pdf/2305.12995.pdf)) - train model to generate explanations on simple tables (they do this for classifier outputs but could easily do it directly for data labels)
   - LLMs for Automated Open-domain Scientific Hypotheses Discovery ([yang...cambria, 2023](https://arxiv.org/abs/2309.02726))
   - Scaling deep learning for materials discovery ([merchant...cubuk, 2023](https://www.nature.com/articles/s41586-023-06735-9))