Skip to content

Commit

Permalink
Adding Luke Guerdan's talk (#435)
Browse files Browse the repository at this point in the history
* fix bugs and add speaker

* commit

* add speaker
  • Loading branch information
cesare-spinoso authored Nov 26, 2024
1 parent 1025b6a commit e3cf872
Show file tree
Hide file tree
Showing 2 changed files with 42 additions and 2 deletions.
4 changes: 2 additions & 2 deletions _pages/reading-group.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@ For the Fall 2024 semester, the reading group will meet on Fridays at 11:30AM (w
| November 8th @ 11:30 AM | Boyuan Zheng | Towards a Generalist Web Agent | [click here]({% link _posts/reading-group/fall-2024/2024-11-07-boyuan-zheng.md %}) |
| November 12th to 16th | **EMNLP 2024** | | |
| November 22nd @ 11:30 AM | William Held | Distilling an End-to-End Voice Assistant Without Instruction Training Data | [click here]({% link _posts/reading-group/fall-2024/2024-11-22-william-held.md %}) |
| November 29th @ 11:30 AM | Luke Guerdan | *TBA* | *TBA* |
| December 6th @ 11:30 AM | Amal Zouaq | *TBA* | *TBA* |
| November 29th @ 11:30 AM | Luke Guerdan | Towards Principled Model Evaluation Under Imperfect "Ground Truth" Labels | [click here]({% link _posts/reading-group/fall-2024/2024-11-29-luke-guerdan.md %}) |
| December 6th @ 11:30 AM | Poster presentations for [Comp 767](https://mcgill-nlp.github.io/teaching/comp767-ling782-F24/) | Various LLM-related topics - To be hosted at the Mila Agora | |

## History

Expand Down
40 changes: 40 additions & 0 deletions _posts/reading-group/fall-2024/2024-11-29-luke-guerdan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
title: Towards Principled Model Evaluation Under Imperfect "Ground Truth" Labels
venue: Carnegie Mellon University
names: Luke Guerdan
author: Luke Guerdan
tags:
- NLP RG
categories:
- Reading-Group
- Fall-2024
layout: archive
classes:
- wide
- no-sidebar
---

*{{ page.names }}*

**{{ page.venue }}**

{% include display-publication-links.html pub=page %}

The [NLP Reading Group]({% link _pages/reading-group.md %}) is excited to host [Luke Guerdan](https://lukeguerdan.com/), a PhD student at CMU, who will be speaking remotely on Zoom on Friday November 29th about "Towards Principled Model Evaluation Under Imperfect "Ground Truth" Labels".


## Talk Description

In many evaluation contexts, “ground truth” labels are an imperfect proxy for the broader capabilities or limitations of interest—such as the “relevance” of retrieval augmented generation (RAG) outputs or the “toxicity” of chatbot responses. How can we conduct statistically rigorous and informative performance evaluations under an imperfect gold standard?

In this talk, I begin by addressing this question in the context of predictive modeling for algorithmic decision support. I describe an approach that leverages structured human feedback in the form of expert anchor assumptions that better-connect observable proxy labels to unobservable constructs of interest. I validate this approach theoretically, and empirically demonstrate that measurement error modeling is critical for learning reliable models. I conclude by illustrating that a similar approach is necessary while evaluating LLMs under violations to the "gold labels’’ assumption.

## Speaker Bio

Luke Guerdan is a Ph.D. student in the Human-Computer Interaction Institute at Carnegie Mellon University. His research focuses on developing tools to evaluate the capabilities and limitations of human-algorithmic systems under imperfect labels. Luke's work has been recognized with an ACM FAccT Best Paper Award and an NSF Graduate Research Fellowship.

## Logistics

Date: November 29th<br>
Time: 11:30AM <br>
Location: Zoom (See email)

0 comments on commit e3cf872

Please sign in to comment.