Skip to content

Commit

Permalink
Adding William Held (#434)
Browse files Browse the repository at this point in the history
* Update reading-group.md

* Create 2024-11-22-william-held.md
  • Loading branch information
oriern authored Nov 21, 2024
1 parent 3d04212 commit 1025b6a
Show file tree
Hide file tree
Showing 2 changed files with 39 additions and 1 deletion.
2 changes: 1 addition & 1 deletion _pages/reading-group.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ For the Fall 2024 semester, the reading group will meet on Fridays at 11:30AM (w
| November 1st @ 11:45 AM | Bang Liu | Applications and Enhancements of LLM Agents Across Diverse Environments | [click here]({% link _posts/reading-group/fall-2024/2024-11-1-bang-liu.md %}) |
| November 8th @ 11:30 AM | Boyuan Zheng | Towards a Generalist Web Agent | [click here]({% link _posts/reading-group/fall-2024/2024-11-07-boyuan-zheng.md %}) |
| November 12th to 16th | **EMNLP 2024** | | |
| November 22nd @ 11:30 AM | William Held | *TBA* | *TBA* |
| November 22nd @ 11:30 AM | William Held | Distilling an End-to-End Voice Assistant Without Instruction Training Data | [click here]({% link _posts/reading-group/fall-2024/2024-11-22-william-held.md %}) |
| November 29th @ 11:30 AM | Luke Guerdan | *TBA* | *TBA* |
| December 6th @ 11:30 AM | Amal Zouaq | *TBA* | *TBA* |

Expand Down
38 changes: 38 additions & 0 deletions _posts/reading-group/fall-2024/2024-11-22-william-held.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
title: "Distilling an End-to-End Voice Assistant Without Instruction Training Data"
venue: Georgia Tech
names: William Held
author: William Held
tags:
- NLP RG
categories:
- Reading-Group
- Fall-2024
layout: archive
classes:
- wide
- no-sidebar
---

*{{ page.names }}*

**{{ page.venue }}**

{% include display-publication-links.html pub=page %}

The [NLP Reading Group]({% link _pages/reading-group.md %}) is excited to host [William Held](https://williamheld.com/), a PhD student at Georgia Tech, who will be speaking remotely on Zoom on Friday November 22nd about "Distilling an End-to-End Voice Assistant Without Instruction Training Data".


## Talk Description

In this talk, I'll cover the methods we used to train our Distilled Voice Assistant (DiVA) model. Recent efforts to simplify spoken NLP with end-to-end Speech Large Language Models (LLMs) trained with supervised finetuning (SFT) have led to models ``forgetting" capabilities from text-only LLMs. Our work proposes an alternative paradigm for training Speech LLMs without instruction data, using the response of a text-only LLM to transcripts as self-supervision. Importantly, this process can be performed directly with ASR data. We show that our Distilled Voice Assistant (DiVA) generalizes to unseen tasks and improves user experience, achieving a 72\% win rate compared with state-of-the-art open models like Qwen 2 Audio. Finally, I'll cover the open-source efforts we've made to support training and demoing Speech LLM systems.

## Speaker Bio

William Held is a Machine Learning PhD student at Georgia Tech, advised by Diyi Yang in the Stanford NLP Group. Before that, early engineer at Sunshine. Focused on enabling inclusive language technology by modeling linguistic variation.

## Logistics

Date: November 22nd<br>
Time: 11:30AM <br>
Location: Zoom (See email)

0 comments on commit 1025b6a

Please sign in to comment.