Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Audio] Qwen Audio Example #1082

Draft
wants to merge 24 commits into
base: main
Choose a base branch
from
Draft

[Audio] Qwen Audio Example #1082

wants to merge 24 commits into from

Conversation

kylesayrs
Copy link
Collaborator

@kylesayrs kylesayrs commented Jan 19, 2025

Purpose

  • Demonstrate support for compressing audio models through examples

Prerequisites

Changes

  • Examples
    • examples/multimodal_audio/whisper_example.py
    • examples/multimodal_audio/qwen2_audio_example.py
  • Traceable definitions
    • TraceableWhisperForConditionalGeneration
    • TraceableQwen2AudioForConditionalGeneration
  • Add support for special case where the processor only supports **kwargs, as is the case for the Whisper processor -_-

TODO

  • Qwen Audio

Testing

Signed-off-by: Kyle Sayers <[email protected]>
@vllm-project vllm-project deleted a comment from github-actions bot Jan 19, 2025
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs changed the title Audio Examples [Audio] Support and Examples Jan 19, 2025
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
mgoin pushed a commit that referenced this pull request Jan 22, 2025
## Purpose ##
* Support oneshot with audio datasets

## Changes ##
* Extend `apply_pad_mask_to_batch` to handle cases where there are no
`input_ids` and where there might be `decoder_input_ids`
* Extend `TextGenerationDataset` to detect if a dataset is already
tokenized based on `processor.model_input_names` rather than only
`input_ids`

## Testing ##
* Ran `test_processors.py` to completion, which verifies that the
`model_input_names` attribute is defined for most processors
* Ran whisper to completion in
#1082

<details><summary>test_processors.py</summary>

```python3
import pytest
from transformers import AutoProcessor

@pytest.mark.parametrize(
    "model_id,expected",
    [
        ("meta-llama/Meta-Llama-3-8B-Instruct", ["input_ids", "attention_mask"]),
        ("mistralai/Mixtral-8x7B-Instruct-v0.1", ["input_ids", "attention_mask"]),
        (
            "Qwen/Qwen2-VL-2B-Instruct",
            [
                "input_ids",
                "attention_mask",
                "pixel_values",
                "image_grid_thw",
                "pixel_values_videos",
                "video_grid_thw",
            ],
        ),
        ("mgoin/pixtral-12b", ["input_ids", "attention_mask", "pixel_values"]),
        ("openai/whisper-large-v2", ["input_features"]),
        (
            "Qwen/Qwen2-Audio-7B-Instruct",
            ["input_ids", "attention_mask", "input_features", "feature_attention_mask"],
        ),
    ],
)
def test_processor_model_input_names(model_id, expected):
    """
    Tests the model_input_names attribute of common model processors
    """

    processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
    assert processor.model_input_names == expected
```
</details>

---------

Signed-off-by: Kyle Sayers <[email protected]>
Co-authored-by: Dipika Sikka <[email protected]>
@kylesayrs kylesayrs changed the title [Audio] Support and Examples [Audio] Whisper and Qwen Examples Jan 25, 2025
Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs force-pushed the kylesayrs/audio_examples branch from 8897342 to f1bd1d2 Compare January 25, 2025 04:53
@kylesayrs kylesayrs changed the base branch from main to kylesayrs/whisper_audio_example January 28, 2025 21:04
@kylesayrs kylesayrs changed the base branch from kylesayrs/whisper_audio_example to main January 28, 2025 21:04
@kylesayrs kylesayrs changed the title [Audio] Whisper and Qwen Examples [Audio] Qwen Audio Example Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant