-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
06a8dee
commit 264ec73
Showing
11 changed files
with
41 additions
and
31 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,17 +1,35 @@ | ||
# StereoSampleGAN (WIP) | ||
# StereoSampleGAN | ||
|
||
[![On Push](https://github.com/shuklabhay/stereo-sample-gan/actions/workflows/push.yml/badge.svg)](https://github.com/shuklabhay/stereo-sample-gan/actions/workflows/push.yml/badge.svg) | ||
|
||
StereoSampleGAN: A lightweight approach high fidelity stereo audio sample generation. Generate a kick drum by running `generate.py`. | ||
StereoSampleGAN: A lightweight approach high fidelity stereo audio sample generation. | ||
|
||
Generated audio spectrogram examples: | ||
![Audio Example 1](paper/static/generated_audio_example_1.png) | ||
![Audio Example 2](paper/static/generated_audio_example_2.png) | ||
![Audio Example 3](paper/static/generated_audio_example_3.png) | ||
## Model Usage | ||
|
||
1. Prereqs | ||
|
||
- Optional but highly reccomended: Set up a [Python virtual environment.](https://www.youtube.com/watch?v=e5GL1obY_sI) | ||
- Audio loader package `librosa` requires an outdated version of Numpy | ||
- Install requirements by running `pip3 install -r requirements.txt` | ||
|
||
2. Generate Audio | ||
|
||
- Specify usage paramaters in `usage_params.py` | ||
- For `outputs/StereoSampleGAN-DiverseKick.pth`, `training_sample_length = 0.6` | ||
- For `outputs/StereoSampleGAN-Kick.pth`, `training_sample_length = 0.6` | ||
- Generate audio by running `python3 generate.py` | ||
|
||
3. Train model | ||
|
||
- Specify training data paramaters in `usage_params.py` | ||
- Process training data by running `python3 encode_audio_data.py` | ||
- Train model by running `python3 stereo_sample_gan.py` | ||
|
||
## Directories | ||
|
||
- `paper`: Research paper and static images | ||
- `model`: Trained model and generated audio | ||
- `paper`: Research paper / model writeup | ||
- `static`: Static images | ||
- `outputs`: Trained model and generated audio | ||
- `src`: Model source code | ||
- `utils`: Model and data utilities | ||
- `data_processing`: Training data processing scripts |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,18 @@ | ||
# Processing training data | ||
training_audio_dir = "data/one_shots" # Your training data path | ||
training_audio_dir = "data/kick_samples_diverse" # Your training data path | ||
compiled_data_path = "data/compiled_data.npy" # Your compiled data/output path | ||
training_sample_length = 0.6 # seconds | ||
|
||
# Saving model | ||
outputs_dir = "outputs" # Where to save your generated audio & model | ||
model_save_name = "StereoSampleGAN-OldKick" # What to name your model save | ||
model_save_name = "StereoSampleGAN-DiverseKick" # What to name your model save | ||
model_save_path = f"{outputs_dir}/{model_save_name}.pth" | ||
|
||
# Generating audio | ||
model_to_generate_with = model_save_path # Generation model path | ||
audio_generation_count = 2 # Audio examples to generate | ||
generated_audio_name = "generated_audio" # Output file name | ||
generated_sample_length = 0.6 # Match model training data audio length | ||
visualize_generated = True # SHow generated audio spectrogra,s | ||
generated_sample_length = ( | ||
training_sample_length # Match model training data audio length | ||
) | ||
visualize_generated = True # Show generated audio spectrograms |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters