Skip to content

Commit

Permalink
more work
Browse files Browse the repository at this point in the history
  • Loading branch information
shuklabhay committed Aug 19, 2024
1 parent 3490335 commit 2661e10
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 11 deletions.
22 changes: 13 additions & 9 deletions paper/main.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,29 @@
# Kick it Out: Audio Generation With a Deep Convolution Generative Network
# Kick it Out: Generating Kick Drum Samples With a Deep Convolution Generative Network

Abhay Shukla\
[email protected]\
Continuation of UCLA COSMOS 2024 Research

## Abstract

Generative adversarial networks have been used to much sucess for generating images

## Introduction

[what are kick drums]

Generative Adversarial Networks (GANs) have changed the landscape of the machine learning community, reaching new bounds in image generation [cite] and more recently natural language and audio generation [cite]. These audio generative models often employ Deep Convolutional GANs (DCGANs) to create spectrogram representations of audio.

This work aims specifically to generate kick drums using a similar DCGAN approach.

Investigation tries to determine if dcgans can learn to recognize and replicate the spatial patterns and nonspatial distributions of a kick drum.

## Background

## Methodology

## Results

compare w/ wavegan?? ig

## Discussion

## Conclusion
Expand All @@ -28,17 +36,13 @@ https://arxiv.org/abs/1511.06434
similar result to me
https://openaccess.thecvf.com/content_CVPR_2020/papers/Durall_Watch_Your_Up-Convolution_CNN_Based_Generative_Deep_Neural_Networks_Are_CVPR_2020_paper.pdf

also talk abt like wavenet as other ideas for models

i have to be doing something wrong. it has to be doable. quickly just check it all make sure theres noooothing more i can do bc im sure its possible just limitations here idk what else i can do to improve model or wtv. there has to be some way to improve at least get better, allthe changes i made should be making it better bruh.

- go thru code and like clean up vars/make naming consistent (moreso helpers)
- TRY USING WASSERSTEIN LOSS SEE IF THERES ANY IMPROVEMENT IF THERE IS THEN HELL YEAH NEW NOVEL KICK DRUM GEN YAY!
- see if theres anything else i know that can be improved/possible source of error (prob not, but there has to be something it should be better w/ changes i made not "worse" its back to noise)
- at most spend today doing this but thats it. paper has to happen now.

STRUCTURE OF A PAPER (claude generated)

1.
1. title: done
2. Abstract: A brief summary of your paper, including the problem, methods, key results, and conclusions.
3. Introduction: Present the research problem, its importance, and your objectives.
4. Background/Literature Review: Provide context on deep convolution and its applications in audio generation. Review relevant previous work.
Expand Down
4 changes: 2 additions & 2 deletions src/utils/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,12 @@
AUDIO_SAMPLE_LENGTH = 0.5 # 500 ms
GLOBAL_SR = 44100
N_CHANNELS = 2 # Left, right
N_FRAMES = 352
N_FRAMES = 176
N_FREQ_BINS = 257

# Initialize STFT Object
GLOBAL_WIN = 2**9
GLOBAL_HOP = 2**6
GLOBAL_HOP = 2**7
win = scipy.signal.windows.kaiser(GLOBAL_WIN, beta=14)
STFT = scipy.signal.ShortTimeFFT(
win=win, hop=GLOBAL_HOP, fs=GLOBAL_SR, scale_to="magnitude"
Expand Down

0 comments on commit 2661e10

Please sign in to comment.