Any release date ? #1

jarredou · 2023-07-05T16:12:44Z

No description provided.

ilic-mezza · 2023-11-12T17:35:10Z

We plan to release the source code and pretrained models after the publication of our paper "Toward Deep Drum Source Separation," which is currently under peer-review. Unfortunately, we do not have a set date yet.

Regardless, the full dataset is now freely available on Zenodo.

riccardogiampiccolo · 2023-12-18T10:08:49Z

Hi @jarredou,

we published the source code, and the paper is now on ArXiv! It is just a preprint though, we have submitted the paper to Pattern Recognition Letters. You can take a look!

Best

jarredou · 2023-12-19T18:57:18Z

Thanks ! Great paper already !
I don't see any training code, have you planned to also release it (maybe when final paper will be published) ?

PS : I've made a quick Google Colab adapation for inference: https://github.com/jarredou/larsnet-colab

ilic-mezza · 2023-12-19T19:51:58Z

Hi @jarredou,

We developed LarsNet using an early version of StemGMD. Therefore, it will take us some time to refactor the code and have it working with the off-the-shelf version available on Zenodo.

We plan to release the training code soon. I am sure it will be available by the time the article is published. In the meanwhile, we added a section in the README.

Thanks for the colab, it's a great idea!

jarredou · 2023-12-20T01:10:27Z

The inference speed is really mindblowing, even on CPU, that's really amazing, congrats for that !

About the quality, do you think that with more epochs the baseline models would perform better ? Because 22 epochs seems quite low seen from outside, and some project like drumsep (demucs-based, with smaller and private dataset but with more sound diversity) are getting quite qood results, probably with more training epochs for each models. What do you think ?

ilic-mezza · 2023-12-20T01:37:14Z

Hi @jarredou,

We process 110k clips per epoch; with a batch size of 24, this corresponds to just above 4500 batches. This means that each U-Net model is trained for about 100k steps, which is pretty standard. After 100k steps, the validation loss had already stop decreasing, so I reckon we'd need more than increasing the number of epochs to improve the output quality.

We already have few ideas for a v2. Most importantly adding synthetic drums to the dataset, but also improve the robustness to stereo imaging that we noticed can sometimes cause problems. Which artifacts are you more concerned with?

I'll try and take a look at drumsep in the next few days. (We were not aware there was already a drums demixing model out there, thanks for the heads up!)

jarredou · 2023-12-20T19:40:21Z

I can't talk for everybody, but for most of my own use cases, I prefer separations with occasional bleed but with full sounding targeted stem than separations with no bleed, but missing some content in the targeted stem (with "underwater"-like sounding on some parts).
Occasional bleed is easier to remove with auto/manual postprocessing, the missing content is way more difficult to handle.
But I know that other people prefer it the other way.

Using Demucs like Drumsep did is a good idea because Demucs until recently (see next message) was the best open-source architecture to separate drums from full mixture, better than KUIElab's TFC-TDF-Net when trained on the same dataset.

(The drumsep model original download link is dead, but it was shared later in that issue: inagoy/drumsep#3 (comment), and it was a student project, there is no publication related to it.)

jarredou · 2023-12-20T22:18:06Z

Side-note: lucidrains has open-sourced SAMI-Bytedance's work (which is the current SOTA in music source separation, by a quite big step):
https://github.com/lucidrains/BS-RoFormer/

You may also find interesting this work, aimed at enhancing source separated audio with a GAN : https://github.com/interactiveaudiolab/MSG

ilic-mezza · 2023-12-21T00:05:22Z

Sure thing! Demucs is arguably a better architecture than our Spleeter-like model. Nevertheless, at this point, we mainly wanted to showcase StemGMD by releasing a baseline for future research. This is why we decided to start from a simpler architecture. We will try better architectures as we go forward!

For what concerns bleed vs hard separation artifacts, you may want to play around with the α-Wiener filter. We noticed that choosing α<1 may sometimes lead to more natural sounding stems while allowing for more bleed. You can try and specify the option -w with a value of 0.5. This would nonlinearly modify the masks by applying square root compression. Namely, you could run

$ python separate -i path/to/input/folder -o path/to/output/folder -w 0.5

The best α really depends on the input track, but it's worth trying different values as it may produce more appealing results.

jarredou changed the title ~~Any released date ?~~ Any release date ? Jul 5, 2023

riccardogiampiccolo closed this as completed Dec 18, 2023

ilic-mezza reopened this Dec 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any release date ? #1

Any release date ? #1

jarredou commented Jul 5, 2023

ilic-mezza commented Nov 12, 2023

riccardogiampiccolo commented Dec 18, 2023

jarredou commented Dec 19, 2023 •

edited

Loading

ilic-mezza commented Dec 19, 2023 •

edited

Loading

jarredou commented Dec 20, 2023 •

edited

Loading

ilic-mezza commented Dec 20, 2023 •

edited

Loading

jarredou commented Dec 20, 2023 •

edited

Loading

jarredou commented Dec 20, 2023 •

edited

Loading

ilic-mezza commented Dec 21, 2023 •

edited

Loading

Any release date ? #1

Any release date ? #1

Comments

jarredou commented Jul 5, 2023

ilic-mezza commented Nov 12, 2023

riccardogiampiccolo commented Dec 18, 2023

jarredou commented Dec 19, 2023 • edited Loading

ilic-mezza commented Dec 19, 2023 • edited Loading

jarredou commented Dec 20, 2023 • edited Loading

ilic-mezza commented Dec 20, 2023 • edited Loading

jarredou commented Dec 20, 2023 • edited Loading

jarredou commented Dec 20, 2023 • edited Loading

ilic-mezza commented Dec 21, 2023 • edited Loading

jarredou commented Dec 19, 2023 •

edited

Loading

ilic-mezza commented Dec 19, 2023 •

edited

Loading

jarredou commented Dec 20, 2023 •

edited

Loading

ilic-mezza commented Dec 20, 2023 •

edited

Loading

jarredou commented Dec 20, 2023 •

edited

Loading

jarredou commented Dec 20, 2023 •

edited

Loading

ilic-mezza commented Dec 21, 2023 •

edited

Loading