Code for paper "Retaining Semantics in Image to Music Conversion"
Firstly, cd ./img2MIDI
, we have an example content input, Amily.jpeg
, and a style input (P.S. a piece of original work from my band), style.mid
.
For Find Closest, Triad Mapping, Range of Verious Instruments
approach:
python image2MIDI.py <image_path> <interval> <chord_progress_type>
For Color + Chord Progression
approach:
python imageColor2MIDI.py <image_path> <interval> <chord_progress_type>
Tips:
interval: duration of key notes: a float number
chord_progress_type: choose from the chord progress dictionary.
To show a histogram of the midi file:
cd img2MIDI
python histogram_mid.py <file>.mid
Firstly, provide a content input of picture.
Then try this interactive demo BLIP: Inference Demo for image captioning.
The only thing we need to modify is to upload the image and replace the image url here:
And we get the captioning result.
Next, input the captioning result to GPT-2 Model, where you can find an interface here.
Next, we get the lyrics, and we need to feed it into TeleMelody.
(1) Prepare lyric-to-rhythm dataset. Several examples are provided in directory data/en_example
and data/zh_example
. We can follow the pipeline mentioned in paper to get the training data.
(2) Train lyric-to-rhythm model.
cd training/lyric2rhythm/
bash train.sh data/example example 8192
(1) Prepare lmd-matched MIDI dataset.
cd training/template2melody/
wget http://hog.ee.columbia.edu/craffel/lmd/lmd_matched.tar.gz
tar -xzvf lmd_matched.tar.gz
(2) Generate training data and alignments.
python gen.py lmd_matched lmd_matched
python gen_align.py lmd_matched
(3) Train template-to-melody model.
bash preprocess.sh lmd_matched lmd_matched
bash train.sh lmd_matched
2.1 Modify miditoolkit
to support Chinese lyrics.
(1)
git clone https://github.com/YatingMusic/miditoolkit.git
cd miditoolkit
(2) Modify miditoolkit/midi/parser.py
.
raw:
318 def dump(self,
319 filename=None,
320 file=None,
321 segment=None,
322 shift=True,
323 instrument_idx=None):
...
371 midi_parsed=mido.MidiFile(ticks_per_beat=self.ticks_per_beat)
Modified:
318 def dump(self,
319 filename=None,
320 file=None,
321 segment=None,
322 shift=True,
323 instrument_idx=None,
324 charset ='latin1'):
...
372 midi_parsed=mido.MidiFile(ticks_per_beat=self.ticks_per_beat, charset=charset)
(3) Install miditoolkit
.
pip uninstall miditoolkit
python setup.py install
2.2 Save checkpoints in checkpoints/{model_prefix}
and dictionary in data-bin/{model_prefix}
.
2.3 Prepare word-level (EN) or character-level (ZH) lyrics in data/{lang}/{data_prefix}/lyric.txt
and chord progression in data/{lang}/{data_prefix}/chord.txt
. For English lyrics, additionally prepare syllable-level lyrics in data/en/{data_prefix}/syllable.txt
as the input of lyric-to-rhythm model. Examples are provided in data/en/test/
and data/zh/test/
.
2.4 Infer.
Firstly, copy files in ./training/xxx/data-bin/yyy
to ./inferrence/data-bin/yyy
, and also copy the checkpoints from ./training/xxx/checkpoints/yyy
to ./inferrence/checkpoints/yyy
.
cd inference/
(EN):
python infer_en.py {en_lyric2rhythm_prefix} {template2melody_prefix} {en_data_prefix} {en_save_prefix}
(ZH):
python infer_zh.py {zh_lyric2rhythm_prefix} {template2melody_prefix} {zh_data_prefix} {zh_save_prefix}
Results are saved in directory results/{save_prefix}/midi/
.
Firstly, cd ./groove2groove
-
Install the dependencies using one of the following options:
-
Create a new environment using conda:
conda env create -f environment.yml
This will also install the correct versions of Python and the CUDA and CuDNN libraries.
-
Using pip (a virtual environment is recommended):
pip install -r requirements.txt
You will need Python 3.6 because we use a version of TensorFlow which is not available from PyPI for more recent Python versions.
The code has been tested with TensorFlow 1.12, CUDA 9.0 and CuDNN 7.6.0. Other versions of TensorFlow (1.x) may work too.
-
-
Install the package with:
pip install './code[gpu]'
cd data/synth
$bash ./download.sh
$bash ./prepare.sh
Then there will be 3 sub folders, called train
, test
and val
, and a file groove2groove-data-v1.0.0.tar.gz
exists.
To train the model with synth dataset, run:
python -m groove2groove.models.roll2seq_style_transfer --logdir $LOGDIR train
Replace $LOGDIR
with the model directory, containing the model.yaml
configuration file (e.g. one of the directories under experiments
).
To generate results with trained model: (Place the content input (MIDI) and style input (MIDI) in the folder of ./groove2groove
)
python -m groove2groove.models.roll2seq_style_transfer --logdir ./experiments/v01 run-midi \
--sample --softmax-temperature 0.6 \
<content>.mid <style>.mid <output>.mid
Thanks for these open-sourced repositories:
-
Groove2Groove by Ondřej Cífka et al.
-
Muzic by Microsoft Research Asia
-
BLIP by salesforce