live_viz-K309-2_light.mp4
Video: Live demonstration
Top: score annotations temporally aligned with audio (each vertical span is labeled with an annotation, notice the labels names above the audio profile)
Bottom: original score added to follow the music and the annotations precise location during live visualizing
(higher quality to see the labels names more clearly: )
See the project's presentation
This repository proposes a pipeline to perform the alignment between an audio recording of a piece present in DCML Mozart sonatas corpus [1], using Sync Toolbox dynamic time warping (DTW) tools [2].
The project was conducted at the Digital and Cognitive Musicology Lab at EPFL led by Prof. Martin Rorhmeier and supervised by Dr. Steffen Herff.
- aligner.py: script to perform alignment with the command line interface described below
- utils.py: module containing the alignment functionalities
- examples:
- notes and labels TSV files from the piece K309, movement 2, downloaded from DCML Mozart sonatas corpus [1]. You can reproduce the alignment with them by providing a version you own of an audio recording of K309-2 to the command line interface.
- example of a CSV result output
- playground_notebook.ipynb: notebook that details the alignment pipeline, gives suggestions on how to adapt it to data outside the Annotated Mozart Sonatas corpus and shows how to evaluate the accuracy of the alignment result.
Currently this repository is running with:
- python 3.9.7
- libfmp 1.2.2
- librosa 0.9.1
- numpy 1.22.2
- pandas 1.4.2
- scipy 1.8.0
- synctoolbox 1.2.0
- ms3 0.5.2
Install synctoolbox via the recommended procedure here.
First, download the Annotated Mozart Sonatas corpus (from the update branch).
Second, install the ms3
parser, navigate to the top level of the mozart_piano_sonatas repository and generate the notes and labels files
needed for audio-to-annotation alignment by running:
python ms3 extract -N [folder_to_write_notes_file_to] -X [folder_to_write_labels_file_to] -q
This will provide additional quarterbeats information needed for alignment.
Once the files locations are all identified, you can run:
python aligner.py -a [audio_WAV_file] -n [notes_TSV_file] -l [labels_TSV_file] -o [CSV_file_to_write_results_to]
This default command line will store a CSV file with minimal information i.e. labels and corresponding timestamps, useful to visualize (e.g. with SonicVisualiser).
[options to be detailed]
A notebook is provided to give a detailed understanding of the aligning pipeline. It allows one to see the behaviour of the pipeline's steps independently, visualize some steps, and shows how the pipeline could be adapted to data outside the Annotated Mozart Sonatas dataset.
For labels visualization purposes, only the compact
mode is needed when running aligner.py
.
-
Open SonicVisualiser and open the audio file used for alignment.
-
Then, click on
File/Import Annotation Layer
(Ctrl+L
)and open the alignment result CSV file. -
Identify columns and their heading category: time for
start
and label forLabel
. Make sure to check "First row contains column headings". -
You can change display settings in the Property Boxes of the right pane with
View/Show Property Boxes
(X
). For example, select Segmentation for Plot Type of the annotation layer. -
Open the score in Musescore. Opening both SonicVisualiser and Musescore side by side, you can follow the temporal alignment of labels on the audio and the score at the same time.
- Notes or labels TSV files may contain unexpected additional fields
- Solution:
- identify them all and add them to
utils.py/align_warped_notes_labels
- rewrite
utils.py/align_warped_notes_labels
so that it is less (or not) sensitive to columns lists
- identify them all and add them to
- Solution:
- Has not been tested for pieces containing repetitions
- Solution:
- Test for repeating pieces (should not give conclusive results, if not raising errors)
- Investigate DTW improvements as proposed in Automatic Alignment of Music Performances with Structural Differences [3]
- Solution:
- Provide function to compute a similar running index as the AMS's
quarterbeats
(and corresp.duration_qb
), for data that would only contain bar and beat within a bar positions and time signature
[1] Hentschel, J., Neuwirth, M. and Rohrmeier, M., 2021. The Annotated Mozart Sonatas: Score, Harmony, and Cadence. Transactions of the International Society for Music Information Retrieval, 4(1), pp.67–80. [https://github.com/DCMLab/mozart_piano_sonatas]
[2] Müller, M., Özer, Y., Krause, K., Prätzlich, T., Driedger, J., and Zalkow, F., 2021. Sync Toolbox: A Python Package for Efficient, Robust, and Accurate Music Synchronization. Journal of Open Source Software (JOSS), 6(64). [https://github.com/meinardmueller/synctoolbox]
[3] Grachten, M., Gasser, M., Arzt, A., and Widmer, G., 2013. Automatic Alignment of Music Performances with Structural Differences. Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), pp. 607–612