Skip to content

TrOMR:Transformer-based Polyphonic Optical Music Recognition

License

Notifications You must be signed in to change notification settings

liebharc/Polyphonic-TrOMR

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

91 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TROMR:TRANSFORMER-BASED POLYPHONIC OPTICAL MUSIC RECOGNITION

📝 Table of Contents

This code was used in the experiments from the paper: "TROMR:TRANSFORMER-BASED POLYPHONIC OPTICAL MUSIC RECOGNITION" The training code will be open source later.

Introduction

Introduction

Optical score recognition (OMR) provides an intelligent and efficient way for paper score digitalization, which can be widely used in the field of assisting music teaching, music search, music secondary creation, and so on. we propose a transformer-based approach with excellent global perceptual capability for end-to-end polyphonic OMR, called TrOMR. Extensive experiments demonstrate that TrOMR outperforms current OMR methods, especially in real-world scenarios.

Dataset

The images in MSD are electronic sheet music with clear symbols. The images in CMSD-P are printed and photographed and the symbols are rather blurred and the lines are jagged. The images in CMSD-S, taken on a screen, are slightly blurred and have a lot of moiré. Meanwhile, the effects of light and jitter are also simulated. These operations make our data very close to the real application scenarios.

Experiment

To facilitate direct comparison, the results of the model were visualized and the error was marked in the red box. It can be seen that TrOMR has higher recognition accuracy than baseline in areas with dense symbols and in areas further away from the staff.

Preparation

  pip install -r requirements.txt

Inference

  python ./tromr/inference.py ./examples/photo4.jpg

Demonstrations

The following contents are displayed in the order of input, output visualization and predictive coding:

Result1

clef-F4+keySignature-CM+note-E3_eighth.|note-C4_eighth.+note-E3_sixteenth|note-D4_sixteenth+note-E3_eighth.|note-E4_eighth.+note-E3_sixteenth|note-F4_sixteenth+note-E3_eighth.|note-C4_eighth.+note-E3_sixteenth|note-B3_sixteenth+note-E3_eighth.|note-D4_eighth.+note-E3_sixteenth|note-C4_sixteenth+barline+note-D3_eighth.|note-F3_eighth.|note-C4_whole+note-D3_sixteenth|note-F3_sixteenth+note-D3_eighth.|note-F3_eighth.+note-D3_sixteenth|note-F3_sixteenth+note-D3_half|note-F3_half+barline+note-D3_eighth.|note-B3_eighth.+note-E3_sixteenth|note-B3_sixteenth+note-F3_eighth.|note-B3_eighth.+note-G3_sixteenth|note-B3_sixteenth+note-B3_quarter|note-D4_quarter+note-G3_quarter|note-B3_quarter+barline
​​​​​​​

Result2

clef-G2+keySignature-CM+note-F4_quarter|note-A4_quarter|note-C5_quarter+note-F4_quarter|note-A4_quarter|note-C5_quarter+note-F4_quarter|note-A4_quarter|note-C5_quarter+note-F4_quarter|note-A4_quarter|note-C5_quarter+barline+note-F4#_whole|note-D5_quarter+note-D5_quarter+note-D5_quarter+note-D5_quarter+barline+note-F4#_quarter|note-D5_quarter+note-E4_quarter|note-C5_quarter+note-D4_quarter|note-B4_quarter+note-C4_quarter|note-A4_quarter+barline+note-B3_half|note-A4_half+note-B3_half|note-G4_half+barline+note-B3_half|note-F4_half|note-G4_half+note-B3_half|note-F4_half|note-G4_half+barline+note-C4_whole|note-E4_whole+barline
​​​​​​​

Result3

clef-G2+keySignature-DM+note-A2_eighth|note-C4N_quarter+note-A2_sixteenth|note-A3_sixteenth+note-A2_sixteenth|note-A3_sixteenth|note-C5N_sixteenth+note-E3_sixteenth|note-A3_eighth|note-C5_sixteenth+note-A2#_eighth.|note-E3_eighth.+note-C4_eighth+note-A2_sixteenth|note-C4_eighth+note-D2#_eighth.|note-A2_eighth.+note-A2N_quarter.+note-D2N_eighth|note-A3_eighth+note-D2_eighth|note-A3_eighth|note-B3_eighth+barline+note-D2_eighth|note-A2_quarter+note-D4#_eighth|note-F4_eighth+note-C4N_half|rest-sixteenth+note-A2_sixteenth+note-A2_sixteenth|note-A2#_sixteenth+note-A2_sixteenth|note-A2_sixteenth|note-D4_sixteenth+note-F2_quarter|note-A2_quarter|note-D4_quarter+note-F2_sixteenth|note-A2N_quarter+note-F2_eighth.|note-A3_eighth.|note-B3_eighth.+barline
​​​​​​​

Result4

clef-G2+keySignature-DM+note-C2N_quarter|note-F2_eighth|note-C3_eighth|note-A3_eighth+note-F2_sixteenth|note-C3_sixteenth|note-A3#_sixteenth+note-F2_sixteenth|note-C3_sixteenth|note-A3_sixteenth|note-F4N_sixteenth+note-C3_eighth|note-A3_eighth|note-F4_eighth|note-A4#_quarter+note-C3_eighth|note-G4_eighth+note-A2_quarter|note-C3_eighth|note-G4_eighth+note-C2#_quarter|note-C3_quarter+note-A3_quarter+note-A2_eighth|note-C3_eighth+barline+note-E5_sixteenth|rest-eighth+note-A2_eighth.+note-A2_quarter+note-F2_eighth|note-B2_eighth+note-F2_eighth|note-B2_eighth|note-A3_eighth|note-E5_eighth+note-A2_eighth|note-A2_quarter|note-D3_eighth|note-E5_eighth+note-F2_eighth|note-B2_eighth|note-A3_eighth|note-E5_eighth+note-B2_sixteenth|note-A4_sixteenth+note-B2_eighth.|note-A3_eighth.|note-A4_eighth.+note-E5_eighth+barline
​​​​​​​

About

TrOMR:Transformer-based Polyphonic Optical Music Recognition

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%