Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Online beat trackers #314

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

SebastianPoell
Copy link
Contributor

@SebastianPoell SebastianPoell commented Jul 17, 2017

This PR extends The BeatTracker with online capability based on #292.
Also a simple CCF beat detecting method for experimenting was added.

Copy link
Collaborator

@superbock superbock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I am not sure about some defaults (e.g. the 5 frames look back or a threshold of 0.1), these should at least be documented somehow or reconsidered completely -- e.g. the 5 frames could maybe expressed as the minimum interval or something like that.

buffer = self.buffer(activation)
# create an online interval histogram
histogram = self.tempo_estimator.online_interval_histogram(
activation, reset=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should pass the reset argument to the tempo_estimator as well, i.e. reset=reset. Otherwise the beat tracker resets its state, but the tempo estimation still keeps its old state.

Copy link
Contributor Author

@SebastianPoell SebastianPoell Jul 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, actually I think it's ok the way it is? The tempo_estimator gets reset inside the reset() method of the BeatTracker and might be called whenever process_online() is called.

If I want to compute a whole song then process_online() gets called only once with all the activations and reset=True. If I would use reset=reset here then while iterating through all activations it would reset the tempo_estimator for each activation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I missed the resetting inside reset(). Anyways, I still think passing reset=reset is the cleaner solution than calling self.tempo_estimator.reset(). When iterating over activations frame by frame, reset must be set to false of course -- and this is exactly what is done in online mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I forgot to pass the online=True parameter to process_single() (I'm calling stuff with a script). Nevermind, works now ;)

activation, reset=False)
# get the dominant interval
interval = self.tempo_estimator.dominant_interval(histogram)
# compute current and next beat times
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is a bit misleading, especially the part of next beat. If I get it right it is the first position which is possible to be the next beat, right?

# for every frame the tempo may change and therefore the last beat
# can easily be missed.
if len(detections) > 0 and detections[-1] >= len(buffer) - 5 and \
buffer[detections[-1]] > self.threshold:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is weird and the if-statement is very long. I'd prefer to have multiple logical expressions and then test for them, e.g.:

cond_1 = detections[-1] >= len(buffer) - 5
cond_2 = buffer[detections[-1]] > self.threshold

if detections and cond_1 and cond_2:

Of course, more meaningful names would be nice, but you get the idea. What I still don't like is some hard coded values such as -5. Why 5 frames?

Copy link
Contributor Author

@SebastianPoell SebastianPoell Jul 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tested different values here and this just seemed to work best. But I know what you mean... maybe i can find something more rational. For the threshold the intention was that we probably don't want false positive predictions on silent signals. This also seem to improve the dataset results a bit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was never questioning the rationale behind using a threshold, e.g.DBNBeatTrackingProcessor uses it for the very same reason 😏. Back then I added the threshold for the DBN stuff and did not back-port it to the other methods. Maybe we should do so now and add it to the other methods (after evaluating it's positive effect of course).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more notes regarding the threshold:

  • What I do is to determine the first/last frame where the activation is above/below the threshold and use only the segment in between. Although this works quite nicely, I am not sure if this is not simply some sort of adaptation towards the annotations...
  • The same could be done for multiple segments, not just one per song. Although this sounds logical and straight forward, I am not sure if this segmentation helps in general. We must be aware of the many songs which have segments without any perceivable beats where counting continues. For these cases splitting into segments doesn't make sense at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, cool! Yeah using only "active" segments seems like a good idea. Maybe the distance between those active segments gives a hint whether it should stop tracking or not.
For online it's a bit different anyway because we don't know how long the break will be so we have to decide on the spot (or maybe with a bit of delay).

# display tempo
display.append('| %5.1f | ' % float(self.fps * 60 / interval))
sys.stderr.write('\r%s' % ''.join(display))
sys.stderr.flush()
Copy link
Collaborator

@superbock superbock Jul 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider refactoring the visualisation stuff as a mixin which provides a visualize() method. BeatTrackingProcessor and DBNBeatTrackingProcessor could then inherit from this mixin and simply call visualize() whenever needed. The mixin should have/keep its own state and counters. Some more notes: we should consider using constants to define how long the beats are displayed (e.g. self.beat_counter = DISPLAY_DURATION) or the inertia of the RNN beat activation strength (e.g. DISPLAY_INERTIA).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be very nice! Every BeatTracker could show the activation strength, the tempo and whether the current frame is a beat. My only concern is that different trackers may want to show different additional information (like the DBNTracker does with the state space).

@@ -79,6 +79,25 @@ def test_process(self):
self.assertTrue(np.allclose(beats, [0.11, 0.45, 0.79, 1.13, 1.47,
1.81, 2.15, 2.49]))

def test_process_online(self):
processor = BeatTrackingProcessor(fps=sample_blstm_act.fps,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor note: use fps=sample_lstm_act.fps instead.

@SebastianPoell
Copy link
Contributor Author

Thx for all your comments, will work through all of them and then amend ;)

@SebastianPoell SebastianPoell force-pushed the online_beat_tracker branch 5 times, most recently from fb6c54f to cb466ac Compare July 20, 2017 19:57
@@ -390,6 +390,51 @@ def recursive(position):
return np.array(positions)


def detect_beats_with_ccf(activations, interval, look_aside=0.2):
Copy link
Contributor Author

@SebastianPoell SebastianPoell Jul 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am actually not sure if you want to keep this method :) It is completely interchangeable with detect_beats() but seems to produce results which are a little bit worse. On the other hand its much faster. Maybe its an option for the BeatDetector (with constant tempo).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem you are facing here is the fact that many musical pieces simply don't have a constant tempo.

Although you're allowing some deviation from a constant tempo, this is only considered for each beat individually, i.e. the tempo remains stable and only some jitter is allowed. If one beat is a bit earlier (faster), the position of the next is not based on the current beat, but rather on the previously pre-computed positions.

detect_beats on the other hand allows also some deviation from the main tempo, but the position of the next beat is based on this position rather than the pre-computed ones. This makes this method more flexible, but also slower.

If detect_beats_with_ccf performs inferior to detect_beats on most datasets I don't see a point for including it, even if it is faster. I'd rather speed up detect_beats if speed is really an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, i think the same! At least i can use it for my thesis and evaluate it :)

I'm very curious now about the MultiAgent approach because, similar to detect_beats it always jumps from beat to beat but considers multiple tempo hypothesis and should be a very good fit for online mode 😋

@SebastianPoell SebastianPoell force-pushed the online_beat_tracker branch 2 times, most recently from cb466ac to 9d85686 Compare July 21, 2017 01:38
@superbock superbock force-pushed the refactor_tempo branch 5 times, most recently from 48d0cf8 to 886937c Compare July 24, 2017 20:17
@superbock superbock force-pushed the refactor_tempo branch 7 times, most recently from 37e83bb to cbdbe8c Compare August 18, 2017 18:18
@superbock superbock force-pushed the refactor_tempo branch 5 times, most recently from ce27bc1 to 67fb095 Compare August 30, 2017 21:22
@SebastianPoell SebastianPoell force-pushed the online_beat_tracker branch 3 times, most recently from b169c9b to 31c4d33 Compare August 31, 2017 06:27
@SebastianPoell SebastianPoell force-pushed the online_beat_tracker branch 4 times, most recently from 2aa06d6 to 4fd7ab9 Compare September 1, 2017 14:03
@superbock superbock force-pushed the refactor_tempo branch 3 times, most recently from 8777d30 to ceb5c74 Compare September 3, 2017 10:33
@SebastianPoell SebastianPoell force-pushed the online_beat_tracker branch 10 times, most recently from 56eb679 to 55d7f1f Compare September 18, 2017 22:03
@superbock superbock force-pushed the refactor_tempo branch 4 times, most recently from dcdbb8b to 1c640e5 Compare September 19, 2017 13:19
@superbock
Copy link
Collaborator

I merged #292, please rebase your branch on master.

@SebastianPoell SebastianPoell changed the base branch from refactor_tempo to master October 17, 2017 22:41
@SebastianPoell
Copy link
Contributor Author

Ok, rebased this branch and applied your stuck-appveyor fixes to the tests. I think the -0.1% Coverage is due to the CCF-Commit (which can be omitted). Anyway, would be awesome if you could take it from here \m/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants