[POC- DO NOT MERGE] Use heatmap to fingerprint recording data #273

microbit-robert · 2024-07-09T10:54:36Z

The output data of the ML filters for each recording are used to plot heatmaps which are effectively fingerprints of the data we pass into the ML model for training and inference. The fingerprints use filter outputs normalized to 0..1, the ML model still uses the non-normalized values and is not impacted by these changes.

These fingerprints are displayed for each recording in the "Process data" page (shown via animation after clicking the "Process data button). The fingerprint of the mean of the filters applied to all recordings for each action is displayed in the same box as the action name. If a micro:bit is connected, the live data is also displayed as a fingerprint at the right hand side of the bottom panel next to the x/y/z traces.

These fingerprints are effectively an alternative visualization to the filters view in the upstream ML Machine app.

More discussion is required regarding the UI/UX and education content impact. We could merge the process data and train model pages and possible re-enable the ability to toggle individual filters.

Model accuracy can be highly variable due to non-determinism. This change improves the consistency of the trained model.

Process data page uses a 50:50 split of the data plot and fingerprint

github-actions · 2024-07-09T10:54:48Z

Preview build will be at
https://review-ml.microbit.org/fingerprinting-vertical

microbit-robert · 2024-07-09T12:40:53Z

A demo of a different layout where we can compare the live fingerprint with the 'average' fingerprints for each class/action. The average data is never something we pass into the model, and it's representation is a bit misleading, but it is useful in seeing how the current action/gesture compares to the recorded data: https://github.com/microbit-foundation/ml-trainer/assets/95928279/c0201a89-8b4e-4e17-bb81-3d0379387630

microbit-robert · 2024-07-09T12:57:42Z

Hello @Karlo-Emilo, we've been playing with the idea of generating a fingerprint that can represent a recording, or rather the processed recording data that actually gets passed to the model, which is generated by the filters. It's essentially a different take on the filters view of the upstream ML Machine app. Details of the changes we've made in this branch are documented in the PR description above. We'd love to hear your thoughts on this if you have time to play with the review branch here -> https://review-ml.microbit.org/fingerprinting-vertical. Please can you also share with Magnus? I was unable to find him on GitHub.

I've also added a link to screen capture in a comment above that was recorded from a different version of this fingerprinting branch, but the comparison between live and recorded data is really easy to see in this video.

Karlo-Emilo · 2024-07-31T10:06:58Z

Hi @microbit-robert, I am back from the holidays.

It sounds super interesting!

However, I cannot connect a micro:bit in the preview version you link to. I am told that I use a micro:bit V1 even though I use a V2. I have tried different (identical versioned) micro:bits. I have not tried to update the firmware.

This is one of the micro:bits:

# DAPLink Firmware - see https://mbed.com/daplink
Unique ID: 9904360261974e45003c000d00000022000000009796990b
HIC ID: 9796990b
Auto Reset: 1
Automation allowed: 0
Overflow detection: 0
Incompatible image detection: 1
Page erasing: 0
Daplink Mode: Interface
Interface Version: 0255
Bootloader Version: 0255
Git SHA: 1436bdcc67029fdfc0ff03b73e12045bb6a9f272
Local Mods: 0
USB Interfaces: MSD, CDC, HID, WebUSB
Bootloader CRC: 0x828c6069
Interface CRC: 0x5b5cc0f5
Remount count: 0
URL: https://microbit.org/device/?id=9904&v=0255

It looks cool in the video. As you mention in the comment, the average representation can be misleading as it may not represent what the model is doing. The model looks at patterns across multiple features. So, the value of a given feature is interpreted differently depending on the other features. For example, this makes it possible for the model (with enough complexity and data) to detect a circle independently of the orientation of the micro:bit.

However, this is very complex for children/students to understand. With more simple models, the average may provide an adequate representation of what is going on in most cases.

I like the fingerprint visualization of the recordings. It looks good and is more comprehensible. It is really interesting to work with these different representations of the data. We will teach in a high school math class at the end of August to explore how to make the math in the models more comprehensible. We may also be able to test some of your ideas if you have a working prototype. I can, however, not promise anything before we have planned the lessons.

We have also experimented with using just one axis and fewer filters to visualize a KNN model. You can see what Malthe is currently working on here: microbit-foundation#484

Will you add @r59q (Malthe) and @Magniswerfer (Magnus) to the conversation?

microbit-robert · 2024-08-02T14:57:56Z

Hi @microbit-robert, I am back from the holidays.

It sounds super interesting!

However, I cannot connect a micro:bit in the preview version you link to. I am told that I use a micro:bit V1 even though I use a V2. I have tried different (identical versioned) micro:bits. I have not tried to update the firmware.

This is one of the micro:bits:
# DAPLink Firmware - see https://mbed.com/daplink
Unique ID: 9904360261974e45003c000d00000022000000009796990b
HIC ID: 9796990b
Auto Reset: 1
Automation allowed: 0
Overflow detection: 0
Incompatible image detection: 1
Page erasing: 0
Daplink Mode: Interface
Interface Version: 0255
Bootloader Version: 0255
Git SHA: 1436bdcc67029fdfc0ff03b73e12045bb6a9f272
Local Mods: 0
USB Interfaces: MSD, CDC, HID, WebUSB
Bootloader CRC: 0x828c6069
Interface CRC: 0x5b5cc0f5
Remount count: 0
URL: https://microbit.org/device/?id=9904&v=0255
It looks cool in the video. As you mention in the comment, the average representation can be misleading as it may not represent what the model is doing. The model looks at patterns across multiple features. So, the value of a given feature is interpreted differently depending on the other features. For example, this makes it possible for the model (with enough complexity and data) to detect a circle independently of the orientation of the micro:bit.

However, this is very complex for children/students to understand. With more simple models, the average may provide an adequate representation of what is going on in most cases.

I like the fingerprint visualization of the recordings. It looks good and is more comprehensible. It is really interesting to work with these different representations of the data. We will teach in a high school math class at the end of August to explore how to make the math in the models more comprehensible. We may also be able to test some of your ideas if you have a working prototype. I can, however, not promise anything before we have planned the lessons.

We have also experimented with using just one axis and fewer filters to visualize a KNN model. You can see what Malthe is currently working on here: microbit-foundation#484

Will you add @r59q (Malthe) and @Magniswerfer (Magnus) to the conversation?

Thanks for checking it out.

I have re-tested with v2.00 and it seems OK to me. Please could you provide details about how it fails, perhaps with some screenshots so we can identify what's going on?

It's interesting that sometimes we're able to see visually the differences in x, y, z data between classes whereas the model can't and vice versa. Any progress on explaining visually why that's the case would be useful, but you can't show something without explaining it, and then it gets complicated.

I'll have a proper look at the KNN visualization next week, but at a glance it also looks pretty cool.

Karlo-Emilo · 2024-08-06T15:05:32Z

I just tried replicating the issue I had last time, and.. now it works.

The tiny fingerprints on the live graph and the data samples is a smart idea. They are very compressed but provide a lot of information. Maybe it should be possible to click on them to get more information—or just a bigger view of the same representation.

It's interesting that sometimes we're able to see visually the differences in x, y, z data between classes whereas the model can't and vice versa. Any progress on explaining visually why that's the case would be useful, but you can't show something without explaining it, and then it gets complicated.

If similar visual differences are absent from the dataset, the model will not be able to recognize them. It can be difficult to give truthful explanations of AI models' outputs. There are lots of examples of explanations that follow human intuition rather than what the model is actually doing.

Maybe you could show which data samples the current input is most similar to. I'm not sure if it would just cause more confusion.

It may also be possible to show which representations the model mainly relies on. There may be some values it just ignores. We have used this approach in image recognition, where you can show which pixels the model mainly looks at in its predictions.

For example, here, the model correctly recognizes a drawing of a half-moon:

In examples where the prediction was wrong, it often looked at random pixels in the image.

You could do something similar with the fingerprint

microbit-robert · 2024-08-08T13:00:46Z

The image recognition representation is cool. If we could do the same with the fingerprint, I think that would be really valuable.

We are very interested if you are able to test the fingerprint idea in the classroom. Please can you let us know what you need from us to make that a realistic possibility, if of course, it fits into your planning and schedule?

Karlo-Emilo · 2024-08-09T08:31:47Z

The image recognition representation is cool. If we could do the same with the fingerprint, I think that would be really valuable.

We are very interested if you are able to test the fingerprint idea in the classroom. Please can you let us know what you need from us to make that a realistic possibility, if of course, it fits into your planning and schedule?

If you, before the 20st August, make a pull request to our repo with the fingerprints on the data samples:

And the fingerprints on the live graph (instead of the 3D model):

Then, we can use this version in activities we have already planned in a high school Math class. Some of the activities will be specifically about data the representation, so it would be a good opportunity to see if the high school students understand the fingerprints. And if they can use them to reason about the capabilities and limitations of their models when they test them.

It will then be part of a formal study, so we may write about it in a research paper and, of course, credit you then.

Because this is a formal study, we will not use the average representations because of the issue with how truthful they are about what the model is doing. It would, however, be interesting to test them on another occasion - in an informal workshop or similar.

Karlo-Emilo · 2024-08-09T08:37:03Z

I think I have identified the reason for my earlier connectivity issues. I may have used the Brave browser. My bad; I hope you have not spent too much time investigating the issue.

microbit-robert · 2024-08-09T08:39:44Z

I think I have identified the reason for my earlier connectivity issues. I may have used the Brave browser. My bad; I hope you have not spent too much time investigating the issue.

Ahh, OK. This was brought to our attention recently microbit-foundation/python-editor-v3#1189

microbit-robert · 2024-08-12T09:33:55Z

The image recognition representation is cool. If we could do the same with the fingerprint, I think that would be really valuable.
We are very interested if you are able to test the fingerprint idea in the classroom. Please can you let us know what you need from us to make that a realistic possibility, if of course, it fits into your planning and schedule?

If you, before the 20st August, make a pull request to our repo with the fingerprints on the data samples:

And the fingerprints on the live graph (instead of the 3D model):

Then, we can use this version in activities we have already planned in a high school Math class. Some of the activities will be specifically about data the representation, so it would be a good opportunity to see if the high school students understand the fingerprints. And if they can use them to reason about the capabilities and limitations of their models when they test them.

It will then be part of a formal study, so we may write about it in a research paper and, of course, credit you then.

Because this is a formal study, we will not use the average representations because of the issue with how truthful they are about what the model is doing. It would, however, be interesting to test them on another occasion - in an informal workshop or similar.

@Karlo-Emilo Just to let you know that we are aiming to make this PR this week.

microbit-robert added 6 commits July 4, 2024 13:45

Add heatmap for fingerprinting recordings

c147747

Improve model consistency

7f19858

Model accuracy can be highly variable due to non-determinism. This change improves the consistency of the trained model.

Alternative version of process data page

32b09e9

Process data page uses a 50:50 split of the data plot and fingerprint

Vertical layout of fingerprints

6463528

Fix logs field

055f312

Add animation on process data button click

3658ec9

microbit-robert mentioned this pull request Jul 9, 2024

[POC- DO NOT MERGE] Add heatmap for fingerprinting recordings #269

Closed

microbit-robert added 2 commits July 22, 2024 10:23

Merge branch 'model-consistency' into fingerprinting-vertical

97b0ac9

Trigger build

9ce0e5c

microbit-robert mentioned this pull request Aug 14, 2024

Add fingerprints to data page microbit-foundation/cctd-ml-machine#520

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[POC- DO NOT MERGE] Use heatmap to fingerprint recording data #273

[POC- DO NOT MERGE] Use heatmap to fingerprint recording data #273

microbit-robert commented Jul 9, 2024

github-actions bot commented Jul 9, 2024

microbit-robert commented Jul 9, 2024

microbit-robert commented Jul 9, 2024

Karlo-Emilo commented Jul 31, 2024

microbit-robert commented Aug 2, 2024

Karlo-Emilo commented Aug 6, 2024

microbit-robert commented Aug 8, 2024

Karlo-Emilo commented Aug 9, 2024

Karlo-Emilo commented Aug 9, 2024

microbit-robert commented Aug 9, 2024

microbit-robert commented Aug 12, 2024

[POC- DO NOT MERGE] Use heatmap to fingerprint recording data #273

Are you sure you want to change the base?

[POC- DO NOT MERGE] Use heatmap to fingerprint recording data #273

Conversation

microbit-robert commented Jul 9, 2024

github-actions bot commented Jul 9, 2024

microbit-robert commented Jul 9, 2024

microbit-robert commented Jul 9, 2024

Karlo-Emilo commented Jul 31, 2024

microbit-robert commented Aug 2, 2024

Karlo-Emilo commented Aug 6, 2024

microbit-robert commented Aug 8, 2024

Karlo-Emilo commented Aug 9, 2024

Karlo-Emilo commented Aug 9, 2024

microbit-robert commented Aug 9, 2024

microbit-robert commented Aug 12, 2024