Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Providing help and FLOSS stack #12

Open
phhusson opened this issue Sep 24, 2023 · 1 comment
Open

Providing help and FLOSS stack #12

phhusson opened this issue Sep 24, 2023 · 1 comment

Comments

@phhusson
Copy link

Hello,

Your project looks cool, as I was rather sad seeing that Rockchip's NN framework failed to load any useful model.

I've done some reverse engineering ( + reading the datasheet) of RK3588's NPU ( https://github.com/phhusson/rknpu-reverse-engineering/), and I think that maybe I can help.

Reading your TODO, you're using RKNN exclusively to do matrix (not higher order tensors?) multiplications, is that intended? (NPU can do RELu, max/min/average pooling, convolutions)

I see you're waiting for rockchip for int4 matmul, hoping there is no hardware bug preventing it, I should be able to provide one if that's the most useful thing you need?

Either way, seeing your usage I'll try to write a FLOSS reimplementation of rockchip's matmul, to get rid of that proprietary blob.

@keveman
Copy link
Contributor

keveman commented Sep 24, 2023

@phhusson This library is indeed using the proprietary binary blob to perform the matrix multiplications. It is unfortunate that Rockchip is keeping the NPU fully closed. For the transformer models, 8-bit and/or 4-bit matrix multiplication are really all we need. Currently, only FP16 matrix multiplies are being used, but I didn't see that much performance improvement for the tiny.en model when using int8.
Reverse engineering just the matrix multiplies would be quite useful for the community in general.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants