Skip to content
This repository has been archived by the owner on Aug 1, 2024. It is now read-only.

Runtime in half precision #692

Open
Hrovatin opened this issue Jun 13, 2024 · 0 comments
Open

Runtime in half precision #692

Hrovatin opened this issue Jun 13, 2024 · 0 comments

Comments

@Hrovatin
Copy link

I tried running ESM2 inference (model(seq_tokens).logits) in full or half-precision (model.half()) on Apple M3 Max Chip and torch==2.3.1.
I noticed that if I use half-precision the inference time is ~10x longer (while the memory drops as expected) - any idea why the runtime increases so drastically?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant