Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Initial state support for Mamba SSM (1) #488

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

mzusman
Copy link

@mzusman mzusman commented Jul 24, 2024

Add chunked prefill / use initial state capability to Mamba ssm ( Mamba 1 ) , Done it by prepending the last forward pass state to the FWD pass kernel and read the data accordingly .

Latency is not affected. ( benchmark script shows similar latencies between this PR and main - 130ms )
Added tests that check correctness when running on chunks.

Limitations:

  • Applied only for selective scan fwd pass ( bwd pass is not supported )

This PR enables efficient Speculative decoding, prefix caching and prefill chunking.

FIX #233 #473 #258 #101

@mzusman mzusman changed the title feat: Chunked prefill for Mamba SSM (1) feat: Initial state support for Mamba SSM (1) Jul 24, 2024
@daphneOdera-618
Copy link

daphneOdera-618 commented Sep 2, 2024

@mzusman I've noticed you made changes to files in the csrc directory, but I'm having trouble getting these changes to take effect in my environment. Could you please tell me the exact instructions to rebuild and install the mamba_ssm package so the changes are applied? It seems I always get the original package using pip install .Thank you!

@mzusman
Copy link
Author

mzusman commented Sep 3, 2024

@daphneOdera-618 Yeah, the default setup.py behaviour is to download the upstream's wheel upon "installing", What you would need to do to force build is to add MAMBA_FORCE_BUILD=TRUE pip install .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

I guess mamba.step could be deleted if selective_scan_fn can accept ssm_state as an input param.
2 participants