Cloud-edge collaborative speculative decoding for LLM based on KubeEdge-Ianvs #126

hsj576 · 2024-07-23T02:14:57Z

Description:
- The autoregressive decoding mode of LLM determines that LLM can only be decoded serially, which limits its inference speed. Speculative decoding technique can be used to decode LLM in parallel with the help of draft model, so as to improve the inference speed of LLM without loss of accuracy. However, the speculative decoding technology of LLM does not consider the application in the cloud-edge distributed environment. This project aims to implement cloud-edge collaborative speculative decoding based on KubeEdge-Ianvs, an open source cloud-edge collaborative distributed machine learning platform, so as to further improve the LLM inference speed in cloud-edge environment.
Expected outcome:
- Implement an example of cloud-edge collaborative speculative decoding based on KubeEdge-Ianvs platform.
- (Optional) Propose a more efficient cloud-edge collaborative speculative decoding algorithm.
Recommended Skills:
- Familiar with LLM related technologies and have experience in deploying open source LLM locally.
- Proficient in Python and Pytorch.
- Have experience in deploying KubeEdge-Ianvs.

kairveeehh · 2024-07-30T10:20:11Z

Hi @hsj576 I am kairvee and would like to take up this project under LFX mentorship as it aligns with my interests and skills as well , could you assign it and let me know the further details?

hsj576 · 2024-08-01T04:29:47Z

Hi @hsj576 I am kairvee and would like to take up this project under LFX mentorship as it aligns with my interests and skills as well , could you assign it and let me know the further details?

The details of the project will be discussed in the weekly SIG ai meeting. Feel free to join us in https://zoom.us/j/4167237304 every Thursday 16:30 UTC+8.

kairveeehh · 2024-08-01T04:38:34Z

Hi @hsj576 I am kairvee and would like to take up this project under LFX mentorship as it aligns with my interests and skills as well , could you assign it and let me know the further details?

The details of the project will be discussed in the weekly SIG ai meeting. Feel free to join us in https://zoom.us/j/4167237304 every Thursday 16:30 UTC+8.

okk sure , thankyou

Ytemiloluwa · 2024-08-01T11:14:58Z

Hello @hsj576 my name is Temi I am an open source developer. I have genuine interest to work on this project this fall.

Ytemiloluwa · 2024-08-01T16:34:29Z

Hi @hsj576 I am kairvee and would like to take up this project under LFX mentorship as it aligns with my interests and skills as well , could you assign it and let me know the further details?

The details of the project will be discussed in the weekly SIG ai meeting. Feel free to join us in https://zoom.us/j/4167237304 every Thursday 16:30 UTC+8.

it seems I missed the meeting. I wasn't familiar with the time zone

kairveeehh · 2024-08-02T10:54:35Z

Hi @hsj576,
Need your reviews and the way forward to contributing

Proposed Approach

Speculative Decoding Implementation:
- I plan to set up a basic speculative decoding pipeline using a draft model to parallelize the decoding process, thus improving the inference speed of the LLM.
Cloud-Edge Architecture:
- For cloud-edge collaboration, I will deploy the draft model at the edge to handle initial predictions and the full model in the cloud for verification and refinement. This setup aims to optimize resource usage and reduce latency.
Testing and Optimization:
- I will benchmark the system to evaluate performance improvements and ensure that the solution does not compromise accuracy. I will also explore possible optimizations to enhance the efficiency of the cloud-edge collaboration.

lazyperson1020 · 2024-08-02T12:35:41Z

@hsj576 Are there any pre-tests to submit? I am interested to contribute.

aryan0931 · 2024-08-05T05:27:18Z

@hsj576 I'm Aryan , I would like to take his project under LFX mentorship as it aligns perfectly with my skills, are there any pretest to submit?

Sid260303 · 2024-08-06T03:51:07Z

@hsj576, Hello sir, I am Siddhant. I would like to work on this project under LFX mentorship as working on LLMs have been my aim and this opportunity is best to kick-start my journey in open source. I also missed the weekly meeting and would like to know more about the project and how can I help.
Also, it would be very kind of you if you could share some resources to make ourselves more prepared for the project

hsj576 · 2024-08-07T07:59:45Z

I will release a pretest in next week.

Ytemiloluwa · 2024-08-07T09:31:52Z

I will release a pretest in next week.

Hello @hsj576 the application for mentees close on 13th August (Tuesday).

hsj576 · 2024-08-07T09:41:42Z

I will release a pretest in next week.

Hello @hsj576 the application for mentees close on 13th August (Tuesday).

Ok, I will release the pretest as soon as possible (before 9th August).

FuryMartin · 2024-08-07T11:23:25Z

Hi, I hope to take on this project. I would like to highlight my advantages and the contributions I can make to the community.

I have a relatively deep understanding of the concept of Edge Computing and LLMs, and I am familiar with the main strategies for LLM cloud-edge collaboration, as well as the principles and implementation methods of Speculative Decoding.

Additionally, I have strong programming skills, particularly in Python and PyTorch. Last year, I interned at a large AI company specializing in LLMs.

What's more, I am trying to conduct a research about LLM cloud-edge collaboration strategy using Ianvs. After two months of learning, I have gained a good understanding of Ianvs's architecture, interfaces, and features. Now I am attempting to introduce a new feature which is relevant to the issue. If I am accepted, the integration of my insights could lead to a more coherent and rational architectural design.

Eagerly await consideration of my taking on this task!

sanggusti · 2024-08-07T15:23:46Z

Hi @hsj576

I would like to know more on your weekly meet. Do you have calendar link that I can add myself into? I would like to know more about this project.

And for the Collaborative Speculative Decoding, do you mind to share the paper of this technique? I just find a Collaborative Decoding paper online on Arxiv(https://arxiv.org/html/2406.12295v1) and can't find the Collaborative Speculative Decoding one.

hsj576 · 2024-08-09T05:17:20Z

Details of the pretest are released in #130.

hsj576 mentioned this issue Aug 9, 2024

LFX Mentorship Autumn 2024 Pretest - for #126 #130

Open

MooreZheng added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 13, 2024

FuryMartin linked a pull request Oct 14, 2024 that will close this issue

Cloud-edge collaborative speculative decoding for LLM based on KubeEdge-Ianvs #156

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cloud-edge collaborative speculative decoding for LLM based on KubeEdge-Ianvs #126

Cloud-edge collaborative speculative decoding for LLM based on KubeEdge-Ianvs #126

hsj576 commented Jul 23, 2024

kairveeehh commented Jul 30, 2024

hsj576 commented Aug 1, 2024

kairveeehh commented Aug 1, 2024

Ytemiloluwa commented Aug 1, 2024

Ytemiloluwa commented Aug 1, 2024

kairveeehh commented Aug 2, 2024

lazyperson1020 commented Aug 2, 2024

aryan0931 commented Aug 5, 2024

Sid260303 commented Aug 6, 2024

hsj576 commented Aug 7, 2024

Ytemiloluwa commented Aug 7, 2024

hsj576 commented Aug 7, 2024

FuryMartin commented Aug 7, 2024

sanggusti commented Aug 7, 2024

hsj576 commented Aug 9, 2024

Cloud-edge collaborative speculative decoding for LLM based on KubeEdge-Ianvs #126

Cloud-edge collaborative speculative decoding for LLM based on KubeEdge-Ianvs #126

Comments

hsj576 commented Jul 23, 2024

kairveeehh commented Jul 30, 2024

hsj576 commented Aug 1, 2024

kairveeehh commented Aug 1, 2024

Ytemiloluwa commented Aug 1, 2024

Ytemiloluwa commented Aug 1, 2024

kairveeehh commented Aug 2, 2024

Proposed Approach

lazyperson1020 commented Aug 2, 2024

aryan0931 commented Aug 5, 2024

Sid260303 commented Aug 6, 2024

hsj576 commented Aug 7, 2024

Ytemiloluwa commented Aug 7, 2024

hsj576 commented Aug 7, 2024

FuryMartin commented Aug 7, 2024

sanggusti commented Aug 7, 2024

hsj576 commented Aug 9, 2024