Skip to content

This project focuses on addressing the limitations and challenges associated with using large generative language models like GPT-3.5 and GPT-4. LLM Lingua, a package designed for efficient language model compression., its approach aims to mitigate token limitations and enhance contextual understanding in language model applications.

Notifications You must be signed in to change notification settings

sarmadafzalj/LLMLingua-Prompt-Compression-for-Cost-Effective-RAG-with-LLMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Demonstrating LLM Lingua for Efficient Language Model Compression

Introduction

This repository contains a Jupyter notebook that demonstrates the effective use of LLM Lingua, a tool for efficient language model compression. The project showcases how LLM Lingua can be utilized to overcome the limitations of token constraints in generative language models like GPT-3.5 and GPT-4, ensuring cost-effective and contextually rich applications. It covers a comparison of number of Tokens and Cost with and without compression while preserving the quality of Response

image.png

Overview

LLM Lingua addresses significant challenges in language model usage:

  • Token Limitations: By efficiently compressing inputs, it tackles the token limitations in models like GPT-3.
  • Contextual Integrity: It ensures that essential contextual details are retained, especially in extensive interactions.

Project Setup

To replicate and understand this demonstration, the following packages are required:

# pip install llmlingua==0.1.5
# pip install streamlit==1.28.2
# pip install langchain==0.0.336
# pip install openai==1.2.0
# pip install tiktoken==0.4.0
# pip install PyPDF2==3.0.1
# pip install accelerate==0.26.1
# pip install optimum==1.16.1
# pip install auto-gptq

Further Reading

For detailed insights into the methodologies and technicalities, refer to the related academic papers:

Notebook:

About

This project focuses on addressing the limitations and challenges associated with using large generative language models like GPT-3.5 and GPT-4. LLM Lingua, a package designed for efficient language model compression., its approach aims to mitigate token limitations and enhance contextual understanding in language model applications.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published