Skip to content

Latest commit

 

History

History
37 lines (30 loc) · 1.71 KB

File metadata and controls

37 lines (30 loc) · 1.71 KB

Demonstrating LLM Lingua for Efficient Language Model Compression

Introduction

This repository contains a Jupyter notebook that demonstrates the effective use of LLM Lingua, a tool for efficient language model compression. The project showcases how LLM Lingua can be utilized to overcome the limitations of token constraints in generative language models like GPT-3.5 and GPT-4, ensuring cost-effective and contextually rich applications. It covers a comparison of number of Tokens and Cost with and without compression while preserving the quality of Response

image.png

Overview

LLM Lingua addresses significant challenges in language model usage:

  • Token Limitations: By efficiently compressing inputs, it tackles the token limitations in models like GPT-3.
  • Contextual Integrity: It ensures that essential contextual details are retained, especially in extensive interactions.

Project Setup

To replicate and understand this demonstration, the following packages are required:

# pip install llmlingua==0.1.5
# pip install streamlit==1.28.2
# pip install langchain==0.0.336
# pip install openai==1.2.0
# pip install tiktoken==0.4.0
# pip install PyPDF2==3.0.1
# pip install accelerate==0.26.1
# pip install optimum==1.16.1
# pip install auto-gptq

Further Reading

For detailed insights into the methodologies and technicalities, refer to the related academic papers:

Notebook: