Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which LLM models are supported? #341

Closed
yyyhainan opened this issue Jul 3, 2024 · 15 comments
Closed

Which LLM models are supported? #341

yyyhainan opened this issue Jul 3, 2024 · 15 comments

Comments

@yyyhainan
Copy link

Whether other LLM models are supported, such as ChatGLM and QWEN?

@Lbaiall
Copy link

Lbaiall commented Jul 3, 2024

i got with same question ....... where can i see the main source code ....

@dinobot22
Copy link

+1

1 similar comment
@andysingal
Copy link

+1

@young169
Copy link

young169 commented Jul 4, 2024

And also can we use locally deployed LLMs other than via api keys?

@zzk2021
Copy link

zzk2021 commented Jul 4, 2024

same question

@gallypette
Copy link

+1

@AlonsoGuevara
Copy link
Contributor

Hi!
During our research we got the most quality out of gpt-4, gpt-4-turbo and gpt-4o, that's why out of the box we include support for these in both OpenAI and Azure environments.

Regarding local hosting there's a very interesting conversation going on in this thread #339

@bmaltais
Copy link

bmaltais commented Jul 4, 2024

I have tested gemma2 and llama3 with success. The only thing that does not work locally is the embeddings. There need to be a fix to accept the style of response coming from ollama when quering embeddings... Once that is fixed you will be able to run this 100% local on a personal computer... but probably need a NVidia with 24GB of VRAM like a 3090 or a Mx Mac with 32GB RAM.

@zzk2021
Copy link

zzk2021 commented Jul 5, 2024

I have tested gemma2 and llama3 with success. The only thing that does not work locally is the embeddings. There need to be a fix to accept the style of response coming from ollama when quering embeddings... Once that is fixed you will be able to run this 100% local on a personal computer... but probably need a NVidia with 24GB of VRAM like a 3090 or a Mx Mac with 32GB RAM.

can we use local embedding?

@vamshi-rvk
Copy link

I have tested gemma2 and llama3 with success. The only thing that does not work locally is the embeddings. There need to be a fix to accept the style of response coming from ollama when quering embeddings... Once that is fixed you will be able to run this 100% local on a personal computer... but probably need a NVidia with 24GB of VRAM like a 3090 or a Mx Mac with 32GB RAM.

Can you help me with running llama 3 from the local please..

@ishotoli
Copy link

ishotoli commented Jul 7, 2024

I have tested gemma2 and llama3 with success. The only thing that does not work locally is the embeddings. There need to be a fix to accept the style of response coming from ollama when quering embeddings... Once that is fixed you will be able to run this 100% local on a personal computer... but probably need a NVidia with 24GB of VRAM like a 3090 or a Mx Mac with 32GB RAM.

Can you help me with running llama 3 from the local please..

Here's my .env file, put it under ./ragtest dir, hope this can help you:
'''
GRAPHRAG_LLM_API_KEY=DEFAULTS
GRAPHRAG_LLM_TYPE=openai_chat
GRAPHRAG_LLM_API_BASE=http://127.0.0.1:5081/v1
GRAPHRAG_LLM_MODEL=Hermes-2-Pro-Llama-3-Instruct-Merged-DPO
GRAPHRAG_LLM_REQUEST_TIMEOUT=700
GRAPHRAG_LLM_MODEL_SUPPORTS_JSON=True
GRAPHRAG_LLM_THREAD_COUNT=16
GRAPHRAG_LLM_CONCURRENT_REQUESTS=16
GRAPHRAG_EMBEDDING_TYPE=openai_embedding
GRAPHRAG_EMBEDDING_API_BASE=http://127.0.0.1:9997/v1
GRAPHRAG_EMBEDDING_MODEL=bce-embedding-base_v1
GRAPHRAG_EMBEDDING_BATCH_SIZE=64
GRAPHRAG_EMBEDDING_BATCH_MAX_TOKENS=512
GRAPHRAG_EMBEDDING_THREAD_COUNT=16
GRAPHRAG_EMBEDDING_CONCURRENT_REQUESTS=16
GRAPHRAG_INPUT_FILE_PATTERN=".*.txt$"
'''

@vamshi-rvk
Copy link

@AlonsoGuevara
Copy link
Contributor

Hi! We are centralizing other LLM discussions in these threads:
Other LLM/Api bases: #339,
Ollama: #345
Local embeddings: #370

I'll resolve this issue so we can keep the focus on those threads

@xpdd123
Copy link

xpdd123 commented Jul 10, 2024

I test gemma2 sucess, but glm4 failed, I guess its because of the input length of the llm

@whisper-bye
Copy link

qwen2:7b fail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests