[Bug]: Using Ollama and error occur like:[JSONDecodeError: Expecting ',' delimiter: line 5 column 45] #663

ayanjiushishuai · 2024-07-23T06:24:05Z

Describe the bug

Try to use local model(include qwen2:1.5b and phi3) by ollama instead of GPT4
Command:python -m graphrag.index --root ./ragtest
When executing create_final_entities part,errors occurred.
Here is the screenshot:
Here is the screenshot when I use GPT4 in a almost same environment(Only the settings.ymal is different),and everything looks fine.

Steps to reproduce

download ollama:
curl -fsSL https://ollama.com/install.sh | sh
set .env and settings.ymal down here
Do like the doc

Expected Behavior

pipline should work well .

GraphRAG Config Used

settings.yaml config is like that:

encoding_model: cl100k_base
skip_workflows: []
llm:
  api_key: ollama
  type: openai_chat # or azure_openai_chat
  model: qwen2:1.5b
  model_supports_json: True
  api_base: http://localhost:8004/v1  #this port is my local config


embeddings:
  async_mode: threaded # or asyncio
  llm:
    api_key: ollama
    type: openai_embedding # or azure_openai_embedding
    model: nomic-embed-text:latest
    api_base: http://localhost:8004/api     #this port is my local config

The .env config is like:

GRAPHRAG_API_KEY=ollama

Logs and screenshots

log file:

16:27:45,459 datashaper.workflow.workflow INFO executing verb create_community_reports
16:27:49,439 httpx INFO HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
16:27:49,485 httpx INFO HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
16:27:49,486 graphrag.llm.openai.utils ERROR error loading json, json={
   "title": "Harmony Assembly Community",
   "summary": "Harmony Assembly is a community that operates in Verdant Oasis Plaza. The organization, Harmony Assembly, is responsible for organizing the Unity March at Verdant Oasis Plaza. This event has significant implications and could pose threats to the community if not managed properly.",
   "rating": 7,
   "rating_explanation": "Harmony Assembly's actions have a high impact on the community, as they are involved in organizing an event that could attract media attention and potentially influence public perception.",
   "findings": [
       {
           "summary": "Harmony Assembly is the organizer of the Unity March at Verdant Oasis Plaza",
           "explanation": "Harmony Assembly is responsible for organizing the Unity March, which is a significant event in the community. The march has the potential to attract media attention and influence public perception."
       },
       {
           "summary": "Harmony Assembly's actions could pose threats if not managed properly",
           "explanation": "The actions of Harmony Assembly could pose threats to the community if they are not managed properly, as their involvement in organizing an event at Verdant Oasis Plaza could attract media attention and potentially influence public perception."
       }
   ]
}
Traceback (most recent call last):
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/graphrag/llm/openai/utils.py", line 94, in try_parse_json_object
   result = json.loads(clean_json)
 File "/root/miniconda3/envs/grag/lib/python3.10/json/__init__.py", line 346, in loads
   return _default_decoder.decode(s)
 File "/root/miniconda3/envs/grag/lib/python3.10/json/decoder.py", line 337, in decode
   obj, end = self.raw_decode(s, idx=_w(s, 0).end())
 File "/root/miniconda3/envs/grag/lib/python3.10/json/decoder.py", line 353, in raw_decode
   obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 5 column 45 (char 405)
16:27:49,486 graphrag.index.graph.extractors.community_reports.community_reports_extractor ERROR error generating community report
Traceback (most recent call last):
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/graphrag/index/graph/extractors/community_reports/community_reports_extractor.py", line 58, in __call__
   await self._llm(
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/graphrag/llm/openai/json_parsing_llm.py", line 34, in __call__
   result = await self._delegate(input, **kwargs)
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/graphrag/llm/openai/openai_token_replacing_llm.py", line 37, in __call__
   return await self._delegate(input, **kwargs)
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/graphrag/llm/openai/openai_history_tracking_llm.py", line 33, in __call__
   output = await self._delegate(input, **kwargs)
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/graphrag/llm/base/caching_llm.py", line 104, in __call__
   result = await self._delegate(input, **kwargs)
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/graphrag/llm/base/rate_limiting_llm.py", line 177, in __call__
   result, start = await execute_with_retry()
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/graphrag/llm/base/rate_limiting_llm.py", line 159, in execute_with_retry
   async for attempt in retryer:
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/tenacity/asyncio/__init__.py", line 166, in __anext__
   do = await self.iter(retry_state=self._retry_state)
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/tenacity/asyncio/__init__.py", line 153, in iter
   result = await action(retry_state)
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/tenacity/_utils.py", line 99, in inner
   return call(*args, **kwargs)
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/tenacity/__init__.py", line 398, in <lambda>
   self._add_action_func(lambda rs: rs.outcome.result())
 File "/root/miniconda3/envs/grag/lib/python3.10/concurrent/futures/_base.py", line 451, in result
   return self.__get_result()
 File "/root/miniconda3/envs/grag/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
   raise self._exception
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/graphrag/llm/base/rate_limiting_llm.py", line 165, in execute_with_retry
   return await do_attempt(), start
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/graphrag/llm/base/rate_limiting_llm.py", line 147, in do_attempt
   return await self._delegate(input, **kwargs)
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/graphrag/llm/base/base_llm.py", line 48, in __call__
   return await self._invoke_json(input, **kwargs)
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/graphrag/llm/openai/openai_chat_llm.py", line 92, in _invoke_json
   result = await generate()
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/graphrag/llm/openai/openai_chat_llm.py", line 84, in generate
   await self._native_json(input, **{**kwargs, "name": call_name})
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/graphrag/llm/openai/openai_chat_llm.py", line 123, in _native_json
   json_output = try_parse_json_object(raw_output)
 File "/root/miniconda3/envs/grag/lib/python3.10/site-packages/graphrag/llm/openai/utils.py", line 94, in try_parse_json_object
   result = json.loads(clean_json)
 File "/root/miniconda3/envs/grag/lib/python3.10/json/__init__.py", line 346, in loads
   return _default_decoder.decode(s)
 File "/root/miniconda3/envs/grag/lib/python3.10/json/decoder.py", line 337, in decode
   obj, end = self.raw_decode(s, idx=_w(s, 0).end())
 File "/root/miniconda3/envs/grag/lib/python3.10/json/decoder.py", line 353, in raw_decode
   obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 5 column 45 (char 405)
16:27:49,487 graphrag.index.reporting.file_workflow_callbacks INFO Community Report Extraction Error details=None
16:27:49,487 graphrag.index.verbs.graph.report.strategies.graph_intelligence.run_graph_intelligence WARNING No report found for community: 2

However,when I copy this JSON string to test, the format seems correct.

And I have try some solution,like manually change json format and change the format of prompt.It do help when the output is not standard JOSN string.But my output now looks ok, but there are still errors.

What's more,I try different models like qwen2:1.5b and phi3.But they all are small size model.Does this mean GraphRAG don't support these small model?

Additional Information

GraphRAG Version:
Operating System:
Python Version:
Related Issues:

The text was updated successfully, but these errors were encountered:

yurochang · 2024-07-23T10:07:44Z

when I use llama3(80k input), I have similar error information in global search part, when I use qwen2:7b(320k input), It solved.

While the local search still can not work: ZeroDivisionError: Weights sum to zero, can't be normalized

natoverse · 2024-07-23T15:25:20Z

Consolidating alternate model issues here: #657

wenwkich · 2024-07-23T19:18:50Z

when I use llama3(80k input), I have similar error information in global search part, when I use qwen2:7b(320k input), It solved.

While the local search still can not work: ZeroDivisionError: Weights sum to zero, can't be normalized

@yurochang
I think you mean 8k input for llama3. Are you also using ollama for embeddings? I was able to run local query without error when I modified the code in graphrag\query\llm\oai\embedding.py as the following (need to pip install ollama first), but it yields completely out of context results.

I also tried other solutions in the github issues, either it's giving out of context results or the same error. And I printed the context_text after the context building (before the local search happens), it was kind of related to the input, but it wasn't including anything related to my question.

I'm thinking it might be due to those errors when creating the community reports and our models were too small. Today, ollama supports the newest llama 3.1 model with 128k context window, I'm going to give it a try.

# Copyright (c) 2024 Microsoft Corporation.
# Licensed under the MIT License

"""OpenAI Embedding model implementation."""

import asyncio
from collections.abc import Callable
from typing import Any

import numpy as np
import tiktoken
from tenacity import (
    AsyncRetrying,
    RetryError,
    Retrying,
    retry_if_exception_type,
    stop_after_attempt,
    wait_exponential_jitter,
)

from graphrag.query.llm.base import BaseTextEmbedding
from graphrag.query.llm.oai.base import OpenAILLMImpl
from graphrag.query.llm.oai.typing import (
    OPENAI_RETRY_ERROR_TYPES,
    OpenaiApiType,
)
from graphrag.query.llm.text_utils import chunk_text
from graphrag.query.progress import StatusReporter

import ollama
import json

class OpenAIEmbedding(BaseTextEmbedding, OpenAILLMImpl):
    """Wrapper for OpenAI Embedding models."""

    def __init__(
        self,
        api_key: str | None = None,
        azure_ad_token_provider: Callable | None = None,
        model: str = "text-embedding-3-small",
        deployment_name: str | None = None,
        api_base: str | None = None,
        api_version: str | None = None,
        api_type: OpenaiApiType = OpenaiApiType.OpenAI,
        organization: str | None = None,
        encoding_name: str = "cl100k_base",
        max_tokens: int = 8191,
        max_retries: int = 10,
        request_timeout: float = 180.0,
        retry_error_types: tuple[type[BaseException]] = OPENAI_RETRY_ERROR_TYPES,  # type: ignore
        reporter: StatusReporter | None = None,
    ):
        OpenAILLMImpl.__init__(
            self=self,
            api_key=api_key,
            azure_ad_token_provider=azure_ad_token_provider,
            deployment_name=deployment_name,
            api_base=api_base,
            api_version=api_version,
            api_type=api_type,  # type: ignore
            organization=organization,
            max_retries=max_retries,
            request_timeout=request_timeout,
            reporter=reporter,
        )

        self.model = model
        self.encoding_name = encoding_name
        self.max_tokens = max_tokens
        self.token_encoder = tiktoken.get_encoding(self.encoding_name)
        self.retry_error_types = retry_error_types

    def embed(self, text: str, **kwargs: Any) -> list[float]:
        """
        Embed text using OpenAI Embedding's sync function.

        For text longer than max_tokens, chunk texts into max_tokens, embed each chunk, then combine using weighted average.
        Please refer to: https://github.com/openai/openai-cookbook/blob/main/examples/Embedding_long_inputs.ipynb
        """
        token_chunks = chunk_text(
            text=text, token_encoder=self.token_encoder, max_tokens=self.max_tokens
        )
        chunk_embeddings = []
        chunk_lens = []
        for chunk in token_chunks:
            try:
                embedding, chunk_len = self._embed_with_retry(chunk, **kwargs)
                chunk_embeddings.append(embedding)
                chunk_lens.append(chunk_len)
            # TODO: catch a more specific exception
            except Exception as e:  # noqa BLE001
                self._reporter.error(
                    message="Error embedding chunk",
                    details={self.__class__.__name__: str(e)},
                )

                continue
        chunk_embeddings = np.average(chunk_embeddings, axis=0, weights=chunk_lens)
        chunk_embeddings = chunk_embeddings / np.linalg.norm(chunk_embeddings)
        return chunk_embeddings.tolist()

    async def aembed(self, text: str, **kwargs: Any) -> list[float]:
        """
        Embed text using OpenAI Embedding's async function.

        For text longer than max_tokens, chunk texts into max_tokens, embed each chunk, then combine using weighted average.
        """
        token_chunks = chunk_text(
            text=text, token_encoder=self.token_encoder, max_tokens=self.max_tokens
        )
        chunk_embeddings = []
        chunk_lens = []
        embedding_results = await asyncio.gather(*[
            self._aembed_with_retry(chunk, **kwargs) for chunk in token_chunks
        ])
        embedding_results = [result for result in embedding_results if result[0]]
        chunk_embeddings = [result[0] for result in embedding_results]
        chunk_lens = [result[1] for result in embedding_results]
        chunk_embeddings = np.average(chunk_embeddings, axis=0, weights=chunk_lens)  # type: ignore
        chunk_embeddings = chunk_embeddings / np.linalg.norm(chunk_embeddings)
        return chunk_embeddings.tolist()

    def _embed_with_retry(
        self, text: str | tuple, **kwargs: Any
    ) -> tuple[list[float], int]:
        try:
            retryer = Retrying(
                stop=stop_after_attempt(self.max_retries),
                wait=wait_exponential_jitter(max=10),
                reraise=True,
                retry=retry_if_exception_type(self.retry_error_types),
            )
            for attempt in retryer:
                with attempt:
                    # embedding = (
                    #     self.sync_client.embeddings.create(  # type: ignore
                    #         input=text,
                    #         model=self.model,
                    #         **kwargs,  # type: ignore
                    #     )
                    #     .data[0]
                    #     .embedding
                    #     or []
                    # )

                    if isinstance(text, tuple):
                        text = json.dumps(text)
                    embedding = ollama.embeddings(model="nomic-embed-text", prompt=text)
                    embedding = list(embedding["embedding"])
                    
                    return (embedding, len(text))
        except RetryError as e:
            self._reporter.error(
                message="Error at embed_with_retry()",
                details={self.__class__.__name__: str(e)},
            )
            return ([], 0)
        else:
            # TODO: why not just throw in this case?
            return ([], 0)

    async def _aembed_with_retry(
        self, text: str | tuple, **kwargs: Any
    ) -> tuple[list[float], int]:
        try:
            retryer = AsyncRetrying(
                stop=stop_after_attempt(self.max_retries),
                wait=wait_exponential_jitter(max=10),
                reraise=True,
                retry=retry_if_exception_type(self.retry_error_types),
            )
            async for attempt in retryer:
                with attempt:
                    # embedding = (
                    #     await self.async_client.embeddings.create(  # type: ignore
                    #         input=text,
                    #         model=self.model,
                    #         **kwargs,  # type: ignore
                    #     )
                    # ).data[0].embedding or []

                    if isinstance(text, tuple):
                        text = json.dumps(text)
                    embedding = ollama.embeddings(model="nomic-embed-text", prompt=text)
                    embedding = list(embedding["embedding"])

                    return (embedding, len(text))
        except RetryError as e:
            self._reporter.error(
                message="Error at embed_with_retry()",
                details={self.__class__.__name__: str(e)},
            )
            return ([], 0)
        else:
            # TODO: why not just throw in this case?
            return ([], 0)

kangqiaosiyuetian · 2024-08-08T07:26:53Z

当我使用llama3（80k输入）时，我在全局搜索部分有类似的错误信息，当我使用qwen2：7b（320k输入）时，它解决了。

虽然本地搜索仍然无法工作：ZeroDivisionError：权重总和为零，无法进行规范化

qwen2：7b再用，但是仍然存在该问题

ayanjiushishuai added bug Something isn't working triage Default label assignment, indicates new issue needs reviewed by a maintainer labels Jul 23, 2024

natoverse closed this as not planned Won't fix, can't repro, duplicate, stale Jul 23, 2024

natoverse added community_support Issue handled by community members and removed bug Something isn't working triage Default label assignment, indicates new issue needs reviewed by a maintainer labels Jul 23, 2024

52doho mentioned this issue Aug 6, 2024

The embeddings api interface is not working properly. ollama/ollama#5870

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Using Ollama and error occur like:[JSONDecodeError: Expecting ',' delimiter: line 5 column 45] #663

[Bug]: Using Ollama and error occur like:[JSONDecodeError: Expecting ',' delimiter: line 5 column 45] #663

ayanjiushishuai commented Jul 23, 2024

yurochang commented Jul 23, 2024

natoverse commented Jul 23, 2024

wenwkich commented Jul 23, 2024

kangqiaosiyuetian commented Aug 8, 2024

[Bug]: Using Ollama and error occur like:[JSONDecodeError: Expecting ',' delimiter: line 5 column 45] #663

[Bug]: Using Ollama and error occur like:[JSONDecodeError: Expecting ',' delimiter: line 5 column 45] #663

Comments

ayanjiushishuai commented Jul 23, 2024

Describe the bug

Steps to reproduce

Expected Behavior

GraphRAG Config Used

Logs and screenshots

Additional Information

yurochang commented Jul 23, 2024

natoverse commented Jul 23, 2024

wenwkich commented Jul 23, 2024

kangqiaosiyuetian commented Aug 8, 2024