Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The content of the file generated in the index process is incorrect. #936

Closed
2 tasks
yangxue-1 opened this issue Aug 15, 2024 · 3 comments
Closed
2 tasks
Labels
community_support Issue handled by community members

Comments

@yangxue-1
Copy link

Is there an existing issue for this?

  • I have searched the existing issues
  • I have checked #657 to validate if my issue is covered by community support

Describe the issue

Q1:
The description of entities in the create_final_entities.parquet file is incorrect. Some entities contain garbled characters, and some descriptions are inconsistent with the entities.

Q2:
The create_summarized_entities.parquet file shows that Nothing to show. Does this mean that the summarize_descriptions.txt file needs to be modified?

Steps to reproduce

No response

GraphRAG Config Used

# Paste your config here

Logs and screenshots

No response

Additional Information

  • GraphRAG Version:
  • Operating System:
  • Python Version:
  • Related Issues:
@yangxue-1 yangxue-1 added the triage Default label assignment, indicates new issue needs reviewed by a maintainer label Aug 15, 2024
@AlonsoGuevara
Copy link
Contributor

Hi!
Can you please provide more details?

  • Language of your inputs?
  • Are you using prompt tuning?
  • Model or config file.

@yangxue-1
Copy link
Author

Hi! Can you please provide more details?

* Language of your inputs?

* Are you using prompt tuning?

* Model or config file.
  • The input file is in Chinese.
  • Yes,use prompt tuning.
  • LLM:qwen2-7b,embedding:o200k_base

@natoverse
Copy link
Collaborator

Routing this to #696. Language issues are very difficult for us to diagnose. I also suggest making sure you are on version 0.3.0, which does include some improved handling of language encoding.

@natoverse natoverse closed this as not planned Won't fix, can't repro, duplicate, stale Aug 16, 2024
@natoverse natoverse added community_support Issue handled by community members and removed triage Default label assignment, indicates new issue needs reviewed by a maintainer labels Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community_support Issue handled by community members
Projects
None yet
Development

No branches or pull requests

3 participants