Citation Tracking Back All the Way to the Original Text Chunk? #800
Replies: 2 comments 5 replies
-
i have implemented please check out my fork of graph tag. |
Beta Was this translation helpful? Give feedback.
-
It seems like we can only trace back to the original text when the citation is 'source' (which actually refers to 'create_base_text_units.parquet'). For answers with citations that are not 'source', but 'relations', 'communities', 'entities', etc., I believe it is impossible to trace back to the original text. These are summarized through procedures during the construction of the graph by the LLM. Ultimately, the traceback would consist of a series of 'text_unit_ids' (that's why it has coherence in RAG, it is a graph constructed by LLM, it answering us using the 'relationship' info) used to summarize those entities, communities, etc., as shown in the example below in the .parquet file:
to achieve a better accuracy of RAG, best way today should be Graph, but graph has this limitation. |
Beta Was this translation helpful? Give feedback.
-
Hi team,
Question: Is there an existing solution to trace back the citation all the way to the original text?
More background:
I would like to trace back from the citation, e.g. Repots(ID1, ID2), Entities(ID1, ID2) all the way to the original text chunks; so that I can easily make sure if the generation is actual relevant. This is critical because even just looking at the traced report or entity level, I would have to worry about if these contents are hallucinated.
I didn't find discussion on that after some research (the most relevant ask seems to be issue_729
To achieve that, I am looking into the response object from the search. Looking into
result.context_data
&result.context_text
, there are actually the contents for these context.However:
e.g.
result.context_data["reports"]
returns all the reports seemingly, not just these cited in this particular searchContext data:
context_text
contains the internal content that has the continuous citation to other level.It would be nice if the functionality tracing back to the original text is already built in, and if so, what's the best to retrieve that?
If not, it will still be great if you could give me some suggestions how to implement given a search response.
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions