Graph-constrained Reasoning (GCR)

Official Implementation of "Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models".

Graph-constrained Reasoning (GCR) is a novel framework that bridges structured knowledge in KGs with unstructured reasoning in LLMs. GCR ensures faithful KG-grounded reasoning by integrating KG structure into the LLM decoding process through KG-Trie. This allows LLMs to directly reason on graphs and generate faithful reasoning paths grounded in KGs to achieve accurate reasoning with zero reasoning hallucination.

Dependencies

We use Poetry to manage dependencies. CUDA 12.1 is recommended.

Step 1: Install Poetry
curl -sSL https://install.python-poetry.org | python3 -

Step 2: Create a conda environment and install dependencies

conda create -n GCR python=3.12
conda activate GCR
poetry install

Step 3: Install Flash-attention for fast decoding

pip install flash-attn --no-build-isolation

Build graph index

Note

Our code will automatically download the data from Huggingface.

Build graph index for training: scripts/build_graph_index.sh

Graph index will be saved under: data/graph_index.

[Optional] Build graph index for evaluation:

You can pre-build the graph index for faster evaluation. Otherwise, the evaluation script will build the graph index on-the-fly.

DATA_PATH="RoG-webqsp RoG-cwq"
SPLIT=test
N_PROCESS=8
HOP=2 # 3
for DATA_PATH in ${DATA_PATH}; do
    python workflow/build_graph_index.py --d ${DATA_PATH} --split ${SPLIT} --n ${N_PROCESS} --K ${HOP}
done

Training the lightweight KG-specialized LLM

We provide the training script for fine-tuning the lightweight KG-specialized LLM on the graph-constrained decoding task.

In the script, we provide the following model configurations: Qwen2-0.5B/1.5B/7B, Llama-2-7B, and Llama-3.1-8B. But it can be easily extended to other LLMs.

Uncomment the corresponding "model configurations block" (Llama-3.1-8B by default) and run the script: scripts/train_kg_specialized_llm.sh.

Models will be saved at: save_models/${SAVE_NAME}.

The training resources and time for each model configuration are as follows:

Note

We provide the pre-trained weights for the lightweight KG-specialized LLMs: Qwen2-0.5B, Llama-2-7B, and Llama-3.1-8B. You can find the pre-trained weights from here and use them for Inference.

Inference

Step 1: Graph-constrained decoding

We first adopt the KG-specialized LLM to generate several KG-grounded reasoning paths and hypotheses answers with beam-search.

Note

Our code will automatically download the model weight from huggingface.

Run: scripts/graph_constrained_decoding.sh

MODEL_PATH=rmanluo/GCR-Meta-Llama-3.1-8B-Instruct
MODEL_NAME=$(basename "$MODEL_PATH")

python workflow/predict_paths_and_answers.py \
  --data_path rmanluo \
  --d {RoG-webqsp,RoG-cwq} \
  --split test \
  --index_path_length 2 \
  --model_name ${MODEL_NAME} \
  --model_path ${MODEL_PATH} \
  --k 10 \
  --prompt_mode zero-shot \
  --generation_mode group-beam \
  --attn_implementation flash_attention_2

Generated reasoning paths and hypotheses answers will be saved at: results/GenPaths/{dataset}/{model_name}/{split}.

Step 2: Graph Inductive reasoning

We a general LLM to reason over multiple reasoning paths and hypotheses answers to produce the final answer without additional training.

Run: scripts/graph_inductive_reasoning.sh

python workflow/predict_final_answer.py \
  --data_path rmanluo \
  --d {RoG-webqsp,RoG-cwq} \
  --split test \
  --model_name {gpt-3.5-turbo, gpt-4o-mini} \
  --reasoning_path {REASONING_PATH} \
  --add_path True \
  -n 10

Note

Note: you need to set your openai key at .env to use ChatGPT.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
accelerate_configs		accelerate_configs
resources		resources
scripts		scripts
src		src
workflow		workflow
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Graph-constrained Reasoning (GCR)

Dependencies

Build graph index

Training the lightweight KG-specialized LLM

Inference

Step 1: Graph-constrained decoding

Step 2: Graph Inductive reasoning

Results

About

Releases

Packages

Languages

License

TrustAGI-Lab/graph-constrained-reasoning

Folders and files

Latest commit

History

Repository files navigation

Graph-constrained Reasoning (GCR)

Dependencies

Build graph index

Training the lightweight KG-specialized LLM

Inference

Step 1: Graph-constrained decoding

Step 2: Graph Inductive reasoning

Results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages