Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on structured data input #2

Open
EvelynBai opened this issue Jan 15, 2025 · 2 comments
Open

Question on structured data input #2

EvelynBai opened this issue Jan 15, 2025 · 2 comments

Comments

@EvelynBai
Copy link

Hey there! First of all, thank you so much for this fantastic project! The work you've done here is truly impressive and inspiring.

I’m curious if there’s a specific pipeline for constructing graphs from structured data like CSV files. I tried running default bash run.sh and inputting the path to my structured data directly, but the results weren’t as expected. Are there examples or guidance for handling structured data?

Thank you!

@yuh-yang
Copy link
Collaborator

Hi,

Thanks for raising the question. What kind of structured data are you aiming to use with? Since it's already structured data, can you organize them as heterogeneous graphs from the csv file so that GraphAgent can take more easily?

The current run.sh showcases how GraphAgent handles unstructured data by building world knowledge upon it. If there's a need for directly running with structured data (such as csv files), we would be happy to also release an implementation. :)

@EvelynBai
Copy link
Author

EvelynBai commented Jan 16, 2025

Hi,

Thanks for raising the question. What kind of structured data are you aiming to use with? Since it's already structured data, can you organize them as heterogeneous graphs from the csv file so that GraphAgent can take more easily?

The current run.sh showcases how GraphAgent handles unstructured data by building world knowledge upon it. If there's a need for directly running with structured data (such as csv files), we would be happy to also release an implementation. :)

Thanks for the reply! I'm looking for a way to automatically convert complex structured data (like financial records) into a graph. Previously I found that directly inputting CSV files into GraphRAG doesn’t yield satisfactory results, as the LLM struggles with structured data.

It would be helpful to have guidance or an implementation for directly handling structured data. Thank you! :D

Also I have a question about the output of the heterogeneous graph. Specifically, does it include relationship information (edges) between scaffold nodes? From what I can see in the demo use case output after running bash run.sh, it only shows scaffold nodes and their associated keywords:
HeteroData( scaffold_nodes_dlist=[5], keyword_nodes_dict={ 0=[5], 1=[4], 2=[5], 3=[4], 4=[8], }, paper_strengths={ x=[2, 1], unified_idx=[2], description=[2], }, paper_weaknesses={ x=[2, 1], unified_idx=[2], description=[2], }, reviewer_recommendation={ x=[1, 1], unified_idx=[1], description=[1], }, keyword={ x=[26, 1], unified_idx=[26], description=[26], }, (paper_strengths, has_keyword, keyword)={ edge_index=[2, 9] }, (paper_weaknesses, has_keyword, keyword)={ edge_index=[2, 9] }, (reviewer_recommendation, has_keyword, keyword)={ edge_index=[2, 8] } )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants