Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DistGB] save as graphbolt graph directly after partition #7690

Open
wants to merge 36 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
ea588dd
add Friends CfromBU(wenxuanc)
Aug 8, 2024
cb01c6b
upgrade partition.py, simplify the FusedCSCSamplingGraph generation p…
Aug 13, 2024
61f7ac7
upgrade partition.py
Aug 13, 2024
c6555e2
remove an unused variable
Aug 13, 2024
8db950a
delete trailing-whitespace
Aug 13, 2024
b8f4a58
change the format following lintrunner
Aug 13, 2024
5e8895e
change the format following lintrunner_2
Aug 13, 2024
8f9e639
change a variable
Aug 14, 2024
309c015
change a variable
Aug 14, 2024
0180c43
change partition
Aug 14, 2024
a183238
change repeated variable
Aug 14, 2024
1399b38
change partition.py
Aug 14, 2024
af44769
change partition function
Aug 14, 2024
5d66e2e
Merge branch 'master' into cwx
Rhett-Ying Aug 14, 2024
d4921c9
modify the partition.py and test_partition.py
Aug 19, 2024
8dff08c
modify the partition.py and test_partition.py
Aug 19, 2024
3ab12c6
modify the partition.py and test_partition.py
Aug 19, 2024
cf8de78
modify the partition.py and test_partition.py
Aug 19, 2024
6f9cc73
modify an unused variable
Aug 19, 2024
8b93244
ad new line
Aug 19, 2024
0bd5988
add new line
Aug 19, 2024
0c83759
add dist graphbolt and test case
Aug 19, 2024
5c9d82d
fix bug on dist partition
Aug 19, 2024
6044a62
change format
Aug 19, 2024
2b8cb06
fix bug on partition feats test
Aug 20, 2024
530b1c6
fix format
Aug 20, 2024
20760c5
push before new pr
Aug 20, 2024
b9714bc
push before new pr
Aug 20, 2024
b0a4e31
change
CfromBU Aug 21, 2024
b8d1506
change test_dist_part
CfromBU Aug 21, 2024
458f6a3
change format
CfromBU Aug 21, 2024
7406ee1
change format
CfromBU Aug 21, 2024
7f18105
change test_dist_partition
Sep 8, 2024
7786f66
dist partition
Sep 8, 2024
45bb276
change partition
Sep 8, 2024
f6c5e78
convert_partition
Sep 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
519 changes: 345 additions & 174 deletions python/dgl/distributed/partition.py

Large diffs are not rendered by default.

1,072 changes: 871 additions & 201 deletions tests/distributed/test_partition.py

Large diffs are not rendered by default.

844 changes: 844 additions & 0 deletions tests/tools/test_dist_partition_graphbolt.py

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion tools/chunk_graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,7 @@ def chunk_graph(

if __name__ == "__main__":
logging.basicConfig(level="INFO")
input_dir = "/data"
input_dir = "/home/ubuntu/workspace/MAG240MDataset/mag240m_kddcup2021"
output_dir = "/chunked-data"
(g,), _ = dgl.load_graphs(os.path.join(input_dir, "graph.dgl"))
chunk_graph(
Expand Down
33 changes: 31 additions & 2 deletions tools/dispatch_data.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
"""Launching distributed graph partitioning pipeline """

import argparse
import json
import logging
Expand Down Expand Up @@ -75,6 +76,10 @@ def submit_jobs(args) -> str:
argslist += "--log-level {} ".format(args.log_level)
argslist += "--save-orig-nids " if args.save_orig_nids else ""
argslist += "--save-orig-eids " if args.save_orig_eids else ""
argslist += "--use-graphbolt " if args.use_graphbolt else ""
argslist += "--store-inner-edge " if args.store_inner_edge else ""
argslist += "--store-inner-node " if args.store_inner_node else ""
argslist += "--store-eids " if args.store_eids else ""
argslist += (
f"--graph-formats {args.graph_formats} " if args.graph_formats else ""
)
Expand All @@ -86,7 +91,6 @@ def submit_jobs(args) -> str:
launch_cmd = get_launch_cmd(args)
launch_cmd += '"' + udf_cmd + '"'

print(launch_cmd)
os.system(launch_cmd)


Expand Down Expand Up @@ -159,6 +163,31 @@ def main():
action="store_true",
help="Save original edge IDs into files",
)
parser.add_argument(
"--use-graphbolt",
action="store_true",
help="Use GraphBolt for distributed train.",
)
parser.add_argument(
"--store-inner-node",
action="store_true",
default=False,
help="Store inner nodes.",
)

parser.add_argument(
"--store-inner-edge",
action="store_true",
default=False,
help="Store inner edges.",
)

parser.add_argument(
"--store-eids",
action="store_true",
default=False,
help="Store edge IDs.",
)
parser.add_argument(
"--graph-formats",
type=str,
Expand All @@ -170,7 +199,7 @@ def main():
)

args, _ = parser.parse_known_args()

assert args.store_inner_edge is True
fmt = "%(asctime)s %(levelname)s %(message)s"
logging.basicConfig(
format=fmt,
Expand Down
Loading
Loading