Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend functionality of Wandb Config Diff script #687

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

kyleclo
Copy link
Contributor

@kyleclo kyleclo commented Aug 2, 2024

  • Add tests for flatten_dict() in utils
  • Extend functionality for flatten_dict() to also flatten any dicts that exist in Lists
  • Extend the wandb config comparison script to use the extended flatten_dict()

Motivation is, while comparing configs, the current implementation doesn't perform comparison of some key aspects of the configs, namely config keys representing dataset paths (which are all List[str]) as well as keys like config["evaluators.value"] which are List[Dict].

The current behavior looks something like this:
image

where we can see that the fields data.value.paths and evaluators.value aren't easily comparable.

The new behavior looks like this:
image

where it preserves behavior of original script under old keys, but performs side by side comparison of list elements also.

The downside, of course, is with a lot of dataset paths, these config diffs can become quite long to sift through.

…nt compare wandb config script to also flatten list dicts
@kyleclo kyleclo requested a review from dirkgr August 2, 2024 09:41
olmo/util.py Outdated
new_list.append(
flatten_dict(
v,
parent_key=root,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The normal way of doing this is actually to treat a List as a Mapping[int, Any]. So the key becomes something like "foo.bar.0" and "foo.bar.1", etc. Then you don't need the extra root parameter either.

}
if len(keys_with_differences) > 0:
for k in sorted(keys_with_differences):
if isinstance(left_config[k], list) and isinstance(right_config[k], list):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also don't need this if you treat lists as Mapping[int, Any]. And it will work right even if the list entries are complex. On the other hand, the output will look different / be less compact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants