Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement loop fusion optimization inside kernels #253

Merged
merged 27 commits into from
Feb 19, 2024

Conversation

NaderAlAwar
Copy link
Contributor

This PR adds support for doing loop fusion inside a kernel. It can be enabled by setting the PK_LOOP_FUSE env variable. The conditions for two loops to be fused are as follows:

  1. They must belong to the same scope OR their parent nodes are two for loops that can be fused.
  2. They must have identical loop ranges. Right now this is checked by comparing the arguments of the two range calls as strings.
  3. They must be adjacent OR the nodes between them are safe to move to before the first loop.
  4. There must be no invalid dependencies (not implemented yet).

@NaderAlAwar NaderAlAwar marked this pull request as draft January 25, 2024 15:59
for k in range(2):
pk.printf("print 2 %d\n", k)

def main():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we format these as proper pytests so we can actually run them and have assertions in the end of each workunit?

Copy link
Contributor

@gliga gliga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except that comment about tests

Copy link
Contributor

@gliga gliga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provided only comments on tests

HannanNaeem and others added 5 commits January 31, 2024 12:58
loop_fusion_kernels.py: Contains bodies of kernels/tests
test_loop_fusion.py: Test class to set environment,  run, capture, and compare outputs of kernels
@NaderAlAwar NaderAlAwar marked this pull request as ready for review February 13, 2024 00:10
@NaderAlAwar NaderAlAwar merged commit 7bfd51b into kokkos:main Feb 19, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants