Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

theta: change API for mapping loop variables #675

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

caleridas
Copy link
Collaborator

Provide an API to loop nodes that allows mapping the various representation pieces of a loop variable. Remove
Theta{Input|Output|Argument|Result}.

@caleridas caleridas requested a review from phate December 2, 2024 20:50
@caleridas
Copy link
Collaborator Author

This change again provides names for variable roles that may be controversial:

  • I picked "pre/post" to designate the places of the variables at loop start and end as pre/post are fairly short
  • more precisely, they could be called "preIter" / "postIter" or similar, but that may be too verbose
  • I also considered "entry" resp. "exit" as names; however they are too ambiguous IMHO as they could be construed to mean the "one time" loop entry / exit values (i.e. before first and after last iteration)
  • unlike gamma where there is no such ambiguity with argument / result, the same ambiguity for argument / result exists as for entry / exit

@phate phate requested review from haved and Felix-Rm December 3, 2024 05:39
Copy link
Owner

@phate phate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all good to me. I am fine with the names, even though preIteration and postIteration would be a bit more descriptive, but is also quite long.

The only reason why I block this temporarily is because of the loop var removal and index remapping. Where is this happening such that this is necessary?

Comment on lines 32 to +34
class ThetaInput;
class ThetaOutput;
class ThetaResult;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You just removed these classes below. These forward declarations can be removed as well, no?

Comment on lines +194 to +195
LoopVar
AddLoopVar(rvsdg::output * origin);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get some documentation on it?

Comment on lines +29 to +28
/* theta node */

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed. Does not add any value

auto thetaInput = util::AssertedCast<ThetaInput>(input);
return ThetaArgument::Create(region, *thetaInput);
}
auto nf = graph()->node_normal_form(typeid(jlm::rvsdg::operation));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rvsdg::operation

Comment on lines +282 to +306
// Calling RemoveThetaInputsWhere/RemoveThetaOutputsWhere can result
// in inputs (and pre-loop arguments) and outputs (and post-loop results)
// to become unmatched. In this case, the theta node itself has
// "invalid" shape until fixed properly.
// The indices of unmatched inputs/outputs are tracked here to
// detect this situation, and also to provide correct mapping.
// Computing the mapping is a bit fiddly as it requires adjusting
// indices accordingly, should seriously consider whether this
// is really necessary or things can rather be reformulated such that
// inputs/outputs are always consistent.

std::optional<std::size_t>
MapInputToOutputIndex(std::size_t index) const noexcept;

std::optional<std::size_t>
MapOutputToInputIndex(std::size_t index) const noexcept;

void
MarkInputIndexErased(std::size_t index) noexcept;

void
MarkOutputIndexErased(std::size_t index) noexcept;

std::vector<std::size_t> unmatchedInputs;
std::vector<std::size_t> unmatchedOutputs;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really necessary? The only places where theta variables are removed should be DNE, and there we use the RemoveThetaInputsWhere and RemoveThetaOutputsWhere immediately after each other to ensure that the inconsistent state is very local. Where else are theta inputs/outputs are removed such that this becomes necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this, and I have a proposal how to do it better, but I would like to separate this from the present PR:

Outputs that are deemed to be "dead" do not play a role in computation, so we are allowed to assign an arbitrary value to it. So instead of leaving unbalanced inputs/outputs we can always route the pre-loop value through to satisfy it. Then, in a second step, we can eliminate them as loop-invariant, and unused inside the loop. This way we avoid ever having unbalanced inputs/outputs, but the tracking of trying to eliminate them needs to move into DNE.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caleridas Sounds reasonable to me. I will mark this conversation as resolved then. I would like you to push this fix for the imbalance then rather soon though such that we do not have the code above lingering around for too long.

Provide an API to loop nodes that allows mapping the various
representation pieces of a loop variable. Remove
Theta{Input|Output|Argument|Result}.
@caleridas
Copy link
Collaborator Author

I have a hard time interpreting the failure from the hls test suite. On github I have no clear indication what exactly fails and when I run it locally, I get this:

Building: base/test_return
Building: base/test_arithmetic
Building: base/test_array
Building: base/test_conditional
Building: base/test_load
Building: base/test_gep
Building: base/test_loop
Building: base/test_nested_loops
Building: base/test_global
Building: base/test_matrix
Building: base/test_cpu
Building: benchmarks/DSS/smm
Building: benchmarks/DSS/getTanh
Building: benchmarks/Inter-block/matrixtrans
Building: benchmarks/Inter-block/substring
Building: benchmarks/PNAnalyser/vecTrans
Building: benchmarks/PNAnalyser/vecTrans2
Building: polybench/jacobi_1d
Building: polybench/correlation
Building: dynamatic/fir
Building with gcc: dynamatic/fir
Building: dynamatic/gaussian
Building with gcc: dynamatic/gaussian
Building: dynamatic/if_loop_1
Building with gcc: dynamatic/if_loop_1
Building: dynamatic/if_loop_2
Building with gcc: dynamatic/if_loop_2
Building: dynamatic/if_loop_3
Building with gcc: dynamatic/if_loop_3
Building: dynamatic/iir
Building with gcc: dynamatic/iir
Building: dynamatic/image_resize
Building with gcc: dynamatic/image_resize
Building: dynamatic/insertion_sort
Building with gcc: dynamatic/insertion_sort
Building with gcc: dynamatic/kernel_2mm
Building: dynamatic/kernel_2mm
Building: dynamatic/kernel_3mm
Building with gcc: dynamatic/kernel_3mm
Building: dynamatic/loop_array
Building with gcc: dynamatic/loop_array
Building: dynamatic/matrix
Building with gcc: dynamatic/matrix
Building: dynamatic/memory_loop
Building with gcc: dynamatic/memory_loop
Building: dynamatic/mul_example
Building with gcc: dynamatic/mul_example
Building: dynamatic/pivot
Building with gcc: dynamatic/pivot
Building: dynamatic/simple_example_1
Building with gcc: dynamatic/simple_example_1
Building: dynamatic/simple_example_2
Building with gcc: dynamatic/simple_example_2
Building with gcc: dynamatic/stencil_2d
Building: dynamatic/sumi3_mem
Building: dynamatic/stencil_2d
Building with gcc: dynamatic/sumi3_mem
Building: dynamatic/test_memory_1
Building with gcc: dynamatic/test_memory_1
Building: dynamatic/test_memory_2
Building with gcc: dynamatic/test_memory_2
Building: dynamatic/test_memory_3
Building with gcc: dynamatic/test_memory_3
Building: dynamatic/test_memory_4
Building with gcc: dynamatic/test_memory_4
Building with gcc: dynamatic/test_memory_6
Building: dynamatic/test_memory_6
Building: dynamatic/test_memory_7
Building with gcc: dynamatic/test_memory_7
Building: dynamatic/test_memory_8
Building with gcc: dynamatic/test_memory_8
Building: dynamatic/test_memory_9
Building with gcc: dynamatic/test_memory_9
Building: dynamatic/test_memory_10
Building with gcc: dynamatic/test_memory_10
Building: dynamatic/test_memory_11
Building with gcc: dynamatic/test_memory_11
Building: dynamatic/test_memory_12
Building with gcc: dynamatic/test_memory_12
Building: dynamatic/test_memory_13
Building with gcc: dynamatic/test_memory_13
Building: dynamatic/test_memory_14
Building with gcc: dynamatic/test_memory_14
Building: dynamatic/test_memory_15
Building with gcc: dynamatic/test_memory_15
Building: dynamatic/test_memory_16
Building with gcc: dynamatic/test_memory_16
Building with gcc: dynamatic/test_memory_17
Building: dynamatic/test_memory_17
Building: dynamatic/test_memory_18
Building with gcc: dynamatic/test_memory_18
Building: dynamatic/threshold
Building with gcc: dynamatic/threshold
Building: dynamatic/vector_rescale
Building with gcc: dynamatic/vector_rescale
Building: dynamatic/video_filter
Building with gcc: dynamatic/video_filter
Building: dynamatic/bicg
Building with gcc: dynamatic/bicg
Building: dynamatic/gemver
Building with gcc: dynamatic/gemver
Building: dynamatic/matrix_power
Building with gcc: dynamatic/matrix_power
Building: dynamatic/matvec
Building with gcc: dynamatic/matvec
Building: dynamatic/triangular
Building with gcc: dynamatic/triangular
Building: dynamatic/binary_search
Building with gcc: dynamatic/binary_search
Building: dynamatic/gcd
Building with gcc: dynamatic/gcd
Building with gcc: dynamatic/polyn_mult
Building: dynamatic/polyn_mult
Building with gcc: dynamatic/sobel
Running: base/test_return
Running: base/test_arithmetic
Running: base/test_loop
Running: base/test_nested_loops
Running: base/test_conditional
Running: base/test_load
Running: base/test_array
Running: base/test_gep
Building: dynamatic/sobel
Running: base/test_global
Running: base/test_matrix
Running: base/test_cpu
Running: benchmarks/DSS/getTanh
Running: benchmarks/DSS/smm
Running: benchmarks/Inter-block/matrixtrans
Running: benchmarks/PNAnalyser/vecTrans
Running: benchmarks/PNAnalyser/vecTrans2
Running: polybench/jacobi_1d
Running: dynamatic/fir
Running: dynamatic/gaussian
Running: dynamatic/if_loop_1
Running: dynamatic/if_loop_2
Running: dynamatic/if_loop_3
Running: dynamatic/iir
Running: dynamatic/image_resize
Running: dynamatic/insertion_sort
Running: dynamatic/kernel_2mm
Running: dynamatic/kernel_3mm
Running: dynamatic/loop_array
Running: dynamatic/matrix
Running: dynamatic/memory_loop
Running: dynamatic/mul_example
Running: dynamatic/pivot
Running: dynamatic/simple_example_1
Running: dynamatic/simple_example_2
Running: dynamatic/stencil_2d
Running: dynamatic/sumi3_mem
Running: dynamatic/test_memory_1
Running: dynamatic/test_memory_2
Running: dynamatic/test_memory_3
Running: dynamatic/test_memory_4
Running: dynamatic/test_memory_6
Running: dynamatic/test_memory_7
Running: dynamatic/test_memory_8
Running: dynamatic/test_memory_9
Running: dynamatic/test_memory_10
Running: dynamatic/test_memory_11
Running: dynamatic/test_memory_12
Running: dynamatic/test_memory_13
Running: dynamatic/test_memory_14
Running: dynamatic/test_memory_15
Running: dynamatic/test_memory_16
Running: dynamatic/test_memory_17
Running: dynamatic/test_memory_18
Running: dynamatic/threshold
Running: dynamatic/vector_rescale
Running: dynamatic/video_filter
Running: dynamatic/bicg
Running: dynamatic/gemver
Running: dynamatic/matrix_power
Running: dynamatic/matvec
Running: dynamatic/triangular
Running: dynamatic/binary_search
Running: dynamatic/gcd
The execution time of base/test_matrix has changed
Golden cycle time: 571
Simulated cycles: 588
The execution time of dynamatic/test_memory_7 has changed
Golden cycle time: 52
Simulated cycles: 31
SUCCESS: base/test_return
SUCCESS: base/test_arithmetic
SUCCESS: base/test_loop
SUCCESS: base/test_nested_loops
SUCCESS: base/test_conditional
SUCCESS: base/test_load
SUCCESS: base/test_array
SUCCESS: base/test_gep
SUCCESS: base/test_global
SUCCESS: base/test_matrix
SUCCESS: base/test_cpu
SUCCESS: benchmarks/DSS/getTanh
SUCCESS: benchmarks/DSS/smm
SUCCESS: benchmarks/Inter-block/matrixtrans
SUCCESS: benchmarks/PNAnalyser/vecTrans
SUCCESS: benchmarks/PNAnalyser/vecTrans2
SUCCESS: polybench/jacobi_1d
SUCCESS: dynamatic/fir
SUCCESS: dynamatic/gaussian
SUCCESS: dynamatic/if_loop_1
SUCCESS: dynamatic/if_loop_2
SUCCESS: dynamatic/if_loop_3
SUCCESS: dynamatic/iir
SUCCESS: dynamatic/image_resize
SUCCESS: dynamatic/insertion_sort
SUCCESS: dynamatic/kernel_2mm
SUCCESS: dynamatic/kernel_3mm
SUCCESS: dynamatic/loop_array
SUCCESS: dynamatic/matrix
SUCCESS: dynamatic/memory_loop
SUCCESS: dynamatic/mul_example
SUCCESS: dynamatic/pivot
SUCCESS: dynamatic/simple_example_2
SUCCESS: dynamatic/simple_example_1
SUCCESS: dynamatic/sumi3_mem
SUCCESS: dynamatic/stencil_2d
SUCCESS: dynamatic/test_memory_1
SUCCESS: dynamatic/test_memory_2
SUCCESS: dynamatic/test_memory_3
SUCCESS: dynamatic/test_memory_4
SUCCESS: dynamatic/test_memory_6
SUCCESS: dynamatic/test_memory_7
SUCCESS: dynamatic/test_memory_8
SUCCESS: dynamatic/test_memory_9
SUCCESS: dynamatic/test_memory_10
SUCCESS: dynamatic/test_memory_11
SUCCESS: dynamatic/test_memory_12
SUCCESS: dynamatic/test_memory_13
SUCCESS: dynamatic/test_memory_14
SUCCESS: dynamatic/test_memory_15
SUCCESS: dynamatic/test_memory_16
SUCCESS: dynamatic/test_memory_17
SUCCESS: dynamatic/test_memory_18
SUCCESS: dynamatic/threshold
SUCCESS: dynamatic/vector_rescale
SUCCESS: dynamatic/video_filter
SUCCESS: dynamatic/bicg
SUCCESS: dynamatic/gemver
SUCCESS: dynamatic/matrix_power
SUCCESS: dynamatic/matvec
SUCCESS: dynamatic/triangular
SUCCESS: dynamatic/binary_search
SUCCESS: dynamatic/gcd
Running: dynamatic/sobel
SUCCESS: dynamatic/sobel
Running: dynamatic/polyn_mult
SUCCESS: dynamatic/polyn_mult
Running: benchmarks/Inter-block/substring
SUCCESS: benchmarks/Inter-block/substring
Running: polybench/correlation
The execution time of polybench/correlation has changed
Golden cycle time: 185779
Simulated cycles: 178462
SUCCESS: polybench/correlation
rm /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_12.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/loop_array.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/stencil_2d.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/if_loop_2.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/simple_example_2.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/gaussian.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/gcd.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/gemver.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_15.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_13.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/polyn_mult.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/mul_example.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/if_loop_1.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/iir.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/insertion_sort.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/pivot.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/kernel_3mm.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/matrix_power.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_16.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_17.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_11.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/matvec.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/binary_search.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/kernel_2mm.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/vector_rescale.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_2.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/sumi3_mem.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/triangular.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_7.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/matrix.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/memory_loop.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_9.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_3.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/bicg.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_6.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/if_loop_3.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_1.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/simple_example_1.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/video_filter.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_10.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_8.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/image_resize.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/fir.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/threshold.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_18.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_14.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/test_memory_4.gcc.log /home/helge/tmp/jlm/usr/hls-test-suite//build/dynamatic/sobel.gcc.log

While I do not understand why my change causes semantic change to the compilation pipeline, I am a bit wondering whether this causes a "semantically meaningful" change to the compilate, or whether the change is fine and the test is a bit too change-sensitive. How would I distinguish between these? @sjalander @phate

@caleridas
Copy link
Collaborator Author

I tried pushing a change to hls-test-suite:

--- a/src/base/test_matrix.cycles
+++ b/src/base/test_matrix.cycles
@@ -1 +1 @@
-571
\ No newline at end of file
+588
\ No newline at end of file
diff --git a/src/dynamatic/test_memory_7.cycles b/src/dynamatic/test_memory_7.cycles
index 6139554..b74e882 100644
--- a/src/dynamatic/test_memory_7.cycles
+++ b/src/dynamatic/test_memory_7.cycles
@@ -1 +1 @@
-52
\ No newline at end of file
+31
\ No newline at end of file
diff --git a/src/polybench/correlation.cycles b/src/polybench/correlation.cycles
index 9ad41b4..b45d0b7 100644
--- a/src/polybench/correlation.cycles
+++ b/src/polybench/correlation.cycles
@@ -1 +1 @@
-185779
\ No newline at end of file
+178462
\ No newline at end of file

but I lack permission to create a branch on that repository.

@phate
Copy link
Owner

phate commented Jan 3, 2025

@caleridas I invited you the llvm-test-suite repository and the hls-test-suite repository as collaborator. I leave the issues with the HLS test suite to @sjalander to answer, he knows way more about it than I do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants