Replies: 1 comment 2 replies
-
Thanks a lot for detailed description! It does look like a bug / missing optimization in |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all,
I have a case like this:
I use the
compute_with
merge the forloop of sum_1 and sum_2.I get the following IR(the IR is after
Unrolling
):Pay attention to the
produce input_c
:I expect the allocate size of
input_c
is 8 * 8, and the forloop extent ofinput_c
are 8 and 8. But the allocate size ofinput_c
is 32 * 64, that is the input_c need compute all data for every outer forloop.So, is there a bug of Halide backend Or is there something wrong with my schedule?
--
I find a way to slove the problem.
clone_in the
input_c
.The schedule is:
The IR is:
I get the expected allocate size and forloop extent. But the input_c have double calculation. Is there any way to eliminate this redundant calculation?
Beta Was this translation helpful? Give feedback.
All reactions