Fix bounds inference for uint -> int casts #7814

abadams · 2023-08-28T22:05:00Z

Note that this may cause previously-functioning pipelines to throw a compile-time error, in cases where they cast a uint32 to an int32 for use in an index expression and were relying on Halide to assume the result is positive. Now that the uint32 -> int32 cast is defined to wrap, the result may not be positive. An example might be the expression:

lut(min(cast<int>(some_u32_param), 256))

Previously Halide would have treated this as bounded between 0 and 256. Now it's no longer bounded below, because the uint32 could have been large enough to wrap.

If this causes too much carnage in production pipelines we may have to revert the decision to define wrapping behavior for uint32 -> int32 casts.

Fixes #7807 Fixes #7811

steven-johnson · 2023-08-29T00:15:56Z

Testing this in Google, I get many many failures of the form Error: Buffer foo may be accessed in an unbounded way in dimension 0

abadams · 2023-08-29T00:24:16Z

The question now is whether to roll back us defining uint -> int casts as having wrapping semantics, or to patch those pipelines. I'd like to slowly reduce the number of types of UB we have over time, so if patching seems tractable (e.g. by inserting a max with zero around the index), that would be best.

steven-johnson · 2023-08-29T00:26:38Z

I'll have to figure out how many pipelines it is. (there are lots of failures, but maybe it's just a handful of shared pipelines.) What is the likely failure pattern I'm looking for?

abadams · 2023-08-29T00:37:57Z

Access to an input named "foo" using a computed uint32 index. This is a bit of a guess but I think the needed patch is something like:

Before:

foo(cast<int>(some_u32_expr), ...)

After:

foo(saturating_cast<int>(some_u32_expr), ...)

steven-johnson · 2023-08-29T16:49:05Z

Access to an input named "foo" using a computed uint32 index. This is a bit of a guess but I think the needed patch is something like:

The first case I'm looking at doesn't have anything obviously of that form, but there are a lot of helper functions. I wonder if there is a bottleneck for trapping most/all int->uint cases that would help?

steven-johnson · 2023-08-29T16:54:05Z

Running the failure case with HL_DEBUG_CODEGEN=4 reveals something possibly interesting: just before the failure, I see

simplify let t749 = (int16)1023 in...
new let t749.s = (undefined) in ... (int16)1023 ...

undefined?

abadams · 2023-08-29T16:56:47Z

That undefined is fine. It means the simplifier has determined the new let is not necessary to insert.

steven-johnson · 2023-08-29T17:00:21Z

The immediate failure looks like this:

Input to CSE ((uint16)2048 == (uint16)2048)
After removing lets: ((uint16)2048 == (uint16)2048)
Include: ((uint16)2048 == (uint16)2048); should extract: 1
Include: (uint16)2048; should extract: 0
Include: (uint16)2048; should extract: 0
Canonical form without lets ((uint16)2048 == (uint16)2048)
0: (uint16)2048, 0
1: ((uint16)2048 == (uint16)2048), 1
With variables ((uint16)2048 == (uint16)2048)
With lets: ((uint16)2048 == (uint16)2048)
In image input region touched is:
  0: max(min(((0*16) + input.min.0) + 0, (input.extent.0 + input.min.0) + -1), input.min.0) .. max(min(((((input.extent.0 + -1)/16)*16) + input.min.0) + 15, (input.extent.0 + input.min.0) + -1), input.min.0)
  1: max(min(((0*4) + input.min.1) + 0, (input.extent.1 + input.min.1) + -1), input.min.1) .. max(min(((((input.extent.1 + -1)/4)*4) + input.min.1) + 3, (input.extent.1 + input.min.1) + -1), input.min.1)
Injecting constraints for input.0
Injecting constraints for input.1
In image original_input region touched is:
  0: (void *)neg_inf .. (void *)pos_inf
  1: (void *)neg_inf .. (void *)pos_inf
User error triggered at third_party/halide/halide/src/AddImageChecks.cpp:383

abadams · 2023-08-29T17:12:48Z

I pushed a temporary change that should give a more useful error

steven-johnson · 2023-08-29T17:24:47Z

The failure is coming from INSIDE HALIDE! </>

Look at mirror_interior() and tell me where the uint32->int32 cast is (implicitly) creeping in:

            Expr limit = extent - 1;
            Expr coord = arg_var - min;   // Enforce zero origin.
            coord = coord % (2 * limit);  // Range is 0 to 2w-1
            coord = coord - limit;        // Range is -w, w
            coord = abs(coord);           // Range is 0, w
            coord = limit - coord;        // Range is 0, w
            coord = coord + min;          // Restore correct min

            // The boundary condition probably doesn't apply
            coord = select(arg_var < min || arg_var >= min + extent, coord,
                           clamp(likely(arg_var), min, min + extent - 1));

(Spoiler alert: abs() in Halide always returns an unsigned.)

(In hindsight, I really wish that Halide had adopted the Golang property of never doing implicit type conversions...)

steven-johnson · 2023-08-29T17:27:53Z

Unfortunately, if I change the bad line to coord = saturating_cast<int>(abs(coord));, we still fail with the same result (input is touched in an infinite way), but none of the "dodgy" check warnings you inserted trigger this time (they did previously, which is how I found the mirror issue). I'm gonna guess there is more abs() nonsense going on.

abadams · 2023-08-29T17:42:06Z

Can repro! I'll investigate.

abadams · 2023-08-29T17:48:40Z

Possible fix pushed

steven-johnson · 2023-08-29T17:57:38Z

OK, that fixes the unbounded access... now I get failures with Error: Signed integer overflow occurred during constant-folding. Signed integer overflow for int32 and int64 is undefined behavior in Halide. Let me investigate that.

steven-johnson · 2023-08-29T18:10:26Z

Yeah, this is another one where it is gonna be tricky to find these errors in existing code without some helpers.

abadams · 2023-08-29T18:21:18Z

I pushed a user_warning for when bounds inference does something different to what it used to do.

steven-johnson · 2023-08-29T18:23:05Z

The first one I found basically amounted to code that used explicit casting to do a 'widening' mul (this code predates the widening_mul() etc operators:

    // f and h are both funcs with type i16
    Expr r = i32(f(x, y, 0));
    Expr g = i32(f(x, y, 1));
    Expr b = i32(f(x, y, 2));
    Expr result = h(0, c) * r + h(1, c) * g + h(2, c) * b;

switching it to use widening_mul seems to heal it (not sure if result is correct), but many others to track down. Will update later.

abadams · 2023-08-29T18:46:44Z

I'm a little alarmed that widening_mul heals that, because any 16-bit -> i32 cast should be bounded, because i32 can represent i16 and u16. Also, widening_mul just means the same thing.

steven-johnson · 2023-08-29T21:27:55Z

Also, widening_mul just means the same thing.

The evidence suggests otherwise? Not sure I can give you an easy repro case from it but maybe using that snippet and seeing how the calculated bounds differ would be useful.

steven-johnson · 2023-08-29T21:28:14Z

I pushed a user_warning for when bounds inference does something different to what it used to do.

I'll apply that and retry.

steven-johnson · 2023-08-29T22:04:39Z

I'm a little alarmed that widening_mul heals that, because any 16-bit -> i32 cast should be bounded, because i32 can represent i16 and u16. Also, widening_mul just means the same thing.

Update: maybe that's because the Call visitor in Bounds calculation doesn't have a case for widening_mul, so it just defaults to max-of-type for an unknown Func, which in this case is '+/- inf'.

If I add a specialization that is similar to widening_mul_right, eg

        } else if (op->is_intrinsic(Call::widening_mul)) {
            Expr mul = Mul::make(cast(op->type, op->args[0]), cast(op->type, op->args[1]));
            mul.accept(this);
        }

..then using widening_mul also fails with signed-integer-overflow.

steven-johnson · 2023-08-29T22:39:57Z

Also interesting (though maybe not 100% relevant here): we do specialize for shift_left, but only for signed integers, not for unsigned... in the case of this code, we have a u32 << u32 case where LHS is in 0..65535 and RHS is in 0..16, so bounds should be 0..0xFFFF0000... but since we don't handle it, bounds is neg_inf..pos_inf

This addresses many (but not all) of the `signed integer overflow` issues we're seeing in Google due to #7814 -- a lot of the issues seems to be in code that uses intrinsics that had no handling in value bounds checking, so the bounds were naively large and overflowed. - Most of the intrinsics from FindIntrinsics.h weren't handled; now they all are (most by lowering to other IR, though the halving_add variants were modeled directly because the bitwise ops don't mesh well) - strict_float() is just a pass-through - round() is a best guess (basically, if bounds exist, expand by one as a worst-case) There are definitely others we should handle here... trunc/floor/ceil probably?

and add one-sided handling for monotonic functions

steven-johnson · 2023-11-28T15:21:25Z

This is fairly old, where does it stand?

abadams · 2023-11-28T15:35:10Z

This one is pending figuring out what breaks inside of google

steven-johnson · 2023-11-28T15:41:19Z

This one is pending figuring out what breaks inside of google

It's been a while, do we remember the gist of it

abadams · 2023-11-28T15:45:17Z

I think we were knocking cases down one by one, but there are more failures that we have yet to diagnose.

steven-johnson · 2023-11-29T16:12:07Z

I think we were knocking cases down one by one, but there are more failures that we have yet to diagnose.

ok, I will pull it in and see where we stand

steven-johnson · 2023-11-29T18:16:12Z

So the fact that abs() has always returns a uint means that this change will likely mean that ~every call to it will need attention (e.g., something like abs(y) -> max(abs(y), 0) to ensure no negative values).

abadams · 2023-11-29T18:19:22Z

I see, so people are doing things like my_func(abs(expr)), and that's implicitly my_func(i32(abs(expr))), and the expr could be the most-negative integer if we know nothing about it, in which case that cast now returns the most-negative integer instead of being UB.

abadams · 2023-11-29T18:20:32Z

Wait, that can't be the example, because if we know nothing about expr in that case the access is unbounded anyway. I guess there are cases where we know an upper bound on expr but not a lower bound?

steven-johnson · 2023-11-29T18:21:51Z

I see, so people are doing things like my_func(abs(expr)), and that's implicitly my_func(i32(abs(expr))), and the expr could be the most-negative integer if we know nothing about it, in which case that cast now returns the most-negative integer instead of being UB.

yeah, this is pervasive too, it makes me question whether this is the right design for abs() -- maybe we should change it to return the same type for signed. alternately, add iabs() which does something like clamp(abs(x), 0, max-for-type) for int...

steven-johnson · 2023-11-29T18:22:58Z

Wait, that can't be the example, because if we know nothing about expr in that case the access is unbounded anyway. I guess there are cases where we know an upper bound on expr but not a lower bound?

in this case the full expr is Expr mirror_y = clamp(min(abs(y), 2 * max_y + 1 - y), 0, max_y);

abadams · 2023-11-29T18:23:15Z

If this really becomes the show-stopper, we could declare that abs of the most negative 32/64-bit integer is UB (which it already is in main if you cast the result back to int due to the signed integer overflow), and then handle int casts of abs specially in bounds inference.

steven-johnson · 2023-11-29T18:27:42Z

I don't know if it's a showstopper, but in the first failure I looked at, literally every failure warning was about abs()

steven-johnson · 2023-11-29T18:37:58Z

I tried writing a shim like so:

Halide::Expr iabs(Halide::Expr i) {
  if (i.type().is_int()) {
    return cast<int32_t>(min(abs(i), cast<uint32_t>(i.type().max())));
  } else {
    return abs(i);
  }
}

but it doesn't help, e.g.

Formerly bounded cast is no longer bounded:
 Cast: int32(min((uint32)abs((blur_x.s0.x.guarded/2) + -1), (uint32)2147483647))
 Bounds of arg: (uint32)0 min(max(uint32(((4 - min(input.extent.0, 1))/2)), uint32(((min(max(input.extent.0, 1), min(max(input.extent.0, 1) - min(input.extent.0, 1), 7) + min(max(input.extent.0, 1), min(input.extent.0, 1) + (((max(input.extent.0, 1) - min(input.extent.0, 1))/8)*8))) + -3)/2))), (uint32)2147483647)
 Bounds of output: (void *)neg_inf (void *)pos_inf

not sure if this is just a false positive in the sniffing code or not? but I still fail with a buffer being accessed in an unbounded way.

steven-johnson · 2023-11-29T18:56:38Z

ContainsDodgyIntCast may not be smart enough, it doesn't check to see if the value in question has been constrained to a legal range. Maybe something like changing can_represent() -> can_prove(op->value >= op->type.min() && op->value <= op->type.max())?

* Handle many more intrinsics in Bounds.cpp This addresses many (but not all) of the `signed integer overflow` issues we're seeing in Google due to #7814 -- a lot of the issues seems to be in code that uses intrinsics that had no handling in value bounds checking, so the bounds were naively large and overflowed. - Most of the intrinsics from FindIntrinsics.h weren't handled; now they all are (most by lowering to other IR, though the halving_add variants were modeled directly because the bitwise ops don't mesh well) - strict_float() is just a pass-through - round() is a best guess (basically, if bounds exist, expand by one as a worst-case) There are definitely others we should handle here... trunc/floor/ceil probably? * Fix round() and strict_float() handling * Update Bounds.cpp * Fixes? * trigger buildbots * Revert saturating_cast handling * Update Bounds.cpp --------- Co-authored-by: Andrew Adams <[email protected]>

* Handle many more intrinsics in Bounds.cpp This addresses many (but not all) of the `signed integer overflow` issues we're seeing in Google due to halide#7814 -- a lot of the issues seems to be in code that uses intrinsics that had no handling in value bounds checking, so the bounds were naively large and overflowed. - Most of the intrinsics from FindIntrinsics.h weren't handled; now they all are (most by lowering to other IR, though the halving_add variants were modeled directly because the bitwise ops don't mesh well) - strict_float() is just a pass-through - round() is a best guess (basically, if bounds exist, expand by one as a worst-case) There are definitely others we should handle here... trunc/floor/ceil probably? * Fix round() and strict_float() handling * Update Bounds.cpp * Fixes? * trigger buildbots * Revert saturating_cast handling * Update Bounds.cpp --------- Co-authored-by: Andrew Adams <[email protected]>

steven-johnson · 2024-07-29T17:40:53Z

Looking over old outstanding PRs, this one is nearly a year old now -- what's the status on it, do we want to (ever) land it?

abadams · 2024-08-02T16:47:20Z

This is technically a bug fix, but it seemed like it caused too many problems in production due to the abs issue. Not sure what to do about it.

Fix bounds inference for uint -> int casts

05ba843

Fixes #7807 Fixes #7811

abadams mentioned this pull request Aug 28, 2023

Don't track bounds through wrapping casts in simplifier #7813

Closed

abadams added the release_notes For changes that may warrant a note in README for official releases. label Aug 29, 2023

Add diagnostics to help with merge

8b90a5e

Fix bounds inference for mirror_interior

2bd666e

clang-format

1b573c1

Add warning when cast behavior has changed

b29988b

Fix last commit

520256f

steven-johnson mentioned this pull request Aug 29, 2023

Various intrinsics are missing specializations in Bounds.cpp #7822

Open

steven-johnson mentioned this pull request Aug 29, 2023

Handle many more intrinsics in Bounds.cpp #7823

Merged

abadams added 3 commits August 29, 2023 18:39

Fix saturating cast handling

ef9a7d8

and add one-sided handling for monotonic functions

Handle certain instances of signed integer overflow

16ddff5

Merge remote-tracking branch 'origin/main' into abadams/fix_7810

4ad3ba8

Merge branch 'main' into abadams/fix_7810

69a4312

Merge branch 'main' into abadams/fix_7810

fe1eb40

trigger buildbots

fb06e94

Fix bounds inference for uint -> int casts #7814

Are you sure you want to change the base?

Fix bounds inference for uint -> int casts #7814

Conversation

abadams commented Aug 28, 2023 • edited Loading

steven-johnson commented Aug 29, 2023

abadams commented Aug 29, 2023

steven-johnson commented Aug 29, 2023

abadams commented Aug 29, 2023

steven-johnson commented Aug 29, 2023

steven-johnson commented Aug 29, 2023

abadams commented Aug 29, 2023

steven-johnson commented Aug 29, 2023

abadams commented Aug 29, 2023

steven-johnson commented Aug 29, 2023

steven-johnson commented Aug 29, 2023

abadams commented Aug 29, 2023

abadams commented Aug 29, 2023

steven-johnson commented Aug 29, 2023

steven-johnson commented Aug 29, 2023

abadams commented Aug 29, 2023

steven-johnson commented Aug 29, 2023

abadams commented Aug 29, 2023

steven-johnson commented Aug 29, 2023

steven-johnson commented Aug 29, 2023

steven-johnson commented Aug 29, 2023

steven-johnson commented Aug 29, 2023

steven-johnson commented Nov 28, 2023

abadams commented Nov 28, 2023

steven-johnson commented Nov 28, 2023

abadams commented Nov 28, 2023

steven-johnson commented Nov 29, 2023

steven-johnson commented Nov 29, 2023

abadams commented Nov 29, 2023

abadams commented Nov 29, 2023 • edited Loading

steven-johnson commented Nov 29, 2023

steven-johnson commented Nov 29, 2023

abadams commented Nov 29, 2023

steven-johnson commented Nov 29, 2023

steven-johnson commented Nov 29, 2023

steven-johnson commented Nov 29, 2023

steven-johnson commented Jul 29, 2024

abadams commented Aug 2, 2024

abadams commented Aug 28, 2023 •

edited

Loading

abadams commented Nov 29, 2023 •

edited

Loading