Releases/gcc 12 #65

jacopobrusini · 2022-06-04T17:20:00Z

Support for Apple Silicon!!!

jwakely · 2024-02-21T00:10:33Z

This is an unofficial mirror that has nothing to do with the GCC project, so submitting pull requests here is a waste of time.

Also, I have no idea what this pull request is trying to do but it would never be accepted even if it was submitted to the right place.

…on-r15-7214-g0710024b5bd861 Contracts nonattr rebase on r15 7214 g0710024b5bd861

During combine we may end up with (set (reg:DI 66 [ _6 ]) (ashift:DI (reg:DI 72 [ x ]) (subreg:QI (and:TI (reg:TI 67 [ _1 ]) (const_wide_int 0x0aaaaaaaaaaaaaabf)) 15))) where the shift count operand does not trivially fit the scheme of address operands. Reject those operands, especially since strip_address_mutations() expects expressions of the form (and ... (const_int ...)) and fails for (and ... (const_wide_int ...)). Thus, be more strict here and accept only CONST_INT operands. Done by replacing immediate_operand() with const_int_operand() which is enough since the former only additionally checks for LEGITIMATE_PIC_OPERAND_P and targetm.legitimate_constant_p which are always true for CONST_INT operands. While on it, fix indentation of the if block. gcc/ChangeLog: PR target/118835 * config/s390/s390.cc (s390_valid_shift_count): Reject shift count operands which do not trivially fit the scheme of address operands. gcc/testsuite/ChangeLog: * gcc.target/s390/pr118835.c: New test. (cherry picked from commit ac9806d)

Floating-point emulation in the D front-end is done via a type named `struct longdouble`, which in GDC is a small interface around the real_value type. Because the D code cannot include gcc/real.h directly, a big enough buffer is used for the data instead. On x86_64, this buffer is actually bigger than real_value itself, so when a new longdouble object is created with longdouble r; real_from_string3 (&r.rv (), buffer, mode); return r; there is uninitialized padding at the end of `r`. This was never a problem when D was implemented in C++ (until GCC 12) as comparing two longdouble objects with `==' would be forwarded to the relevant operator== overload that extracted the underlying real_value. However when the front-end was translated to D, such conditions were instead rewritten into identity comparisons return exp.toReal() is CTFloat.zero The `is` operator gets lowered as a call to `memcmp() == 0', which is where the read of uninitialized memory occurs, as seen by valgrind. ==26778== Conditional jump or move depends on uninitialised value(s) ==26778== at 0x911F41: dmd.dstruct._isZeroInit(dmd.expression.Expression) (dstruct.d:635) ==26778== by 0x9123BE: StructDeclaration::finalizeSize() (dstruct.d:373) ==26778== by 0x86747C: dmd.aggregate.AggregateDeclaration.determineSize(ref const(dmd.location.Loc)) (aggregate.d:226) [...] To avoid accidentally reading uninitialized data, explicitly initialize all `longdouble` variables with an empty constructor on C++ side of the implementation before initializing underlying real_value type it holds. PR d/116961 gcc/d/ChangeLog: * d-codegen.cc (build_float_cst): Change new_value type from real_t to real_value. * d-ctfloat.cc (CTFloat::fabs): Default initialize the return value. (CTFloat::ldexp): Likewise. (CTFloat::parse): Likewise. * d-longdouble.cc (longdouble::add): Likewise. (longdouble::sub): Likewise. (longdouble::mul): Likewise. (longdouble::div): Likewise. (longdouble::mod): Likewise. (longdouble::neg): Likewise. * d-port.cc (Port::isFloat32LiteralOutOfRange): Likewise. (Port::isFloat64LiteralOutOfRange): Likewise. gcc/testsuite/ChangeLog: * gdc.dg/pr116961.d: New test. (cherry picked from commit f7bc17e)

…ed in i3 [PR118739] The combine pass is trying to combine: Trying 16, 22, 21 -> 23: 16: r104:QI=flags:CCNO>0 22: {r120:QI=r104:QI^0x1;clobber flags:CC;} REG_UNUSED flags:CC 21: r119:QI=flags:CCNO<=0 REG_DEAD flags:CCNO 23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;} REG_DEAD r120:QI REG_DEAD r119:QI REG_UNUSED flags:CC and creates the following two insn sequence: modifying insn i2 22: r104:QI=flags:CCNO>0 REG_DEAD flags:CC deferring rescan insn with uid = 22. modifying insn i3 23: r110:QI=flags:CCNO<=0 REG_DEAD flags:CC deferring rescan insn with uid = 23. where the REG_DEAD note in i2 is not correct, because the flags register is still referenced in i3. In try_combine() megafunction, we have this part: --cut here-- /* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3. */ if (i3notes) distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, elim_i0); if (i2notes) distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, elim_i0); if (i1notes) distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL, elim_i2, local_elim_i1, local_elim_i0); if (i0notes) distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, local_elim_i0); if (midnotes) distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, elim_i0); --cut here-- where the compiler distributes REG_UNUSED note from i2: 22: {r120:QI=r104:QI^0x1;clobber flags:CC;} REG_UNUSED flags:CC via distribute_notes() using the following: --cut here-- /* Otherwise, if this register is used by I3, then this register now dies here, so we must put a REG_DEAD note here unless there is one already. */ else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3)) && ! (REG_P (XEXP (note, 0)) ? find_regno_note (i3, REG_DEAD, REGNO (XEXP (note, 0))) : find_reg_note (i3, REG_DEAD, XEXP (note, 0)))) { PUT_REG_NOTE_KIND (note, REG_DEAD); place = i3; } --cut here-- Flags register is used in I3, but there already is a REG_DEAD note in I3. The above condition doesn't trigger and continues in the "else" part where REG_DEAD note is put to I2. The proposed solution corrects the above logic to trigger every time the register is referenced in I3, avoiding the "else" part. PR rtl-optimization/118739 gcc/ChangeLog: * combine.cc (distribute_notes) <case REG_UNUSED>: Correct the logic when the register is used by I3. gcc/testsuite/ChangeLog: * gcc.target/i386/pr118739.c: New test. (cherry picked from commit a92dc3f)

Uros' r15-7793 fixed this PR as well, I'm just committing tests from the PR so that it can be closed. 2025-03-04 Jakub Jelinek <[email protected]> PR rtl-optimization/119071 * gcc.dg/pr119071.c: New test. * gcc.c-torture/execute/pr119071.c: New test. (cherry picked from commit ccf9db9)

…485) Commit r9-4307-g89d7557202d25a forgot to accept a fixed PIC register when extending the assert in require_pic_register. arm_pic_register can be set explicitly by the user (e.g. -mpic-register=r9) or implicitly as the default value with -fpic/-fPIC/-fPIE and -mno-pic-data-is-text-relative -mlong-calls, and we want to use/accept it when recording cfun->machine->pic_reg as used to be the case. PR target/115485 gcc/ * config/arm/arm.cc (require_pic_register): Fix typos in comment. Handle fixed arm_pic_register. gcc/testsuite/ * g++.target/arm/pr115485.C: New test. (cherry picked from commit b1d0ac2)

The testcase contains a VNx2QImode pseudo that is live across a call and that cannot be allocated a call-preserved register. LRA quite reasonably tried to save it before the call and restore it afterwards. Unfortunately, the target told it to do that in SImode, even though punning between SImode and VNx2QImode is disallowed by both TARGET_CAN_CHANGE_MODE_CLASS and TARGET_MODES_TIEABLE_P. The natural class to use for SImode is GENERAL_REGS, so this led to an unsalvageable situation in which we had: (set (subreg:VNx2QI (reg:SI A) 0) (reg:VNx2QI B)) where A needed GENERAL_REGS and B needed FP_REGS. We therefore ended up in a reload loop. The hooks above should ensure that this situation can never occur for incoming subregs. It only happened here because the target explicitly forced it. The decision to use SImode for modes smaller than 4 bytes dates back to the beginning of the port, before 16-bit floating-point modes existed. I'm not sure whether promoting to SImode really makes sense for any FPR, but that's a separate performance/QoI discussion. For now, this patch just disallows using SImode when it is wrong for correctness reasons, since that should be safer to backport. gcc/ PR testsuite/116238 * config/aarch64/aarch64.cc (aarch64_hard_regno_caller_save_mode): Only return SImode if we can convert to and from it. gcc/testsuite/ PR testsuite/116238 * gcc.target/aarch64/sve/pr116238.c: New test. (cherry picked from commit ec9d6d4)

The svwhilele folder mishandled the degenerate case in which the second argument is the maximum integer. In that case, the result is all-true regardless of the first parameter: If the second scalar operand is equal to the maximum signed integer value then a condition which includes an equality test can never fail and the result will be an all-true predicate. This is because the conceptual "increment the first operand by 1 after each element" is done modulo the range of the operand. The GCC code was instead treating it as infinite precision. whilele_5.c even had a test for the incorrect behaviour. The easiest fix seemed to be to handle that case specially before doing constant folding. This also copes with variable first operands. gcc/ PR target/116999 PR target/117045 * config/aarch64/aarch64-sve-builtins-base.cc (svwhilelx_impl::fold): Check for WHILELTs of the minimum value and WHILELEs of the maximum value. Fold them to all-false and all-true respectively. gcc/testsuite/ PR target/116999 PR target/117045 * gcc.target/aarch64/sve/acle/general/whilele_5.c: Fix bogus expected result. * gcc.target/aarch64/sve/acle/general/whilele_11.c: New test. * gcc.target/aarch64/sve/acle/general/whilele_12.c: Likewise. (cherry picked from commit 50e7c51)

There was an embarrassing typo in the folding of BIT_NOT_EXPR for POLY_INT_CSTs: it used - rather than ~ on the poly_int. Not sure how that happened, but it might have been due to the way that ~x is implemented as -1 - x internally. gcc/ PR tree-optimization/118976 * fold-const.cc (const_unop): Use ~ rather than - for BIT_NOT_EXPR. * config/aarch64/aarch64.cc (aarch64_test_sve_folding): New function. (aarch64_run_selftests): Run it. (cherry picked from commit 78380fd)

An optimization was added in GDC-12 which sets the TREE_READONLY flag on all local variables with the storage class `const' assigned. For some reason, const is also being added by the front-end to `__result' variables in non-virtual functions, which ends up getting wrong code by the gimplify pass promoting the local to static storage. A bug has been raised upstream, as this looks like an error in the AST. For now, turn off setting TREE_READONLY on all result variables. PR d/119139 gcc/d/ChangeLog: * decl.cc (get_symbol_decl): Don't set TREE_READONLY for __result declarations. gcc/testsuite/ChangeLog: * gdc.dg/pr119139.d: New test. (cherry picked from commit 81582ca)

…decisions [PR119327] The following testcase FAILs because the always_inline function can't be inlined. The rs6000 backend has similarly to other targets a hook which rejects inlining which would bring in new ISAs which aren't there in the caller. And this hook rejects this because of OPTION_MASK_SAVE_TOC_INDIRECT differences. This flag is set if explicitly requested or by default depending on whether the current function looks hot (or at least not cold): if ((rs6000_isa_flags_explicit & OPTION_MASK_SAVE_TOC_INDIRECT) == 0 && flag_shrink_wrap_separate && optimize_function_for_speed_p (cfun)) rs6000_isa_flags |= OPTION_MASK_SAVE_TOC_INDIRECT; The target nodes that are being compared here are actually the default target node (which was created when cfun was NULL) vs. one that was created for the always_inline function when it wasn't NULL, so one doesn't have it, the other does. In any case, this flag feels like a tuning decision rather than hard ISA requirement and I see no problems why we couldn't inline even explicit -msave-toc-indirect function into -mno-save-toc-indirect or vice versa. We already ignore OPTION_MASK_P{8,10}_FUSION which are also more like tuning flags. 2025-04-22 Jakub Jelinek <[email protected]> PR target/119327 * config/rs6000/rs6000.cc (rs6000_can_inline_p): Ignore also OPTION_MASK_SAVE_TOC_INDIRECT differences. * g++.dg/opt/pr119327.C: New test. (cherry picked from commit 4b62cf5)

We need to drop the kind argument from what is passed to the library, but need to do it not only when one uses the argument name for it (so kind=4 etc.) but also when one passes all the arguments to the intrinsics. The following patch uses what gfc_conv_intrinsic_findloc uses, which looks more efficient and cleaner, we already set automatic vars to point to the kind and back actual arguments, so we can just free/clear expr on the former and set name to "%VAL" on the latter. And similarly clears dim argument for the BT_CHARACTER case when using maxloc2/minloc2, again regardless of whether it was named or not. 2025-05-13 Jakub Jelinek <[email protected]> Daniil Kochergin <[email protected]> Tobias Burnus <[email protected]> PR fortran/120191 * trans-intrinsic.cc (strip_kind_from_actual): Remove. (gfc_conv_intrinsic_minmaxloc): Don't call strip_kind_from_actual. Free and clear kind_arg->expr if non-NULL. Set back_arg->name to "%VAL" instead of a loop looking for last argument. Remove actual variable, use array_arg instead. Free and clear dim_arg->expr if non-NULL for BT_CHARACTER cases instead of using a loop. * gfortran.dg/pr120191_1.f90: New test. (cherry picked from commit ec249be)

I've tried to write a testcase for the BT_CHARACTER maxloc/minloc with named or unnamed arguments and indeed the just posted patch fixed the arguments in there in multiple cases to match what the library expects. But the testcase still fails, due to library problems. One dealt with in this patch are _gfortran_s{max,min}loc2_{4,8,16}_s{1,4} functions. Those are trivial wrappers around _gfortrani_{max,min}loc2_{4,8,16}_s{1,4} which should call those functions if the scalar mask is true and just return 0 otherwise. The two bugs I see there is that the back, len arguments are swapped, which means that it always acts as back=.true. and for len will use character length of 1 or 0 instead of the desired one. The _gfortrani_{max,min}loc2_{4,8,16}_s{1,4} functions have prototypes like GFC_INTEGER_4 maxloc2_4_s1 (gfc_array_s1 * const restrict array, GFC_LOGICAL_4 back, gfc_charlen_type len) so back comes before len, ditto for the GFC_INTEGER_4 smaxloc2_4_s1 (gfc_array_s1 * const restrict array, GFC_LOGICAL_4 *mask, GFC_LOGICAL_4 back, gfc_charlen_type len) The other problem is that it was just testing if (mask). In my limited Fortran understanding that means that the optional argument mask was supplied but nothing about its actual value. Other scalar mask generated routines use if (mask == NULL || *mask) as the condition when to call the non-masked function, i.e. when mask is not supplied (then it should act like .true. mask) or when it is supplied and evaluates to .true.). 2025-05-13 Jakub Jelinek <[email protected]> PR fortran/120191 * m4/maxloc2s.m4: For smaxloc2 call maxloc2 if mask is NULL or *mask. Swap back and len arguments. * m4/minloc2s.m4: Likewise. * generated/maxloc2_4_s1.c: Regenerate. * generated/maxloc2_4_s4.c: Regenerate. * generated/maxloc2_8_s1.c: Regenerate. * generated/maxloc2_8_s4.c: Regenerate. * generated/maxloc2_16_s1.c: Regenerate. * generated/maxloc2_16_s4.c: Regenerate. * generated/minloc2_4_s1.c: Regenerate. * generated/minloc2_4_s4.c: Regenerate. * generated/minloc2_8_s1.c: Regenerate. * generated/minloc2_8_s4.c: Regenerate. * generated/minloc2_16_s1.c: Regenerate. * generated/minloc2_16_s4.c: Regenerate. * gfortran.dg/pr120191_2.f90: New test. (cherry picked from commit 482f219)

There is a bug in _gfortran_s{max,min}loc1_{4,8,16}_s{1,4} which the following testcase shows. The functions return but then crash in the caller. Seems that is because buffer overflows, I believe those functions for if (mask == NULL || *mask) condition being false are supposed to fill in the result array with all zeros (or allocate it and fill it with zeros). My understanding is the result array in that case is integer(kind={4,8,16}) and should have the extents the character input array has. The problem is that it uses * string_len in the extent multiplication: extent[n] = GFC_DESCRIPTOR_EXTENT(array,n) * string_len; and extent[n] = GFC_DESCRIPTOR_EXTENT(array,n + 1) * string_len; which is I guess fine and desirable for the extents of the character array, but not for the extents of the destination array. Yet the code uses that extent array for that purpose (and no other purposes). Here it uses it to set the dimensions for the case where it needs to allocate (as well as size): for (n = 0; n < rank; n++) { if (n == 0) str = 1; else str = GFC_DESCRIPTOR_STRIDE(retarray,n-1) * extent[n-1]; GFC_DIMENSION_SET(retarray->dim[n], 0, extent[n] - 1, str); } Here it uses it for bounds checking of the destination: if (unlikely (compile_options.bounds_check)) { for (n=0; n < rank; n++) { index_type ret_extent; ret_extent = GFC_DESCRIPTOR_EXTENT(retarray,n); if (extent[n] != ret_extent) runtime_error ("Incorrect extent in return value of" " MAXLOC intrinsic in dimension %ld:" " is %ld, should be %ld", (long int) n + 1, (long int) ret_extent, (long int) extent[n]); } } and here to find out how many retarray elements to actually fill in each dimension: while(1) { *dest = 0; count[0]++; dest += dstride[0]; n = 0; while (count[n] == extent[n]) { /* When we get to the end of a dimension, reset it and increment the next dimension. */ count[n] = 0; /* We could precalculate these products, but this is a less frequently used path so probably not worth it. */ dest -= dstride[n] * extent[n]; Seems maxloc1s.m4 and minloc1s.m4 are the only users of ifunction-s.m4, so we can change SCALAR_ARRAY_FUNCTION in there without breaking anything else. 2025-05-13 Jakub Jelinek <[email protected]> PR fortran/120191 * m4/ifunction-s.m4 (SCALAR_ARRAY_FUNCTION): Don't multiply GFC_DESCRIPTOR_EXTENT(array,) by string_len. * generated/maxloc1_4_s1.c: Regenerate. * generated/maxloc1_4_s4.c: Regenerate. * generated/maxloc1_8_s1.c: Regenerate. * generated/maxloc1_8_s4.c: Regenerate. * generated/maxloc1_16_s1.c: Regenerate. * generated/maxloc1_16_s4.c: Regenerate. * generated/minloc1_4_s1.c: Regenerate. * generated/minloc1_4_s4.c: Regenerate. * generated/minloc1_8_s1.c: Regenerate. * generated/minloc1_8_s4.c: Regenerate. * generated/minloc1_16_s1.c: Regenerate. * generated/minloc1_16_s4.c: Regenerate. * gfortran.dg/pr120191_3.f90: New test. (cherry picked from commit 781cfc4)

As mentioned in the PR, _gfortran_{,m,s}findloc2_s{1,4} iterate too many times in the back case if nothing is found. For !back, the loops are for (i = 1; i <= extent; i++) so i is in the body [1, extent] if nothing is found, but for back it is for (i = extent; i >= 0; i--) so i is in the body [0, extent] and compares one element before the start of the array. Note, findloc1_s{1,4} uses for (n = len; n > 0; n--, src -= delta * len_array) for the back loop and for (n = 1; n <= len; n++, src += delta * len_array) for !back. This patch fixes that. The testcase fails under valgrind without the libgfortran changes and succeeds with those. 2025-05-13 Jakub Jelinek <[email protected]> PR libfortran/120196 * m4/ifindloc2.m4 (header1, header2): For back use i > 0 rather than i >= 0 as for condition. * generated/findloc2_s1.c: Regenerate. * generated/findloc2_s4.c: Regenerate. * gfortran.dg/pr120196.f90: New test. (cherry picked from commit 748a7bc)

The UB on the following testcase isn't diagnosed by -fsanitize=address, because we see that the array has a single element and optimize the strlen to 0. I think it is fine to assume e.g. for range purposes the lower bound for the strlen as long as we don't try to optimize strlen (str) where we know that it returns [26, 42] to 26 + strlen (str + 26), but for the upper bound we really want to punt on optimizing that for -fsanitize=address to read all the bytes of the string and diagnose if we run to object end etc. 2024-02-06 Jakub Jelinek <[email protected]> PR sanitizer/110676 * gimple-fold.cc (gimple_fold_builtin_strlen): For -fsanitize=address reset maxlen to sizetype maximum. * gcc.dg/asan/pr110676.c: New test. (cherry picked from commit d3eac7d)

mark_vtable_entries already has /* It's OK for the vtable to refer to deprecated virtual functions. */ warning_sentinel w(warn_deprecated_decl); but that doesn't cover __attribute__((unavailable)). We can use the following override to cover both. PR c++/116606 gcc/cp/ChangeLog: * decl2.cc (mark_vtable_entries): Temporarily override deprecated_state to UNAVAILABLE_DEPRECATED_SUPPRESS. Remove a warning_sentinel. gcc/testsuite/ChangeLog: * g++.dg/ext/attr-unavailable-13.C: New test. (cherry picked from commit d9d34f9)

…for methods with such attributes [PR116636] On the following testcase, we emit false positive warnings/errors about using the deprecated or unavailable methods when creating thunks for them, even when nothing (in the testcase so far) actually used those. The following patch temporarily disables that diagnostics when creating the thunks. 2024-09-12 Jakub Jelinek <[email protected]> PR c++/116636 * method.cc: Include decl.h. (use_thunk): Temporarily change deprecated_state to UNAVAILABLE_DEPRECATED_SUPPRESS. * g++.dg/warn/deprecated-19.C: New test. (cherry picked from commit 4026d89)

The following testcases are miscompiled on s390x-linux, because the doloop_optimize /* Ensure that the new sequence doesn't clobber a register that is live at the end of the block. */ { bitmap modified = BITMAP_ALLOC (NULL); for (rtx_insn *i = doloop_seq; i != NULL; i = NEXT_INSN (i)) note_stores (i, record_reg_sets, modified); basic_block loop_end = desc->out_edge->src; bool fail = bitmap_intersect_p (df_get_live_out (loop_end), modified); check doesn't work as intended. The problem is that it uses df, but the df analysis was only done using iv_analysis_loop_init (loop); -> df_analyze_loop (loop); which computes df inside on the bbs of the loop. While loop_end bb is inside of the loop, df_get_live_out computed that way includes registers set in the loop and used at the start of the next iteration, but doesn't include registers set in the loop (or before the loop) and used after the loop. The following patch fixes that by doing whole function df_analyze first, changes the loop iteration mode from 0 to LI_ONLY_INNERMOST (on many targets which use can_use_doloop_if_innermost target hook a so are known to only handle innermost loops) or LI_FROM_INNERMOST (I think only bfin actually allows non-innermost loops) and checking not just df_get_live_out (loop_end) (that is needed for something used by the next iteration), but also df_get_live_in (desc->out_edge->dest), i.e. what will be used after the loop. df of such a bb shouldn't be affected by the df_analyze_loop and so should be from df_analyze of the whole function. 2024-12-05 Jakub Jelinek <[email protected]> PR rtl-optimization/113994 PR rtl-optimization/116799 * loop-doloop.cc: Include targhooks.h. (doloop_optimize): Also punt on intersection of modified with df_get_live_in (desc->out_edge->dest). (doloop_optimize_loops): Call df_analyze. Use LI_ONLY_INNERMOST or LI_FROM_INNERMOST instead of 0 as second loops_list argument. * gcc.c-torture/execute/pr116799.c: New test. * g++.dg/torture/pr113994.C: New test. (cherry picked from commit 0eed816)

The following testcase is miscompiled because of RTL represententation of bt{l,q} insn followed by e.g. j{c,nc} being misleading to what it actually does. Let's look e.g. at (define_insn_and_split "*jcc_bt<mode>" [(set (pc) (if_then_else (match_operator 0 "bt_comparison_operator" [(zero_extract:SWI48 (match_operand:SWI48 1 "nonimmediate_operand") (const_int 1) (match_operand:QI 2 "nonmemory_operand")) (const_int 0)]) (label_ref (match_operand 3)) (pc))) (clobber (reg:CC FLAGS_REG))] "(TARGET_USE_BT || optimize_function_for_size_p (cfun)) && (CONST_INT_P (operands[2]) ? (INTVAL (operands[2]) < GET_MODE_BITSIZE (<MODE>mode) && INTVAL (operands[2]) >= (optimize_function_for_size_p (cfun) ? 8 : 32)) : !memory_operand (operands[1], <MODE>mode)) && ix86_pre_reload_split ()" "#" "&& 1" [(set (reg:CCC FLAGS_REG) (compare:CCC (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)) (const_int 0))) (set (pc) (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)]) (label_ref (match_dup 3)) (pc)))] { operands[0] = shallow_copy_rtx (operands[0]); PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0]))); }) The define_insn part in RTL describes exactly what it does, jumps to op3 if bit op2 in op1 is set (for op0 NE) or not set (for op0 EQ). The problem is with what it splits into. put_condition_code %C1 for CCCmode comparisons emits c for EQ and LTU, nc for NE and GEU and ICEs otherwise. CCCmode is used mainly for carry out of add/adc, borrow out of sub/sbb, in those cases e.g. for add we have (set (reg:CCC flags) (compare:CCC (plus:M x y) x)) and use (ltu (reg:CCC flags) (const_int 0)) for carry set and (geu (reg:CCC flags) (const_int 0)) for carry not set. These cases model in RTL what is actually happening, compare in infinite precision x from the result of finite precision addition in M mode and if it is less than unsigned (i.e. overflow happened), carry is set. Another use of CCCmode is in UNSPEC_* patterns, those are used with (eq (reg:CCC flags) (const_int 0)) for carry set and ne for unset, given the UNSPEC no big deal, the middle-end doesn't know what means set or unset. But for the bt{l,q}; j{c,nc} case the above splits it into (set (reg:CCC flags) (compare:CCC (zero_extract) (const_int 0))) for bt and (set (pc) (if_then_else (eq (reg:CCC flags) (const_int 0)) (label_ref) (pc))) for the bit set case (so that the jump expands to jc) and ne for the bit not set case (so that the jump expands to jnc). Similarly for the different splitters for cmov and set{c,nc} etc. The problem is that when the middle-end reads this RTL, it feels the exact opposite to it. If zero_extract is 1, flags is set to comparison of 1 and 0 and that would mean using ne ne in the if_then_else, and vice versa. So, in order to better describe in RTL what is actually happening, one possibility would be to swap the behavior of put_condition_code and use NE + LTU -> c and EQ + GEU -> nc rather than the current EQ + LTU -> c and NE + GEU -> nc; and adjust everything. The following patch uses a more limited approach, instead of representing bt{l,q}; j{c,nc} case as written above it uses (set (reg:CCC flags) (compare:CCC (const_int 0) (zero_extract))) and (set (pc) (if_then_else (ltu (reg:CCC flags) (const_int 0)) (label_ref) (pc))) which uses the existing put_condition_code but describes what the insns actually do in RTL clearly. If zero_extract is 1, then flags are LTU, 0U < 1U, if zero_extract is 0, then flags are GEU, 0U >= 0U. The patch adjusts the *bt<mode> define_insn and all the splitters to it and its comparisons/conditional moves/setXX. 2025-02-10 Jakub Jelinek <[email protected]> PR target/118623 * config/i386/i386.md (*bt<mode>): Represent bt as compare:CCC of const0_rtx and zero_extract rather than zero_extract and const0_rtx. (*jcc_bt<mode>): Likewise. Use LTU and GEU as flags test instead of EQ and NE. (*jcc_bt<mode>_1): Likewise. (*jcc_bt<mode>_mask): Likewise. (Help combine recognize bt followed by cmov splitter): Likewise. (*bt<mode>_setcqi): Likewise. (*bt<mode>_setncqi): Likewise. (*bt<mode>_setnc<mode>): Likewise. * gcc.c-torture/execute/pr118623.c: New test. (cherry picked from commit 9214201)

…lier implicit instantation [PR113976] Already previously instantiated const variable templates had cp_apply_type_quals_to_decl called when they were instantiated, but if they need runtime initialization, their TREE_READONLY flag has been subsequently cleared. Explicit variable template instantiation calls grokdeclarator which calls cp_apply_type_quals_to_decl on them again, setting TREE_READONLY flag again, but nothing clears it afterwards, so we emit such instantiations into rodata sections and segfault when the dynamic initialization attempts to initialize them. The following patch fixes that by not calling cp_apply_type_quals_to_decl on already instantiated variable declarations. 2024-02-28 Jakub Jelinek <[email protected]> Patrick Palka <[email protected]> PR c++/113976 * decl.cc (grokdeclarator): Don't call cp_apply_type_quals_to_decl on DECL_TEMPLATE_INSTANTIATED VAR_DECLs. * g++.dg/cpp1y/var-templ87.C: New test. (cherry picked from commit 29ac924)

The following testcase ICEs since r15-1579 (addition of late combiner), because *clrmem_short can't be split. The problem is that the define_insn uses (use (match_operand 1 "nonmemory_operand" "n,a,a,a")) (use (match_operand 2 "immediate_operand" "X,R,X,X")) (clobber (match_scratch:P 3 "=X,X,X,&a")) and define_split assumed that if operands[1] is const_int_operand, match_scratch will be always scratch, and it will be reg only if it was the last alternative where operands[1] is a reg. The pattern doesn't guarantee it though, of course RA will not try to uselessly assign a reg there if it is not needed, but during RA on the testcase below we match the last alternative, but then comes late combiner and propagates const_int 3 into operands[1]. And that matches fine, match_scratch matches either scratch or reg and the constraint in that case is X for the first variant, so still just fine. But we won't split that because the splitters only expect scratch. The following patch fixes it by using match_scratch instead of scratch, so that it accepts either. 2025-04-17 Jakub Jelinek <[email protected]> PR target/119834 * config/s390/s390.md (define_split after *cpymem_short): Use (clobber (match_scratch N)) instead of (clobber (scratch)). Use (match_dup 4) and operands[4] instead of (match_dup 3) and operands[3] in the last of those. (define_split after *clrmem_short): Use (clobber (match_scratch N)) instead of (clobber (scratch)). (define_split after *cmpmem_short): Likewise. * g++.target/s390/pr119834.C: New test. (cherry picked from commit 22fe83d)

This got broken with r13-9727 and fixed with either of r13-9729 or r13-9728. 2025-05-30 Jakub Jelinek <[email protected]> PR target/120480 * gcc.dg/pr120480.c: New test. (cherry picked from commit c13d5b9)

If expand_binop_directly fails to add a REG_EQUAL note it tries to unwind and restart. But it can unwind too far if expand_binop changed some of the operands before calling it. We don't need to unwind that far anyway since we should end up taking exactly the same route next time, just without a target rtx. To fix this we remove LAST from the argument list and let the callers (all in expand_binop) do their own unwinding if the call fails. Instead we unwind just as far as the entry to expand_binop_directly and recurse within this function instead of all the way back up. gcc/ChangeLog: PR middle-end/117811 * optabs.cc (expand_binop_directly): Remove LAST as an argument, instead record the last insn on entry. Only delete insns if we need to restart and restart by calling ourself, not expand_binop. (expand_binop): Update callers to expand_binop_directly. If it fails to expand the operation, delete back to LAST. gcc/testsuite: PR middle-end/117811 * gcc.dg/torture/pr117811.c: New test. (cherry picked from commit 7679b82)

PR middle-end/117811 PR testsuite/52641 gcc/testsuite/ * gcc.dg/torture/pr117811.c: Fix for int < 32 bit. (cherry picked from commit 07f229c)

r8-7538 for PR84968 made strip_typedefs_expr diagnose STATEMENT_LIST so that we reject statement-expressions in noexcept-specifiers to match our behavior in template arguments (which the parser diagnoses directly). Later r11-7452 made decltype(auto) deduction canonicalize the expression (as an implementation detail) which in turn calls strip_typedefs_expr, and so ever since we inadvertently reject decltype(auto) deduction of a statement-expression. This patch just removes the diagnostic in strip_typedefs_expr and instead treats statement-expressions similar to lambda-expressions. The function doesn't seem like the right place for such a diagnostic and so it seems easier to just accept rather than try to reject them in a suitable place. PR c++/116418 gcc/cp/ChangeLog: * tree.cc (strip_typedefs_expr) <case STATEMENT_LIST>: Replace this error path with ... <case STMT_EXPR>: ... this, returning the original tree. gcc/testsuite/ChangeLog: * g++.dg/eh/pr84968.C: No longer expect an ahead of time diagnostic for the statement-expresssion. Instantiate the template and expect an incomplete type error instead. * g++.dg/ext/stmtexpr26.C: New test. Reviewed-by: Jason Merrill <[email protected]> (cherry picked from commit 12bdcc3)

r13-6452-g341e6cd8d603a3 made build_extra_args walk evaluated contexts first so that we prefer processing a local specialization in an evaluated context even if its first use is in an unevaluated context. But this means we need to avoid walking a tree that already has extra args/specs saved because the list of saved specs appears to be an evaluated context which we'll now walk first. It seems then that we should be calculating the saved specs from scratch each time, rather than potentially walking the saved specs list from an earlier partial instantiation when calling build_extra_args a second time around. PR c++/114303 gcc/cp/ChangeLog: * constraint.cc (tsubst_requires_expr): Clear REQUIRES_EXPR_EXTRA_ARGS before calling build_extra_args. * pt.cc (tree_extra_args): Define. (extract_locals_r): Assert *_EXTRA_ARGS is empty. (tsubst_stmt) <case IF_STMT>: Clear IF_SCOPE on the new IF_STMT. Call build_extra_args on the new IF_STMT instead of t which might already have IF_STMT_EXTRA_ARGS. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/constexpr-if-lambda6.C: New test. Reviewed-by: Jason Merrill <[email protected]> (cherry picked from commit b262b17)

Here when checking the access of (the injected-class-name) B in c->B::m at parse time, we notice its context B (now the type) is a base of the object type C<T>, so we proceed to use C<T> as the effective qualifying type. But this C<T> is the dependent specialization not the primary template type, so it has empty TYPE_BINFO, which leads to a segfault later from perform_or_defer_access_check. The reason the DERIVED_FROM_P (B, C<T>) test guarding this code path works despite C<T> having empty TYPE_BINFO is because of its currently_open_class logic (added in r9-713-gd9338471b91bbe) which replaces a dependent specialization with the primary template type if we're inside it. So the safest fix seems to be to call currently_open_class in the caller as well. PR c++/116320 gcc/cp/ChangeLog: * semantics.cc (check_accessibility_of_qualified_id): Try currently_open_class when using the object type as the effective qualifying type. gcc/testsuite/ChangeLog: * g++.dg/template/access42.C: New test. Reviewed-by: Jason Merrill <[email protected]> (cherry picked from commit 484f139)

Here we end up ICEing at instantiation time for the call to f<local_static> ultimately because we wrongly consider the call to be non-dependent, and so we specialize f ahead of time and then get confused when fully substituting this specialization. The call is dependent due to [temp.dep.temp]/3 and we miss that because function template-id arguments aren't coerced until overload resolution, and so the local static template argument lacks an implicit cast to reference type that value_dependent_expression_p looks for before considering dependence of the address. Other kinds of template-ids aren't affected since they're coerced ahead of time. So when considering dependence of a function template-id, we need to conservatively consider dependence of the address of each argument (if applicable). PR c++/117792 gcc/cp/ChangeLog: * pt.cc (type_dependent_expression_p): Consider the dependence of the address of each template argument of a function template-id. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/nontype7.C: New test. Reviewed-by: Jason Merrill <[email protected]> (cherry picked from commit 40f0f6a)

atahanozbayram approved these changes Apr 2, 2024

View reviewed changes

NinaRanns pushed a commit to NinaRanns/gcc that referenced this pull request Jan 28, 2025

Merge pull request gcc-mirror#65 from iains/contracts-nonattr-rebase-…

5128291

…on-r15-7214-g0710024b5bd861 Contracts nonattr rebase on r15 7214 g0710024b5bd861

GCC Administrator and others added 27 commits February 23, 2025 00:19

Daily bump.

7466f74

Daily bump.

647dbc9

Daily bump.

07d24a1

Daily bump.

6517238

Daily bump.

9211a99

Daily bump.

90a791b

Daily bump.

b7fd09c

Daily bump.

6c95603

Daily bump.

cc26531

Daily bump.

7e92949

Daily bump.

283e554

Daily bump.

6eec3f7

Daily bump.

c96a7aa

Daily bump.

c5588cb

Daily bump.

11c933c

Daily bump.

ecdf944

Daily bump.

dfc6b28

Daily bump.

be4a99f

jakubjelinek and others added 30 commits June 13, 2025 13:10

testsuite: Add testcase for GCC 13 branch s390 bug [PR120480]

7c4f694

This got broken with r13-9727 and fixed with either of r13-9729 or r13-9728. 2025-05-30 Jakub Jelinek <[email protected]> PR target/120480 * gcc.dg/pr120480.c: New test. (cherry picked from commit c13d5b9)

Daily bump.

05607da

Daily bump.

d020bc8

Daily bump.

0a48efd

Daily bump.

e1cd88e

Fix test case for PR117811 which failed for int < 32 bit.

448750a

PR middle-end/117811 PR testsuite/52641 gcc/testsuite/ * gcc.dg/torture/pr117811.c: Fix for int < 32 bit. (cherry picked from commit 07f229c)

Daily bump.

c091c67

Daily bump.

70e1ccd

Daily bump.

5678064

Daily bump.

82f11c9

Daily bump.

8581635

Daily bump.

54b7cd9

Daily bump.

d6159c3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Releases/gcc 12 #65

Releases/gcc 12 #65

Uh oh!

jacopobrusini commented Jun 4, 2022

Uh oh!

jwakely commented Feb 21, 2024

Uh oh!

Uh oh!

Releases/gcc 12 #65

Are you sure you want to change the base?

Releases/gcc 12 #65

Uh oh!

Conversation

jacopobrusini commented Jun 4, 2022

Uh oh!

jwakely commented Feb 21, 2024

Uh oh!

Uh oh!