git.ipfire.org Git - thirdparty/gcc.git/log

arm: Optimize arm-mlib.h header inclusion [pr108505].

I have committed a fix [1] into gcc trunk for a build
issue mentioned in pr108505 and latter received few upstream
comments proposing more robust fix for this issue.

In this patch I'm addressing those comments and sending this
as a followup patch.

gcc/ChangeLog:

2023-01-27 Srinath Parvathaneni <srinath.parvathaneni@arm.com>

PR target/108505
* config.gcc (tm_mlib_file): Define new variable.

arm: Fix inclusion of arm-mlib.h header more than once (pr108505).

The patch fixes the build issue for arm-none-eabi target configured with
--with-multilib-list=aprofile,rmprofile, in which case the header file
arm/arm-mlib.h is being included more than once and the toolchain build
is failing (PR108505).

gcc/ChangeLog:

2023-01-24 Srinath Parvathaneni <srinath.parvathaneni@arm.com>

PR target/108505
* config.gcc (tm_file): Move the variable out of loop.

arm: Documentation fix for -mbranch-protection option.

This patch fixes the documentation for -mbranch-protection command line option.

gcc/ChangeLog:

2023-01-23 Srinath Parvathaneni <srinath.parvathaneni@arm.com>

* doc/invoke.texi (-mbranch-protection): Update documentation.

arm: Add support for new frame unwinding instruction "0xb5".

This patch adds support for Arm frame unwinding instruction "0xb5" [1]. When
an exception is taken and "0xb5" instruction is encounter during runtime
stack-unwinding, we use effective vsp as modifier in pointer authentication.
On completion of stack unwinding if "0xb5" instruction is not encountered
then CFA will be used as modifier in pointer authentication.

[1] https://github.com/ARM-software/abi-aa/releases/download/2022Q3/ehabi32.pdf

libgcc/ChangeLog:

2022-11-09 Srinath Parvathaneni <srinath.parvathaneni@arm.com>

* config/arm/pr-support.c (__gnu_unwind_execute): Decode opcode
"0xb5".

arm: Add support for dwarf debug directives and pseudo hard-register for PAC feature.

This patch teaches the DWARF support in gcc about RA_AUTH_CODE pseudo hard-register and also
updates the ".save", ".cfi_register", ".cfi_offset", ".cfi_restore" directives accordingly.
This patch also adds support to emit ".pacspval" directive when "pac ip, lr, sp" instruction
in generated in the assembly.

RA_AUTH_CODE register number is 107 and it's dwarf register number is 143.

Applying this patch on top of PACBTI series posted here
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599658.html and when compiling the following
test.c with "-march=armv8.1-m.main+mve+pacbti -mbranch-protection=pac-ret -mthumb -mfloat-abi=hard
fasynchronous-unwind-tables -g -O0 -S" command line options, the assembly output after this patch
looks like below:

$cat test.c

void fun1(int a);
void fun(int a,...)
{
  fun1(a);
}

int main()
{
  fun (10);
  return 0;
}

$ arm-none-eabi-gcc -march=armv8.1-m.main+mve+pacbti -mbranch-protection=pac-ret -mthumb -mfloat-abi=hard
-fasynchronous-unwind-tables -g -O0 -S test.s

Assembly output:
...
fun:
...
        .pacspval
        pac     ip, lr, sp
        .cfi_register 143, 12
        push    {r3, r7, ip, lr}
        .save {r3, r7, ra_auth_code, lr}
...
        .cfi_offset 143, -24
...
        .cfi_restore 143
...
        aut     ip, lr, sp
        bx      lr
...
main:
...
        .pacspval
        pac     ip, lr, sp
        .cfi_register 143, 12
        push    {r3, r7, ip, lr}
        .save {r3, r7, ra_auth_code, lr}
...
        .cfi_offset 143, -8
...
        .cfi_restore 143
...
        aut     ip, lr, sp
        bx      lr
...

gcc/ChangeLog:

2023-01-11  Srinath Parvathaneni  <srinath.parvathaneni@arm.com>

* config/arm/aout.h (ra_auth_code): Add entry in enum.
* config/arm/arm.cc (emit_multi_reg_push): Add RA_AUTH_CODE register
to dwarf frame expression.
(arm_emit_multi_reg_pop): Restore RA_AUTH_CODE register.
(arm_expand_prologue): Update frame related information and reg notes
for pac/pacbit insn.
(arm_regno_class): Check for pac pseudo reigster.
(arm_dbx_register_number): Assign ra_auth_code register number in dwarf.
(arm_init_machine_status): Set pacspval_needed to zero.
(arm_debugger_regno): Check for PAC register.
(arm_unwind_emit_sequence): Print .save directive with ra_auth_code
register.
(arm_unwind_emit_set): Add entry for IP_REGNUM in switch case.
(arm_unwind_emit): Update REG_CFA_REGISTER case._
* config/arm/arm.h (FIRST_PSEUDO_REGISTER): Modify.
(DWARF_PAC_REGNUM): Define.
(IS_PAC_REGNUM): Likewise.
(enum reg_class): Add PAC_REG entry.
(machine_function): Add pacbti_needed state to structure.
* config/arm/arm.md (RA_AUTH_CODE): Define.

gcc/testsuite/ChangeLog:

2023-01-11  Srinath Parvathaneni  <srinath.parvathaneni@arm.com>

* g++.target/arm/pac-1.C: New test.
* gcc.target/arm/pac-15.c: Likewise.

arm: Add pacbti related multilib support for armv8.1-m.main.

This patch adds the support for pacbti multlilib linking by making
"-mbranch-protection=none" as default multilib option for arm-none-eabi
target.

Eg 1.

If the passed command line flags are (without mbranch-protection):
a) -march=armv8.1-m.main+mve -mfloat-abi=hard -mfpu=auto

"-mbranch-protection=none" will be used in the multilib matching.

Eg 2.

If the passed command line flags are (with mbranch-protection):
a) -march=armv8.1-m.main+mve+pacbti -mfloat-abi=hard -mfpu=auto  -mbranch-protection=pac-ret

"-mbranch-protection=standard" will be used in the multilib matching.

gcc/ChangeLog:

2023-01-11  Srinath Parvathaneni  <srinath.parvathaneni@arm.com>

* config.gcc ($tm_file): Update variable.
* config/arm/arm-mlib.h: Create new header file.
* config/arm/t-rmprofile (MULTI_ARCH_DIRS_RM): Rename mbranch-protection
multilib arch directory.
(MULTILIB_REUSE): Add multilib reuse rules.
(MULTILIB_MATCHES): Add multilib match rules.

gcc/testsuite/ChangeLog:

2023-01-11  Srinath Parvathaneni  <srinath.parvathaneni@arm.com>

* gcc.target/arm/multilib.exp (multilib_config "rmprofile"): Update
tests.
* gcc.target/arm/pac-12.c: New test.
* gcc.target/arm/pac-13.c: Likewise.
* gcc.target/arm/pac-14.c: Likewise.

arm: Add support for Arm Cortex-M85 CPU.

This patch adds the -mcpu support for the Arm Cortex-M85 CPU which is
an Armv8.1-M Mainline CPU supporting MVE and PACBTI by default.

-mpcu=cortex-m85 switch by default matches to -march=armv8.1-m.main+pacbti+mve.fp+fp.dp.

Also following options are provided to disable default features.
+nomve.fp (disables MVE Floating point)
+nomve (disables MVE Integer and MVE Floating point)
+nodsp (disables dsp, MVE Integer and MVE Floating point)
+nopacbti (disables pacbti)
+nofp (disables floating point and MVE floating point)

gcc/ChangeLog:

2022-08-12 Srinath Parvathaneni <srinath.parvathaneni@arm.com>

* config/arm/arm-cpus.in (cortex-m85): Define new CPU.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Likewise.
* doc/invoke.texi (Arm Options): Document -mcpu=cortex-m85.
* (-mfix-cmse-cve-2021-35465): Likewise.

gcc/testsuite/ChangeLog:

2022-08-12 Srinath Parvathaneni <srinath.parvathaneni@arm.com>

* gcc.target/arm/multilib.exp: Add tests for cortex-m85.

[Committed] arm: Document +no options for Cortex-M55 CPU.

This patch documents the following options for Arm Cortex-M55 CPU
under -mcpu= list.

+nomve.fp (disables MVE single precision floating point instructions)
+nomve (disables MVE integer and single precision floating point instructions)
+nodsp (disables dsp, MVE integer and single precision floating point instructions)
+nofp (disables floating point instructions)
Committed as obvious to master.

gcc/ChangeLog:

2022-08-12 Srinath Parvathaneni <srinath.parvathaneni@arm.com>

* doc/invoke.texi (Arm Options): Document -mcpu=cortex-m55 options.

arm: Implement arm Function target attribute 'branch-protection'

gcc/

* config/arm/arm.cc (arm_valid_target_attribute_rec): Add ARM function
attribute 'branch-protection' and parse its options.
* doc/extend.texi: Document ARM Function attribute 'branch-protection'.

gcc/testsuite/

* gcc.target/arm/acle/pacbti-m-predef-13.c: New test.

Co-Authored-By: Tejas Belagod <tbelagod@arm.com>

aarch64: Fix return_address_sign_ab_exception.C regression

Hi all,

this is to fix the regression of
g++.target/aarch64/return_address_sign_ab_exception.C that I
introduced with d8dadbc9a5199bf7bac1ab7376b0f84f45e94350.

'aarch_ra_sign_key' for aarch64 ended up being non defined in the opt
file and the function attribute "branch-protection=pac-ret+leaf+b-key"
stopped working as expected.

This patch moves the definition of 'aarch_ra_sign_key' to the opt
files for both Arm back-ends.

Regards

Andera Corallo

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (aarch_ra_sign_key): Remove
declaration.
* config/aarch64/aarch64.cc (aarch_ra_sign_key): Remove
definition.
* config/aarch64/aarch64.opt (aarch64_ra_sign_key): Rename
to 'aarch_ra_sign_key'.
* config/arm/aarch-common.cc (aarch_ra_sign_key): Remove
declaration.
* config/arm/arm-protos.h (aarch_ra_sign_key): Likewise.
* config/arm/arm.cc (enum aarch_key_type): Remove definition.
* config/arm/arm.opt: Define.

[PATCH 12/15] arm: implement bti injection

Hi all,

this patch enables Branch Target Identification Armv8.1-M Mechanism
[1].

This is achieved by using the bti pass made common with Aarch64.

The pass iterates through the instructions and adds the necessary BTI
instructions at the beginning of every function and at every landing
pads targeted by indirect jumps.

Best Regards

  Andrea

[1]
<https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>

gcc/ChangeLog

2022-04-07  Andrea Corallo  <andrea.corallo@arm.com>

* config.gcc (arm*-*-*): Add 'aarch-bti-insert.o' object.
* config/arm/arm-protos.h: Update.
* config/arm/aarch-common-protos.h: Declare
'aarch_bti_arch_check'.
* config/arm/arm.cc (aarch_bti_enabled) Update.
(aarch_bti_j_insn_p, aarch_pac_insn_p, aarch_gen_bti_c)
(aarch_gen_bti_j, aarch_bti_arch_check): New functions.
* config/arm/arm.md (bti_nop): New insn.
* config/arm/t-arm (PASSES_EXTRA): Add 'arm-passes.def'.
(aarch-bti-insert.o): New target.
* config/arm/unspecs.md (VUNSPEC_BTI_NOP): New unspec.
* config/arm/aarch-bti-insert.cc (rest_of_insert_bti): Verify arch
compatibility.
(gate): Make use of 'aarch_bti_arch_check'.
* config/arm/arm-passes.def: New file.
* config/aarch64/aarch64.cc (aarch_bti_arch_check): New function.

gcc/testsuite/ChangeLog

2022-04-07  Andrea Corallo  <andrea.corallo@arm.com>

* gcc.target/arm/bti-1.c: New testcase.
* gcc.target/arm/bti-2.c: Likewise.

[PATCH 11/15] aarch64: Make bti pass generic so it can be used by the arm backend

Hi all,

this patch splits and restructures the aarch64 bti pass code in order
to have it usable by the arm backend as well. These changes have no
functional impact.

Best Regards

Andrea

gcc/Changelog

* config.gcc (aarch64*-*-*): Rename 'aarch64-bti-insert.o' into
'aarch-bti-insert.o'.
* config/aarch64/aarch64-protos.h: Remove 'aarch64_bti_enabled'
proto.
* config/aarch64/aarch64.cc (aarch_bti_enabled): Rename.
(aarch_bti_j_insn_p, aarch_pac_insn_p): New functions.
(aarch64_output_mi_thunk)
(aarch64_print_patchable_function_entry)
(aarch64_file_end_indicate_exec_stack): Update renamed function
calls to renamed functions.
* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Likewise.
* config/aarch64/t-aarch64 (aarch-bti-insert.o): Update
target.
* config/aarch64/aarch64-bti-insert.cc: Delete.
* config/arm/aarch-bti-insert.cc: New file including and
generalizing code from aarch64-bti-insert.cc.
* config/arm/aarch-common-protos.h: Update.

[PATCH 10/15] arm: Implement cortex-M return signing address codegen

Hi all,

this patch enables address return signature and verification based on
Armv8.1-M Pointer Authentication [1].

To sign the return address, we use the PAC R12, LR, SP instruction
upon function entry.  This is signing LR using SP and storing the
result in R12.  R12 will be pushed into the stack.

During function epilogue R12 will be popped and AUT R12, LR, SP will
be used to verify that the content of LR is still valid before return.

Here an example of PAC instrumented function prologue and epilogue:

void foo (void);

int main()
{
  foo ();
  return 0;
}

Compiled with '-march=armv8.1-m.main -mbranch-protection=pac-ret
-mthumb' translates into:

main:
pac ip, lr, sp
push {r3, r7, ip, lr}
add r7, sp, #0
bl foo
movs r3, #0
mov r0, r3
pop {r3, r7, ip, lr}
aut ip, lr, sp
bx lr

The patch also takes care of generating a PACBTI instruction in place
of the sequence BTI+PAC when Branch Target Identification is enabled
contextually.

Ex. the previous example compiled with '-march=armv8.1-m.main
-mbranch-protection=pac-ret+bti -mthumb' translates into:

main:
pacbti ip, lr, sp
push {r3, r7, ip, lr}
add r7, sp, #0
bl foo
movs r3, #0
mov r0, r3
pop {r3, r7, ip, lr}
aut ip, lr, sp
bx lr

As part of previous upstream suggestions a test for varargs has been
added and '-mtpcs-frame' is deemed being incompatible with this return
signing address feature being introduced.

[1] <https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>

gcc/
* config/arm/arm.h (arm_arch8m_main): Declare it.
* config/arm/arm-protos.h (arm_current_function_pac_enabled_p):
Declare it.
* config/arm/arm.cc (arm_arch8m_main): Define it.
(arm_option_reconfigure_globals): Set arm_arch8m_main.
(arm_compute_frame_layout, arm_expand_prologue)
(thumb2_expand_return, arm_expand_epilogue)
(arm_conditional_register_usage): Update for pac codegen.
(arm_current_function_pac_enabled_p): New function.
(aarch_bti_enabled) New function.
(use_return_insn): Return zero when pac is enabled.
* config/arm/arm.md (pac_ip_lr_sp, pacbti_ip_lr_sp, aut_ip_lr_sp):
Add new patterns.
* config/arm/unspecs.md (UNSPEC_PAC_NOP)
(VUNSPEC_PACBTI_NOP, VUNSPEC_AUT_NOP): Add unspecs.

gcc/testsuite/

* gcc.target/arm/pac.h : New file.
* gcc.target/arm/pac-1.c : New test case.
* gcc.target/arm/pac-2.c : Likewise.
* gcc.target/arm/pac-3.c : Likewise.
* gcc.target/arm/pac-4.c : Likewise.
* gcc.target/arm/pac-5.c : Likewise.
* gcc.target/arm/pac-6.c : Likewise.
* gcc.target/arm/pac-7.c : Likewise.
* gcc.target/arm/pac-8.c : Likewise.
* gcc.target/arm/pac-9.c : Likewise.
* gcc.target/arm/pac-10.c : Likewise.
* gcc.target/arm/pac-11.c : Likewise.

[PATCH 8/15] arm: Introduce multilibs for PACBTI target feature

This patch add the following new multilibs.

thumb/v8.1-m.main+pacbti/mbranch-protection/nofp
thumb/v8.1-m.main+pacbti+dp/mbranch-protection/soft
thumb/v8.1-m.main+pacbti+dp/mbranch-protection/hard
thumb/v8.1-m.main+pacbti+fp/mbranch-protection/soft
thumb/v8.1-m.main+pacbti+fp/mbranch-protection/hard
thumb/v8.1-m.main+pacbti+mve/mbranch-protection/hard

Triggering the following compiler flags:

-mthumb -march=armv8.1-m.main+pacbti -mbranch-protection=standard -mfloat-abi=soft
-mthumb -march=armv8.1-m.main+pacbti+fp -mbranch-protection=standard -mfloat-abi=softfp
-mthumb -march=armv8.1-m.main+pacbti+fp -mbranch-protection=standard -mfloat-abi=hard
-mthumb -march=armv8.1-m.main+pacbti+fp.dp -mbranch-protection=standard -mfloat-abi=softfp
-mthumb -march=armv8.1-m.main+pacbti+fp.dp -mbranch-protection=standard -mfloat-abi=hard
-mthumb -march=armv8.1-m.main+pacbti+mve -mbranch-protection=standard -mfloat-abi=hard

gcc/

* config/arm/t-rmprofile: Add multilib rules for march +pacbti and
mbranch-protection.

gcc/testsuite/

* gcc.target/arm/multilib.exp: Add pacbti related entries.

[PATCH 7/15] arm: Emit build attributes for PACBTI target feature

This patch emits assembler directives for PACBTI build attributes as
defined by the
ABI.

<https://github.com/ARM-software/abi-aa/releases/download/2021Q1/addenda32.pdf>

gcc/ChangeLog:

* config/arm/arm.cc (arm_file_start): Emit EABI attributes for
Tag_PAC_extension, Tag_BTI_extension, TAG_BTI_use, TAG_PACRET_use.

gcc/testsuite/ChangeLog:

* gcc.target/arm/acle/pacbti-m-predef-1.c: New test.
* gcc.target/arm/acle/pacbti-m-predef-3.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-6.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-7.c: Likewise.

Co-Authored-By: Tejas Belagod <tbelagod@arm.com>

[PATCH 6/15] arm: Add pointer authentication for stack-unwinding runtime

This patch adds authentication for when the stack is unwound when an
exception is taken. All the changes here are done to the runtime code
in libgcc's unwinder code for Arm target. All the changes are guarded
under defined (__ARM_FEATURE_PAC_DEFAULT) and activated only if the
+pacbti feature is switched on for the architecture. This means that
switching on the target feature via -march or -mcpu is sufficient and
-mbranch-protection need not be enabled. This ensures that the
unwinder is authenticated only if the PACBTI instructions are
available in the non-NOP space as it uses AUTG. Just generating
PAC/AUT instructions using -mbranch-protection will not enable
authentication on the unwinder.

Pre-approved with the requested changes here
<https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586555.html>.

gcc/ChangeLog:

* ginclude/unwind-arm-common.h (_Unwind_VRS_RegClass): Introduce
new pseudo register class _UVRSC_PAC.

libgcc/ChangeLog:
* config/arm/pr-support.c (__gnu_unwind_execute): Decode
exception opcode (0xb4) for saving RA_AUTH_CODE and authenticate
with AUTG if found.
* config/arm/unwind-arm.c (struct pseudo_regs): New.
(phase1_vrs): Introduce new field to store pseudo-reg state.
(phase2_vrs): Likewise.
(_Unwind_VRS_Get): Load pseudo register state from virtual reg set.
(_Unwind_VRS_Set): Store pseudo register state to virtual reg set.
(_Unwind_VRS_Pop): Load pseudo register value from stack into VRS.

Co-Authored-By: Tejas Belagod <tbelagod@arm.com>
Co-Authored-By: Srinath Parvathaneni <srinath.parvathaneni@arm.com>

[PATCH 5/15] arm: Implement target feature macros for PACBTI

This patch implements target feature macros when PACBTI is enabled
through the -march option or -mbranch-protection. The target feature
macros __ARM_FEATURE_PAC_DEFAULT and __ARM_FEATURE_BTI_DEFAULT are
specified in ARM ACLE
<https://developer.arm.com/documentation/101028/0012/5--Feature-test-macros?lang=en>
__ARM_FEATURE_PAUTH and __ARM_FEATURE_BTI are specified in the
pull-request <https://github.com/ARM-software/acle/pull/55>.

Approved here
<https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586334.html>.

gcc/

* config/arm/arm-c.cc (arm_cpu_builtins): Define
__ARM_FEATURE_BTI_DEFAULT, __ARM_FEATURE_PAC_DEFAULT,
__ARM_FEATURE_PAUTH and __ARM_FEATURE_BTI.

gcc/testsuite/

* lib/target-supports.exp
(check_effective_target_mbranch_protection_ok): New function.
* gcc.target/arm/acle/pacbti-m-predef-2.c: New test.
* gcc.target/arm/acle/pacbti-m-predef-4.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-5.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-8.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-9.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-10.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-11.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-12.c: Likewise.

Co-Authored-By: Tejas Belagod <tbelagod@arm.com>

[PATCH 4/15] arm: Add testsuite library support for PACBTI target

Add targeting-checking entities for PACBTI in testsuite
framework.

Pre-approved with the requested changes here
<https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586331.html>.

gcc/testsuite/ChangeLog

* lib/target-supports.exp:
(check_effective_target_arm_pacbti_hw): New.

gcc/ChangeLog:
* doc/sourcebuild.texi: Document arm_pacbti_hw.

Co-Authored-By: Tejas Belagod <tbelagod@arm.com>

[PATCH 3/15] arm: Add option -mbranch-protection

Add -mbranch-protection option. This option enables the
code-generation of pointer signing and authentication instructions in
function prologues and epilogues.

gcc/ChangeLog:

* config/arm/arm.cc (arm_configure_build_target): Parse and validate
-mbranch-protection option and initialize appropriate data structures.
* config/arm/arm.opt (-mbranch-protection): New option.
* doc/invoke.texi (Arm Options): Document it.

Co-Authored-By: Tejas Belagod <tbelagod@arm.com>
Co-Authored-By: Richard Earnshaw <Richard.Earnshaw@arm.com>

[PATCH 2/15] arm: Add Armv8.1-M Mainline target feature +pacbti

This patch adds the -march feature +pacbti to Armv8.1-M Mainline.
This feature enables pointer signing and authentication instructions
on M-class architectures.

Pre-approved here
<https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586144.html>.

gcc/Changelog:

* config/arm/arm.h (TARGET_HAVE_PACBTI): New macro.
* config/arm/arm-cpus.in (pacbti): New feature.
* doc/invoke.texi (Arm Options): Document it.

Co-Authored-By: Tejas Belagod <tbelagod@arm.com>

[PATCH 1/15] arm: Make mbranch-protection opts parsing common to AArch32/64

Hi all,

This change refactors all the mbranch-protection option parsing code and
types to make it common to both AArch32 and AArch64 backends.

This change also pulls in some supporting types from AArch64 to make
it common (aarch_parse_opt_result).

The significant changes in this patch are the movement of all branch
protection parsing routines from aarch64.c to aarch-common.c and
supporting data types and static data structures.

This patch also pre-declares variables and types required in the
aarch32 back-end for moved variables for function sign scope and key
to prepare for the impending series of patches that support parsing
the feature mbranch-protection in the aarch32 back-end.

gcc/ChangeLog:

* common/config/aarch64/aarch64-common.cc: Include aarch-common.h.
(all_architectures): Fix comment.
(aarch64_parse_extension): Rename return type, enum value names.
* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Rename
factored out aarch_ra_sign_scope and aarch_ra_sign_key variables.
Also rename corresponding enum values.
* config/aarch64/aarch64-opts.h (aarch64_function_type): Factor
out aarch64_function_type and move it to common code as
aarch_function_type in aarch-common.h.
* config/aarch64/aarch64-protos.h: Include common types header,
move out types aarch64_parse_opt_result and aarch64_key_type to
aarch-common.h
* config/aarch64/aarch64.cc: Move mbranch-protection parsing types
and functions out into aarch-common.h and aarch-common.cc. Fix up
all the name changes resulting from the move.
* config/aarch64/aarch64.md: Fix up aarch64_ra_sign_key type name change
and enum value.
* config/aarch64/aarch64.opt: Include aarch-common.h to import
type move. Fix up name changes from factoring out common code and
data.
* config/arm/aarch-common-protos.h: Export factored out routines to both
backends.
* config/arm/aarch-common.cc: Include newly factored out types.
Move all mbranch-protection code and data structures from
aarch64.cc.
* config/arm/aarch-common.h: New header that declares types shared
between aarch32 and aarch64 backends.
* config/arm/arm-protos.h: Declare types and variables that are
made common to aarch64 and aarch32 backends - aarch_ra_sign_key,
aarch_ra_sign_scope and aarch_enable_bti.
* config/arm/arm.opt (config/arm/aarch-common.h): Include header.
(aarch_ra_sign_scope, aarch_enable_bti): Declare variable.
* config/arm/arm.cc: Add missing includes.

Co-Authored-By: Tejas Belagod <tbelagod@arm.com>

Fix small regression in Ada

gcc/
* gimplify.cc (gimplify_save_expr): Add missing guard.
gcc/testsuite/
* gnat.dg/shift2.adb: New test.

libstdc++: Add missing free functions for atomic_flag [PR103934]

This patch adds -
atomic_flag_test
atomic_flag_test_explicit

Which were missed when commit 491ba6 introduced C++20 atomic flag
test.

libstdc++-v3/ChangeLog:

PR libstdc++/103934
* include/std/atomic (atomic_flag_test): Add.
(atomic_flag_test_explicit): Add.
* testsuite/29_atomics/atomic_flag/test/explicit.cc: Add
test case to cover missing atomic_flag free functions.
* testsuite/29_atomics/atomic_flag/test/implicit.cc:
Likewise.

(cherry picked from commit a8d769045b43e8509490362865a85cb31a855ccf)

Daily bump.

rs6000: Fix typo on vec_vsubcuq in rs6000-overload.def [PR108396]

As Andrew pointed out in PR108396, there is one typo in
rs6000-overload.def on built-in function vec_vsubcuq:

[VEC_VSUBCUQ, vec_vsubcuqP, __builtin_vec_vsubcuq]

"vec_vsubcuqP" should be "vec_vsubcuq", this typo caused
us to define vec_vsubcuqP in rs6000-vecdefines.h instead
of vec_vsubcuq, so that compiler is not able to realize
the built-in function name vec_vsubcuq any more.

Co-authored-By: Andrew Pinski <apinski@marvell.com>
PR target/108396

gcc/ChangeLog:

* config/rs6000/rs6000-overload.def (VEC_VSUBCUQ): Fix typo
vec_vsubcuqP with vec_vsubcuq.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr108396.c: New test.

(cherry picked from commit aaf29ae6cdbaad58b709a77784375d15138174b3)

rs6000: Teach rs6000_opaque_type_invalid_use_p about gcall [PR108348]

PR108348 shows one special case that MMA opaque types are
used in function arguments and treated as pass by reference,
it results in one copying from argument to a temp variable,
since this copying happens before rs6000_function_arg check,
it can cause ICE without MMA support then. This patch is to
teach function rs6000_opaque_type_invalid_use_p to check if
any function argument in a gcall stmt has the invalid use of
MMA opaque types.

btw, I checked the handling on return value, it doesn't have
this kind of issue as its checking and error emission is quite
early, so this doesn't handle function return value.

PR target/108348

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_opaque_type_invalid_use_p): Add the
support for invalid uses of MMA opaque type in function arguments.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr108348-1.c: New test.
* gcc.target/powerpc/pr108348-2.c: New test.

(cherry picked from commit 5d9529687deb9ed009361a16c02a7f6c3e2ebbf3)

rs6000: Teach rs6000_opaque_type_invalid_use_p about inline asm [PR108272]

As PR108272 shows, there are some invalid uses of MMA opaque
types in inline asm statements. This patch is to teach the
function rs6000_opaque_type_invalid_use_p for inline asm,
check and error any invalid use of MMA opaque types in input
and output operands.

PR target/108272

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_opaque_type_invalid_use_p): Add the
support for invalid uses in inline asm, factor out the checking and
erroring to lambda function check_and_error_invalid_use.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr108272-1.c: New test.
* gcc.target/powerpc/pr108272-2.c: New test.
* gcc.target/powerpc/pr108272-3.c: New test.
* gcc.target/powerpc/pr108272-4.c: New test.

(cherry picked from commit 074b0c03eabeb8e9c8de813c81bf87a1f88fdb65)

Daily bump.

Suppress -fstack-protector warning on hppa.

Some package builds enable -fstack-protector and -Werror. Since
-fstack-protector is not supported on hppa because the stack grows
up, these packages must check for the warning generated by
-fstack-protector and suppress it on hppa. This is problematic
since hppa is the only significant architecture where the stack
grows up.

2022-12-16 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa.cc (pa_option_override): Disable -fstack-protector.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_effective_target_static): Return 0
on hppa*-*-*.

Daily bump.

i386: Fix up -Wuninitialized warnings in avx512erintrin.h [PR105593]

As reported in the PR, there are some -Wuninitialized warnings in
avx512erintrin.h.  One can see that by compiling sse-23.c testcase with
-Wuninitialized (or when actually using those intrinsics).
Those 6 spots use an uninitialized variable and pass it as one of the
argument to a builtin with constant mask -1, because there is no unmasked
builtin.  It is true that expansion of those builtins into RTL will see
mask is all ones and ignore the unneeded argument, but -Wuninitialized
is diagnosed on GIMPLE and on GIMPLE these builtins are just builtin calls.
avx512fintrin.h and other headers use in these cases the _mm*_undefined_* ()
intrinsics, like:
  return (__m512i) __builtin_ia32_psrav8di_mask ((__v8di) __X,
                                                 (__v8di) __Y,
                                                 (__v8di)
                                                 _mm512_undefined_epi32 (),
                                                 (__mmask8) -1);
etc. and the following patch does the same for avx512erintrin.h.
With the recent changes in C++ FE and the _mm*_undefined_* intrinsics,
we don't emit -Wuninitialized warnings for those (previously we didn't
just in C due to self-initialization).  Of course we could also
just self-initialize these uninitialized vars and add the #pragma GCC
diagnostic dances around it, but using the intrinsics is consistent with
the rest and IMHO cleaner.

2023-01-31  Jakub Jelinek  <jakub@redhat.com>

PR c++/105593
* config/i386/avx512erintrin.h (_mm512_exp2a23_round_pd,
_mm512_exp2a23_round_ps, _mm512_rcp28_round_pd, _mm512_rcp28_round_ps,
_mm512_rsqrt28_round_pd, _mm512_rsqrt28_round_ps): Use
_mm512_undefined_pd () or _mm512_undefined_ps () instead of using
uninitialized automatic variable __W.

* gcc.target/i386/sse-23.c: Add -Wuninitialized to dg-options.

(cherry picked from commit 41602390456901c14ecdfa2fa64c3cebd5b6ff09)

x86: Avoid -Wuninitialized warnings on _mm*_undefined_* in C++ [PR105593]

In https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609844.html
I've posted a patch to allow ignoring -Winit-self using GCC diagnostic
pragmas, such that one can mark self-initialization as intentional
disabling of -Wuninitialized warnings.

The following incremental patch uses that in the x86 intrinsic
headers.

2023-01-16 Jakub Jelinek <jakub@redhat.com>

PR c++/105593
gcc/
* config/i386/xmmintrin.h (_mm_undefined_ps): Temporarily
disable -Winit-self using pragma GCC diagnostic ignored.
* config/i386/emmintrin.h (_mm_undefined_pd, _mm_undefined_si128):
Likewise.
* config/i386/avxintrin.h (_mm256_undefined_pd, _mm256_undefined_ps,
_mm256_undefined_si256): Likewise.
* config/i386/avx512fintrin.h (_mm512_undefined_pd,
_mm512_undefined_ps, _mm512_undefined_epi32): Likewise.
* config/i386/avx512fp16intrin.h (_mm_undefined_ph,
_mm256_undefined_ph, _mm512_undefined_ph): Likewise.
gcc/testsuite/
* g++.target/i386/pr105593.C: New test.

(cherry picked from commit 6b0907b4fc455377e5f8109f427d97da02b6aec9)

c, c++: Allow ignoring -Winit-self through pragmas [PR105593]

As mentioned in the PR, various x86 intrinsics need to return
an uninitialized vector. Currently they use self initialization
to avoid -Wuninitialized warnings, which works fine in C, but
doesn't work in C++ where -Winit-self is enabled in -Wall.
We don't have an attribute to mark a variable as knowingly
uninitialized (the uninitialized attribute exists but means
something else, only in the -ftrivial-auto-var-init context),
and trying to suppress either -Wuninitialized or -Winit-self
inside of the _mm_undefined_ps etc. intrinsic definitions
doesn't work, one needs to currently disable through pragmas
-Wuninitialized warning at the point where _mm_undefined_ps etc.
result is actually used, but that goes against the intent of
those intrinsics.

The -Winit-self warning option actually doesn't do any warning,
all we do is record a suppression for -Winit-self if !warn_init_self
on the decl definition and later look that up in uninit pass.

The following patch changes those !warn_init_self tests which
are true only based on the command line option setting, not based
on GCC diagnostic pragma overrides to
!warning_enabled_at (DECL_SOURCE_LOCATION (decl), OPT_Winit_self)
such that it takes them into account.

2023-01-16 Jakub Jelinek <jakub@redhat.com>

PR c++/105593
gcc/c/
* c-parser.cc (c_parser_initializer): Check warning_enabled_at
at the DECL_SOURCE_LOCATION (decl) for OPT_Winit_self instead
of warn_init_self.
gcc/cp/
* decl.cc (cp_finish_decl): Check warning_enabled_at
at the DECL_SOURCE_LOCATION (decl) for OPT_Winit_self instead
of warn_init_self.
gcc/testsuite/
* c-c++-common/Winit-self3.c: New test.
* c-c++-common/Winit-self4.c: New test.
* c-c++-common/Winit-self5.c: New test.

(cherry picked from commit 98b41fd4045b7856e7b85dd58d67c600bd909379)

c-family: Honor -Wno-init-self for cv-qual vars [PR102633]

Since r11-5188-g32934a4f45a721, we drop qualifiers during l-to-r
conversion by creating a NOP_EXPR.  For e.g.

  const int i = i;

that means that the DECL_INITIAL is '(int) i' and not 'i' anymore.
Consequently, we don't suppress_warning here:

711     case DECL_EXPR:
715       if (VAR_P (DECL_EXPR_DECL (*expr_p))
716           && !DECL_EXTERNAL (DECL_EXPR_DECL (*expr_p))
717           && !TREE_STATIC (DECL_EXPR_DECL (*expr_p))
718           && (DECL_INITIAL (DECL_EXPR_DECL (*expr_p)) == DECL_EXPR_DECL (*expr_p))
719           && !warn_init_self)
720         suppress_warning (DECL_EXPR_DECL (*expr_p), OPT_Winit_self);

because of the check on line 718 -- (int) i is not i.  So -Wno-init-self
doesn't disable the warning as it's supposed to.

The following patch fixes it by moving the suppress_warning call from
c_gimplify_expr to the front ends, at points where we haven't created
the NOP_EXPR yet.

PR middle-end/102633

gcc/c-family/ChangeLog:

* c-gimplify.cc (c_gimplify_expr) <case DECL_EXPR>: Don't call
suppress_warning here.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_initializer): Add new tree parameter.  Use it.
Call suppress_warning.
(c_parser_declaration_or_fndef): Pass d down to c_parser_initializer.
(c_parser_omp_declare_reduction): Pass omp_priv down to
c_parser_initializer.

gcc/cp/ChangeLog:

* decl.cc (cp_finish_decl): Call suppress_warning.

gcc/testsuite/ChangeLog:

* c-c++-common/Winit-self1.c: New test.
* c-c++-common/Winit-self2.c: New test.

(cherry picked from commit 04ce2400b35225302e0d6883bb0817378180f5d7)

c++: Handle structured bindings like anon unions in initializers [PR108474]

As reported by Andrew Pinski, structured bindings (with the exception
of the ones using std::tuple_{size,element} and get which are really
standalone variables in addition to the binding one) also use
DECL_VALUE_EXPR and needs the same treatment in static initializers.

On Sun, Jan 22, 2023 at 07:19:07PM -0500, Jason Merrill wrote:
> Though, actually, why not instead fix expand_expr_real_1 (and staticp) to
> look through DECL_VALUE_EXPR?

Doing it when emitting the initializers seems to be too late to me,
we in various spots try to put parts of the static var DECL_INITIAL expressions
into the IL, or e.g. for varpool purposes remember which vars are referenced
there.

This patch moves it to record_reference, which is called from varpool_node::analyze
and so about the same time as gimplification of the bodies which also
replaces DECL_VALUE_EXPRs.

2023-01-24 Jakub Jelinek <jakub@redhat.com>

PR c++/108474
* cp-gimplify.cc (cp_fold_r): Handle structured bindings
vars like anon union artificial vars.

* g++.dg/cpp1z/decomp57.C: New test.
* g++.dg/cpp1z/decomp58.C: New test.

(cherry picked from commit b84e21115700523b4d0ac44275443f7b9c670344)

forwprop: Further fixes for simplify_rotate [PR108440]

As mentioned in the simplify_rotate comment, for e.g.
   ((T) ((T2) X << (Y & (B - 1)))) | ((T) ((T2) X >> ((-Y) & (B - 1))))
we already emit
   X r<< (Y & (B - 1))
as replacement.  This PR is about the
   ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
   ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
forms if T2 is wider than T.  Unlike e.g.
   (X << Y) OP (X >> (B - Y))
which is valid just for Y in [1, B - 1], the above 2 forms are actually
valid and do the rotates for Y in [0, B] - for Y 0 the X value is preserved
by the left shift and right logical shift by B adds just zeros (but because
the shift is in wider precision B is still valid shift count), while for
Y equal to B X is preserved through the latter shift and the former adds
just zeros.
Now, it is unclear if we in the middle-end treat rotates with rotate count
equal or larger than precision as UB or not, unlike shifts there are less
reasons to do so, but e.g. expansion of X r<< Y if there is no rotate optab
for the mode is emitted as (X << Y) | (((unsigned) X) >> ((-Y) & (B - 1)))
and so with UB on Y == B.

The following patch does multiple things:
1) for the above 2, asks the ranger if Y could be equal to B and if so,
   instead of using X r<< Y uses X r<< (Y & (B - 1))
2) for the
   ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1))))
   ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))
   forms that were fixed 2 days ago it only punts if Y might be in the
   [B,B2-1] range but isn't known to be in the
   [0,B][2*B,2*B][3*B,3*B]... range.  Because for Y which is a multiple
   of B but smaller than B2 it acts as a rotate too, left shift provides
   0 and (-Y) & (B - 1) is 0 and so preserves X.  Though, for the cases
   where Y is not known to be in [0,B-1] the patch also uses
   X r<< (Y & (B - 1)) rather than X r<< Y
3) as discussed with Aldy, instead of using global ranger it uses a pass
   specific copy but lazily created on first simplify_rotate that needs it;
   this e.g. handles rotate inside of if body where the guarding condition
   limits the shift count to some range which will not work with the
   global ranger (unless there is some SSA_NAME to attach the range to).

Note, e.g. on x86 X r<< (Y & (B - 1)) and X r<< Y actually emit the
same assembly because rotates work the same even for larger rotate counts,
but that is handled only during combine.

2023-01-19  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/108440
* tree-ssa-forwprop.cc: Include gimple-range.h.
(simplify_rotate): For the forms with T2 wider than T and shift counts of
Y and B - Y add & (B - 1) masking for the rotate count if Y could be equal
to B.  For the forms with T2 wider than T and shift counts of
Y and (-Y) & (B - 1), don't punt if range could be [B, B2], but only if
range doesn't guarantee Y < B or Y = N * B.  If range doesn't guarantee
Y < B, also add & (B - 1) masking for the rotate count.  Use lazily created
pass specific ranger instead of get_global_range_query.
(pass_forwprop::execute): Disable that ranger at the end of pass if it has
been created.

* c-c++-common/rotate-10.c: New test.
* c-c++-common/rotate-11.c: New test.

(cherry picked from commit 05b9868b182bb9ed2013b39a0bc6297354a0db49)

forwprop: Fix up rotate pattern matching [PR106523]

The comment above simplify_rotate roughly describes what patterns
are matched into what:
   We are looking for X with unsigned type T with bitsize B, OP being
   +, | or ^, some type T2 wider than T.  For:
   (X << CNT1) OP (X >> CNT2)                           iff CNT1 + CNT2 == B
   ((T) ((T2) X << CNT1)) OP ((T) ((T2) X >> CNT2))     iff CNT1 + CNT2 == B

   transform these into:
   X r<< CNT1

   Or for:
   (X << Y) OP (X >> (B - Y))
   (X << (int) Y) OP (X >> (int) (B - Y))
   ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
   ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
   (X << Y) | (X >> ((-Y) & (B - 1)))
   (X << (int) Y) | (X >> (int) ((-Y) & (B - 1)))
   ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1))))
   ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))

   transform these into (last 2 only if ranger can prove Y < B):
   X r<< Y

   Or for:
   (X << (Y & (B - 1))) | (X >> ((-Y) & (B - 1)))
   (X << (int) (Y & (B - 1))) | (X >> (int) ((-Y) & (B - 1)))
   ((T) ((T2) X << (Y & (B - 1)))) | ((T) ((T2) X >> ((-Y) & (B - 1))))
   ((T) ((T2) X << (int) (Y & (B - 1)))) \
     | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))

   transform these into:
   X r<< (Y & (B - 1))

The following testcase shows that 2 of these are problematic.
If T2 is wider than T, then the 2 which yse (-Y) & (B - 1) on one
of the shift counts but Y on the can do something different from
rotate.  E.g.:
__attribute__((noipa)) unsigned char
f7 (unsigned char x, unsigned int y)
{
  unsigned int t = x;
  return (t << y) | (t >> ((-y) & 7));
}
if y is [0, 7], then it is a normal rotate, and if y is in [32, ~0U]
then it is UB, but for y in [9, 31] the left shift in this case
will never leave any bits in the result, while in a rotate they are
left there.  Say for y 5 and x 0xaa the expression gives
0x55 which is the same thing as rotate, while for y 19 and x 0xaa
0x5, which is different.
Now, I believe the
   ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
   ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
forms are ok, because B - Y still needs to be a valid shift count,
and if Y > B then B - Y should be either negative or very large
positive (for unsigned types).
And similarly the last 2 cases above which use & (B - 1) on both
shift operands are definitely ok.

The following patch disables the
   ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1))))
   ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))
unless ranger says Y is not in [B, B2 - 1] range.

And, looking at it again this morning, actually the Y equal to B
case is still fine, if Y is equal to 0, then it is
(T) (((T2) X << 0) | ((T2) X >> 0))
and so X, for Y == B it is
(T) (((T2) X << B) | ((T2) X >> 0))
which is the same as
(T) (0 | ((T2) X >> 0))
which is also X.  So instead of the [B, B2 - 1] range we could use
[B + 1, B2 - 1].  And, if we wanted to go further, even multiplies
of B are ok if they are smaller than B2, so we could construct a detailed
int_range_max if we wanted.

2023-01-17  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/106523
* tree-ssa-forwprop.cc (simplify_rotate): For the
patterns with (-Y) & (B - 1) in one operand's shift
count and Y in another, if T2 has wider precision than T,
punt if Y could have a value in [B, B2 - 1] range.

* c-c++-common/rotate-2.c (f5, f6, f7, f8, f13, f14, f15, f16,
f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using
__builtin_unreachable about shift count.
* c-c++-common/rotate-2b.c: New test.
* c-c++-common/rotate-4.c (f5, f6, f7, f8, f13, f14, f15, f16,
f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using
__builtin_unreachable about shift count.
* c-c++-common/rotate-4b.c: New test.
* gcc.c-torture/execute/pr106523.c: New test.

(cherry picked from commit 001121e8921d5d1a439ce0e64ab04c5959b0bfd8)

c++: Avoid incorrect shortening of divisions [PR108365]

The following testcase is miscompiled, because we shorten the division
in a case where it should not be shortened.
Divisions (and modulos) can be shortened if it is unsigned division/modulo,
or if it is signed division/modulo where we can prove the dividend will
not be the minimum signed value or divisor will not be -1, because e.g.
on sizeof(long long)==sizeof(int)*2 && __INT_MAX__ == 0x7fffffff targets
(-2147483647 - 1) / -1 is UB
but
(int) (-2147483648LL / -1LL) is not, it is -2147483648.
The primary aim of both the C and C++ FE division/modulo shortening I assume
was for the implicit integral promotions of {,signed,unsigned} {char,short}
and because at this point we have no VRP information etc., the shortening
is done if the integral promotion is from unsigned type for the divisor
or if the dividend is an integer constant other than -1.
This works fine for char/short -> int promotions when char/short have
smaller precision than int - unsigned char -> int or unsigned short -> int
will always be a positive int, so never the most negative.

Now, the C FE checks whether orig_op0 is TYPE_UNSIGNED where op0 is either
the same as orig_op0 or that promoted to int, I think that works fine,
if it isn't promoted, either the division/modulo common type will have the
same precision as op0 but then the division/modulo is unsigned and so
without UB, or it will be done in wider precision (e.g. because op1 has
wider precision), but then op0 can't be minimum signed value. Or it has
been promoted to int, but in that case it was again from narrower type and
so never minimum signed int.

But the C++ FE was checking if op0 is a NOP_EXPR from TYPE_UNSIGNED.
First of all, not sure if the operand of NOP_EXPR couldn't be non-integral
type where TYPE_UNSIGNED wouldn't be meaningful, but more importantly,
even if it is a cast from unsigned integral type, we only know it can't be
minimum signed value if it is a widening cast, if it is same precision or
narrowing cast, we know nothing.

So, the following patch for the NOP_EXPR cases checks just in case that
it is from integral type and more importantly checks it is a widening
conversion.

2023-01-14 Jakub Jelinek <jakub@redhat.com>

PR c++/108365
* typeck.cc (cp_build_binary_op): For integral division or modulo,
shorten if type0 is unsigned, or op0 is cast from narrower unsigned
integral type or stripped_op1 is INTEGER_CST other than -1.

* g++.dg/opt/pr108365.C: New test.
* g++.dg/warn/pr108365.C: New test.

(cherry picked from commit 5b3a88640f962d4ffca31ae651bed2d8672f1a8c)

libstdc++: Another merge from fast_float upstream [PR107468]

Upstream fast_float came up with a cheaper test for
fegetround () == FE_TONEAREST using one float addition, one subtraction
and one comparison. If we know we are rounding to nearest, we can use
fast path in more cases as before.
The following patch merges those changes into libstdc++.

2022-11-24 Jakub Jelinek <jakub@redhat.com>

PR libstdc++/107468
* src/c++17/fast_float/fast_float.h: Partial merge from fast_float
2ef9abbcf6a11958b6fa685a89d0150022e82e78 commit.

(cherry picked from commit ec73b55c75baa16c1cf7482fa65928a8d45598d4)

libstdc++: Update from latest fast_float [PR107468]

The following patch partially updates from fast_float trunk.
That way it gets fix for the incorrect rounding case,
PR107468 aka https://github.com/fastfloat/fast_float/issues/149
Using std::fegetround showed in benchmarks too slow, so instead of
doing that the patch limits the fast path where it uses floating
point multiplication rather than integral to cases where we can
prove there will be no rounding (the multiplication will be exact, not
just that the two multiplication or division operation arguments are
exactly representable).

2022-11-07 Jakub Jelinek <jakub@redhat.com>

PR libstdc++/107468
* src/c++17/fast_float/fast_float.h: Partial merge from fast_float
662497742fea7055f0e0ee27e5a7ddc382c2c38e commit.
* testsuite/20_util/from_chars/pr107468.cc: New test.

(cherry picked from commit cb0ceeaee9e041aaac3edd089b07b439621d0f29)

match.pd: When simplifying BFR of an insert, require a mode precision integral type [PR108688]

The same problem as PR 88739 has crept in but
this time in match.pd when simplifying bit_field_ref of
an bit_insert. That is we are generating a BIT_FIELD_REF
of a non-mode-precision integral type.

PR tree-optimization/108688
* match.pd (bit_field_ref [bit_insert]): Avoid generating
BIT_FIELD_REFs of non-mode-precision integral operands.

* gcc.c-torture/compile/pr108688-1.c: New test.

(cherry picked from commit 44f308e59bfa0f93ae05b17e257d8563c12399fd)

vect-patterns: Fix up vect_widened_op_tree [PR108692]

The following testcase is miscompiled on aarch64-linux since r11-5160.
Given
  <bb 3> [local count: 955630225]:
  # i_22 = PHI <i_20(6), 0(5)>
  # r_23 = PHI <r_19(6), 0(5)>
...
  a.0_5 = (unsigned char) a_15;
  _6 = (int) a.0_5;
  b.1_7 = (unsigned char) b_17;
  _8 = (int) b.1_7;
  c_18 = _6 - _8;
  _9 = ABS_EXPR <c_18>;
  r_19 = _9 + r_23;
...
where SSA_NAMEs 15/17 have signed char, 5/7 unsigned char and rest is int
we first pattern recognize c_18 as
patt_34 = (a.0_5) w- (b.1_7);
which is still correct, 5/7 are unsigned char subtracted in wider type,
but then vect_recog_sad_pattern turns it into
SAD_EXPR <a_15, b_17, r_23>
which is incorrect, because 15/17 are signed char and so it is
sum of absolute signed differences rather than unsigned sum of
absolute unsigned differences.
The reason why this happens is that vect_recog_sad_pattern calls
vect_widened_op_tree with MINUS_EXPR, WIDEN_MINUS_EXPR on the
patt_34 = (a.0_5) w- (b.1_7); statement's vinfo and vect_widened_op_tree
calls vect_look_through_possible_promotion on the operands of the
WIDEN_MINUS_EXPR, which looks through the further casts.
vect_look_through_possible_promotion has careful code to stop when there
would be nested casts that need to be preserved, but the problem here
is that the WIDEN_*_EXPR operation itself has an implicit cast on the
operands already - in this case of WIDEN_MINUS_EXPR the unsigned char
5/7 SSA_NAMEs are widened to unsigned short before the subtraction,
and vect_look_through_possible_promotion obviously isn't told about that.

Now, I think when we see those WIDEN_{MULT,MINUS,PLUS}_EXPR codes, we had
to look through possible promotions already when creating those and so
vect_look_through_possible_promotion again isn't really needed, all we need
to do is arrange what that function will do if the operand isn't result
of any cast.  Other option would be let vect_look_through_possible_promotion
know about the implicit promotion from the WIDEN_*_EXPR, but I'm afraid
that would be much harder.

2023-02-08  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/108692
* tree-vect-patterns.cc (vect_widened_op_tree): If rhs_code is
widened_code which is different from code, don't call
vect_look_through_possible_promotion but instead just check op is
SSA_NAME with integral type for which vect_is_simple_use is true
and call set_op on this_unprom.

* gcc.dg/pr108692.c: New test.

(cherry picked from commit 6ad1c1027628f094260037536f6b6fcdb63b5add)

fortran: Fix up hash table usage in gfc_trans_use_stmts [PR108451]

The first testcase in the PR (which I haven't included in the patch because
it is unclear to me if it is supposed to be valid or not) ICEs since extra
hash table checking has been added recently.  The problem is that
gfc_trans_use_stmts does
          tree *slot = entry->decls->find_slot_with_hash (rent->use_name, hash,
                                                          INSERT);
          if (*slot == NULL)
and later on doesn't store anything into *slot and continues.  Another spot
a few lines later correctly clears the slot if it decides not to use the
slot, so the following patch does the same.

2023-02-03  Jakub Jelinek  <jakub@redhat.com>

PR fortran/108451
* trans-decl.cc (gfc_trans_use_stmts): Call clear_slot before
doing continue.

(cherry picked from commit 76f7f0eddcb7c418d1ec3dea3e2341ca99097301)

nested, openmp: Wrap OMP_CLAUSE_*_GIMPLE_SEQ into GIMPLE_BIND for declare_vars [PR108435]

When gimplifying OMP_CLAUSE_{LASTPRIVATE,LINEAR}_STMT, we wrap it always
into a GIMPLE_BIND, but when putting statements directly into
OMP_CLAUSE_{LASTPRIVATE,LINEAR}_GIMPLE_SEQ, we do it only if needed (there
are any temporaries that need to be declared in the sequence).
convert_nonlocal_omp_clauses was relying on the GIMPLE_BIND to be there always
because it called declare_vars on it.

The following patch wraps it into GIMPLE_BIND in tree-nested if we need to
declare_vars on it on demand.

2023-02-02 Jakub Jelinek <jakub@redhat.com>

PR middle-end/108435
* tree-nested.cc (convert_nonlocal_omp_clauses)
<case OMP_CLAUSE_LASTPRIVATE>: If info->new_local_var_chain and *seq
is not a GIMPLE_BIND, wrap the sequence into a new GIMPLE_BIND
before calling declare_vars.
(convert_nonlocal_omp_clauses) <case OMP_CLAUSE_LINEAR>: Merge
with the OMP_CLAUSE_LASTPRIVATE handling except for whether
seq is initialized to &OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (clause)
or &OMP_CLAUSE_LINEAR_GIMPLE_SEQ (clause).

* gcc.dg/gomp/pr108435.c: New test.

(cherry picked from commit 0f349928e16fdc7dba52561e8d40347909f9f0ff)

ree: Fix -fcompare-debug issues in combine_reaching_defs [PR108573]

The PR78437 r7-4871 changes made combine_reaching_defs punt on
WORD_REGISTER_OPERATIONS targets if a setter of smaller than word
register has wider uses.  This unfortunately breaks -fcompare-debug,
because if such a use appears only in DEBUG_INSN(s), while all other
uses aren't wider than the setter, we can REE optimize it without -g
and not with -g.

Such decisions shouldn't be based on debug instructions.  We could try
to reset them or adjust in some other way after we decide to perform the
change, but at least on the testcase which used to fail on riscv64-linux
the
(debug_insn 8 7 9 2 (var_location:HI s (minus:HI (subreg:HI (and:DI (reg:DI 10 a0 [160])
                (const_int 1 [0x1])) 0)
        (subreg:HI (ashiftrt:DI (reg/v:DI 9 s1 [orig:151 l ] [151])
                (debug_expr:SI D#1)) 0))) "pr108573.c":12:5 -1
     (nil))
clearly doesn't care about the upper bits and I have hard time imaging how
could one end up with DEBUG_INSN which actually cares about those upper
bits.

So, the following patch just ignores uses on DEBUG_INSNs in this case,
if we run into something where we'd need to do something further later on,
let's deal with it when we have a testcase for it.

2023-02-01  Jakub Jelinek  <jakub@redhat.com>

PR debug/108573
* ree.cc (combine_reaching_defs): Don't return false for paradoxical
subregs in DEBUG_INSNs.

* gcc.dg/pr108573.c: New test.

(cherry picked from commit e4473d7cf871c8ddf8f22d105c5af6375ebe37bf)

c++, openmp: Handle some OMP_*/OACC_* constructs during constant expression evaluation [PR108607]

While potential_constant_expression_1 handled most of OMP_* codes (by saying that
they aren't potential constant expressions), OMP_SCOPE was missing in that list.
I've also added OMP_SCAN, though that is less important (similarly to OMP_SECTION
it ought to appear solely inside of OMP_{FOR,SIMD} resp. OMP_SECTIONS).
As the testcase shows, it isn't enough, potential_constant_expression_1
can catch only some cases, as soon as one uses switch or ifs where at least
one of the possible paths could be constant expression, we can run into the
same codes during cxx_eval_constant_expression, so this patch handles those
there as well.

2023-02-01 Jakub Jelinek <jakub@redhat.com>

PR c++/108607
* constexpr.cc (cxx_eval_constant_expression): Handle OMP_*
and OACC_* constructs as non-constant.
(potential_constant_expression_1): Handle OMP_SCAN and OMP_SCOPE.

* g++.dg/gomp/pr108607.C: New test.

(cherry picked from commit bfc070595bfb00abef88a002eee5d9117f5b86a7)

i386: Fix up ix86_convert_const_wide_int_to_broadcast [PR108599]

The following testcase is miscompiled.  The problem is that during
RTL DSE we see a V4DI register is being loaded { 16, 16, 0, 0 }
value and DSE mostly works in terms of scalar modes, so it calls
movoi to set an OImode REG to (const_wide_int 0x100000000000000010)
and ix86_convert_const_wide_int_to_broadcast thinks it can compute
that value by broadcasting DImode 0x10.  While it is true that
for TImode result the broadcast could be used, for OImode/XImode
it can't be, because all but the lowest 2 HOST_WIDE_INTs aren't
present (so are 0 or -1 depending on sign), not 0x10 in this case.
The function checks if the least significant HOST_WIDE_INT elt
of the CONST_WIDE_INT is broadcastable from QI/HI/SI/DImode and then
  /* Check if OP can be broadcasted from VAL.  */
  for (int i = 1; i < CONST_WIDE_INT_NUNITS (op); i++)
    if (val != CONST_WIDE_INT_ELT (op, i))
      return nullptr;
That is needed of course, but nothing checks that
CONST_WIDE_INT_NUNITS (op) isn't too small for the mode in question.
I think if op would be 0 or -1, it ought to be never CONST_WIDE_INT,
but CONST_INT and so we can just punt whenever the number of
CONST_WIDE_INT elts is not the expected one.

2023-01-31  Jakub Jelinek  <jakub@redhat.com>

PR target/108599
* config/i386/i386-expand.cc
(ix86_convert_const_wide_int_to_broadcast): Return nullptr if
CONST_WIDE_INT_NUNITS (op) times HOST_BITS_PER_WIDE_INT isn't
equal to bitsize of mode.

* gcc.target/i386/avx2-pr108599.c: New test.

(cherry picked from commit 963315a922e228c4f6853826666151fc540f111a)

bbpart: Fix up ICE on asm goto [PR108596]

On the following testcase we have asm goto in hot block with 2 successors,
one cold to which it both falls through and has one of the label
pointing to it and another hot successor with another label.

Now, during bbpart we want to ensure that no blocks from one partition fall
through into a block in a different partition.  fix_up_fall_thru_edges
does that by temporarily clearing the EDGE_CROSSING on the fallthrough edge,
calling force_nonfallthru and then depending on whether it created a new
bb either set EDGE_CROSSING on the single successor edge from the new bb
(the new bb is kept in the same partition as the predecessor block), or
if no new bb has been created setting EDGE_CROSSING back on the fallthru
edge which has been forced non-EDGE_FALLTHRU.
For asm goto this doesn't always work, force_nonfallthru can create a new bb
and change the fallthrough edge to point to that, but if the original
fallthru destination block has its label referenced among the asm goto
labels, it will create a new non-fallthru edge for the label(s).
But because we've temporarily cheated and cleared EDGE_CROSSING on the edge,
it is cleared on the new edge as well, then the caller sees we've created
a new bb and just sets EDGE_CROSSING on the single fallthru edge from the
new bb.  But the direct edge from cur_bb to fallthru edge's destination
isn't handled and fails afterwards consistency checks, because it crosses
partitions.

The following patch notes the case and sets EDGE_CROSSING on that edge too.

2023-01-31  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/108596
* bb-reorder.cc (fix_up_fall_thru_edges): Handle the case where cur_bb
ends with asm goto and has a crossing fallthrough edge to the same bb
that contains at least one of its labels by restoring EDGE_CROSSING
flag even on possible edge from cur_bb to new_bb successor.

* gcc.c-torture/compile/pr108596.c: New test.

(cherry picked from commit 603a6fbcaac1e80aa90d1d26318c881a53473066)

doc: Fix up return type of __builtin_va_arg_pack_len [PR108560]

__builtin_va_arg_pack_len as implemented returned int since its introduction
in 2007. The initial documentation didn't mention any return type,
which changed in 2010 in r0-103077-gab940b73bfabe2cec4 during some
documentation formatting cleanups
https://gcc.gnu.org/legacy-ml/gcc-patches/2010-09/msg01632.html
I can understand that for formatting some type was needed there
but what exactly hasn't been really discussed.

So, I think we should change documentation to match the implementation,
rather than change implementation to match the documentation.
Most people don't use more than 2147483647 arguments to inline functions,
and on poor targets with 16-bit ints I bet even having more than 65535
arguments to inline functions would be highly unexpected.

2023-01-27 Jakub Jelinek <jakub@redhat.com>

PR other/108560
* doc/extend.texi: Fix up return type of __builtin_va_arg_pack_len
from size_t to int.

(cherry picked from commit 16f30680f403891556da2ad6329fcef9dc9b47db)

store-merging: Disable string_concatenate mode if start or end aren't byte aligned [PR108498]

The first of the following testcases is miscompiled on powerpc64-linux -O2
-m64 at least, the latter at least on x86_64-linux -m32/-m64.
Since GCC 11 store-merging has a separate string_concatenation mode which
turns stores into setting a MEM_REF from a STRING_CST.
This mode is triggered if at least one of the to be merged stores
is a STRING_CST store and either the first store (to earliest address)
is that STRING_CST store or the first store is 8-bit INTEGER_CST store
and then there are some rules when to turn that mode off or not merge
further stores into it.

The problem with these 2 testcases is that the actual implementation
relies on start/width of the store to be at byte boundaries, as it
simply creates a char array, MEM_REF can be only on byte boundaries
and the char array too, plus obviously STRING_CST as well.
But as can be easily seen in the second testcase, nothing verifies this,
while the first store has to be a STRING_CST (which will be aligned)
or 8-bit INTEGER_CST, that 8-bit INTEGER_CST store could be a bitfield
store, nothing verifies any stores in between whether they actually are
8-bit and aligned, the only major requirement is that all the stores
are consecutive.

For GCC 14 I think we should reconsider this, simply treat STRING_CST
stores during the merging like INTEGER_CST stores and deal with it only
during split_group where we can create multiple parts, this part
would be a normal store, this part would be STRING_CST store, this part
another normal store etc. But that is quite a lot of work, the following
patch just disables the string_concatenate mode if boundaries aren't byte
aligned in the spot where we disable it if it is too short too.
If that happens, we'll just try to do the merging using normal 1/2/4/8 etc.
byte stores as usually with RMW masking for any bits that shouldn't be
touched or punt if we end up with too many stores compared to the original.

Note, an original STRING_CST store will count as one store in that case,
something we might want to reconsider later too (but, after all, CONSTRUCTOR
stores (aka zeroing) already have the same problem, they can be large and
expensive and we still count them as one store).

2023-01-25 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/108498
* gimple-ssa-store-merging.cc (class store_operand_info):
End coment with full stop rather than comma.
(split_group): Likewise.
(merged_store_group::apply_stores): Clear string_concatenation if
start or end aren't on a byte boundary.

* gcc.c-torture/execute/pr108498-1.c: New test.
* gcc.c-torture/execute/pr108498-2.c: New test.

(cherry picked from commit 617be7ba436bcbf9d7b883968c6b3c011206b56c)

options: fix cl_target_option_print_diff() with strings

Fix an obvious copy-and-paste error where ptr1 was used instead of ptr2.
This bug caused the dump file produced by -fdump-ipa-inline-details to
not correctly show the difference in target options when a function
could not be inlined due to a target option mismatch.

gcc/ChangeLog:
PR bootstrap/90543
* optc-save-gen.awk: Fix copy-and-paste error.

Signed-off-by: Eric Biggers <ebiggers@google.com>
(cherry picked from commit 9f0cb3368af735e95776769c4f28fa9cbb60eaf8)

c++: Fix up handling of references to anon union members in initializers [PR53932]

For anonymous union members we create artificial VAR_DECLs which
have DECL_VALUE_EXPR for the actual COMPONENT_REF.  That works
just fine inside of functions (including global dynamic constructors),
because during gimplification such VAR_DECLs are gimplified as
their DECL_VALUE_EXPR.  This is also done during regimplification.

But references to these artificial vars in DECL_INITIAL expressions
aren't ever replaced by the DECL_VALUE_EXPRs, so we end up either
with link failures like on the testcase below, or worse ICEs with
LTO.

The following patch fixes those during cp_fully_fold_init where we
already walk all the trees (!data->genericize means that
function rather than cp_fold_function).

2023-01-19  Jakub Jelinek  <jakub@redhat.com>

PR c++/53932
* cp-gimplify.cc (cp_fold_r): During cp_fully_fold_init replace
DECL_ANON_UNION_VAR_P VAR_DECLs with their corresponding
DECL_VALUE_EXPR.

* g++.dg/init/pr53932.C: New test.

(cherry picked from commit 9b9a989adc042b304572fd6d4ade46b47be6ccb8)

openmp: Fix up OpenMP expansion of non-rectangular loops [PR108459]

expand_omp_for_init_counts was using for the case where collapse(2)
inner loop has init expression dependent on non-constant multiple of
the outer iterator and the condition upper bound expression doesn't
depend on the outer iterator fold_unary (NEGATE_EXPR, ...). This
will just return NULL if it can't be folded, we need fold_build1
instead.

2023-01-19 Jakub Jelinek <jakub@redhat.com>

PR middle-end/108459
* omp-expand.cc (expand_omp_for_init_counts): Use fold_build1 rather
than fold_unary for NEGATE_EXPR.

* testsuite/libgomp.c/pr108459.c: New test.

(cherry picked from commit 46644ec99cb355845b23bb1d02775c057ed8ee88)

c: Don't emit DEBUG_BEGIN_STMTs for K&R function argument declarations [PR105972]

K&R function parameter declarations are handled by calling
recursively c_parser_declaration_or_fndef in a loop, where each such
call will add_debug_begin_stmt at the start.
Now, if the K&R function definition is not a nested function,
building_stmt_list_p () is false and so we don't emit the DEBUG_BEGIN_STMTs
anywhere, but if it is a nested function, we emit it in the containing
function at the point of the nested function definition.
As the following testcase shows, it can cause ICEs if the containing
function has var-tracking disabled but nested function has them enabled,
as the DEBUG_BEGIN_STMTs are added to the containing function which
shouldn't have them but MAY_HAVE_DEBUG_MARKER_STMTS is checked already
for the nested function, or just wrong experience in the debugger.

The following patch ensures we don't emit any such DEBUG_BEGIN_STMTs for the
K&R function parameter declarations even in nested functions.

2023-01-11 Jakub Jelinek <jakub@redhat.com>

PR c/105972
* c-parser.cc (c_parser_declaration_or_fndef): Disable debug non-bind
markers for K&R function parameter declarations of nested functions.

* gcc.dg/pr105972.c: New test.

(cherry picked from commit 23b4ce18379cd336d99d7c71701be28118905b57)

fortran: Fix up function types for realloc and sincos{,f,l} builtins [PR108349]

As reported in the PR, the FUNCTION_TYPE for __builtin_realloc in the
Fortran FE is wrong since r0-100026-gb64fca63690ad which changed
-  tmp = tree_cons (NULL_TREE, pvoid_type_node, void_list_node);
-  tmp = tree_cons (NULL_TREE, size_type_node, tmp);
-  ftype = build_function_type (pvoid_type_node, tmp);
+  ftype = build_function_type_list (pvoid_type_node,
+                                    size_type_node, pvoid_type_node,
+                                    NULL_TREE);
   gfc_define_builtin ("__builtin_realloc", ftype, BUILT_IN_REALLOC,
                      "realloc", false);
The return type is correct, void *, but the first argument should be
void * too and only second one size_t, while the above change changed
realloc to be void *__builtin_realloc (size_t, void *);
I went through all other changes from that commit and found that
__builtin_sincos{,f,l} got broken as well, instead of the former
void __builtin_sincos{,f,l} (ftype, ftype *, ftype *);
where ftype is {double,float,long double} it is now incorrectly
void __builtin_sincos{,f,l} (ftype *, ftype *);

The following patch fixes that, plus some formatting issues around
the spots I've changed.

2023-01-11  Jakub Jelinek  <jakub@redhat.com>

PR fortran/108349
* f95-lang.cc (gfc_init_builtin_function): Fix up function types
for BUILT_IN_REALLOC and BUILT_IN_SINCOS{F,,L}.  Formatting fixes.

(cherry picked from commit 0986c351aa8a9f08b3cb614baec13564dd62c114)

openmp: Fix up finish_omp_target_clauses [PR108286]

The comment in the loop says that we shouldn't add a map clause if such
a clause exists already, but the loop was actually using OMP_CLAUSE_DECL
on any clause. Target construct can have various clauses which don't
have OMP_CLAUSE_DECL at all (e.g. nowait, device or if) or clause
where it means something different (e.g. privatization clauses, allocate,
depend).

So, only check OMP_CLAUSE_DECL on OMP_CLAUSE_MAP clauses.

2023-01-05 Jakub Jelinek <jakub@redhat.com>

PR c++/108286
* semantics.cc (finish_omp_target_clauses): Ignore clauses other than
OMP_CLAUSE_MAP.

* testsuite/libgomp.c++/pr108286.C: New test.

(cherry picked from commit 29c3218618ef6177dc33871b26c8fbd9b21eabe1)

c++: Error recovery in merge_default_template_args [PR108206]

We ICE on the following testcase during error recovery, both new_parm
and old_parm are error_mark_node, the ICE is on
          error ("redefinition of default argument for %q+#D", new_parm);
          inform (DECL_SOURCE_LOCATION (old_parm),
                  "original definition appeared here");
where we don't print anything useful for new_parm and ICE trying to
access DECL_SOURCE_LOCATION of old_parm.  I think we shouldn't diagnose
anything when either of the parms is erroneous, GCC 11 before
merge_default_template_args has been added was doing
      if (TREE_VEC_ELT (tmpl_parms, i) == error_mark_node
          || TREE_VEC_ELT (parms, i) == error_mark_node)
        continue;

      tmpl_parm = TREE_VALUE (TREE_VEC_ELT (tmpl_parms, i));
      if (error_operand_p (tmpl_parm))
        return false;
in redeclare_class_template.

2023-01-04  Jakub Jelinek  <jakub@redhat.com>

PR c++/108206
* decl.cc (merge_default_template_args): Return false if either
new_parm or old_parm are erroneous.

* g++.dg/template/pr108206.C: New test.

(cherry picked from commit fc349931adcf1024ee95e0a0cd98cf4a41996093)

generic-match-head: Don't assume GENERIC folding is done only early [PR108237]

We ICE on the following testcase, because a valid V2DImode
!= comparison is folded into an unsupported V2DImode > comparison.
The match.pd pattern which does this looks like:
/* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z
   where ~Y + 1 == pow2 and Z = ~Y.  */
(for cst (VECTOR_CST INTEGER_CST)
(for cmp (eq ne)
      icmp (le gt)
  (simplify
   (cmp (bit_and:c@2 @0 cst@1) integer_zerop)
    (with { tree csts = bitmask_inv_cst_vector_p (@1); }
     (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2)))
      (with { auto optab = VECTOR_TYPE_P (TREE_TYPE (@1))
                         ? optab_vector : optab_default;
              tree utype = unsigned_type_for (TREE_TYPE (@1)); }
       (if (target_supports_op_p (utype, icmp, optab)
            || (optimize_vectors_before_lowering_p ()
                && (!target_supports_op_p (type, cmp, optab)
                    || !target_supports_op_p (type, BIT_AND_EXPR, optab))))
        (if (TYPE_UNSIGNED (TREE_TYPE (@1)))
         (icmp @0 { csts; })
         (icmp (view_convert:utype @0) { csts; })))))))))
and that optimize_vectors_before_lowering_p () guarded stuff there
already deals with this problem, not trying to fold a supported comparison
into a non-supported one.  The reason it doesn't work in this case is that
it isn't GIMPLE folding which does this, but GENERIC folding done during
forwprop4 - forward_propagate_into_comparison -> forward_propagate_into_comparison_1
-> combine_cond_expr_cond -> fold_binary_loc -> generic_simplify
and we simply assumed that GENERIC folding happens only before
gimplification.

The following patch fixes that by checking cfun properties instead of
always returning true in those cases.

2023-01-04  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/108237
* generic-match-head.cc: Include tree-pass.h.
(canonicalize_math_p, optimize_vectors_before_lowering_p): Define
to false if cfun and cfun->curr_properties has PROP_gimple_opt_math
resp. PROP_gimple_lvec property set.

* gcc.c-torture/compile/pr108237.c: New test.

(cherry picked from commit 345dffd0d4ebff7e705dfff1a8a72017a167120a)

expr: Fix up store_expr into SUBREG_PROMOTED_* target [PR108264]

The following testcase ICEs on s390x-linux (e.g. with -march=z13).
The problem is that target is (subreg/s/u:SI (reg/v:DI 66 [ x+-4 ]) 4)
and we call convert_move from temp to the SUBREG_REG of that, expecting
to extend the value properly. That works nicely if temp has some
scalar integer mode (or partial one), but ICEs when temp has V4QImode
on the assertion that from and to modes have the same bitsize.
store_expr generally allows say store from V4QI to SI target because
they have the same size and if temp is a CONST_INT, we already have code
to convert the constant properly, so the following patch just adds handling
of non-scalar integer modes by converting them to the mode of target
first before convert_move extends them.

2023-01-03 Jakub Jelinek <jakub@redhat.com>

PR middle-end/108264
* expr.cc (store_expr): For stores into SUBREG_PROMOTED_* targets
from source which doesn't have scalar integral mode first convert
it to outer_mode.

* gcc.dg/pr108264.c: New test.

(cherry picked from commit 226a498733e7919de72eb6f1bf3e16883ad159f6)

tree-ssa-dom: can_infer_simple_equiv fixes [PR108068]

As reported in the PR, tree-ssa-dom.cc uses real_zerop call to find
if a floating point constant is zero and it shouldn't try to infer
equivalences from comparison against it if signed zeros are honored.
This doesn't work at all for decimal types, because real_zerop always
returns false for them (one can have different representations of decimal
zero beyond -0/+0), and it doesn't work for vector compares either,
as real_zerop checks if all elements are zero, while we need to avoid
infering equivalences from comparison against vector constants which have
at least one zero element in it (if signed zeros are honored).
Furthermore, as mentioned by Joseph, for decimal types many other values
aren't singleton.

So, this patch stops infering anything if element mode is decimal, and
otherwise uses instead of real_zerop a new function, real_maybe_zerop,
which will work even for decimal types and for complex or vector will
return true if any element is or might be zero (so it returns true
for anything but constants for now).

2022-12-23 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/108068
* tree.h (real_maybe_zerop): Declare.
* tree.cc (real_maybe_zerop): Define.
* tree-ssa-dom.cc (record_edge_info): Use it instead of
real_zerop or TREE_CODE (op1) == SSA_NAME || real_zerop. Always set
can_infer_simple_equiv to false for decimal floating point types.

* gcc.dg/dfp/pr108068.c: New test.

(cherry picked from commit fd1b0aefda5b65f3f841ca6e61ccea6a72daa060)

phiopt: Drop SSA_NAME_RANGE_INFO in maybe equal case [PR108166]

The following place in value_replacement is after proving that
x == cst1 ? cst2 : x
phi result is only used in a comparison with constant which doesn't
care if it compares cst1 or cst2 and replaces it with x.
The testcase is miscompiled because we have after the replacement
incorrect range info for the phi result, we would need to
effectively union the phi result range with cst1 (oarg in the code)
because previously that constant might be missing in the range, but
newly it can appear (we've just verified that the single use stmt
of the phi result doesn't care about that value in particular).

The following patch just resets the info, bootstrapped/regtested
on x86_64-linux and i686-linux, ok for trunk?

Aldy/Andrew, how would one instead union the SSA_NAME_RANGE_INFO
with some INTEGER_CST and store it back into SSA_NAME_RANGE_INFO
(including adjusting non-zero bits and the like)?

2022-12-22 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/108166
* tree-ssa-phiopt.cc (value_replacement): For the maybe_equal_p
case turned into equal_p reset SSA_NAME_RANGE_INFO of phi result.

* g++.dg/torture/pr108166.C: New test.

(cherry picked from commit 5c17adfb5d08e34da7a7f234dfc2ed1f0aaadaa9)

cse: Fix up CSE const_anchor handling [PR108193]

The following testcase ICEs on aarch64, because insert_const_anchor
inserts invalid CONST_INT into the CSE tables - 0x80000000 for SImode.
The second hunk of the patch fixes that, the first one is to avoid
triggering undefined behavior at compile time during compute_const_anchors
computations - performing those additions and subtractions in
HOST_WIDE_INT means it can overflow for certain constants.

2022-12-22 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/108193
* cse.cc (compute_const_anchors): Change n type to
unsigned HOST_WIDE_INT, adjust comparison against it to avoid
warnings. Formatting fix.
(insert_const_anchor): Use gen_int_mode instead of GEN_INT.

* gfortran.dg/pr108193.f90: New test.

(cherry picked from commit 0cb5d7cdbab8e5f8359764ef5f62d93c2bc88552)

openmp: Don't try to destruct DECL_OMP_PRIVATIZED_MEMBER vars [PR108180]

DECL_OMP_PRIVATIZED_MEMBER vars are artificial vars with DECL_VALUE_EXPR
of this->field used just during gimplification and omp lowering/expansion
to privatize individual fields in methods when needed.
As the following testcase shows, when not in templates, they were handled
right, but in templates we actually called cp_finish_decl on them and
that can result in their destruction, which is obviously undesirable,
we should only destruct the privatized copies of them created in omp
lowering.

Fixed thusly.

2022-12-21 Jakub Jelinek <jakub@redhat.com>

PR c++/108180
* pt.cc (tsubst_expr): Don't call cp_finish_decl on
DECL_OMP_PRIVATIZED_MEMBER vars.

* testsuite/libgomp.c++/pr108180.C: New test.

(cherry picked from commit 1119902b6c7c1c50123ed85ec1def8be4772d68c)

testsuite: Fix up pr64536.c for LLP64 targets [PR108151]

Apparently llp64 had 2 further warnings, fixed thusly.

2022-12-19 Jakub Jelinek <jakub@redhat.com>

PR testsuite/108151
* gcc.dg/pr64536.c (bar): Cast long to __INTPTR_TYPE__
before casting to long *.

(cherry picked from commit 6e85f89a7d59a99a3395b6e153b99262a58b2f6c)

testsuite: Fix up pr64536.c for LLP64 targets [PR108151]

The test casts a pointer to long, which is ok for ilp32 and lp64
targets but not for llp64 targets. Nothing reads the values later,
it is a link test, so all we care about is that it is the same
cast on s390x-linux where it used to fail before the PR64536 fix,
and that we don't warn about it.

2022-12-19 Jakub Jelinek <jakub@redhat.com>

PR testsuite/108151
* gcc.dg/pr64536.c (bar): Use casts to __INTPTR_TYPE__ rather than
long when casting pointer to integral type.

(cherry picked from commit ea37e96a37b50dad17b91d46edc518bbb9132d8e)

loop-invariant: Split preheader edge if the preheader bb ends with jump [PR106751]

The RTL loop passes only request simple preheaders, but don't require
fallthru preheaders, while move_invariant_reg apparently assumes the
latter, that it can just append instruction(s) to the end of the preheader
basic block.

The following patch fixes that by splitting the preheader edge if
the preheader bb ends with a JUMP_INSN (asm goto in this case).
Without that we get control flow in the middle of a bb.

2022-12-16 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/106751
* loop-invariant.cc (move_invariant_reg): If preheader bb ends
with a JUMP_INSN, split the preheader edge and emit invariants
into the new preheader basic block.

* gcc.c-torture/compile/pr106751.c: New test.

(cherry picked from commit ddcaa60983b50378bde1b7e327086fe0ce101795)

c++: Ensure !!var is not an lvalue [PR107065]

The TRUTH_NOT_EXPR case in cp_build_unary_op is one of the spots where
we somewhat fold immediately using invert_truthvalue_loc.
I've tried using
  return build1_loc (location, TRUTH_NOT_EXPR, boolean_type_node, arg);
in there instead, but unfortunately that regressed
Wlogical-not-parentheses-*.c pr49706.c pr62199.c pr65120.c sequence-pt-1.C
tests, so at least for backporting that doesn't seem to be a way to go.

So, this patch instead wraps it into NON_LVALUE_EXPR if needed (which also
need a tweak for some tests in the pr47906.c test, but nothing major),
with the intent to make it backportable, and later I'll try to do further
steps to avoid folding here prematurely.  Most of the problems with
build1 TRUTH_NOT_EXPR are that it doesn't even invert comparisons as most
common case and lots of warning code isn't able to deal with ! around
comparisons; so perhaps one way to do this would be fold by hand only
invertable comparisons and for the rest create TRUTH_NOT_EXPR.

2022-12-15  Jakub Jelinek  <jakub@redhat.com>

PR c++/107065
gcc/cp/
* typeck.cc (cp_build_unary_op) <case TRUTH_NOT_EXPR>: If
invert_truthvalue_loc returns obvalue_p, wrap it into NON_LVALUE_EXPR.
* parser.cc (cp_parser_binary_expression): Don't call
warn_logical_not_parentheses if current.lhs is a NON_LVALUE_EXPR
of a decl with boolean type.
gcc/testsuite/
* g++.dg/cpp0x/pr107065.C: New test.

(cherry picked from commit 8b775b4c48a3cc4ef5c50e56144aea02da2e9cc6)

into-ssa: Fix emitting debug stmts after asm goto [PR108095]

The following testcase ICEs, because ccp1 replaced
  s.0_1 = &s;
  __asm__ goto("" : "=r" MEM[(T *)s.0_1] :  :  : "lab" lab);
with
  __asm__ goto("" : "=r" s :  :  : "lab" lab);
and because s is no longer addressable, we are rewriting it into
ssa and want
  __asm__ goto("" : "=r" s_7 :  :  : "lab" lab);
plus debug stmt
  # DEBUG s => s_7
The code assumes that there is at most one non-EH edge in that
case, but with the addition of outputs to asm goto that is no longer the
case, we can have many outgoing edges.

The patch keeps the checking assertion that there is at most one such
edge for everything but asm goto, but moves the addition of the debug
stmt into the loop, so that it can be added on all edges where it is
possible, not just one of them.

Furthermore, looking at gsi_insert_on_edge_immediate
-> gimple_find_edge_insert_loc, the conditions to insert stmt there
to the destination block are
  if (single_pred_p (dest)
      && gimple_seq_empty_p (phi_nodes (dest))
      && dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
(plus there is code to insert it in the previous block but that is
never true when the pred is known to be stmt_ends_bb_p), while
mayube_register_def was just checking
                 if (ef && single_pred_p (ef->dest)
                     && ef->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
so if for whatever reason ef->dest had any PHIs, we'd split the
edge for -g and not for -g0, something we must avoid for -fcompare-debug
stability.  So, I've added the no phi_nodes check too.

2022-12-15  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/108095
* tree-into-ssa.cc (maybe_register_def): Insert debug stmt
on all non-EH edges from asm goto if they have a single
predecessor rather than asserting there is at most one such edge.
Test whether there are no PHI nodes next to the single predecessor
test.

* gcc.dg/pr108095.c: New test.

(cherry picked from commit bf3ce6f84a7a994a0fc87419b383b9ce4efed442)

ivopts: Fix IP_END handling for asm goto [PR107997]

The following testcase ICEs, because the latch bb ends with
asm goto which has both fallthrough to the header and one or more labels
in the header too. In that case there is just a single edge out of the
latch block, but still the asm goto is stmt_ends_bb_p statement, yet
ivopts decides to emit an IV bump at the IP_END position and inserts
it into the same bb as the asm goto after it, which then fails verification
(control flow in the middle of bb).

The following patch fixes it by splitting the latch -> header edge in that
case and inserting into the newly created bb, where split_edge ->
redirect_edge_and_branch is able to deal with this case correctly.

2022-12-10 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/107997
* tree-ssa-loop-ivopts.cc: Include cfganal.h.
(create_new_iv) <case IP_END>: If ip_end_pos bb is non-empty and ends
with a stmt which ends bb, instead of adding iv update after it split
the latch edge and insert iterator into the new latch bb.

* gcc.c-torture/compile/pr107997.c: New test.

(cherry picked from commit 7676235f690e624b7ed41a22b22ce8ccfac1492f)

cfgbuild: Fix DEBUG_INSN handling in find_bb_boundaries [PR106719]

The following testcase FAILs on aarch64-linux.  We have some atomic
instruction followed by 2 DEBUG_INSNs (if -g only of course) followed
by NOTE_INSN_EPILOGUE_BEG followed by some USE insn.
Now, split3 pass replaces the atomic instruction with a code sequence
which ends with a conditional jump and the split3 pass calls
find_many_sub_basic_blocks.
For -g0, find_bb_boundaries sees the flow_transfer_insn (the new conditional
jump), then NOTE_INSN_EPILOGUE_BEG which can live in between basic blocks
and then the USE insn, so splits block after the NOTE_INSN_EPILOGUE_BEG
and puts the NOTE in between the blocks.
For -g, if sees a DEBUG_INSN after the flow_transfer_insn, so sets
debug_insn to it, then walks over another DEBUG_INSN, NOTE_INSN_EPILOGUE_BEG
until it finally sees the USE insn, and triggers the:
          rtx_insn *prev = PREV_INSN (insn);

          /* If the first non-debug inside_basic_block_p insn after a control
             flow transfer is not a label, split the block before the debug
             insn instead of before the non-debug insn, so that the debug
             insns are not lost.  */
          if (debug_insn && code != CODE_LABEL && code != BARRIER)
            prev = PREV_INSN (debug_insn);
code I've added for PR81325.  If there are only DEBUG_INSNs, that is
the right thing to do, but if in between debug_insn and insn there are
notes which can stay in between basic blocks or simnilarly JUMP_TABLE_DATA
or their associated CODE_LABELs, it causes -fcompare-debug differences.

The following patch fixes it by clearing debug_insn if JUMP_TABLE_DATA
or associated CODE_LABEL is seen (I'm afraid there is no good answer
what to do with DEBUG_INSNs before those; the code then removes them:
              /* Clean up the bb field for the insns between the blocks.  */
              for (x = NEXT_INSN (flow_transfer_insn);
                   x != BB_HEAD (fallthru->dest);
                   x = next)
                {
                  next = NEXT_INSN (x);
                  /* Debug insns should not be in between basic blocks,
                     drop them on the floor.  */
                  if (DEBUG_INSN_P (x))
                    delete_insn (x);
                  else if (!BARRIER_P (x))
                    set_block_for_insn (x, NULL);
                }
but if there are NOTEs, the patch just reorders the NOTEs and DEBUG_INSNs,
such that the NOTEs come first (so that they stay in between basic blocks
like with -g0) and DEBUG_INSNs after those (so that bb is split before
them, so they will be in the basic block after NOTE_INSN_BASIC_BLOCK).

2022-12-08  Jakub Jelinek  <jakub@redhat.com>

PR debug/106719
* cfgbuild.cc (find_bb_boundaries): If there are NOTEs in between
debug_insn (seen after flow_transfer_insn) and insn, move NOTEs
before all the DEBUG_INSNs and split after NOTEs.  If there are
other insns like jump table data, clear debug_insn.

* gcc.dg/pr106719.c: New test.

(cherry picked from commit d9f9d5d30feb33c359955d7030cc6be50ef6dc0a)

i386: Fix up ix86_abi handling [PR106875]

The following testcase fails since my changes to make also
opts_set saved/restored upon function target/optimization changes
(before it has been acting as "has this option be ever explicit
anywhere?").

The problem is that for ix86_abi we depend on the opts_set
value for it in ix86_option_override_internal:
  SET_OPTION_IF_UNSET (opts, opts_set, ix86_abi, DEFAULT_ABI);
but as it is a TargetSave, the backend code is required to
save/restore it manually (it does that) and since gcc 11 also
to save/restore the opts_set bit for it (which isn't done).
We don't do that for various other TargetSave which
ix86_function_specific_{save,restore} saves/restores, but as long
as we never test opts_set for it, it doesn't really matter.
One possible fix would be to introduce some new TargetSave into
which ix86_function_specific_{save,restore} would save/restore a bitmask
of the opts_set bits.  The following patch uses an easier fix, by
making it a TargetVariable instead the saving/restoring is handled
by the generated code.
The differences in options.h are just slight movements on where
*ix86_abi stuff appears in it, ditto for options.cc, the real
differences are just in options-save.cc, where cl_target_option_save
gets:
+  ptr->x_ix86_abi = opts->x_ix86_abi;
...
+  if (opts_set->x_ix86_abi) mask |= HOST_WIDE_INT_1U << 3;
(plus adjustments of following TargetVariables mask related stuff),
cl_target_option_restore gets:
+  opts->x_ix86_abi = ptr->x_ix86_abi;
...
+  opts_set->x_ix86_abi = static_cast<enum calling_abi>((mask & 1) != 0);
+  mask >>= 1;
plus the movements in other functions too.  So, by it being a
TargetVariable, the only thing that changed is that we don't need to
handle it manually in ix86_function_specific_{save,restore} because it
is handled automatically including the opts_set stuff.

2022-11-28  Jakub Jelinek  <jakub@redhat.com>

PR target/106875
* config/i386/i386.opt (x_ix86_abi): Remove TargetSave.
(ix86_abi): Replace it with TargetVariable.
* config/i386/i386-options.cc (ix86_function_specific_save,
ix86_function_specific_restore): Don't save and restore x_ix86_abi.

* g++.target/i386/pr106875.C: New test.

(cherry picked from commit ee629d242d9f93a38e49bed904bb334bbe15dde1)

asan: Fix up error recovery for too large frames [PR107317]

asan_emit_stack_protection and functions it calls have various asserts that
verify sanity of the stack protection instrumentation. But, that
verification can easily fail if we've diagnosed a frame offset overflow.
asan_emit_stack_protection just emits some extra code in the prologue,
if we've reported errors, we aren't producing assembly, so it doesn't
really matter if we don't include the protection code, compilation
is going to fail anyway.

2022-11-24 Jakub Jelinek <jakub@redhat.com>

PR middle-end/107317
* asan.cc: Include diagnostic-core.h.
(asan_emit_stack_protection): Return NULL early if seen_error ().

* gcc.dg/asan/pr107317.c: New test.

(cherry picked from commit b6330a7685476fc30b8ae9bbf3fca1a9b0d4be95)

testsuite: Fix up broken testcase [PR107127]

I've added { dg-options "" } line manually in the patch but
forgot to adjust the number of added lines.

2022-11-24 Jakub Jelinek <jakub@redhat.com>

PR c/107127
* gcc.dg/pr107127.c (foo): Add missing closing }.

(cherry picked from commit add0f941be18cdf962a0f300019acacbf2325d41)

c: Fix compile time hog in c_genericize [PR107127]

The complex multiplications result in deeply nested set of many SAVE_EXPRs,
which takes even on fast machines over 5 minutes to walk.
This patch fixes that by using walk_tree_without_duplicates where it is
instant.

2022-11-23 Andrew Pinski <apinski@marvell.com>
Jakub Jelinek <jakub@redhat.com>

PR c/107127
* c-gimplify.cc (c_genericize): Use walk_tree_without_duplicates
instead of walk_tree for c_genericize_control_r.

* gcc.dg/pr107127.c: New test.

(cherry picked from commit 8a0fce6a51915c29584427fd376b40073c328090)

Daily bump.

Fortran: error handling of global entity appearing in COMMON block [PR103259]

gcc/fortran/ChangeLog:

PR fortran/103259
* resolve.cc (resolve_common_vars): Avoid NULL pointer dereference
when a symbol's location is not set.

gcc/testsuite/ChangeLog:

PR fortran/103259
* gfortran.dg/pr103259.f90: New test.

(cherry picked from commit 7e9f20f5517429cfaadec14d6b04705e59078565)

Daily bump.

Fortran: ASSOCIATE variables should not be TREE_STATIC [PR95107]

gcc/fortran/ChangeLog:

PR fortran/95107
* trans-decl.cc (gfc_finish_var_decl): With -fno-automatic, do not
make ASSOCIATE variables TREE_STATIC.

gcc/testsuite/ChangeLog:

PR fortran/95107
* gfortran.dg/save_7.f90: New test.

(cherry picked from commit c36f3da534e7f411c5bc48c5b6b660e238167480)

Daily bump.

Fix PR 108582: ICE due to PHI-OPT removing a still in use ssa_name.

This patch adds a check in match_simplify_replacement to make sure the middlebb
does not have any phi-nodes as we don't currently move those.
This was just a thinko from before.

Committed on the GCC 12 branch after a bootstrap/test on x86_64-linux-gnu.

PR tree-optimization/108582

gcc/ChangeLog:

* tree-ssa-phiopt.cc (match_simplify_replacement): Add check
for middlebb to have no phi nodes.

gcc/testsuite/ChangeLog:

* gcc.dg/pr108582-1.c: New test.

(cherry picked from commit 876b3e0514bc8cb2256c44db56255403bedfa52d)

tree-optimization/108522 Use component_ref_field_offset

Instead of using TREE_OPERAND (expr, 2) directly, use
component_ref_field_offset instead, which does scaling for us. The
function also substitutes PLACEHOLDER_EXPRs but it is not relevant for
tree-object-size.

gcc/ChangeLog:

PR tree-optimization/108522
* tree-object-size.cc (compute_object_offset): Make EXPR
argument non-const. Call component_ref_field_offset.

gcc/testsuite/ChangeLog:

PR tree-optimization/108522
* gcc.dg/builtin-dynamic-object-size-0.c (DEFSTRUCT): New
macro.
(test_dynarray_struct_member_b, test_dynarray_struct_member_c,
test_dynarray_struct_member_d,
test_dynarray_struct_member_subobj_b,
test_dynarray_struct_member_subobj_c,
test_dynarray_struct_member_subobj_d): New tests.
(main): Call them.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
(cherry picked from commit 0573a0778af88e805f7630ac8640ecd67d692665)

tree-optimization/108522 Use COMPONENT_REF offset when available

Use the offset in TREE_OPERAND(component_ref, 2) when available instead
of DECL_FIELD_OFFSET when trying to compute offset for a COMPONENT_REF.

Co-authored-by: Jakub Jelinek <jakub@redhat.com>
gcc/ChangeLog:

PR tree-optimization/108522
* tree-object-size.cc (compute_object_offset): Use
TREE_OPERAND(ref, 2) for COMPONENT_REF when available.

gcc/testsuite/ChangeLog:

PR tree-optimization/108522
* gcc.dg/builtin-dynamic-object-size-0.c
(test_dynarray_struct_member): New test.
(main): Call it.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
(cherry picked from commit b851ee9fdf0f3023635f0cb1f7c607b2d6801053)

Daily bump.

c++: equivalence of non-dependent calls [PR107461]

After r13-5684-g59e0376f607805 the (pruned) callee of a non-dependent
CALL_EXPR is a bare FUNCTION_DECL rather than ADDR_EXPR of FUNCTION_DECL.
This innocent change revealed that cp_tree_equal doesn't first check
dependence of a CALL_EXPR before treating a FUNCTION_DECL callee as a
dependent name, which leads to us incorrectly accepting the first two
testcases below and rejecting the third:

* In the first testcase, cp_tree_equal incorrectly returns true for
   the two non-dependent CALL_EXPRs f(0) and f(0) (whose CALL_EXPR_FN
   are different FUNCTION_DECLs) which causes us to treat #2 as a
   redeclaration of #1.

* Same issue in the second testcase, for f<int*>() and f<char>().

* In the third testcase, cp_tree_equal incorrectly returns true for
   f<int>() and f<void(*)(int)>() which causes us to conflate the two
   dependent specializations A<decltype(f<int>()(U()))> and
   A<decltype(f<void(*)(int)>()(U()))>.

This patch fixes this by making called_fns_equal treat two callees as
dependent names only if the overall CALL_EXPRs are dependent, via a new
convenience function call_expr_dependent_name that is like dependent_name
but also checks dependence of the overall CALL_EXPR.

PR c++/107461

gcc/cp/ChangeLog:

* cp-tree.h (call_expr_dependent_name): Declare.
* pt.cc (iterative_hash_template_arg) <case CALL_EXPR>: Use
call_expr_dependent_name instead of dependent_name.
* tree.cc (call_expr_dependent_name): Define.
(called_fns_equal): Adjust to take two CALL_EXPRs instead of
CALL_EXPR_FNs thereof.  Use call_expr_dependent_name instead
of dependent_name.
(cp_tree_equal) <case CALL_EXPR>: Adjust call to called_fns_equal.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/overload5.C: New test.
* g++.dg/cpp0x/overload5a.C: New test.
* g++.dg/cpp0x/overload6.C: New test.

(cherry picked from commit 31924665c86d47af6b1f22a74f594f2e1dc0ed2d)

Daily bump.

fortran: Set name for *LOC default BACK argument [PR108450]

This change fixes an ICE caused by the double resolution of MINLOC,
MAXLOC and FINDLOC expressions which get a default value for the BACK
argument at resolution time. That argument is added without name,
and argument reordering code is not prepared to handle unnamed arguments
coming after named ones, so the second resolution causes a NULL pointer
dereference.
The problem is fixed by explicitly setting the argument name.

PR fortran/108450

gcc/fortran/ChangeLog:

* check.cc (gfc_check_minloc_maxloc): Explicitly set argument name.
(gfc_check_findloc): Ditto.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/minmaxloc_1.f90: New test.

(cherry picked from commit 2e32a12c04c72f692a7bd119fd3e4e5b74392c9d)

Fortran: error recovery on invalid array section [PR108609]

The testcase for PR108527 uncovered a latent issue with invalid array
sections that resulted in different paths being taken on different
architectures. Detect the invalid array declaration for a clean recovery.

gcc/fortran/ChangeLog:

PR fortran/108609
* expr.cc (find_array_section): Add check to prevent interpreting an
mpz non-integer constant as an integer.

gcc/testsuite/ChangeLog:

PR fortran/108609
* gfortran.dg/pr108527.f90: Adjust test pattern.

(cherry picked from commit 88a2a09dd4529107e7ef7a6e7ce43acf96457173)

Fortran: fix ICE in compare_bound_int [PR108527]

gcc/fortran/ChangeLog:

PR fortran/108527
* resolve.cc (compare_bound_int): Expression to compare must be of
type INTEGER.
(compare_bound_mpz_t): Likewise.
(check_dimension): Fix comment on checks applied to array section
and clean up associated logic.

gcc/testsuite/ChangeLog:

PR fortran/108527
* gfortran.dg/pr108527.f90: New test.

Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
(cherry picked from commit 22afa4947584c701633a79fd8750c9ceeaa96711)

Daily bump.

c++: unexpected ADDR_EXPR after overload set pruning [PR107461]

Here the ahead-of-time overload set pruning in finish_call_expr is
unintentionally returning a CALL_EXPR whose (pruned) callee is wrapped
in an ADDR_EXPR, despite the original callee not being wrapped in an
ADDR_EXPR. This ends up causing a bogus declaration mismatch error in
the below testcase because the call to min in #1 gets expressed as a
CALL_EXPR of ADDR_EXPR of FUNCTION_DECL, whereas the level-lowered call
to min in #2 gets expressed instead as a CALL_EXPR of FUNCTION_DECL.

This patch fixes this by stripping the spurious ADDR_EXPR appropriately.
Thus the first call to min now also gets expressed as a CALL_EXPR of
FUNCTION_DECL, matching the behavior before r12-6075-g2decd2cabe5a4f.

PR c++/107461

gcc/cp/ChangeLog:

* semantics.cc (finish_call_expr): Strip ADDR_EXPR from
the selected callee during overload set pruning.

gcc/testsuite/ChangeLog:

* g++.dg/template/call9.C: New test.

(cherry picked from commit 59e0376f607805ef9b67fd7b0a4a3084ab3571a5)

Daily bump.

Fortran: diagnose USE associated symbols in COMMON blocks [PR108453]

gcc/fortran/ChangeLog:

PR fortran/108453
* match.cc (gfc_match_common): A USE associated name shall not appear
in a COMMON block (F2018:C8121).

gcc/testsuite/ChangeLog:

PR fortran/108453
* gfortran.dg/common_27.f90: New test.

(cherry picked from commit aba9ff8f30d4245294ea2583de1dc28f1c7ccf7b)

libstdc++: Fix std::random_device for avr

This fixes a build failure that affects avr, but could affect other
targets in theory. The _M_fini function should not try to use ::open or
::fopen if _GLIBCXX_USE_DEV_RANDOM is not defined, because no file can
ever have been opened.

libstdc++-v3/ChangeLog:

* src/c++11/random.cc (random_device::_M_fini): Do not try to
close the file handle if the target doesn't support the
/dev/random and /dev/urandom files.

(cherry picked from commit 277dd6ea416718ba5493023b5a4660ecdbaf936c)

libstdc++: Fix build failures for avr

The abr-libc <errno.h> does not define EOVERFLOW, which means that
std::errc::value_too_large is not defined, and so <charconv> cannot be
compiled. Define value_too_large for avr with a value that does not
clash with any that is defined in <errno.h>. This is a kluge to fix
bootstrap for avr; it can be removed after PR libstdc++/104883 is
resolved.

The avr-libc <errno.h> fails to meet the C and POSIX requirements that
each error macro has a distinct integral value, and is usable in #if
directives. Add a special case for avr to system_error.cc so that only
the valid errors are recognized. Also disable the errno checks in
std::filesystem::remove_all that assume a meaningful value for errno.

On avr-libc <unistd.h> exists but does not define the POSIX functions
needed by std::filesystem, so _GLIBCXX_HAVE_UNISTD_H is not sufficient
to check for basic POSIX APIs. Check !defined __AVR__ as well as
_GLIBCXX_HAVE_UNISTD_H before using those functions. This is a kluge and
we should really have a specific macro that says the required functions
are available.

libstdc++-v3/ChangeLog:

* config/os/generic/error_constants.h (errc::value_too_large)
[__AVR__]: Define.
* src/c++11/system_error.cc
(system_category::default_error_condition) [__AVR__]: Only match
recognize values equal to EDOM, ERANGE, ENOSYS and EINTR.
* src/c++17/fs_ops.cc (fs::current_path) [__AVR__]: Do not check
for ENOENT etc. in switch.
(fs::remove_all) [__AVR__]: Likewise.
* src/filesystem/ops-common.h [__AVR__]: Do not use POSIX open,
close etc.

(cherry picked from commit 2d2e163d12f64a5e68f9e0381751ed9d5d6d3617)

libstdc++: Declare const global variables inline

The changes inside the regex_constants and execution namespaces seem to
be (the only) unimplemented parts of P0607R0 "Inline Variable for the
Standard Library"; the rest of the changes are to implementation details.

libstdc++-v3/ChangeLog:

* include/bits/atomic_wait.h (_detail::__platform_wait_alignment):
Declare inline.  Remove redundant static specifier.
(__detail::__atomic_spin_count_relax): Declare inline.
(__detail::__atomic_spin_count): Likewise.
* include/bits/regex_automaton.h (__detail::_S_invalid_state_id):
Declare inline for C++17.  Declare constexpr.  Remove
redundant const and static specifiers.
* include/bits/regex_error.h (regex_constants::error_collate):
Declare inline for C++17 as per P0607R0.
(regex_constants::error_ctype): Likewise.
(regex_constants::error_escape): Likewise.
(regex_constants::error_backref): Likewise.
(regex_constants::error_brack): Likewise.
(regex_constants::error_paren): Likewise.
(regex_constants::error_brace): Likewise.
(regex_constants::error_badbrace): Likewise.
(regex_constants::error_range): Likewise.
(regex_constants::error_space): Likewise.
(regex_constants::error_badrepeat): Likewise.
(regex_constants::error_complexity): Likewise.
(regex_constants::error_stack): Likewise.
* include/ext/concurrence.h (__gnu_cxx::__default_lock_policy):
Likewise.  Remove redundant static specifier.
* include/pstl/execution_defs.h (execution::seq): Declare inline
for C++17 as per P0607R0.
(execution::par): Likewise.
(execution::par_unseq): Likewise.
(execution::unseq): Likewise.

(cherry picked from commit e3b10249119fb4258fb83d21d805a04f30b63152)

Daily bump.

ipa: Release body more carefully when removing nodes (PR 107944)

The code removing function bodies when the last call graph clone of a
node is removed is too aggressive when there are nodes up the
clone_of chain which still need them. Fixed by expanding the check.

gcc/ChangeLog:

2023-01-18 Martin Jambor <mjambor@suse.cz>

PR ipa/107944
* cgraph.cc (cgraph_node::remove): Check whether nodes up the
lcone_of chain also do not need the body.

(cherry picked from commit db959e250077ae6b4fc08f53fb322719582c5de6)

c++: ICE with -Wlogical-op [PR107755]

Here we crash in the middle end because warn_logical_operator calls
build_range_check which calls various fold_* functions and those
don't work too well when we're still processing template trees. For
instance here we crash because we're converting a RECORD_TYPE to bool.
At this point VIEW_CONVERT_EXPR<struct Foo>(b) hasn't yet been converted
to Foo::operator bool (&b).

I was excited to fix this with instantiation_dependent_expression_p
which can now be called from c-family/ as well, but the problem isn't
that the expression is dependent. So, p_t_d it is.

PR c++/107755

gcc/cp/ChangeLog:

* call.cc (build_new_op): Don't call warn_logical_operator when
processing a template.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wlogical-op-4.C: New test.

(cherry picked from commit 5ce8961b46f050a96e8c542b34b1cf024ba95f1b)