tree-parloops: Enable runtime thread detection with -ftree-parallelize-loops
This patch adds runtime thread count detection to auto-parallelization.
-ftree-parallelize-loops option generates parallelized loops without
specifying a fixed thread count, deferring this decision to program execution
time where it is controlled by the OMP_NUM_THREADS environment variable.
Bootstrap and regression tested on aarch64-linux. Compiled SPEC HPC pot3d
https://www.spec.org/hpc2021/docs/benchmarks/628.pot3d_s.html with
-ftree-parallelize-loops and tested without having OMP_NUM_THREADS set in the
environment and with OMP_NUM_THREADS set to different values.
gcc/ChangeLog:
* doc/invoke.texi (ftree-parallelize-loops): Update.
* common.opt (ftree-parallelize-loops): Add alias that maps to
special value INT_MAX for runtime thread detection.
* tree-parloops.cc (create_parallel_loop): Use INT_MAX for runtime
detection. Call gimple_build_omp_parallel without building a
OMP_CLAUSE_NUM_THREADS clause.
(gen_parallel_loop): For auto-detection, use a conservative
estimate of 2 threads.
(parallelize_loops): Same.