From: Viktor Szakats Date: Sat, 8 Feb 2025 23:50:07 +0000 (+0100) Subject: GHA/windows: improve build perf with cmake unity batches X-Git-Tag: curl-8_12_1~19 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=29e4eda631f46368c2adf833ba3065b1b46c2a7d;p=thirdparty%2Fcurl.git GHA/windows: improve build perf with cmake unity batches Default curl unity builds make a single unit for each target. It means all target sources are batched together and built in a single compiler invocation. With multi-core CPUs this doesn't always result in the best possible performance. This patch enables smaller batches for jobs where this resulted in shorter build times. These jobs are Cygwin, MSYS2, MinGW, running on the Windows runners. Use batch of 30 (meaning 30 sources batched into units), and 32 for Cygwin/MSYS2 to avoid a unity fallout that's subject to a different PR. (CMake allows to set the number of sources per unit, not the number of units, though the latter may be more practical to max out CPU cores.) Also override to not batch the `curlu` target because batching lost a little bit of time there, due to the already existing parallelism when building the `testdeps` targets. For jobs on the macOS and Linux runners jobs were already mostly single digit or below teen seconds, and batching didn't improve on them noticeably. On VM jobs, the virtual CPUs are limited, so I didn't make a try. In AppVeyor and GHA vcpkg jobs (using msbuild), batching didn't result in conclusive or any gains. Build times in seconds (curl + testdeps): Job | Before | After w curlu=0 | Gain :--------------------| :-------------- | :-------------- | :--- cygwin, CM | 19 + 32 = 51 | 12 + 32 = 44 | 7 msys2, CM | 7 + 15 = 22 | 5 + 14 = 19 | 3 mingw gcc U, CM | 19 + 30 = 49 | 13 + 32 = 45 | 4 mingw ucrt, CM | 32 + 42 = 74 | 15 + 43 = 58 | 16 mingw clang, CM | 15 + 21 = 36 | 8 + 21 = 29 | 7 mingw uwp, CM | 30 + 40 = 70 | 14 + 40 = 54 | 16 mingw gcc, CM | 20 + 31 = 51 | 12 + 31 = 43 | 8 mingw x86, CM | 35 + 40 = 75 | 15 + 38 = 53 | 22 dl-mingw, CM 9.5.0 | 88 + 99 = 187 | 42 + 101 = 143 | 44 dl-mingw, CM 7.3.0 U | 24 + 32 = 56 | 17 + 35 = 52 | 4 Total | | | 131 Total gain per GHA/windows workflow runs: 2m11s Runs: Before: https://github.com/curl/curl/actions/runs/13220256084/job/36904342259 After: https://github.com/curl/curl/actions/runs/13220383702/job/36904602981 https://github.com/curl/curl/actions/runs/13220613141/job/36905170104 https://github.com/curl/curl/actions/runs/13222019443/job/36908358550 With curlu tweak: https://github.com/curl/curl/actions/runs/13222239255/job/36908782462 Ref: 116950a25066257f86461f9d1dfa5f787f55e73c #16265 Closes #16272 --- diff --git a/.github/workflows/windows.yml b/.github/workflows/windows.yml index 8dabd34c82..21c2d219dd 100644 --- a/.github/workflows/windows.yml +++ b/.github/workflows/windows.yml @@ -86,7 +86,7 @@ jobs: if [ '${{ matrix.build }}' = 'cmake' ]; then PATH="/usr/bin:$(cygpath "${SYSTEMROOT}")/System32" cmake -B bld -G Ninja ${options} \ - -DCMAKE_UNITY_BUILD=ON -DCURL_TEST_BUNDLES=ON \ + -DCMAKE_UNITY_BUILD=ON -DCMAKE_UNITY_BUILD_BATCH_SIZE=32 -DCURL_TEST_BUNDLES=ON \ -DCURL_WERROR=ON \ ${{ matrix.config }} else @@ -252,7 +252,7 @@ jobs: cmake -B bld -G Ninja ${options} \ -DCMAKE_C_FLAGS="${{ matrix.cflags }} ${CFLAGS_CMAKE} ${CPPFLAGS}" \ -DCMAKE_BUILD_TYPE='${{ matrix.type }}' \ - -DCMAKE_UNITY_BUILD=ON -DCURL_TEST_BUNDLES=ON \ + -DCMAKE_UNITY_BUILD=ON -DCMAKE_UNITY_BUILD_BATCH_SIZE=32 -DCURL_TEST_BUNDLES=ON \ -DCURL_WERROR=ON \ ${{ matrix.config }} else @@ -438,7 +438,7 @@ jobs: cmake -B bld -G 'MSYS Makefiles' ${options} \ -DCMAKE_C_COMPILER=gcc \ -DCMAKE_BUILD_TYPE='${{ matrix.type }}' \ - -DCMAKE_UNITY_BUILD=ON -DCURL_TEST_BUNDLES=ON \ + -DCMAKE_UNITY_BUILD=ON -DCMAKE_UNITY_BUILD_BATCH_SIZE=30 -DCURL_TEST_BUNDLES=ON \ -DCURL_WERROR=ON \ -DCURL_USE_LIBPSL=OFF \ ${{ matrix.config }} diff --git a/lib/CMakeLists.txt b/lib/CMakeLists.txt index 168fa20cf3..5f1b395f0a 100644 --- a/lib/CMakeLists.txt +++ b/lib/CMakeLists.txt @@ -55,6 +55,9 @@ if(CURL_BUILD_TESTING) ) target_compile_definitions(curlu PUBLIC "UNITTESTS" "CURL_STATICLIB") target_link_libraries(curlu PRIVATE ${CURL_LIBS}) + # There is plenty of parallelism when building the testdeps target. + # Override the curlu batch size with the maximum to optimize performance. + set_target_properties(curlu PROPERTIES UNITY_BUILD_BATCH_SIZE 0) endif() if(ENABLE_CURLDEBUG)