]> git.ipfire.org Git - thirdparty/curl.git/commitdiff
GHA/windows: improve build perf with cmake unity batches
authorViktor Szakats <commit@vsz.me>
Sat, 8 Feb 2025 23:50:07 +0000 (00:50 +0100)
committerViktor Szakats <commit@vsz.me>
Mon, 10 Feb 2025 11:54:11 +0000 (12:54 +0100)
Default curl unity builds make a single unit for each target. It means
all target sources are batched together and built in a single compiler
invocation. With multi-core CPUs this doesn't always result in the best
possible performance. This patch enables smaller batches for jobs where
this resulted in shorter build times. These jobs are Cygwin, MSYS2,
MinGW, running on the Windows runners.

Use batch of 30 (meaning 30 sources batched into units), and 32 for
Cygwin/MSYS2 to avoid a unity fallout that's subject to a different PR.

(CMake allows to set the number of sources per unit, not the number
of units, though the latter may be more practical to max out CPU cores.)

Also override to not batch the `curlu` target because batching lost
a little bit of time there, due to the already existing parallelism when
building the `testdeps` targets.

For jobs on the macOS and Linux runners jobs were already mostly single
digit or below teen seconds, and batching didn't improve on them
noticeably. On VM jobs, the virtual CPUs are limited, so I didn't
make a try. In AppVeyor and GHA vcpkg jobs (using msbuild), batching
didn't result in conclusive or any gains.

Build times in seconds (curl + testdeps):
Job                  |          Before | After w curlu=0 | Gain
:--------------------| :-------------- | :-------------- | :---
cygwin, CM           |   19 + 32 =  51 |  12 +  32 =  44 |    7
msys2, CM            |    7 + 15 =  22 |   5 +  14 =  19 |    3
mingw gcc U, CM      |   19 + 30 =  49 |  13 +  32 =  45 |    4
mingw ucrt, CM       |   32 + 42 =  74 |  15 +  43 =  58 |   16
mingw clang, CM      |   15 + 21 =  36 |   8 +  21 =  29 |    7
mingw uwp, CM        |   30 + 40 =  70 |  14 +  40 =  54 |   16
mingw gcc, CM        |   20 + 31 =  51 |  12 +  31 =  43 |    8
mingw x86, CM        |   35 + 40 =  75 |  15 +  38 =  53 |   22
dl-mingw, CM 9.5.0   |   88 + 99 = 187 |  42 + 101 = 143 |   44
dl-mingw, CM 7.3.0 U |   24 + 32 =  56 |  17 +  35 =  52 |    4
Total                |                 |                 |  131

Total gain per GHA/windows workflow runs: 2m11s

Runs:
Before: https://github.com/curl/curl/actions/runs/13220256084/job/36904342259
After: https://github.com/curl/curl/actions/runs/13220383702/job/36904602981
       https://github.com/curl/curl/actions/runs/13220613141/job/36905170104
       https://github.com/curl/curl/actions/runs/13222019443/job/36908358550
With curlu tweak: https://github.com/curl/curl/actions/runs/13222239255/job/36908782462

Ref: 116950a25066257f86461f9d1dfa5f787f55e73c #16265

Closes #16272

.github/workflows/windows.yml
lib/CMakeLists.txt

index 8dabd34c826b9d90d4fc58f5f9772752c6f8241c..21c2d219dd0b095005be6a14642a6c5e74f44395 100644 (file)
@@ -86,7 +86,7 @@ jobs:
           if [ '${{ matrix.build }}' = 'cmake' ]; then
             PATH="/usr/bin:$(cygpath "${SYSTEMROOT}")/System32"
             cmake -B bld -G Ninja ${options} \
-              -DCMAKE_UNITY_BUILD=ON -DCURL_TEST_BUNDLES=ON \
+              -DCMAKE_UNITY_BUILD=ON -DCMAKE_UNITY_BUILD_BATCH_SIZE=32 -DCURL_TEST_BUNDLES=ON \
               -DCURL_WERROR=ON \
               ${{ matrix.config }}
           else
@@ -252,7 +252,7 @@ jobs:
             cmake -B bld -G Ninja ${options} \
               -DCMAKE_C_FLAGS="${{ matrix.cflags }} ${CFLAGS_CMAKE} ${CPPFLAGS}" \
               -DCMAKE_BUILD_TYPE='${{ matrix.type }}' \
-              -DCMAKE_UNITY_BUILD=ON -DCURL_TEST_BUNDLES=ON \
+              -DCMAKE_UNITY_BUILD=ON -DCMAKE_UNITY_BUILD_BATCH_SIZE=32 -DCURL_TEST_BUNDLES=ON \
               -DCURL_WERROR=ON \
               ${{ matrix.config }}
           else
@@ -438,7 +438,7 @@ jobs:
           cmake -B bld -G 'MSYS Makefiles' ${options} \
             -DCMAKE_C_COMPILER=gcc \
             -DCMAKE_BUILD_TYPE='${{ matrix.type }}' \
-            -DCMAKE_UNITY_BUILD=ON -DCURL_TEST_BUNDLES=ON \
+            -DCMAKE_UNITY_BUILD=ON -DCMAKE_UNITY_BUILD_BATCH_SIZE=30 -DCURL_TEST_BUNDLES=ON \
             -DCURL_WERROR=ON \
             -DCURL_USE_LIBPSL=OFF \
             ${{ matrix.config }}
index 168fa20cf3cbe4f6554912859782b1e8899d9caa..5f1b395f0a341cdd527d829d21963df0ecdbe5ff 100644 (file)
@@ -55,6 +55,9 @@ if(CURL_BUILD_TESTING)
   )
   target_compile_definitions(curlu PUBLIC "UNITTESTS" "CURL_STATICLIB")
   target_link_libraries(curlu PRIVATE ${CURL_LIBS})
+  # There is plenty of parallelism when building the testdeps target.
+  # Override the curlu batch size with the maximum to optimize performance.
+  set_target_properties(curlu PROPERTIES UNITY_BUILD_BATCH_SIZE 0)
 endif()
 
 if(ENABLE_CURLDEBUG)