tune.runqueue-depth <number>
Sets the maximum amount of task that can be processed at once when running
- tasks. The default value is 200. Increasing it may incur latency when
- dealing with I/Os, making it too small can incur extra overhead. When
- experimenting with much larger values, it may be useful to also enable
- tune.sched.low-latency to limit the maximum latency to the lowest possible.
+ tasks. The default value is 40 which tends to show the highest request rates
+ and lowest latencies. Increasing it may incur latency when dealing with I/Os,
+ making it too small can incur extra overhead. When experimenting with much
+ larger values, it may be useful to also enable tune.sched.low-latency and
+ possibly tune.fd.edge-triggered to limit the maximum latency to the lowest
+ possible.
tune.sched.low-latency { on | off }
Enables ('on') or disables ('off') the low-latency task scheduler. By default
#define MAX_POLL_EVENTS 200
#endif
-// the max number of tasks to run at once
+// the max number of tasks to run at once. Tests have shown the following
+// number of requests/s for 1 to 16 threads (1c1t, 1c2t, 2c4t, 4c8t, 4c16t):
+//
+// rq\thr| 1 2 4 8 16
+// ------+------------------------------
+// 32| 120k 159k 276k 477k 698k
+// 40| 122k 160k 276k 478k 722k
+// 48| 121k 159k 274k 482k 720k
+// 64| 121k 160k 274k 469k 710k
+// 200| 114k 150k 247k 415k 613k
+//
#ifndef RUNQUEUE_DEPTH
-#define RUNQUEUE_DEPTH 200
+#define RUNQUEUE_DEPTH 40
#endif
// cookie delimiter in "prefix" mode. This character is inserted between the