OPTIM: proxy: move atomically access fields out of the read-only ones
Perf top showed that h1_snd_buf() was having great difficulties accessing
the proxy's server_id_hdr_name field in the middle of the headers loop.
Moving the assignment out of the loop to a local variable moved the
problem there as well:
| if (!(h1m->flags & H1_MF_RESP) && isttest(h1c->px->server_id_hdr_n
0.10 |20b0: mov -0x120(%rbp),%rdi
1.33 | mov 0x60(%rdi),%r10
0.01 | test %eax,%eax
0.18 | jne 2118
12.87 | mov 0x350(%r10),%rdi
0.01 | test %rdi,%rdi
0.05 | je 2118
| mov 0x358(%r10),%r11
It turns out that there are several atomically accessed fields in its
vicinity, causing the cache line to bounce all the time. Let's collect
the few frequently changed fields and place them together at the end
of the structure, and plug the 32-bit hole with another isolated field.
Doing so also reduced a little bit the cost of decrementing be->be_conn
in process_stream(), and overall the HTTP/1 performance increased by
about 1% both on ARM and x86_64.