Jacques Heunis from bloomberg reported on the mailing list [1] that
with haproxy 2.8 up to master, yielding from a Lua tcp service while
data was still buffered inside haproxy would eat some data which was
definitely lost.
He provided the reproducer below which turned out to be really helpful:
global
log stdout format raw local0 info
lua-load haproxy_yieldtest.lua
defaults
log global
timeout connect 10s
timeout client 1m
timeout server 1m
listen echo
bind *:9090
mode tcp
tcp-request content use-service lua.print_input
haproxy_yieldtest.lua:
core.register_service("print_input", "tcp", function(applet)
core.Info("Start printing input...")
while true do
local inputs = applet:getline()
if inputs == nil or string.len(inputs) == 0 then
core.Info("closing input connection")
return
end
core.Info("Received line: "..inputs)
core.yield()
end
end)
And the script below:
#!/usr/bin/bash
for i in $(seq 1 9999); do
for j in $(seq 1 50); do
echo "${i}_foo_${j}"
done
sleep 2
done
Using it like this:
./test_seq.sh | netcat localhost 9090
We can clearly see the missing data for every "foo" burst (every 2
seconds), as they are holes in the numbering.
Thanks to the reproducer, it was quickly found that only versions
>= 2.8 were affected, and that in fact this regression was introduced
by commit
31572229e ("MEDIUM: hlua/applet: Use the sedesc to report and
detect end of processing")
In fact in
31572229e 2 mistakes were made during the refaco.
Indeed, both in hlua_applet_tcp_fct() (which is involved in the reproducer
above) and hlua_applet_http_fct(), the request (buffer) is now
systematically consumed when returning from the function, which wasn't the
case prior to this commit: when HLUA_E_AGAIN is returned, it means a
yield was requested and that the processing is not done yet, thus we
should not consume any data, like we did prior to the refacto.
Big thanks to Jacques who did a great job reproducing and reporting this
issue on the mailing list.
[1]: https://www.mail-archive.com/haproxy@formilux.org/msg45778.html
It should be backported up to 2.8 with commit
31572229e
struct act_rule *rule = ctx->rule;
struct proxy *px = strm->be;
struct hlua *hlua = tcp_ctx->hlua;
+ int yield = 0;
if (unlikely(se_fl_test(ctx->sedesc, (SE_FL_EOS|SE_FL_ERROR|SE_FL_SHR|SE_FL_SHW))))
goto out;
/* yield. */
case HLUA_E_AGAIN:
+ yield = 1;
if (hlua->wake_time != TICK_ETERNITY)
task_schedule(tcp_ctx->task, hlua->wake_time);
break;
}
out:
- /* eat the whole request */
+ /* eat the whole request unless yield was requested which means
+ * we are not done yet
+ */
+ if (yield)
+ return;
+
if (ctx->flags & APPCTX_FL_INOUT_BUFS)
b_reset(&ctx->inbuf);
else
struct proxy *px = strm->be;
struct hlua *hlua = http_ctx->hlua;
struct htx *req_htx, *res_htx;
+ int yield = 0;
res_htx = htx_from_buf(&res->buf);
/* yield. */
case HLUA_E_AGAIN:
+ yield = 1;
if (hlua->wake_time != TICK_ETERNITY)
task_schedule(http_ctx->task, hlua->wake_time);
goto out;
out:
htx_to_buf(res_htx, &res->buf);
- /* eat the whole request */
+
+ /* eat the whole request unless yield was requested which means
+ * we are not done yet
+ */
+ if (yield)
+ return;
+
if (co_data(req)) {
req_htx = htx_from_buf(&req->buf);
co_htx_skip(req, req_htx, co_data(req));