BMP: simplified update queuing and better memory performance
This commit is quite a substantial rework of the underlying layers in
BMP TX:
- several unnecessary layers of indirection dropped, including most of
the original BMP's buffer machinery
- all messages are now written directly into one protocol's buffer
allocated for the whole time big enough to fit every possible message
- output blocks are allocated by pages and immediately returned when
used, improving the overall memory footprint
- no intermediary allocation is done from the heap altogether
- there is a documented and configurable limit on the TX queue size