When IB1 is split into multiple cmd buffers, we'd emit multiple
RD_CMDSTREAM_ADDR per submit. But after this packet is handled
by the cffdump parser, it resets it's known buffers on the next
GPUADDR packet, so subsequent RD_CMDSTREAM_ADDR packets from the
same submit would not find their buffers.
Re-work the loop to snapshot all buffers before RD_CMDSTREAM_ADDR
to avoid this problem.