gdb: pagination fix for emoji and unicode box output
I noticed that, on a 24 line terminal, the new unicode boxed hint text
was causing the pager to trigger unexpectedly:
$ gdb -nx -nh
GNU gdb (GDB) 18.0.50.
20260107-git
Copyright (C) 2026 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Find the GDB manual online at: ┃
┃ http://www.gnu.org/software/gdb/documentation/. ┃
┃ For help, type "help". ┃
┃ Type "apropos <word>" to search for commands related to <word>. ┃
--Type <RET> for more, q to quit, c to continue without paging--
At this point there are 6 unused lines remaining in my terminal, so
the pager should not have triggered yet.
There are two problems here, both in pager_file::puts (in utils.c).
Lets start with the easy problem first. When content is written to
the pager_file we process it within a loop looking for a newline
character. We handle some special cases, but if none of them apply we
handle all general, printable, content with this block:
else
{
m_wrap_buffer.push_back (*linebuffer);
chars_printed++;
linebuffer++;
}
This copies one byte from LINEBUFFER to M_WRAP_BUFFER, and increments
CHARS_PRINTED. The problem is that the unicode box characters are
multi-byte, this means we are over incrementing CHARS_PRINTED by
counting each byte of the unicode character as one output character.
GDB believes that the top line of the box is actually going to span
over multiple screen lines due to the large number of bytes within the
line. In reality of course, the multi-byte characters fill exactly
one screen line.
I propose fixing this by making use of mbrlen to spot multi-byte
characters and count them as a single character.
If mbrlen returns anything less than 1 (which indicates a null
character, or an invalid character), then I just treat this as a
single byte character and continue as before. This means if any
"weird" output is sent to the pager then it will still be printed.
The null wide character case shouldn't occur as the null wide
character is still all zeros, which the outer control loop in ::puts
should catch, so all I'm really concerned about is the invalid wide
character case.
Handling multi-byte wide characters does make things a little better,
but doesn't fix everything. The pager still activates unnecessarily,
but just a little later. On the same 80x24 terminal, the output is
now:
$ gdb -nx -nh
GNU gdb (GDB) 18.0.50.
20260107-git
Copyright (C) 2026 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Find the GDB manual online at: ┃
┃ http://www.gnu.org/software/gdb/documentation/. ┃
┃ For help, type "help". ┃
┃ Type "apropos <word>" to search for commands related to <word>. ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
--Type <RET> for more, q to quit, c to continue without paging--
We managed to get an extra line of output printed, but there is still
enough room on the terminal to print everything, so why is the pager
triggering?
The problem now is how we deal with lines that entirely fill the
terminal, and how we handle newlines.
Within the pager_file::puts inner loop we process input from
LINEBUFFER and copy it to the M_WRAP_BUFFER, incrementing
CHARS_PRINTED as we do. This continues until we reach the end of
LINEBUFFER, or until we reach a newline within LINEBUFFER.
After each character is added to M_WRAP_BUFFER, we check CHARS_PRINTED
to see if we have filled a line. If we have then we flush
M_WRAP_BUFFER and increment LINES_PRINTED. If enough lines have now
been printed then we activate the pager.
Alternatively, if we encounter a newline character in LINEBUFFER then
we flush M_WRAP_BUFFER and increment LINES_PRINTED, then we re-enter
the inner loop, which includes performing a check to see if the pager
should trigger.
The problem here is that when we print the box, we entirely fill a
screen line, and then print the newline character. When we print the
final non-newline character then this is enough to trigger the line
full logic, this flushes the line and increments LINES_PRINTED. The
CHARS_PRINTED count is also reset to zero.
Then we print the newline. This never enters the inner loop, but just
goes straight to the newline handling code, which increments
LINES_PRINTED and also resets CHARS_PRINTED to zero.
Notice that we've now incremented LINES_PRINTED twice. This is the
cause of the premature pager activation; lines that are exactly one
screen width wide end up being double counted.
My initial thoughts when trying to fix this were to move the full line
check before the code which copies content from LINEBUFFER to
M_WRAP_BUFFER, inside the pager_file::puts inner loop. This would
mean we only check for a full line when processing the next byte of
output after we've filled a screen line, but we'd never encounter this
check if the first byte after a full screen line was the newline
character, as in this case we'd never enter the inner loop.
And this does indeed fix the immediate problem, but I think, is still
not correct.
On an 80 character wide terminal, what we actually care about, is when
we try to add the 81st _printable_ character. If the 81st character
was a tab then this doesn't wrap onto the next line. Or if the 81st
character was \r, then this certainly doesn't wrap to a new line, it
just resets the current line. And the same is true for the 82nd
character, and so on. The only time we need to trigger a new screen
line is when we try to actually print something that will be displayed
to the user.
It turns out, I think, that we only want to check for a full line
inside the block that I mentioned above, the one I just updated to use
mbrlen. This is the only place where printable content is copied from
LINEBUFFER into M_WRAP_BUFFER.
There are still some edge cases here that are not being handled
correctly, some unicode characters are non-printable, or stack on the
previous character, requiring zero width. And even some of the basic
ASCII characters that we don't cover are non-printable. But I'm
choosing to ignore all of these for now. These cases were broken
before this patch, and continue to be broken afterwards. Broken here
simply means that including these characters in GDB's output will
confuse the pager, likely resulting in the pager triggering too
early.
But for printable characters that are 1 terminal character wide,
things should now be a little better. The pager will trigger only
when we try to add the first character that wraps onto the next screen
line. In our original problem with the box, this means that when the
top border of the box is printed this will no longer cause an
increment of LINES_PRINTED. When the newline is added then this does
finish off the current line and increments LINES_PRINTED as expected.
We now only increment LINES_PRINTED once for each line of the box,
rather than twice, and so the pager no longer needs to trigger during
startup.
To make the code cleaner, I moved the full line check into a new
function, pager_file::check_for_overfull_line(), and added comments in
the LINEBUFFER handling code to explain when the new function should
be called.
The test gdb.python/py-color-pagination.exp needed some updates after
this patch, the current expected output was tied to how the pager used
to operate. Now that we defer starting a newline until we see some
printable characters GDB is better able to coalesce adjacent style
changes, this accounts for the large volume of changes to this test.
I've also added a couple of new tests to the
gdb.python/py-color-pagination.exp file. An initial failed idea I had
for fixing this problem caused a bug such that empty lines would never
have triggered the pager, there's a new test that covers this case.
There's also a test that lines can exceed the screen width so long as
the extra content is non-printable; the test fills a line then prints
some tabs and a '\r' before filling the same line a second time.
There are also a couple of new pager self tests that I added. I wrote
these because early on while investigating this issue, I thought I'd
spotted a bug in pager_file::wrap_here, so I wrote these tests to
expose that bug. It turned not to be a bug, but a gap in my
understanding. I think retaining these tests isn't going to hurt, and
increases coverage.
Approved-By: Tom Tromey <tom@tromey.com>