]> git.ipfire.org Git - thirdparty/binutils-gdb.git/blame - gdb/doc/agentexpr.texi
Update copyright year range in header of all files managed by GDB
[thirdparty/binutils-gdb.git] / gdb / doc / agentexpr.texi
CommitLineData
f418dd93 1@c \input texinfo
c906108c 2@c %**start of header
f418dd93
DJ
3@c @setfilename agentexpr.info
4@c @settitle GDB Agent Expressions
5@c @setchapternewpage off
c906108c
SS
6@c %**end of header
7
16d1a084
DJ
8@c This file is part of the GDB manual.
9@c
1d506c26 10@c Copyright (C) 2003--2024 Free Software Foundation, Inc.
16d1a084
DJ
11@c
12@c See the file gdb.texinfo for copying conditions.
13
f418dd93
DJ
14@node Agent Expressions
15@appendix The GDB Agent Expression Mechanism
c906108c 16
d3e8051b 17In some applications, it is not feasible for the debugger to interrupt
c906108c
SS
18the program's execution long enough for the developer to learn anything
19helpful about its behavior. If the program's correctness depends on its
20real-time behavior, delays introduced by a debugger might cause the
21program to fail, even when the code itself is correct. It is useful to
22be able to observe the program's behavior without interrupting it.
23
24Using GDB's @code{trace} and @code{collect} commands, the user can
25specify locations in the program, and arbitrary expressions to evaluate
26when those locations are reached. Later, using the @code{tfind}
27command, she can examine the values those expressions had when the
28program hit the trace points. The expressions may also denote objects
29in memory --- structures or arrays, for example --- whose values GDB
30should record; while visiting a particular tracepoint, the user may
31inspect those objects as if they were in memory at that moment.
32However, because GDB records these values without interacting with the
33user, it can do so quickly and unobtrusively, hopefully not disturbing
34the program's behavior.
35
36When GDB is debugging a remote target, the GDB @dfn{agent} code running
37on the target computes the values of the expressions itself. To avoid
38having a full symbolic expression evaluator on the agent, GDB translates
39expressions in the source language into a simpler bytecode language, and
40then sends the bytecode to the agent; the agent then executes the
41bytecode, and records the values for GDB to retrieve later.
42
43The bytecode language is simple; there are forty-odd opcodes, the bulk
44of which are the usual vocabulary of C operands (addition, subtraction,
45shifts, and so on) and various sizes of literals and memory reference
46operations. The bytecode interpreter operates strictly on machine-level
47values --- various sizes of integers and floating point numbers --- and
48requires no information about types or symbols; thus, the interpreter's
49internal data structures are simple, and each bytecode requires only a
50few native machine instructions to implement it. The interpreter is
51small, and strict limits on the memory and time required to evaluate an
52expression are easy to determine, making it suitable for use by the
53debugging agent in real-time applications.
54
55@menu
56* General Bytecode Design:: Overview of the interpreter.
57* Bytecode Descriptions:: What each one does.
58* Using Agent Expressions:: How agent expressions fit into the big picture.
59* Varying Target Capabilities:: How to discover what the target can do.
c906108c
SS
60* Rationale:: Why we did it this way.
61@end menu
62
63
64@c @node Rationale
65@c @section Rationale
66
67
68@node General Bytecode Design
69@section General Bytecode Design
70
71The agent represents bytecode expressions as an array of bytes. Each
72instruction is one byte long (thus the term @dfn{bytecode}). Some
73instructions are followed by operand bytes; for example, the @code{goto}
74instruction is followed by a destination for the jump.
75
76The bytecode interpreter is a stack-based machine; most instructions pop
77their operands off the stack, perform some operation, and push the
78result back on the stack for the next instruction to consume. Each
79element of the stack may contain either a integer or a floating point
80value; these values are as many bits wide as the largest integer that
81can be directly manipulated in the source language. Stack elements
82carry no record of their type; bytecode could push a value as an
83integer, then pop it as a floating point value. However, GDB will not
84generate code which does this. In C, one might define the type of a
85stack element as follows:
86@example
87union agent_val @{
88 LONGEST l;
89 DOUBLEST d;
90@};
91@end example
92@noindent
93where @code{LONGEST} and @code{DOUBLEST} are @code{typedef} names for
94the largest integer and floating point types on the machine.
95
96By the time the bytecode interpreter reaches the end of the expression,
97the value of the expression should be the only value left on the stack.
98For tracing applications, @code{trace} bytecodes in the expression will
99have recorded the necessary data, and the value on the stack may be
100discarded. For other applications, like conditional breakpoints, the
101value may be useful.
102
103Separate from the stack, the interpreter has two registers:
104@table @code
105@item pc
106The address of the next bytecode to execute.
107
108@item start
109The address of the start of the bytecode expression, necessary for
110interpreting the @code{goto} and @code{if_goto} instructions.
111
112@end table
113@noindent
114Neither of these registers is directly visible to the bytecode language
115itself, but they are useful for defining the meanings of the bytecode
116operations.
117
118There are no instructions to perform side effects on the running
119program, or call the program's functions; we assume that these
120expressions are only used for unobtrusive debugging, not for patching
121the running code.
122
123Most bytecode instructions do not distinguish between the various sizes
124of values, and operate on full-width values; the upper bits of the
125values are simply ignored, since they do not usually make a difference
126to the value computed. The exceptions to this rule are:
127@table @asis
128
129@item memory reference instructions (@code{ref}@var{n})
130There are distinct instructions to fetch different word sizes from
131memory. Once on the stack, however, the values are treated as full-size
132integers. They may need to be sign-extended; the @code{ext} instruction
133exists for this purpose.
134
135@item the sign-extension instruction (@code{ext} @var{n})
136These clearly need to know which portion of their operand is to be
137extended to occupy the full length of the word.
138
139@end table
140
141If the interpreter is unable to evaluate an expression completely for
142some reason (a memory location is inaccessible, or a divisor is zero,
143for example), we say that interpretation ``terminates with an error''.
144This means that the problem is reported back to the interpreter's caller
145in some helpful way. In general, code using agent expressions should
146assume that they may attempt to divide by zero, fetch arbitrary memory
147locations, and misbehave in other ways.
148
149Even complicated C expressions compile to a few bytecode instructions;
150for example, the expression @code{x + y * z} would typically produce
151code like the following, assuming that @code{x} and @code{y} live in
152registers, and @code{z} is a global variable holding a 32-bit
153@code{int}:
154@example
155reg 1
156reg 2
157const32 @i{address of z}
158ref32
159ext 32
160mul
161add
162end
163@end example
164
165In detail, these mean:
166@table @code
167
168@item reg 1
169Push the value of register 1 (presumably holding @code{x}) onto the
170stack.
171
172@item reg 2
173Push the value of register 2 (holding @code{y}).
174
175@item const32 @i{address of z}
176Push the address of @code{z} onto the stack.
177
178@item ref32
179Fetch a 32-bit word from the address at the top of the stack; replace
180the address on the stack with the value. Thus, we replace the address
181of @code{z} with @code{z}'s value.
182
183@item ext 32
184Sign-extend the value on the top of the stack from 32 bits to full
185length. This is necessary because @code{z} is a signed integer.
186
187@item mul
188Pop the top two numbers on the stack, multiply them, and push their
189product. Now the top of the stack contains the value of the expression
190@code{y * z}.
191
192@item add
193Pop the top two numbers, add them, and push the sum. Now the top of the
194stack contains the value of @code{x + y * z}.
195
196@item end
197Stop executing; the value left on the stack top is the value to be
198recorded.
199
200@end table
201
202
203@node Bytecode Descriptions
204@section Bytecode Descriptions
205
206Each bytecode description has the following form:
207
208@table @asis
209
210@item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b}
211
212Pop the top two stack items, @var{a} and @var{b}, as integers; push
213their sum, as an integer.
214
215@end table
216
217In this example, @code{add} is the name of the bytecode, and
218@code{(0x02)} is the one-byte value used to encode the bytecode, in
d3e8051b 219hexadecimal. The phrase ``@var{a} @var{b} @result{} @var{a+b}'' shows
c906108c
SS
220the stack before and after the bytecode executes. Beforehand, the stack
221must contain at least two values, @var{a} and @var{b}; since the top of
222the stack is to the right, @var{b} is on the top of the stack, and
223@var{a} is underneath it. After execution, the bytecode will have
224popped @var{a} and @var{b} from the stack, and replaced them with a
225single value, @var{a+b}. There may be other values on the stack below
226those shown, but the bytecode affects only those shown.
227
228Here is another example:
229
230@table @asis
231
232@item @code{const8} (0x22) @var{n}: @result{} @var{n}
233Push the 8-bit integer constant @var{n} on the stack, without sign
234extension.
235
236@end table
237
238In this example, the bytecode @code{const8} takes an operand @var{n}
239directly from the bytecode stream; the operand follows the @code{const8}
240bytecode itself. We write any such operands immediately after the name
241of the bytecode, before the colon, and describe the exact encoding of
242the operand in the bytecode stream in the body of the bytecode
243description.
244
245For the @code{const8} bytecode, there are no stack items given before
246the @result{}; this simply means that the bytecode consumes no values
247from the stack. If a bytecode consumes no values, or produces no
248values, the list on either side of the @result{} may be empty.
249
250If a value is written as @var{a}, @var{b}, or @var{n}, then the bytecode
251treats it as an integer. If a value is written is @var{addr}, then the
252bytecode treats it as an address.
253
254We do not fully describe the floating point operations here; although
255this design can be extended in a clean way to handle floating point
256values, they are not of immediate interest to the customer, so we avoid
257describing them, to save time.
258
259
260@table @asis
261
262@item @code{float} (0x01): @result{}
263
264Prefix for floating-point bytecodes. Not implemented yet.
265
266@item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b}
267Pop two integers from the stack, and push their sum, as an integer.
268
269@item @code{sub} (0x03): @var{a} @var{b} @result{} @var{a-b}
270Pop two integers from the stack, subtract the top value from the
271next-to-top value, and push the difference.
272
273@item @code{mul} (0x04): @var{a} @var{b} @result{} @var{a*b}
274Pop two integers from the stack, multiply them, and push the product on
275the stack. Note that, when one multiplies two @var{n}-bit numbers
276yielding another @var{n}-bit number, it is irrelevant whether the
277numbers are signed or not; the results are the same.
278
279@item @code{div_signed} (0x05): @var{a} @var{b} @result{} @var{a/b}
280Pop two signed integers from the stack; divide the next-to-top value by
281the top value, and push the quotient. If the divisor is zero, terminate
282with an error.
283
284@item @code{div_unsigned} (0x06): @var{a} @var{b} @result{} @var{a/b}
285Pop two unsigned integers from the stack; divide the next-to-top value
286by the top value, and push the quotient. If the divisor is zero,
287terminate with an error.
288
289@item @code{rem_signed} (0x07): @var{a} @var{b} @result{} @var{a modulo b}
290Pop two signed integers from the stack; divide the next-to-top value by
291the top value, and push the remainder. If the divisor is zero,
292terminate with an error.
293
294@item @code{rem_unsigned} (0x08): @var{a} @var{b} @result{} @var{a modulo b}
295Pop two unsigned integers from the stack; divide the next-to-top value
296by the top value, and push the remainder. If the divisor is zero,
297terminate with an error.
298
299@item @code{lsh} (0x09): @var{a} @var{b} @result{} @var{a<<b}
300Pop two integers from the stack; let @var{a} be the next-to-top value,
301and @var{b} be the top value. Shift @var{a} left by @var{b} bits, and
302push the result.
303
f418dd93 304@item @code{rsh_signed} (0x0a): @var{a} @var{b} @result{} @code{(signed)}@var{a>>b}
c906108c
SS
305Pop two integers from the stack; let @var{a} be the next-to-top value,
306and @var{b} be the top value. Shift @var{a} right by @var{b} bits,
307inserting copies of the top bit at the high end, and push the result.
308
309@item @code{rsh_unsigned} (0x0b): @var{a} @var{b} @result{} @var{a>>b}
310Pop two integers from the stack; let @var{a} be the next-to-top value,
311and @var{b} be the top value. Shift @var{a} right by @var{b} bits,
312inserting zero bits at the high end, and push the result.
313
314@item @code{log_not} (0x0e): @var{a} @result{} @var{!a}
315Pop an integer from the stack; if it is zero, push the value one;
316otherwise, push the value zero.
317
318@item @code{bit_and} (0x0f): @var{a} @var{b} @result{} @var{a&b}
319Pop two integers from the stack, and push their bitwise @code{and}.
320
321@item @code{bit_or} (0x10): @var{a} @var{b} @result{} @var{a|b}
322Pop two integers from the stack, and push their bitwise @code{or}.
323
324@item @code{bit_xor} (0x11): @var{a} @var{b} @result{} @var{a^b}
325Pop two integers from the stack, and push their bitwise
326exclusive-@code{or}.
327
328@item @code{bit_not} (0x12): @var{a} @result{} @var{~a}
329Pop an integer from the stack, and push its bitwise complement.
330
331@item @code{equal} (0x13): @var{a} @var{b} @result{} @var{a=b}
332Pop two integers from the stack; if they are equal, push the value one;
333otherwise, push the value zero.
334
335@item @code{less_signed} (0x14): @var{a} @var{b} @result{} @var{a<b}
336Pop two signed integers from the stack; if the next-to-top value is less
337than the top value, push the value one; otherwise, push the value zero.
338
339@item @code{less_unsigned} (0x15): @var{a} @var{b} @result{} @var{a<b}
340Pop two unsigned integers from the stack; if the next-to-top value is less
341than the top value, push the value one; otherwise, push the value zero.
342
343@item @code{ext} (0x16) @var{n}: @var{a} @result{} @var{a}, sign-extended from @var{n} bits
344Pop an unsigned value from the stack; treating it as an @var{n}-bit
345twos-complement value, extend it to full length. This means that all
346bits to the left of bit @var{n-1} (where the least significant bit is bit
3470) are set to the value of bit @var{n-1}. Note that @var{n} may be
348larger than or equal to the width of the stack elements of the bytecode
349engine; in this case, the bytecode should have no effect.
350
351The number of source bits to preserve, @var{n}, is encoded as a single
352byte unsigned integer following the @code{ext} bytecode.
353
354@item @code{zero_ext} (0x2a) @var{n}: @var{a} @result{} @var{a}, zero-extended from @var{n} bits
355Pop an unsigned value from the stack; zero all but the bottom @var{n}
b8162e5a 356bits.
c906108c
SS
357
358The number of source bits to preserve, @var{n}, is encoded as a single
359byte unsigned integer following the @code{zero_ext} bytecode.
360
361@item @code{ref8} (0x17): @var{addr} @result{} @var{a}
362@itemx @code{ref16} (0x18): @var{addr} @result{} @var{a}
363@itemx @code{ref32} (0x19): @var{addr} @result{} @var{a}
364@itemx @code{ref64} (0x1a): @var{addr} @result{} @var{a}
365Pop an address @var{addr} from the stack. For bytecode
366@code{ref}@var{n}, fetch an @var{n}-bit value from @var{addr}, using the
367natural target endianness. Push the fetched value as an unsigned
368integer.
369
370Note that @var{addr} may not be aligned in any particular way; the
371@code{ref@var{n}} bytecodes should operate correctly for any address.
372
373If attempting to access memory at @var{addr} would cause a processor
374exception of some sort, terminate with an error.
375
376@item @code{ref_float} (0x1b): @var{addr} @result{} @var{d}
377@itemx @code{ref_double} (0x1c): @var{addr} @result{} @var{d}
378@itemx @code{ref_long_double} (0x1d): @var{addr} @result{} @var{d}
379@itemx @code{l_to_d} (0x1e): @var{a} @result{} @var{d}
380@itemx @code{d_to_l} (0x1f): @var{d} @result{} @var{a}
381Not implemented yet.
382
383@item @code{dup} (0x28): @var{a} => @var{a} @var{a}
384Push another copy of the stack's top element.
385
386@item @code{swap} (0x2b): @var{a} @var{b} => @var{b} @var{a}
387Exchange the top two items on the stack.
388
389@item @code{pop} (0x29): @var{a} =>
390Discard the top value on the stack.
391
c7f96d2b
TT
392@item @code{pick} (0x32) @var{n}: @var{a} @dots{} @var{b} => @var{a} @dots{} @var{b} @var{a}
393Duplicate an item from the stack and push it on the top of the stack.
394@var{n}, a single byte, indicates the stack item to copy. If @var{n}
395is zero, this is the same as @code{dup}; if @var{n} is one, it copies
396the item under the top item, etc. If @var{n} exceeds the number of
397items on the stack, terminate with an error.
398
791fb3d7
SM
399@item @code{rot} (0x33): @var{a} @var{b} @var{c} => @var{c} @var{a} @var{b}
400Rotate the top three items on the stack. The top item (c) becomes the third
401item, the next-to-top item (b) becomes the top item and the third item (a) from
402the top becomes the next-to-top item.
c7f96d2b 403
c906108c
SS
404@item @code{if_goto} (0x20) @var{offset}: @var{a} @result{}
405Pop an integer off the stack; if it is non-zero, branch to the given
406offset in the bytecode string. Otherwise, continue to the next
407instruction in the bytecode stream. In other words, if @var{a} is
408non-zero, set the @code{pc} register to @code{start} + @var{offset}.
409Thus, an offset of zero denotes the beginning of the expression.
410
411The @var{offset} is stored as a sixteen-bit unsigned value, stored
412immediately following the @code{if_goto} bytecode. It is always stored
f821f325 413most significant byte first, regardless of the target's normal
c906108c
SS
414endianness. The offset is not guaranteed to fall at any particular
415alignment within the bytecode stream; thus, on machines where fetching a
41616-bit on an unaligned address raises an exception, you should fetch the
417offset one byte at a time.
418
419@item @code{goto} (0x21) @var{offset}: @result{}
420Branch unconditionally to @var{offset}; in other words, set the
421@code{pc} register to @code{start} + @var{offset}.
422
423The offset is stored in the same way as for the @code{if_goto} bytecode.
424
425@item @code{const8} (0x22) @var{n}: @result{} @var{n}
426@itemx @code{const16} (0x23) @var{n}: @result{} @var{n}
427@itemx @code{const32} (0x24) @var{n}: @result{} @var{n}
428@itemx @code{const64} (0x25) @var{n}: @result{} @var{n}
429Push the integer constant @var{n} on the stack, without sign extension.
430To produce a small negative value, push a small twos-complement value,
431and then sign-extend it using the @code{ext} bytecode.
432
433The constant @var{n} is stored in the appropriate number of bytes
434following the @code{const}@var{b} bytecode. The constant @var{n} is
435always stored most significant byte first, regardless of the target's
436normal endianness. The constant is not guaranteed to fall at any
437particular alignment within the bytecode stream; thus, on machines where
438fetching a 16-bit on an unaligned address raises an exception, you
439should fetch @var{n} one byte at a time.
440
441@item @code{reg} (0x26) @var{n}: @result{} @var{a}
442Push the value of register number @var{n}, without sign extension. The
443registers are numbered following GDB's conventions.
444
445The register number @var{n} is encoded as a 16-bit unsigned integer
446immediately following the @code{reg} bytecode. It is always stored most
f821f325 447significant byte first, regardless of the target's normal endianness.
c906108c
SS
448The register number is not guaranteed to fall at any particular
449alignment within the bytecode stream; thus, on machines where fetching a
45016-bit on an unaligned address raises an exception, you should fetch the
451register number one byte at a time.
452
f61e138d
SS
453@item @code{getv} (0x2c) @var{n}: @result{} @var{v}
454Push the value of trace state variable number @var{n}, without sign
455extension.
456
457The variable number @var{n} is encoded as a 16-bit unsigned integer
458immediately following the @code{getv} bytecode. It is always stored most
459significant byte first, regardless of the target's normal endianness.
460The variable number is not guaranteed to fall at any particular
461alignment within the bytecode stream; thus, on machines where fetching a
46216-bit on an unaligned address raises an exception, you should fetch the
463register number one byte at a time.
464
53cf2ee0 465@item @code{setv} (0x2d) @var{n}: @var{v} @result{} @var{v}
f61e138d
SS
466Set trace state variable number @var{n} to the value found on the top
467of the stack. The stack is unchanged, so that the value is readily
468available if the assignment is part of a larger expression. The
469handling of @var{n} is as described for @code{getv}.
470
c906108c
SS
471@item @code{trace} (0x0c): @var{addr} @var{size} @result{}
472Record the contents of the @var{size} bytes at @var{addr} in a trace
473buffer, for later retrieval by GDB.
474
475@item @code{trace_quick} (0x0d) @var{size}: @var{addr} @result{} @var{addr}
476Record the contents of the @var{size} bytes at @var{addr} in a trace
477buffer, for later retrieval by GDB. @var{size} is a single byte
478unsigned integer following the @code{trace} opcode.
479
480This bytecode is equivalent to the sequence @code{dup const8 @var{size}
481trace}, but we provide it anyway to save space in bytecode strings.
482
483@item @code{trace16} (0x30) @var{size}: @var{addr} @result{} @var{addr}
484Identical to trace_quick, except that @var{size} is a 16-bit big-endian
485unsigned integer, not a single byte. This should probably have been
486named @code{trace_quick16}, for consistency.
487
f61e138d
SS
488@item @code{tracev} (0x2e) @var{n}: @result{} @var{a}
489Record the value of trace state variable number @var{n} in the trace
490buffer. The handling of @var{n} is as described for @code{getv}.
491
3065dfb6
SS
492@item @code{tracenz} (0x2f) @var{addr} @var{size} @result{}
493Record the bytes at @var{addr} in a trace buffer, for later retrieval
494by GDB. Stop at either the first zero byte, or when @var{size} bytes
495have been recorded, whichever occurs first.
496
d3ce09f5
SS
497@item @code{printf} (0x34) @var{numargs} @var{string} @result{}
498Do a formatted print, in the style of the C function @code{printf}).
499The value of @var{numargs} is the number of arguments to expect on the
500stack, while @var{string} is the format string, prefixed with a
501two-byte length. The last byte of the string must be zero, and is
502included in the length. The format string includes escaped sequences
503just as it appears in C source, so for instance the format string
504@code{"\t%d\n"} is six characters long, and the output will consist of
505a tab character, a decimal number, and a newline. At the top of the
506stack, above the values to be printed, this bytecode will pop a
507``function'' and ``channel''. If the function is nonzero, then the
508target may treat it as a function and call it, passing the channel as
509a first argument, as with the C function @code{fprintf}. If the
510function is zero, then the target may simply call a standard formatted
511print function of its choice. In all, this bytecode pops 2 +
512@var{numargs} stack elements, and pushes nothing.
513
c906108c
SS
514@item @code{end} (0x27): @result{}
515Stop executing bytecode; the result should be the top element of the
516stack. If the purpose of the expression was to compute an lvalue or a
517range of memory, then the next-to-top of the stack is the lvalue's
518address, and the top of the stack is the lvalue's size, in bytes.
519
520@end table
521
522
523@node Using Agent Expressions
524@section Using Agent Expressions
525
782b2b07
SS
526Agent expressions can be used in several different ways by @value{GDBN},
527and the debugger can generate different bytecode sequences as appropriate.
528
529One possibility is to do expression evaluation on the target rather
530than the host, such as for the conditional of a conditional
531tracepoint. In such a case, @value{GDBN} compiles the source
532expression into a bytecode sequence that simply gets values from
533registers or memory, does arithmetic, and returns a result.
534
535Another way to use agent expressions is for tracepoint data
536collection. @value{GDBN} generates a different bytecode sequence for
537collection; in addition to bytecodes that do the calculation,
538@value{GDBN} adds @code{trace} bytecodes to save the pieces of
539memory that were used.
c906108c
SS
540
541@itemize @bullet
542
543@item
544The user selects trace points in the program's code at which GDB should
545collect data.
546
547@item
548The user specifies expressions to evaluate at each trace point. These
549expressions may denote objects in memory, in which case those objects'
550contents are recorded as the program runs, or computed values, in which
551case the values themselves are recorded.
552
553@item
554GDB transmits the tracepoints and their associated expressions to the
555GDB agent, running on the debugging target.
556
557@item
a5832c8f 558The agent arranges to be notified when a trace point is hit.
c906108c
SS
559
560@item
561When execution on the target reaches a trace point, the agent evaluates
562the expressions associated with that trace point, and records the
563resulting values and memory ranges.
564
565@item
566Later, when the user selects a given trace event and inspects the
567objects and expression values recorded, GDB talks to the agent to
568retrieve recorded data as necessary to meet the user's requests. If the
569user asks to see an object whose contents have not been recorded, GDB
570reports an error.
571
572@end itemize
573
574
575@node Varying Target Capabilities
576@section Varying Target Capabilities
577
578Some targets don't support floating-point, and some would rather not
579have to deal with @code{long long} operations. Also, different targets
580will have different stack sizes, and different bytecode buffer lengths.
581
582Thus, GDB needs a way to ask the target about itself. We haven't worked
583out the details yet, but in general, GDB should be able to send the
584target a packet asking it to describe itself. The reply should be a
585packet whose length is explicit, so we can add new information to the
586packet in future revisions of the agent, without confusing old versions
587of GDB, and it should contain a version number. It should contain at
588least the following information:
589
590@itemize @bullet
591
592@item
593whether floating point is supported
594
595@item
596whether @code{long long} is supported
597
598@item
599maximum acceptable size of bytecode stack
600
601@item
602maximum acceptable length of bytecode expressions
603
604@item
605which registers are actually available for collection
606
607@item
608whether the target supports disabled tracepoints
609
610@end itemize
611
c906108c
SS
612@node Rationale
613@section Rationale
614
615Some of the design decisions apparent above are arguable.
616
617@table @b
618
619@item What about stack overflow/underflow?
620GDB should be able to query the target to discover its stack size.
621Given that information, GDB can determine at translation time whether a
622given expression will overflow the stack. But this spec isn't about
623what kinds of error-checking GDB ought to do.
624
625@item Why are you doing everything in LONGEST?
626
627Speed isn't important, but agent code size is; using LONGEST brings in a
628bunch of support code to do things like division, etc. So this is a
629serious concern.
630
631First, note that you don't need different bytecodes for different
632operand sizes. You can generate code without @emph{knowing} how big the
633stack elements actually are on the target. If the target only supports
63432-bit ints, and you don't send any 64-bit bytecodes, everything just
635works. The observation here is that the MIPS and the Alpha have only
636fixed-size registers, and you can still get C's semantics even though
637most instructions only operate on full-sized words. You just need to
638make sure everything is properly sign-extended at the right times. So
639there is no need for 32- and 64-bit variants of the bytecodes. Just
640implement everything using the largest size you support.
641
642GDB should certainly check to see what sizes the target supports, so the
643user can get an error earlier, rather than later. But this information
644is not necessary for correctness.
645
646
647@item Why don't you have @code{>} or @code{<=} operators?
648I want to keep the interpreter small, and we don't need them. We can
649combine the @code{less_} opcodes with @code{log_not}, and swap the order
650of the operands, yielding all four asymmetrical comparison operators.
651For example, @code{(x <= y)} is @code{! (x > y)}, which is @code{! (y <
652x)}.
653
654@item Why do you have @code{log_not}?
655@itemx Why do you have @code{ext}?
656@itemx Why do you have @code{zero_ext}?
657These are all easily synthesized from other instructions, but I expect
658them to be used frequently, and they're simple, so I include them to
659keep bytecode strings short.
660
661@code{log_not} is equivalent to @code{const8 0 equal}; it's used in half
662the relational operators.
663
664@code{ext @var{n}} is equivalent to @code{const8 @var{s-n} lsh const8
665@var{s-n} rsh_signed}, where @var{s} is the size of the stack elements;
666it follows @code{ref@var{m}} and @var{reg} bytecodes when the value
667should be signed. See the next bulleted item.
668
669@code{zero_ext @var{n}} is equivalent to @code{const@var{m} @var{mask}
670log_and}; it's used whenever we push the value of a register, because we
671can't assume the upper bits of the register aren't garbage.
672
673@item Why not have sign-extending variants of the @code{ref} operators?
674Because that would double the number of @code{ref} operators, and we
675need the @code{ext} bytecode anyway for accessing bitfields.
676
677@item Why not have constant-address variants of the @code{ref} operators?
678Because that would double the number of @code{ref} operators again, and
679@code{const32 @var{address} ref32} is only one byte longer.
680
681@item Why do the @code{ref@var{n}} operators have to support unaligned fetches?
682GDB will generate bytecode that fetches multi-byte values at unaligned
683addresses whenever the executable's debugging information tells it to.
684Furthermore, GDB does not know the value the pointer will have when GDB
685generates the bytecode, so it cannot determine whether a particular
686fetch will be aligned or not.
687
688In particular, structure bitfields may be several bytes long, but follow
689no alignment rules; members of packed structures are not necessarily
690aligned either.
691
692In general, there are many cases where unaligned references occur in
693correct C code, either at the programmer's explicit request, or at the
694compiler's discretion. Thus, it is simpler to make the GDB agent
695bytecodes work correctly in all circumstances than to make GDB guess in
696each case whether the compiler did the usual thing.
697
698@item Why are there no side-effecting operators?
699Because our current client doesn't want them? That's a cheap answer. I
700think the real answer is that I'm afraid of implementing function
701calls. We should re-visit this issue after the present contract is
702delivered.
703
704@item Why aren't the @code{goto} ops PC-relative?
705The interpreter has the base address around anyway for PC bounds
706checking, and it seemed simpler.
707
708@item Why is there only one offset size for the @code{goto} ops?
709Offsets are currently sixteen bits. I'm not happy with this situation
710either:
711
712Suppose we have multiple branch ops with different offset sizes. As I
713generate code left-to-right, all my jumps are forward jumps (there are
714no loops in expressions), so I never know the target when I emit the
715jump opcode. Thus, I have to either always assume the largest offset
716size, or do jump relaxation on the code after I generate it, which seems
717like a big waste of time.
718
719I can imagine a reasonable expression being longer than 256 bytes. I
720can't imagine one being longer than 64k. Thus, we need 16-bit offsets.
721This kind of reasoning is so bogus, but relaxation is pathetic.
722
723The other approach would be to generate code right-to-left. Then I'd
724always know my offset size. That might be fun.
725
726@item Where is the function call bytecode?
727
728When we add side-effects, we should add this.
729
730@item Why does the @code{reg} bytecode take a 16-bit register number?
731
5e35df8e 732Intel's IA-64 architecture has 128 general-purpose registers,
c906108c
SS
733and 128 floating-point registers, and I'm sure it has some random
734control registers.
735
736@item Why do we need @code{trace} and @code{trace_quick}?
737Because GDB needs to record all the memory contents and registers an
738expression touches. If the user wants to evaluate an expression
739@code{x->y->z}, the agent must record the values of @code{x} and
740@code{x->y} as well as the value of @code{x->y->z}.
741
742@item Don't the @code{trace} bytecodes make the interpreter less general?
743They do mean that the interpreter contains special-purpose code, but
744that doesn't mean the interpreter can only be used for that purpose. If
745an expression doesn't use the @code{trace} bytecodes, they don't get in
746its way.
747
748@item Why doesn't @code{trace_quick} consume its arguments the way everything else does?
749In general, you do want your operators to consume their arguments; it's
750consistent, and generally reduces the amount of stack rearrangement
751necessary. However, @code{trace_quick} is a kludge to save space; it
752only exists so we needn't write @code{dup const8 @var{SIZE} trace}
753before every memory reference. Therefore, it's okay for it not to
754consume its arguments; it's meant for a specific context in which we
755know exactly what it should do with the stack. If we're going to have a
756kludge, it should be an effective kludge.
757
758@item Why does @code{trace16} exist?
759That opcode was added by the customer that contracted Cygnus for the
760data tracing work. I personally think it is unnecessary; objects that
761large will be quite rare, so it is okay to use @code{dup const16
762@var{size} trace} in those cases.
763
764Whatever we decide to do with @code{trace16}, we should at least leave
765opcode 0x30 reserved, to remain compatible with the customer who added
766it.
767
768@end table