]> git.ipfire.org Git - thirdparty/binutils-gdb.git/blob - bfd/doc/bfdint.texi
fix TeX problems
[thirdparty/binutils-gdb.git] / bfd / doc / bfdint.texi
1 \input texinfo
2 @setfilename bfdint.info
3
4 @settitle BFD Internals
5 @iftex
6 @titlepage
7 @title{BFD Internals}
8 @author{Ian Lance Taylor}
9 @author{Cygnus Solutions}
10 @page
11 @end iftex
12
13 @node Top
14 @top BFD Internals
15 @raisesections
16 @cindex bfd internals
17
18 This document describes some BFD internal information which may be
19 helpful when working on BFD. It is very incomplete.
20
21 This document is not updated regularly, and may be out of date. It was
22 last modified on $Date$.
23
24 The initial version of this document was written by Ian Lance Taylor
25 @email{ian@@cygnus.com}.
26
27 @menu
28 * BFD glossary:: BFD glossary
29 * BFD guidelines:: BFD programming guidelines
30 * BFD target vector:: BFD target vector
31 * BFD generated files:: BFD generated files
32 * BFD multiple compilations:: Files compiled multiple times in BFD
33 * BFD relocation handling:: BFD relocation handling
34 * BFD ELF support:: BFD ELF support
35 * Index:: Index
36 @end menu
37
38 @node BFD glossary
39 @section BFD glossary
40 @cindex glossary for bfd
41 @cindex bfd glossary
42
43 This is a short glossary of some BFD terms.
44
45 @table @asis
46 @item a.out
47 The a.out object file format. The original Unix object file format.
48 Still used on SunOS, though not Solaris. Supports only three sections.
49
50 @item archive
51 A collection of object files produced and manipulated by the @samp{ar}
52 program.
53
54 @item BFD
55 The BFD library itself. Also, each object file, archive, or exectable
56 opened by the BFD library has the type @samp{bfd *}, and is sometimes
57 referred to as a bfd.
58
59 @item COFF
60 The Common Object File Format. Used on Unix SVR3. Used by some
61 embedded targets, although ELF is normally better.
62
63 @item DLL
64 A shared library on Windows.
65
66 @item dynamic linker
67 When a program linked against a shared library is run, the dynamic
68 linker will locate the appropriate shared library and arrange to somehow
69 include it in the running image.
70
71 @item dynamic object
72 Another name for an ELF shared library.
73
74 @item ECOFF
75 The Extended Common Object File Format. Used on Alpha Digital Unix
76 (formerly OSF/1), as well as Ultrix and Irix 4. A variant of COFF.
77
78 @item ELF
79 The Executable and Linking Format. The object file format used on most
80 modern Unix systems, including GNU/Linux, Solaris, Irix, and SVR4. Also
81 used on many embedded systems.
82
83 @item executable
84 A program, with instructions and symbols, and perhaps dynamic linking
85 information. Normally produced by a linker.
86
87 @item NLM
88 NetWare Loadable Module. Used to describe the format of an object which
89 be loaded into NetWare, which is some kind of PC based network server
90 program.
91
92 @item object file
93 A binary file including machine instructions, symbols, and relocation
94 information. Normally produced by an assembler.
95
96 @item object file format
97 The format of an object file. Typically object files and executables
98 for a particular system are in the same format, although executables
99 will not contain any relocation information.
100
101 @item PE
102 The Portable Executable format. This is the object file format used for
103 Windows (specifically, Win32) object files. It is based closely on
104 COFF, but has a few significant differences.
105
106 @item PEI
107 The Portable Executable Image format. This is the object file format
108 used for Windows (specifically, Win32) executables. It is very similar
109 to PE, but includes some additional header information.
110
111 @item relocations
112 Information used by the linker to adjust section contents. Also called
113 relocs.
114
115 @item section
116 Object files and executable are composed of sections. Sections have
117 optional data and optional relocation information.
118
119 @item shared library
120 A library of functions which may be used by many executables without
121 actually being linked into each executable. There are several different
122 implementations of shared libraries, each having slightly different
123 features.
124
125 @item symbol
126 Each object file and executable may have a list of symbols, often
127 referred to as the symbol table. A symbol is basically a name and an
128 address. There may also be some additional information like the type of
129 symbol, although the type of a symbol is normally something simple like
130 function or object, and should be confused with the more complex C
131 notion of type. Typically every global function and variable in a C
132 program will have an associated symbol.
133
134 @item Win32
135 The current Windows API, implemented by Windows 95 and later and Windows
136 NT 3.51 and later, but not by Windows 3.1.
137
138 @item XCOFF
139 The eXtended Common Object File Format. Used on AIX. A variant of
140 COFF, with a completely different symbol table implementation.
141 @end table
142
143 @node BFD guidelines
144 @section BFD programming guidelines
145 @cindex bfd programming guidelines
146 @cindex programming guidelines for bfd
147 @cindex guidelines, bfd programming
148
149 There is a lot of poorly written and confusing code in BFD. New BFD
150 code should be written to a higher standard. Merely because some BFD
151 code is written in a particular manner does not mean that you should
152 emulate it.
153
154 Here are some general BFD programming guidelines:
155
156 @itemize @bullet
157 @item
158 Follow the GNU coding standards.
159
160 @item
161 Avoid global variables. We ideally want BFD to be fully reentrant, so
162 that it can be used in multiple threads. All uses of global or static
163 variables interfere with that. Initialized constant variables are OK,
164 and they should be explicitly marked with const. Instead of global
165 variables, use data attached to a BFD or to a linker hash table.
166
167 @item
168 All externally visible functions should have names which start with
169 @samp{bfd_}. All such functions should be declared in some header file,
170 typically @file{bfd.h}. See, for example, the various declarations near
171 the end of @file{bfd-in.h}, which mostly declare functions required by
172 specific linker emulations.
173
174 @item
175 All functions which need to be visible from one file to another within
176 BFD, but should not be visible outside of BFD, should start with
177 @samp{_bfd_}. Although external names beginning with @samp{_} are
178 prohibited by the ANSI standard, in practice this usage will always
179 work, and it is required by the GNU coding standards.
180
181 @item
182 Always remember that people can compile using --enable-targets to build
183 several, or all, targets at once. It must be possible to link together
184 the files for all targets.
185
186 @item
187 BFD code should compile with few or no warnings using @samp{gcc -Wall}.
188 Some warnings are OK, like the absence of certain function declarations
189 which may or may not be declared in system header files. Warnings about
190 ambiguous expressions and the like should always be fixed.
191 @end itemize
192
193 @node BFD target vector
194 @section BFD target vector
195 @cindex bfd target vector
196 @cindex target vector in bfd
197
198 BFD supports multiple object file formats by using the @dfn{target
199 vector}. This is simply a set of function pointers which implement
200 behaviour that is specific to a particular object file format.
201
202 In this section I list all of the entries in the target vector and
203 describe what they do.
204
205 @menu
206 * BFD target vector miscellaneous:: Miscellaneous constants
207 * BFD target vector swap:: Swapping functions
208 * BFD target vector format:: Format type dependent functions
209 * BFD_JUMP_TABLE macros:: BFD_JUMP_TABLE macros
210 * BFD target vector generic:: Generic functions
211 * BFD target vector copy:: Copy functions
212 * BFD target vector core:: Core file support functions
213 * BFD target vector archive:: Archive functions
214 * BFD target vector symbols:: Symbol table functions
215 * BFD target vector relocs:: Relocation support
216 * BFD target vector write:: Output functions
217 * BFD target vector link:: Linker functions
218 * BFD target vector dynamic:: Dynamic linking information functions
219 @end menu
220
221 @node BFD target vector miscellaneous
222 @subsection Miscellaneous constants
223
224 The target vector starts with a set of constants.
225
226 @table @samp
227 @item name
228 The name of the target vector. This is an arbitrary string. This is
229 how the target vector is named in command line options for tools which
230 use BFD, such as the @samp{-oformat} linker option.
231
232 @item flavour
233 A general description of the type of target. The following flavours are
234 currently defined:
235 @table @samp
236 @item bfd_target_unknown_flavour
237 Undefined or unknown.
238 @item bfd_target_aout_flavour
239 a.out.
240 @item bfd_target_coff_flavour
241 COFF.
242 @item bfd_target_ecoff_flavour
243 ECOFF.
244 @item bfd_target_elf_flavour
245 ELF.
246 @item bfd_target_ieee_flavour
247 IEEE-695.
248 @item bfd_target_nlm_flavour
249 NLM.
250 @item bfd_target_oasys_flavour
251 OASYS.
252 @item bfd_target_tekhex_flavour
253 Tektronix hex format.
254 @item bfd_target_srec_flavour
255 Motorola S-record format.
256 @item bfd_target_ihex_flavour
257 Intel hex format.
258 @item bfd_target_som_flavour
259 SOM (used on HP/UX).
260 @item bfd_target_os9k_flavour
261 os9000.
262 @item bfd_target_versados_flavour
263 VERSAdos.
264 @item bfd_target_msdos_flavour
265 MS-DOS.
266 @item bfd_target_evax_flavour
267 openVMS.
268 @end table
269
270 @item byteorder
271 The byte order of data in the object file. One of
272 @samp{BFD_ENDIAN_BIG}, @samp{BFD_ENDIAN_LITTLE}, or
273 @samp{BFD_ENDIAN_UNKNOWN}. The latter would be used for a format such
274 as S-records which do not record the architecture of the data.
275
276 @item header_byteorder
277 The byte order of header information in the object file. Normally the
278 same as the @samp{byteorder} field, but there are certain cases where it
279 may be different.
280
281 @item object_flags
282 Flags which may appear in the @samp{flags} field of a BFD with this
283 format.
284
285 @item section_flags
286 Flags which may appear in the @samp{flags} field of a section within a
287 BFD with this format.
288
289 @item symbol_leading_char
290 A character which the C compiler normally puts before a symbol. For
291 example, an a.out compiler will typically generate the symbol
292 @samp{_foo} for a function named @samp{foo} in the C source, in which
293 case this field would be @samp{_}. If there is no such character, this
294 field will be @samp{0}.
295
296 @item ar_pad_char
297 The padding character to use at the end of an archive name. Normally
298 @samp{/}.
299
300 @item ar_max_namelen
301 The maximum length of a short name in an archive. Normally @samp{14}.
302
303 @item backend_data
304 A pointer to constant backend data. This is used by backends to store
305 whatever additional information they need to distinguish similar target
306 vectors which use the same sets of functions.
307 @end table
308
309 @node BFD target vector swap
310 @subsection Swapping functions
311
312 Every target vector has fuction pointers used for swapping information
313 in and out of the target representation. There are two sets of
314 functions: one for data information, and one for header information.
315 Each set has three sizes: 64-bit, 32-bit, and 16-bit. Each size has
316 three actual functions: put, get unsigned, and get signed.
317
318 These 18 functions are used to convert data between the host and target
319 representations.
320
321 @node BFD target vector format
322 @subsection Format type dependent functions
323
324 Every target vector has three arrays of function pointers which are
325 indexed by the BFD format type. The BFD format types are as follows:
326 @table @samp
327 @item bfd_unknown
328 Unknown format. Not used for anything useful.
329 @item bfd_object
330 Object file.
331 @item bfd_archive
332 Archive file.
333 @item bfd_core
334 Core file.
335 @end table
336
337 The three arrays of function pointers are as follows:
338 @table @samp
339 @item bfd_check_format
340 Check whether the BFD is of a particular format (object file, archive
341 file, or core file) corresponding to this target vector. This is called
342 by the @samp{bfd_check_format} function when examining an existing BFD.
343 If the BFD matches the desired format, this function will initialize any
344 format specific information such as the @samp{tdata} field of the BFD.
345 This function must be called before any other BFD target vector function
346 on a file opened for reading.
347
348 @item bfd_set_format
349 Set the format of a BFD which was created for output. This is called by
350 the @samp{bfd_set_format} function after creating the BFD with a
351 function such as @samp{bfd_openw}. This function will initialize format
352 specific information required to write out an object file or whatever of
353 the given format. This function must be called before any other BFD
354 target vector function on a file opened for writing.
355
356 @item bfd_write_contents
357 Write out the contents of the BFD in the given format. This is called
358 by @samp{bfd_close} function for a BFD opened for writing. This really
359 should not be an array selected by format type, as the
360 @samp{bfd_set_format} function provides all the required information.
361 In fact, BFD will fail if a different format is used when calling
362 through the @samp{bfd_set_format} and the @samp{bfd_write_contents}
363 arrays; fortunately, since @samp{bfd_close} gets it right, this is a
364 difficult error to make.
365 @end table
366
367 @node BFD_JUMP_TABLE macros
368 @subsection @samp{BFD_JUMP_TABLE} macros
369 @cindex @samp{BFD_JUMP_TABLE}
370
371 Most target vectors are defined using @samp{BFD_JUMP_TABLE} macros.
372 These macros take a single argument, which is a prefix applied to a set
373 of functions. The macros are then used to initialize the fields in the
374 target vector.
375
376 For example, the @samp{BFD_JUMP_TABLE_RELOCS} macro defines three
377 functions: @samp{_get_reloc_upper_bound}, @samp{_canonicalize_reloc},
378 and @samp{_bfd_reloc_type_lookup}. A reference like
379 @samp{BFD_JUMP_TABLE_RELOCS (foo)} will expand into three functions
380 prefixed with @samp{foo}: @samp{foo_get_reloc_upper_found}, etc. The
381 @samp{BFD_JUMP_TABLE_RELOCS} macro will be placed such that those three
382 functions initialize the appropriate fields in the BFD target vector.
383
384 This is done because it turns out that many different target vectors can
385 shared certain classes of functions. For example, archives are similar
386 on most platforms, so most target vectors can use the same archive
387 functions. Those target vectors all use @samp{BFD_JUMP_TABLE_ARCHIVE}
388 with the same argument, calling a set of functions which is defined in
389 @file{archive.c}.
390
391 Each of the @samp{BFD_JUMP_TABLE} macros is mentioned below along with
392 the description of the function pointers which it defines. The function
393 pointers will be described using the name without the prefix which the
394 @samp{BFD_JUMP_TABLE} macro defines. This name is normally the same as
395 the name of the field in the target vector structure. Any differences
396 will be noted.
397
398 @node BFD target vector generic
399 @subsection Generic functions
400 @cindex @samp{BFD_JUMP_TABLE_GENERIC}
401
402 The @samp{BFD_JUMP_TABLE_GENERIC} macro is used for some catch all
403 functions which don't easily fit into other categories.
404
405 @table @samp
406 @item _close_and_cleanup
407 Free any target specific information associated with the BFD. This is
408 called when any BFD is closed (the @samp{bfd_write_contents} function
409 mentioned earlier is only called for a BFD opened for writing). Most
410 targets use @samp{bfd_alloc} to allocate all target specific
411 information, and therefore don't have to do anything in this function.
412 This function pointer is typically set to
413 @samp{_bfd_generic_close_and_cleanup}, which simply returns true.
414
415 @item _bfd_free_cached_info
416 Free any cached information associated with the BFD which can be
417 recreated later if necessary. This is used to reduce the memory
418 consumption required by programs using BFD. This is normally called via
419 the @samp{bfd_free_cached_info} macro. It is used by the default
420 archive routines when computing the archive map. Most targets do not
421 do anything special for this entry point, and just set it to
422 @samp{_bfd_generic_free_cached_info}, which simply returns true.
423
424 @item _new_section_hook
425 This is called from @samp{bfd_make_section_anyway} whenever a new
426 section is created. Most targets use it to initialize section specific
427 information. This function is called whether or not the section
428 corresponds to an actual section in an actual BFD.
429
430 @item _get_section_contents
431 Get the contents of a section. This is called from
432 @samp{bfd_get_section_contents}. Most targets set this to
433 @samp{_bfd_generic_get_section_contents}, which does a @samp{bfd_seek}
434 based on the section's @samp{filepos} field and a @samp{bfd_read}. The
435 corresponding field in the target vector is named
436 @samp{_bfd_get_section_contents}.
437
438 @item _get_section_contents_in_window
439 Set a @samp{bfd_window} to hold the contents of a section. This is
440 called from @samp{bfd_get_section_contents_in_window}. The
441 @samp{bfd_window} idea never really caught in, and I don't think this is
442 ever called. Pretty much all targets implement this as
443 @samp{bfd_generic_get_section_contents_in_window}, which uses
444 @samp{bfd_get_section_contents} to do the right thing. The
445 corresponding field in the target vector is named
446 @samp{_bfd_get_section_contents_in_window}.
447 @end table
448
449 @node BFD target vector copy
450 @subsection Copy functions
451 @cindex @samp{BFD_JUMP_TABLE_COPY}
452
453 The @samp{BFD_JUMP_TABLE_COPY} macro is used for functions which are
454 called when copying BFDs, and for a couple of functions which deal with
455 internal BFD information.
456
457 @table @samp
458 @item _bfd_copy_private_bfd_data
459 This is called when copying a BFD, via @samp{bfd_copy_private_bfd_data}.
460 If the input and output BFDs have the same format, this will copy any
461 private information over. This is called after all the section contents
462 have been written to the output file. Only a few targets do anything in
463 this function.
464
465 @item _bfd_merge_private_bfd_data
466 This is called when linking, via @samp{bfd_merge_private_bfd_data}. It
467 gives the backend linker code a chance to set any special flags in the
468 output file based on the contents of the input file. Only a few targets
469 do anything in this function.
470
471 @item _bfd_copy_private_section_data
472 This is similar to @samp{_bfd_copy_private_bfd_data}, but it is called
473 for each section, via @samp{bfd_copy_private_section_data}. This
474 function is called before any section contents have been written. Only
475 a few targets do anything in this function.
476
477 @item _bfd_copy_private_symbol_data
478 This is called via @samp{bfd_copy_private_symbol_data}, but I don't
479 think anything actually calls it. If it were defined, it could be used
480 to copy private symbol data from one BFD to another. However, most BFDs
481 store extra symbol information by allocating space which is larger than
482 the @samp{asymbol} structure and storing private information in the
483 extra space. Since @samp{objcopy} and other programs copy symbol
484 information by copying pointers to @samp{asymbol} structures, the
485 private symbol information is automatically copied as well. Most
486 targets do not do anything in this function.
487
488 @item _bfd_set_private_flags
489 This is called via @samp{bfd_set_private_flags}. It is basically a hook
490 for the assembler to set magic information. For example, the PowerPC
491 ELF assembler uses it to set flags which appear in the e_flags field of
492 the ELF header. Most targets do not do anything in this function.
493
494 @item _bfd_print_private_bfd_data
495 This is called by @samp{objdump} when the @samp{-p} option is used. It
496 is called via @samp{bfd_print_private_data}. It prints any interesting
497 information about the BFD which can not be otherwise represented by BFD
498 and thus can not be printed by @samp{objdump}. Most targets do not do
499 anything in this function.
500 @end table
501
502 @node BFD target vector core
503 @subsection Core file support functions
504 @cindex @samp{BFD_JUMP_TABLE_CORE}
505
506 The @samp{BFD_JUMP_TABLE_CORE} macro is used for functions which deal
507 with core files. Obviously, these functions only do something
508 interesting for targets which have core file support.
509
510 @table @samp
511 @item _core_file_failing_command
512 Given a core file, this returns the command which was run to produce the
513 core file.
514
515 @item _core_file_failing_signal
516 Given a core file, this returns the signal number which produced the
517 core file.
518
519 @item _core_file_matches_executable_p
520 Given a core file and a BFD for an executable, this returns whether the
521 core file was generated by the executable.
522 @end table
523
524 @node BFD target vector archive
525 @subsection Archive functions
526 @cindex @samp{BFD_JUMP_TABLE_ARCHIVE}
527
528 The @samp{BFD_JUMP_TABLE_ARCHIVE} macro is used for functions which deal
529 with archive files. Most targets use COFF style archive files
530 (including ELF targets), and these use @samp{_bfd_archive_coff} as the
531 argument to @samp{BFD_JUMP_TABLE_ARCHIVE}. Some targets use BSD/a.out
532 style archives, and these use @samp{_bfd_archive_bsd}. (The main
533 difference between BSD and COFF archives is the format of the archive
534 symbol table). Targets with no archive support use
535 @samp{_bfd_noarchive}. Finally, a few targets have unusual archive
536 handling.
537
538 @table @samp
539 @item _slurp_armap
540 Read in the archive symbol table, storing it in private BFD data. This
541 is normally called from the archive @samp{check_format} routine. The
542 corresponding field in the target vector is named
543 @samp{_bfd_slurp_armap}.
544
545 @item _slurp_extended_name_table
546 Read in the extended name table from the archive, if there is one,
547 storing it in private BFD data. This is normally called from the
548 archive @samp{check_format} routine. The corresponding field in the
549 target vector is named @samp{_bfd_slurp_extended_name_table}.
550
551 @item construct_extended_name_table
552 Build and return an extended name table if one is needed to write out
553 the archive. This also adjusts the archive headers to refer to the
554 extended name table appropriately. This is normally called from the
555 archive @samp{write_contents} routine. The corresponding field in the
556 target vector is named @samp{_bfd_construct_extended_name_table}.
557
558 @item _truncate_arname
559 This copies a file name into an archive header, truncating it as
560 required. It is normally called from the archive @samp{write_contents}
561 routine. This function is more interesting in targets which do not
562 support extended name tables, but I think the GNU @samp{ar} program
563 always uses extended name tables anyhow. The corresponding field in the
564 target vector is named @samp{_bfd_truncate_arname}.
565
566 @item _write_armap
567 Write out the archive symbol table using calls to @samp{bfd_write}.
568 This is normally called from the archive @samp{write_contents} routine.
569 The corresponding field in the target vector is named @samp{write_armap}
570 (no leading underscore).
571
572 @item _read_ar_hdr
573 Read and parse an archive header. This handles expanding the archive
574 header name into the real file name using the extended name table. This
575 is called by routines which read the archive symbol table or the archive
576 itself. The corresponding field in the target vector is named
577 @samp{_bfd_read_ar_hdr_fn}.
578
579 @item _openr_next_archived_file
580 Given an archive and a BFD representing a file stored within the
581 archive, return a BFD for the next file in the archive. This is called
582 via @samp{bfd_openr_next_archived_file}. The corresponding field in the
583 target vector is named @samp{openr_next_archived_file} (no leading
584 underscore).
585
586 @item _get_elt_at_index
587 Given an archive and an index, return a BFD for the file in the archive
588 corresponding to that entry in the archive symbol table. This is called
589 via @samp{bfd_get_elt_at_index}. The corresponding field in the target
590 vector is named @samp{_bfd_get_elt_at_index}.
591
592 @item _generic_stat_arch_elt
593 Do a stat on an element of an archive, returning information read from
594 the archive header (modification time, uid, gid, file mode, size). This
595 is called via @samp{bfd_stat_arch_elt}. The corresponding field in the
596 target vector is named @samp{_bfd_stat_arch_elt}.
597
598 @item _update_armap_timestamp
599 After the entire contents of an archive have been written out, update
600 the timestamp of the archive symbol table to be newer than that of the
601 file. This is required for a.out style archives. This is normally
602 called by the archive @samp{write_contents} routine. The corresponding
603 field in the target vector is named @samp{_bfd_update_armap_timestamp}.
604 @end table
605
606 @node BFD target vector symbols
607 @subsection Symbol table functions
608 @cindex @samp{BFD_JUMP_TABLE_SYMBOLS}
609
610 The @samp{BFD_JUMP_TABLE_SYMBOLS} macro is used for functions which deal
611 with symbols.
612
613 @table @samp
614 @item _get_symtab_upper_bound
615 Return a sensible upper bound on the amount of memory which will be
616 required to read the symbol table. In practice most targets return the
617 amount of memory required to hold @samp{asymbol} pointers for all the
618 symbols plus a trailing @samp{NULL} entry, and store the actual symbol
619 information in BFD private data. This is called via
620 @samp{bfd_get_symtab_upper_bound}. The corresponding field in the
621 target vector is named @samp{_bfd_get_symtab_upper_bound}.
622
623 @item _get_symtab
624 Read in the symbol table. This is called via
625 @samp{bfd_canonicalize_symtab}. The corresponding field in the target
626 vector is named @samp{_bfd_canonicalize_symtab}.
627
628 @item _make_empty_symbol
629 Create an empty symbol for the BFD. This is needed because most targets
630 store extra information with each symbol by allocating a structure
631 larger than an @samp{asymbol} and storing the extra information at the
632 end. This function will allocate the right amount of memory, and return
633 what looks like a pointer to an empty @samp{asymbol}. This is called
634 via @samp{bfd_make_empty_symbol}. The corresponding field in the target
635 vector is named @samp{_bfd_make_empty_symbol}.
636
637 @item _print_symbol
638 Print information about the symbol. This is called via
639 @samp{bfd_print_symbol}. One of the arguments indicates what sort of
640 information should be printed:
641 @table @samp
642 @item bfd_print_symbol_name
643 Just print the symbol name.
644 @item bfd_print_symbol_more
645 Print the symbol name and some interesting flags. I don't think
646 anything actually uses this.
647 @item bfd_print_symbol_all
648 Print all information about the symbol. This is used by @samp{objdump}
649 when run with the @samp{-t} option.
650 @end table
651 The corresponding field in the target vector is named
652 @samp{_bfd_print_symbol}.
653
654 @item _get_symbol_info
655 Return a standard set of information about the symbol. This is called
656 via @samp{bfd_symbol_info}. The corresponding field in the target
657 vector is named @samp{_bfd_get_symbol_info}.
658
659 @item _bfd_is_local_label_name
660 Return whether the given string would normally represent the name of a
661 local label. This is called via @samp{bfd_is_local_label} and
662 @samp{bfd_is_local_label_name}. Local labels are normally discarded by
663 the assembler. In the linker, this defines the difference between the
664 @samp{-x} and @samp{-X} options.
665
666 @item _get_lineno
667 Return line number information for a symbol. This is only meaningful
668 for a COFF target. This is called when writing out COFF line numbers.
669
670 @item _find_nearest_line
671 Given an address within a section, use the debugging information to find
672 the matching file name, function name, and line number, if any. This is
673 called via @samp{bfd_find_nearest_line}. The corresponding field in the
674 target vector is named @samp{_bfd_find_nearest_line}.
675
676 @item _bfd_make_debug_symbol
677 Make a debugging symbol. This is only meaningful for a COFF target,
678 where it simply returns a symbol which will be placed in the
679 @samp{N_DEBUG} section when it is written out. This is called via
680 @samp{bfd_make_debug_symbol}.
681
682 @item _read_minisymbols
683 Minisymbols are used to reduce the memory requirements of programs like
684 @samp{nm}. A minisymbol is a cookie pointing to internal symbol
685 information which the caller can use to extract complete symbol
686 information. This permits BFD to not convert all the symbols into
687 generic form, but to instead convert them one at a time. This is called
688 via @samp{bfd_read_minisymbols}. Most targets do not implement this,
689 and just use generic support which is based on using standard
690 @samp{asymbol} structures.
691
692 @item _minisymbol_to_symbol
693 Convert a minisymbol to a standard @samp{asymbol}. This is called via
694 @samp{bfd_minisymbol_to_symbol}.
695 @end table
696
697 @node BFD target vector relocs
698 @subsection Relocation support
699 @cindex @samp{BFD_JUMP_TABLE_RELOCS}
700
701 The @samp{BFD_JUMP_TABLE_RELOCS} macro is used for functions which deal
702 with relocations.
703
704 @table @samp
705 @item _get_reloc_upper_bound
706 Return a sensible upper bound on the amount of memory which will be
707 required to read the relocations for a section. In practice most
708 targets return the amount of memory required to hold @samp{arelent}
709 pointers for all the relocations plus a trailing @samp{NULL} entry, and
710 store the actual relocation information in BFD private data. This is
711 called via @samp{bfd_get_reloc_upper_bound}.
712
713 @item _canonicalize_reloc
714 Return the relocation information for a section. This is called via
715 @samp{bfd_canonicalize_reloc}. The corresponding field in the target
716 vector is named @samp{_bfd_canonicalize_reloc}.
717
718 @item _bfd_reloc_type_lookup
719 Given a relocation code, return the corresponding howto structure
720 (@pxref{BFD relocation codes}). This is called via
721 @samp{bfd_reloc_type_lookup}. The corresponding field in the target
722 vector is named @samp{reloc_type_lookup}.
723 @end table
724
725 @node BFD target vector write
726 @subsection Output functions
727 @cindex @samp{BFD_JUMP_TABLE_WRITE}
728
729 The @samp{BFD_JUMP_TABLE_WRITE} macro is used for functions which deal
730 with writing out a BFD.
731
732 @table @samp
733 @item _set_arch_mach
734 Set the architecture and machine number for a BFD. This is called via
735 @samp{bfd_set_arch_mach}. Most targets implement this by calling
736 @samp{bfd_default_set_arch_mach}. The corresponding field in the target
737 vector is named @samp{_bfd_set_arch_mach}.
738
739 @item _set_section_contents
740 Write out the contents of a section. This is called via
741 @samp{bfd_set_section_contents}. The corresponding field in the target
742 vector is named @samp{_bfd_set_section_contents}.
743 @end table
744
745 @node BFD target vector link
746 @subsection Linker functions
747 @cindex @samp{BFD_JUMP_TABLE_LINK}
748
749 The @samp{BFD_JUMP_TABLE_LINK} macro is used for functions called by the
750 linker.
751
752 @table @samp
753 @item _sizeof_headers
754 Return the size of the header information required for a BFD. This is
755 used to implement the @samp{SIZEOF_HEADERS} linker script function. It
756 is normally used to align the first section at an efficient position on
757 the page. This is called via @samp{bfd_sizeof_headers}. The
758 corresponding field in the target vector is named
759 @samp{_bfd_sizeof_headers}.
760
761 @item _bfd_get_relocated_section_contents
762 Read the contents of a section and apply the relocation information.
763 This handles both a final link and a relocateable link; in the latter
764 case, it adjust the relocation information as well. This is called via
765 @samp{bfd_get_relocated_section_contents}. Most targets implement it by
766 calling @samp{bfd_generic_get_relocated_section_contents}.
767
768 @item _bfd_relax_section
769 Try to use relaxation to shrink the size of a section. This is called
770 by the linker when the @samp{-relax} option is used. This is called via
771 @samp{bfd_relax_section}. Most targets do not support any sort of
772 relaxation.
773
774 @item _bfd_link_hash_table_create
775 Create the symbol hash table to use for the linker. This linker hook
776 permits the backend to control the size and information of the elements
777 in the linker symbol hash table. This is called via
778 @samp{bfd_link_hash_table_create}.
779
780 @item _bfd_link_add_symbols
781 Given an object file or an archive, add all symbols into the linker
782 symbol hash table. Use callbacks to the linker to include archive
783 elements in the link. This is called via @samp{bfd_link_add_symbols}.
784
785 @item _bfd_final_link
786 Finish the linking process. The linker calls this hook after all of the
787 input files have been read, when it is ready to finish the link and
788 generate the output file. This is called via @samp{bfd_final_link}.
789
790 @item _bfd_link_split_section
791 I don't know what this is for. Nothing seems to call it. The only
792 non-trivial definition is in @file{som.c}.
793 @end table
794
795 @node BFD target vector dynamic
796 @subsection Dynamic linking information functions
797 @cindex @samp{BFD_JUMP_TABLE_DYNAMIC}
798
799 The @samp{BFD_JUMP_TABLE_DYNAMIC} macro is used for functions which read
800 dynamic linking information.
801
802 @table @samp
803 @item _get_dynamic_symtab_upper_bound
804 Return a sensible upper bound on the amount of memory which will be
805 required to read the dynamic symbol table. In practice most targets
806 return the amount of memory required to hold @samp{asymbol} pointers for
807 all the symbols plus a trailing @samp{NULL} entry, and store the actual
808 symbol information in BFD private data. This is called via
809 @samp{bfd_get_dynamic_symtab_upper_bound}. The corresponding field in
810 the target vector is named @samp{_bfd_get_dynamic_symtab_upper_bound}.
811
812 @item _canonicalize_dynamic_symtab
813 Read the dynamic symbol table. This is called via
814 @samp{bfd_canonicalize_dynamic_symtab}. The corresponding field in the
815 target vector is named @samp{_bfd_canonicalize_dynamic_symtab}.
816
817 @item _get_dynamic_reloc_upper_bound
818 Return a sensible upper bound on the amount of memory which will be
819 required to read the dynamic relocations. In practice most targets
820 return the amount of memory required to hold @samp{arelent} pointers for
821 all the relocations plus a trailing @samp{NULL} entry, and store the
822 actual relocation information in BFD private data. This is called via
823 @samp{bfd_get_dynamic_reloc_upper_bound}. The corresponding field in
824 the target vector is named @samp{_bfd_get_dynamic_reloc_upper_bound}.
825
826 @item _canonicalize_dynamic_reloc
827 Read the dynamic relocations. This is called via
828 @samp{bfd_canonicalize_dynamic_reloc}. The corresponding field in the
829 target vector is named @samp{_bfd_canonicalize_dynamic_reloc}.
830 @end table
831
832 @node BFD generated files
833 @section BFD generated files
834 @cindex generated files in bfd
835 @cindex bfd generated files
836
837 BFD contains several automatically generated files. This section
838 describes them. Some files are created at configure time, when you
839 configure BFD. Some files are created at make time, when you build
840 time. Some files are automatically rebuilt at make time, but only if
841 you configure with the @samp{--enable-maintainer-mode} option. Some
842 files live in the object directory---the directory from which you run
843 configure---and some live in the source directory. All files that live
844 in the source directory are checked into the CVS repository.
845
846 @table @file
847 @item bfd.h
848 @cindex @file{bfd.h}
849 @cindex @file{bfd-in3.h}
850 Lives in the object directory. Created at make time from
851 @file{bfd-in2.h} via @file{bfd-in3.h}. @file{bfd-in3.h} is created at
852 configure time from @file{bfd-in2.h}. There are automatic dependencies
853 to rebuild @file{bfd-in3.h} and hence @file{bfd.h} if @file{bfd-in2.h}
854 changes, so you can normally ignore @file{bfd-in3.h}, and just think
855 about @file{bfd-in2.h} and @file{bfd.h}.
856
857 @file{bfd.h} is built by replacing a few strings in @file{bfd-in2.h}.
858 To see them, search for @samp{@@} in @file{bfd-in2.h}. They mainly
859 control whether BFD is built for a 32 bit target or a 64 bit target.
860
861 @item bfd-in2.h
862 @cindex @file{bfd-in2.h}
863 Lives in the source directory. Created from @file{bfd-in.h} and several
864 other BFD source files. If you configure with the
865 @samp{--enable-maintainer-mode} option, @file{bfd-in2.h} is rebuilt
866 automatically when a source file changes.
867
868 @item elf32-target.h
869 @itemx elf64-target.h
870 @cindex @file{elf32-target.h}
871 @cindex @file{elf64-target.h}
872 Live in the object directory. Created from @file{elfxx-target.h}.
873 These files are versions of @file{elfxx-target.h} customized for either
874 a 32 bit ELF target or a 64 bit ELF target.
875
876 @item libbfd.h
877 @cindex @file{libbfd.h}
878 Lives in the source directory. Created from @file{libbfd-in.h} and
879 several other BFD source files. If you configure with the
880 @samp{--enable-maintainer-mode} option, @file{libbfd.h} is rebuilt
881 automatically when a source file changes.
882
883 @item libcoff.h
884 @cindex @file{libcoff.h}
885 Lives in the source directory. Created from @file{libcoff-in.h} and
886 @file{coffcode.h}. If you configure with the
887 @samp{--enable-maintainer-mode} option, @file{libcoff.h} is rebuilt
888 automatically when a source file changes.
889
890 @item targmatch.h
891 @cindex @file{targmatch.h}
892 Lives in the object directory. Created at make time from
893 @file{config.bfd}. This file is used to map configuration triplets into
894 BFD target vector variable names at run time.
895 @end table
896
897 @node BFD multiple compilations
898 @section Files compiled multiple times in BFD
899 Several files in BFD are compiled multiple times. By this I mean that
900 there are header files which contain function definitions. These header
901 filesare included by other files, and thus the functions are compiled
902 once per file which includes them.
903
904 Preprocessor macros are used to control the compilation, so that each
905 time the files are compiled the resulting functions are slightly
906 different. Naturally, if they weren't different, there would be no
907 reason to compile them multiple times.
908
909 This is a not a particularly good programming technique, and future BFD
910 work should avoid it.
911
912 @itemize @bullet
913 @item
914 Since this technique is rarely used, even experienced C programmers find
915 it confusing.
916
917 @item
918 It is difficult to debug programs which use BFD, since there is no way
919 to describe which version of a particular function you are looking at.
920
921 @item
922 Programs which use BFD wind up incorporating two or more slightly
923 different versions of the same function, which wastes space in the
924 executable.
925
926 @item
927 This technique is never required nor is it especially efficient. It is
928 always possible to use statically initialized structures holding
929 function pointers and magic constants instead.
930 @end itemize
931
932 The following is a list of the files which are compiled multiple times.
933
934 @table @file
935 @item aout-target.h
936 @cindex @file{aout-target.h}
937 Describes a few functions and the target vector for a.out targets. This
938 is used by individual a.out targets with different definitions of
939 @samp{N_TXTADDR} and similar a.out macros.
940
941 @item aoutf1.h
942 @cindex @file{aoutf1.h}
943 Implements standard SunOS a.out files. In principle it supports 64 bit
944 a.out targets based on the preprocessor macro @samp{ARCH_SIZE}, but
945 since all known a.out targets are 32 bits, this code may or may not
946 work. This file is only included by a few other files, and it is
947 difficult to justify its existence.
948
949 @item aoutx.h
950 @cindex @file{aoutx.h}
951 Implements basic a.out support routines. This file can be compiled for
952 either 32 or 64 bit support. Since all known a.out targets are 32 bits,
953 the 64 bit support may or may not work. I believe the original
954 intention was that this file would only be included by @samp{aout32.c}
955 and @samp{aout64.c}, and that other a.out targets would simply refer to
956 the functions it defined. Unfortunately, some other a.out targets
957 started including it directly, leading to a somewhat confused state of
958 affairs.
959
960 @item coffcode.h
961 @cindex @file{coffcode.h}
962 Implements basic COFF support routines. This file is included by every
963 COFF target. It implements code which handles COFF magic numbers as
964 well as various hook functions called by the generic COFF functions in
965 @file{coffgen.c}. This file is controlled by a number of different
966 macros, and more are added regularly.
967
968 @item coffswap.h
969 @cindex @file{coffswap.h}
970 Implements COFF swapping routines. This file is included by
971 @file{coffcode.h}, and thus by every COFF target. It implements the
972 routines which swap COFF structures between internal and external
973 format. The main control for this file is the external structure
974 definitions in the files in the @file{include/coff} directory. A COFF
975 target file will include one of those files before including
976 @file{coffcode.h} and thus @file{coffswap.h}. There are a few other
977 macros which affect @file{coffswap.h} as well, mostly describing whether
978 certain fields are present in the external structures.
979
980 @item ecoffswap.h
981 @cindex @file{ecoffswap.h}
982 Implements ECOFF swapping routines. This is like @file{coffswap.h}, but
983 for ECOFF. It is included by the ECOFF target files (of which there are
984 only two). The control is the preprocessor macro @samp{ECOFF_32} or
985 @samp{ECOFF_64}.
986
987 @item elfcode.h
988 @cindex @file{elfcode.h}
989 Implements ELF functions that use external structure definitions. This
990 file is included by two other files: @file{elf32.c} and @file{elf64.c}.
991 It is controlled by the @samp{ARCH_SIZE} macro which is defined to be
992 @samp{32} or @samp{64} before including it. The @samp{NAME} macro is
993 used internally to give the functions different names for the two target
994 sizes.
995
996 @item elfcore.h
997 @cindex @file{elfcore.h}
998 Like @file{elfcode.h}, but for functions that are specific to ELF core
999 files. This is included only by @file{elfcode.h}.
1000
1001 @item elflink.h
1002 @cindex @file{elflink.h}
1003 Like @file{elfcode.h}, but for functions used by the ELF linker. This
1004 is included only by @file{elfcode.h}.
1005
1006 @item elfxx-target.h
1007 @cindex @file{elfxx-target.h}
1008 This file is the source for the generated files @file{elf32-target.h}
1009 and @file{elf64-target.h}, one of which is included by every ELF target.
1010 It defines the ELF target vector.
1011
1012 @item freebsd.h
1013 @cindex @file{freebsd.h}
1014 Presumably intended to be included by all FreeBSD targets, but in fact
1015 there is only one such target, @samp{i386-freebsd}. This defines a
1016 function used to set the right magic number for FreeBSD, as well as
1017 various macros, and includes @file{aout-target.h}.
1018
1019 @item netbsd.h
1020 @cindex @file{netbsd.h}
1021 Like @file{freebsd.h}, except that there are several files which include
1022 it.
1023
1024 @item nlm-target.h
1025 @cindex @file{nlm-target.h}
1026 Defines the target vector for a standard NLM target.
1027
1028 @item nlmcode.h
1029 @cindex @file{nlmcode.h}
1030 Like @file{elfcode.h}, but for NLM targets. This is only included by
1031 @file{nlm32.c} and @file{nlm64.c}, both of which define the macro
1032 @samp{ARCH_SIZE} to an appropriate value. There are no 64 bit NLM
1033 targets anyhow, so this is sort of useless.
1034
1035 @item nlmswap.h
1036 @cindex @file{nlmswap.h}
1037 Like @file{coffswap.h}, but for NLM targets. This is included by each
1038 NLM target, but I think it winds up compiling to the exact same code for
1039 every target, and as such is fairly useless.
1040
1041 @item peicode.h
1042 @cindex @file{peicode.h}
1043 Provides swapping routines and other hooks for PE targets.
1044 @file{coffcode.h} will include this rather than @file{coffswap.h} for a
1045 PE target. This defines PE specific versions of the COFF swapping
1046 routines, and also defines some macros which control @file{coffcode.h}
1047 itself.
1048 @end table
1049
1050 @node BFD relocation handling
1051 @section BFD relocation handling
1052 @cindex bfd relocation handling
1053 @cindex relocations in bfd
1054
1055 The handling of relocations is one of the more confusing aspects of BFD.
1056 Relocation handling has been implemented in various different ways, all
1057 somewhat incompatible, none perfect.
1058
1059 @menu
1060 * BFD relocation concepts:: BFD relocation concepts
1061 * BFD relocation functions:: BFD relocation functions
1062 * BFD relocation codes:: BFD relocation codes
1063 * BFD relocation future:: BFD relocation future
1064 @end menu
1065
1066 @node BFD relocation concepts
1067 @subsection BFD relocation concepts
1068
1069 A relocation is an action which the linker must take when linking. It
1070 describes a change to the contents of a section. The change is normally
1071 based on the final value of one or more symbols. Relocations are
1072 created by the assembler when it creates an object file.
1073
1074 Most relocations are simple. A typical simple relocation is to set 32
1075 bits at a given offset in a section to the value of a symbol. This type
1076 of relocation would be generated for code like @code{int *p = &i;} where
1077 @samp{p} and @samp{i} are global variables. A relocation for the symbol
1078 @samp{i} would be generated such that the linker would initialize the
1079 area of memory which holds the value of @samp{p} to the value of the
1080 symbol @samp{i}.
1081
1082 Slightly more complex relocations may include an addend, which is a
1083 constant to add to the symbol value before using it. In some cases a
1084 relocation will require adding the symbol value to the existing contents
1085 of the section in the object file. In others the relocation will simply
1086 replace the contents of the section with the symbol value. Some
1087 relocations are PC relative, so that the value to be stored in the
1088 section is the difference between the value of a symbol and the final
1089 address of the section contents.
1090
1091 In general, relocations can be arbitrarily complex. For
1092 example,relocations used in dynamic linking systems often require the
1093 linker to allocate space in a different section and use the offset
1094 within that section as the value to store. In the IEEE object file
1095 format, relocations may involve arbitrary expressions.
1096
1097 When doing a relocateable link, the linker may or may not have to do
1098 anything with a relocation, depending upon the definition of the
1099 relocation. Simple relocations generally do not require any special
1100 action.
1101
1102 @node BFD relocation functions
1103 @subsection BFD relocation functions
1104
1105 In BFD, each section has an array of @samp{arelent} structures. Each
1106 structure has a pointer to a symbol, an address within the section, an
1107 addend, and a pointer to a @samp{reloc_howto_struct} structure. The
1108 howto structure has a bunch of fields describing the reloc, including a
1109 type field. The type field is specific to the object file format
1110 backend; none of the generic code in BFD examines it.
1111
1112 Originally, the function @samp{bfd_perform_relocation} was supposed to
1113 handle all relocations. In theory, many relocations would be simple
1114 enough to be described by the fields in the howto structure. For those
1115 that weren't, the howto structure included a @samp{special_function}
1116 field to use as an escape.
1117
1118 While this seems plausible, a look at @samp{bfd_perform_relocation}
1119 shows that it failed. The function has odd special cases. Some of the
1120 fields in the howto structure, such as @samp{pcrel_offset}, were not
1121 adequately documented.
1122
1123 The linker uses @samp{bfd_perform_relocation} to do all relocations when
1124 the input and output file have different formats (e.g., when generating
1125 S-records). The generic linker code, which is used by all targets which
1126 do not define their own special purpose linker, uses
1127 @samp{bfd_get_relocated_section_contents}, which for most targets turns
1128 into a call to @samp{bfd_generic_get_relocated_section_contents}, which
1129 calls @samp{bfd_perform_relocation}. So @samp{bfd_perform_relocation}
1130 is still widely used, which makes it difficult to change, since it is
1131 difficult to test all possible cases.
1132
1133 The assembler used @samp{bfd_perform_relocation} for a while. This
1134 turned out to be the wrong thing to do, since
1135 @samp{bfd_perform_relocation} was written to handle relocations on an
1136 existing object file, while the assembler needed to create relocations
1137 in a new object file. The assembler was changed to use the new function
1138 @samp{bfd_install_relocation} instead, and @samp{bfd_install_relocation}
1139 was created as a copy of @samp{bfd_perform_relocation}.
1140
1141 Unfortunately, the work did not progress any farther, so
1142 @samp{bfd_install_relocation} remains a simple copy of
1143 @samp{bfd_perform_relocation}, with all the odd special cases and
1144 confusing code. This again is difficult to change, because again any
1145 change can affect any assembler target, and so is difficult to test.
1146
1147 The new linker, when using the same object file format for all input
1148 files and the output file, does not convert relocations into
1149 @samp{arelent} structures, so it can not use
1150 @samp{bfd_perform_relocation} at all. Instead, users of the new linker
1151 are expected to write a @samp{relocate_section} function which will
1152 handle relocations in a target specific fashion.
1153
1154 There are two helper functions for target specific relocation:
1155 @samp{_bfd_final_link_relocate} and @samp{_bfd_relocate_contents}.
1156 These functions use a howto structure, but they @emph{do not} use the
1157 @samp{special_function} field. Since the functions are normally called
1158 from target specific code, the @samp{special_function} field adds
1159 little; any relocations which require special handling can be handled
1160 without calling those functions.
1161
1162 So, if you want to add a new target, or add a new relocation to an
1163 existing target, you need to do the following:
1164 @itemize @bullet
1165 @item
1166 Make sure you clearly understand what the contents of the section should
1167 look like after assembly, after a relocateable link, and after a final
1168 link. Make sure you clearly understand the operations the linker must
1169 perform during a relocateable link and during a final link.
1170
1171 @item
1172 Write a howto structure for the relocation. The howto structure is
1173 flexible enough to represent any relocation which should be handled by
1174 setting a contiguous bitfield in the destination to the value of a
1175 symbol, possibly with an addend, possibly adding the symbol value to the
1176 value already present in the destination.
1177
1178 @item
1179 Change the assembler to generate your relocation. The assembler will
1180 call @samp{bfd_install_relocation}, so your howto structure has to be
1181 able to handle that. You may need to set the @samp{special_function}
1182 field to handle assembly correctly. Be careful to ensure that any code
1183 you write to handle the assembler will also work correctly when doing a
1184 relocateable link. For example, see @samp{bfd_elf_generic_reloc}.
1185
1186 @item
1187 Test the assembler. Consider the cases of relocation against an
1188 undefined symbol, a common symbol, a symbol defined in the object file
1189 in the same section, and a symbol defined in the object file in a
1190 different section. These cases may not all be applicable for your
1191 reloc.
1192
1193 @item
1194 If your target uses the new linker, which is recommended, add any
1195 required handling to the target specific relocation function. In simple
1196 cases this will just involve a call to @samp{_bfd_final_link_relocate}
1197 or @samp{_bfd_relocate_contents}, depending upon the definition of the
1198 relocation and whether the link is relocateable or not.
1199
1200 @item
1201 Test the linker. Test the case of a final link. If the relocation can
1202 overflow, use a linker script to force an overflow and make sure the
1203 error is reported correctly. Test a relocateable link, whether the
1204 symbol is defined or undefined in the relocateable output. For both the
1205 final and relocateable link, test the case when the symbol is a common
1206 symbol, when the symbol looked like a common symbol but became a defined
1207 symbol, when the symbol is defined in a different object file, and when
1208 the symbol is defined in the same object file.
1209
1210 @item
1211 In order for linking to another object file format, such as S-records,
1212 to work correctly, @samp{bfd_perform_relocation} has to do the right
1213 thing for the relocation. You may need to set the
1214 @samp{special_function} field to handle this correctly. Test this by
1215 doing a link in which the output object file format is S-records.
1216
1217 @item
1218 Using the linker to generate relocateable output in a different object
1219 file format is impossible in the general case, so you generally don't
1220 have to worry about that. Linking input files of different object file
1221 formats together is quite unusual, but if you're really dedicated you
1222 may want to consider testing this case, both when the output object file
1223 format is the same as your format, and when it is different.
1224 @end itemize
1225
1226 @node BFD relocation codes
1227 @subsection BFD relocation codes
1228
1229 BFD has another way of describing relocations besides the howto
1230 structures described above: the enum @samp{bfd_reloc_code_real_type}.
1231
1232 Every known relocation type can be described as a value in this
1233 enumeration. The enumeration contains many target specific relocations,
1234 but where two or more targets have the same relocation, a single code is
1235 used. For example, the single value @samp{BFD_RELOC_32} is used for all
1236 simple 32 bit relocation types.
1237
1238 The main purpose of this relocation code is to give the assembler some
1239 mechanism to create @samp{arelent} structures. In order for the
1240 assembler to create an @samp{arelent} structure, it has to be able to
1241 obtain a howto structure. The function @samp{bfd_reloc_type_lookup},
1242 which simply calls the target vector entry point
1243 @samp{reloc_type_lookup}, takes a relocation code and returns a howto
1244 structure.
1245
1246 The function @samp{bfd_get_reloc_code_name} returns the name of a
1247 relocation code. This is mainly used in error messages.
1248
1249 Using both howto structures and relocation codes can be somewhat
1250 confusing. There are many processor specific relocation codes.
1251 However, the relocation is only fully defined by the howto structure.
1252 The same relocation code will map to different howto structures in
1253 different object file formats. For example, the addend handling may be
1254 different.
1255
1256 Most of the relocation codes are not really general. The assembler can
1257 not use them without already understanding what sorts of relocations can
1258 be used for a particular target. It might be possible to replace the
1259 relocation codes with something simpler.
1260
1261 @node BFD relocation future
1262 @subsection BFD relocation future
1263
1264 Clearly the current BFD relocation support is in bad shape. A
1265 wholescale rewrite would be very difficult, because it would require
1266 thorough testing of every BFD target. So some sort of incremental
1267 change is required.
1268
1269 My vague thoughts on this would involve defining a new, clearly defined,
1270 howto structure. Some mechanism would be used to determine which type
1271 of howto structure was being used by a particular format.
1272
1273 The new howto structure would clearly define the relocation behaviour in
1274 the case of an assembly, a relocateable link, and a final link. At
1275 least one special function would be defined as an escape, and it might
1276 make sense to define more.
1277
1278 One or more generic functions similar to @samp{bfd_perform_relocation}
1279 would be written to handle the new howto structure.
1280
1281 This should make it possible to write a generic version of the relocate
1282 section functions used by the new linker. The target specific code
1283 would provide some mechanism (a function pointer or an initial
1284 conversion) to convert target specific relocations into howto
1285 structures.
1286
1287 Ideally it would be possible to use this generic relocate section
1288 function for the generic linker as well. That is, it would replace the
1289 @samp{bfd_generic_get_relocated_section_contents} function which is
1290 currently normally used.
1291
1292 For the special case of ELF dynamic linking, more consideration needs to
1293 be given to writing ELF specific but ELF target generic code to handle
1294 special relocation types such as GOT and PLT.
1295
1296 @node BFD ELF support
1297 @section BFD ELF support
1298 @cindex elf support in bfd
1299 @cindex bfd elf support
1300
1301 The ELF object file format is defined in two parts: a generic ABI and a
1302 processor specific supplement. The ELF support in BFD is split in a
1303 similar fashion. The processor specific support is largely kept within
1304 a single file. The generic support is provided by several other file.
1305 The processor specific support provides a set of function pointers and
1306 constants used by the generic support.
1307
1308 @menu
1309 * BFD ELF generic support:: BFD ELF generic support
1310 * BFD ELF processor specific support:: BFD ELF processor specific support
1311 * BFD ELF future:: BFD ELF future
1312 @end menu
1313
1314 @node BFD ELF generic support
1315 @subsection BFD ELF generic support
1316
1317 In general, functions which do not read external data from the ELF file
1318 are found in @file{elf.c}. They operate on the internal forms of the
1319 ELF structures, which are defined in @file{include/elf/internal.h}. The
1320 internal structures are defined in terms of @samp{bfd_vma}, and so may
1321 be used for both 32 bit and 64 bit ELF targets.
1322
1323 The file @file{elfcode.h} contains functions which operate on the
1324 external data. @file{elfcode.h} is compiled twice, once via
1325 @file{elf32.c} with @samp{ARCH_SIZE} defined as @samp{32}, and once via
1326 @file{elf64.c} with @samp{ARCH_SIZE} defined as @samp{64}.
1327 @file{elfcode.h} includes functions to swap the ELF structures in and
1328 out of external form, as well as a few more complex functions.
1329
1330 Linker support is found in @file{elflink.c} and @file{elflink.h}. The
1331 latter file is compiled twice, for both 32 and 64 bit support. The
1332 linker support is only used if the processor specific file defines
1333 @samp{elf_backend_relocate_section}, which is required to relocate the
1334 section contents. If that macro is not defined, the generic linker code
1335 is used, and relocations are handled via @samp{bfd_perform_relocation}.
1336
1337 The core file support is in @file{elfcore.h}, which is compiled twice,
1338 for both 32 and 64 bit support. The more interesting cases of core file
1339 support only work on a native system which has the @file{sys/procfs.h}
1340 header file. Without that file, the core file support does little more
1341 than read the ELF program segments as BFD sections.
1342
1343 The BFD internal header file @file{elf-bfd.h} is used for communication
1344 among these files and the processor specific files.
1345
1346 The default entries for the BFD ELF target vector are found mainly in
1347 @file{elf.c}. Some functions are found in @file{elfcode.h}.
1348
1349 The processor specific files may override particular entries in the
1350 target vector, but most do not, with one exception: the
1351 @samp{bfd_reloc_type_lookup} entry point is always processor specific.
1352
1353 @node BFD ELF processor specific support
1354 @subsection BFD ELF processor specific support
1355
1356 By convention, the processor specific support for a particular processor
1357 will be found in @file{elf@var{nn}-@var{cpu}.c}, where @var{nn} is
1358 either 32 or 64, and @var{cpu} is the name of the processor.
1359
1360 @menu
1361 * BFD ELF processor required:: Required processor specific support
1362 * BFD ELF processor linker:: Processor specific linker support
1363 * BFD ELF processor other:: Other processor specific support options
1364 @end menu
1365
1366 @node BFD ELF processor required
1367 @subsubsection Required processor specific support
1368
1369 When writing a @file{elf@var{nn}-@var{cpu}.c} file, you must do the
1370 following:
1371 @itemize @bullet
1372 @item
1373 Define either @samp{TARGET_BIG_SYM} or @samp{TARGET_LITTLE_SYM}, or
1374 both, to a unique C name to use for the target vector. This name should
1375 appear in the list of target vectors in @file{targets.c}, and will also
1376 have to appear in @file{config.bfd} and @file{configure.in}. Define
1377 @samp{TARGET_BIG_SYM} for a big-endian processor,
1378 @samp{TARGET_LITTLE_SYM} for a little-endian processor, and define both
1379 for a bi-endian processor.
1380 @item
1381 Define either @samp{TARGET_BIG_NAME} or @samp{TARGET_LITTLE_NAME}, or
1382 both, to a string used as the name of the target vector. This is the
1383 name which a user of the BFD tool would use to specify the object file
1384 format. It would normally appear in a linker emulation parameters
1385 file.
1386 @item
1387 Define @samp{ELF_ARCH} to the BFD architecture (an element of the
1388 @samp{bfd_architecture} enum, typically @samp{bfd_arch_@var{cpu}}).
1389 @item
1390 Define @samp{ELF_MACHINE_CODE} to the magic number which should appear
1391 in the @samp{e_machine} field of the ELF header. As of this writing,
1392 these magic numbers are assigned by SCO; if you want to get a magic
1393 number for a particular processor, try sending a note to
1394 @email{registry@@sco.com}. In the BFD sources, the magic numbers are
1395 found in @file{include/elf/common.h}; they have names beginning with
1396 @samp{EM_}.
1397 @item
1398 Define @samp{ELF_MAXPAGESIZE} to the maximum size of a virtual page in
1399 memory. This can normally be found at the start of chapter 5 in the
1400 processor specific supplement. For a processor which will only be used
1401 in an embedded system, or which has no memory management hardware, this
1402 can simply be @samp{1}.
1403 @item
1404 If the format should use @samp{Rel} rather than @samp{Rela} relocations,
1405 define @samp{USE_REL}. This is normally defined in chapter 4 of the
1406 processor specific supplement. In the absence of a supplement, it's
1407 usually easier to work with @samp{Rela} relocations, although they will
1408 require more space in object files (but not in executables, except when
1409 using dynamic linking). It is possible, though somewhat awkward, to
1410 support both @samp{Rel} and @samp{Rela} relocations for a single target;
1411 @file{elf64-mips.c} does it by overriding the relocation reading and
1412 writing routines.
1413 @item
1414 Define howto structures for all the relocation types.
1415 @item
1416 Define a @samp{bfd_reloc_type_lookup} routine. This must be named
1417 @samp{bfd_elf@var{nn}_bfd_reloc_type_lookup}, and may be either a
1418 function or a macro. It must translate a BFD relocation code into a
1419 howto structure. This is normally a table lookup or a simple switch.
1420 @item
1421 If using @samp{Rel} relocations, define @samp{elf_info_to_howto_rel}.
1422 If using @samp{Rela} relocations, define @samp{elf_info_to_howto}.
1423 Either way, this is a macro defined as the name of a function which
1424 takes an @samp{arelent} and a @samp{Rel} or @samp{Rela} structure, and
1425 sets the @samp{howto} field of the @samp{arelent} based on the
1426 @samp{Rel} or @samp{Rela} structure. This is normally uses
1427 @samp{ELF@var{nn}_R_TYPE} to get the ELF relocation type and uses it as
1428 an index into a table of howto structures.
1429 @end itemize
1430
1431 You must also add the magic number for this processor to the
1432 @samp{prep_headers} function in @file{elf.c}.
1433
1434 @node BFD ELF processor linker
1435 @subsubsection Processor specific linker support
1436
1437 The linker will be much more efficient if you define a relocate section
1438 function. This will permit BFD to use the ELF specific linker support.
1439
1440 If you do not define a relocate section function, BFD must use the
1441 generic linker support, which requires converting all symbols and
1442 relocations into BFD @samp{asymbol} and @samp{arelent} structures. In
1443 this case, relocations will be handled by calling
1444 @samp{bfd_perform_relocation}, which will use the howto structures you
1445 have defined. @xref{BFD relocation handling}.
1446
1447 In order to support linking into a different object file format, such as
1448 S-records, @samp{bfd_perform_relocation} must work correctly with your
1449 howto structures, so you can't skip that step. However, if you define
1450 the relocate section function, then in the normal case of linking into
1451 an ELF file the linker will not need to convert symbols and relocations,
1452 and will be much more efficient.
1453
1454 To use a relocation section function, define the macro
1455 @samp{elf_backend_relocate_section} as the name of a function which will
1456 take the contents of a section, as well as relocation, symbol, and other
1457 information, and modify the section contents according to the relocation
1458 information. In simple cases, this is little more than a loop over the
1459 relocations which computes the value of each relocation and calls
1460 @samp{_bfd_final_link_relocate}. The function must check for a
1461 relocateable link, and in that case normally needs to do nothing other
1462 than adjust the addend for relocations against a section symbol.
1463
1464 The complex cases generally have to do with dynamic linker support. GOT
1465 and PLT relocations must be handled specially, and the linker normally
1466 arranges to set up the GOT and PLT sections while handling relocations.
1467 When generating a shared library, random relocations must normally be
1468 copied into the shared library, or converted to RELATIVE relocations
1469 when possible.
1470
1471 @node BFD ELF processor other
1472 @subsubsection Other processor specific support options
1473
1474 There are many other macros which may be defined in
1475 @file{elf@var{nn}-@var{cpu}.c}. These macros may be found in
1476 @file{elfxx-target.h}.
1477
1478 Macros may be used to override some of the generic ELF target vector
1479 functions.
1480
1481 Several processor specific hook functions which may be defined as
1482 macros. These functions are found as function pointers in the
1483 @samp{elf_backend_data} structure defined in @file{elf-bfd.h}. In
1484 general, a hook function is set by defining a macro
1485 @samp{elf_backend_@var{name}}.
1486
1487 There are a few processor specific constants which may also be defined.
1488 These are again found in the @samp{elf_backend_data} structure.
1489
1490 I will not define the various functions and constants here; see the
1491 comments in @file{elf-bfd.h}.
1492
1493 Normally any odd characteristic of a particular ELF processor is handled
1494 via a hook function. For example, the special @samp{SHN_MIPS_SCOMMON}
1495 section number found in MIPS ELF is handled via the hooks
1496 @samp{section_from_bfd_section}, @samp{symbol_processing},
1497 @samp{add_symbol_hook}, and @samp{output_symbol_hook}.
1498
1499 Dynamic linking support, which involves processor specific relocations
1500 requiring special handling, is also implemented via hook functions.
1501
1502 @node BFD ELF future
1503 @subsection BFD ELF future
1504
1505 The current dynamic linking support has too much code duplication.
1506 While each processor has particular differences, much of the dynamic
1507 linking support is quite similar for each processor. The GOT and PLT
1508 are handled in fairly similar ways, the details of -Bsymbolic linking
1509 are generally similar, etc. This code should be reworked to use more
1510 generic functions, eliminating the duplication.
1511
1512 Similarly, the relocation handling has too much duplication. Many of
1513 the @samp{reloc_type_lookup} and @samp{info_to_howto} functions are
1514 quite similar. The relocate section functions are also often quite
1515 similar, both in the standard linker handling and the dynamic linker
1516 handling. Many of the COFF processor specific backends share a single
1517 relocate section function (@samp{_bfd_coff_generic_relocate_section}),
1518 and it should be possible to do something like this for the ELF targets
1519 as well.
1520
1521 The appearance of the processor specific magic number in
1522 @samp{prep_headers} in @file{elf.c} is somewhat bogus. It should be
1523 possible to add support for a new processor without changing the generic
1524 support.
1525
1526 The processor function hooks and constants are ad hoc and need better
1527 documentation.
1528
1529 @node Index
1530 @unnumberedsec Index
1531 @printindex cp
1532
1533 @contents
1534 @bye