Julian Seward [Fri, 10 May 2002 21:07:22 +0000 (21:07 +0000)]
New and hopefully more reliable method for finding argc/argv/envp at
startup, by looking for the ELF frame created on the process' stack
at startup. This avoids having to deal with problems caused by glibc
magic offsets.
WARNING: only works for 2.2 kernels right now. 2.4 is broken.
Julian Seward [Fri, 10 May 2002 21:03:56 +0000 (21:03 +0000)]
Modify the startup mechanism so that any call into valgrind's libpthread.so
will start up valgrind if it is not already running. This more or less
sidesteps the problem that sometimes valgrind.so isn't init'd first by
the dynamic linker.
Julian Seward [Fri, 10 May 2002 03:03:57 +0000 (03:03 +0000)]
Insert hacks, only partially successful, to make 'make distcheck' work
with the new vg_libpthread.vs linker script. Problem is that builds
where builddir != srcdir don't work now. Don't know how to fix.
Julian Seward [Thu, 9 May 2002 17:38:13 +0000 (17:38 +0000)]
Remove valgrind's use of libc-supplied stat() and sbrk(). Now the only
sysbols we need from libc are __umoddi3 and __udivdi3 ; other than that
valgrind.so is completely self-contained.
Julian Seward [Thu, 9 May 2002 12:01:14 +0000 (12:01 +0000)]
Reinstate a condition in the IPCOP_shmctl wrapper without which the
system dies to the recently-rejuvenated
first-and-last-secondaries-look-plausible assertions around syscalls.
Julian Seward [Wed, 8 May 2002 00:32:50 +0000 (00:32 +0000)]
Improvements to the error-collecting machinery:
- Don't waste a potentially huge amount of time calling describe_addr
on addresses in errors we aren't going to show.
- If an invalid address is just below %ESP, say that it might be due
to a gcc bug. Increase the window in which this is allowed to
1024 bytes below %ESP.
Julian Seward [Tue, 7 May 2002 23:45:03 +0000 (23:45 +0000)]
Actually call VG_(first_and_last_secondaries_look_plausible) and make
assertions about the return value, rather than asserting the
non-NULL-ness of the function's address :) Classic beginner's mistake,
compounded by C's crappy (non-existent) type system, which allows me
to silently confuse Bool with Pointer-to-Function. What a great
programming language. Come back Haskell, all is forgiven.
Julian Seward [Tue, 7 May 2002 23:38:30 +0000 (23:38 +0000)]
Generate better ucode for back-to-back sequences of register pushes and
pops, as appear at function prologues/epilogues. Specifically, update %ESP
just once for the whole sequence. This reduces by about 20% the number
of calls to handle_esp_assignment (for kate in KDE 3.0, -O), which is a
good thing since that is quite expensive.
vg_symtab2.c:
- No longer aborting when encountering a N_SOL symbol after the 65535th
line in a file, just printing a warning/apology that annotations/messages
might be wrong.
This is a pain to fix properly, since it requires first guessing when a
line number overflow happens, then switching to one or more other files,
then switching back.
Julian Seward [Fri, 3 May 2002 21:01:35 +0000 (21:01 +0000)]
From: Tom Hughes <thh@i'm sure he doesn't want to be spammed.com>
The attached patch improves the validation done on sockaddr structures
passed to systems calls by extending the existing code for AF_UNIX to
cover AF_INET and AF_INET6 as well so that errors are not raised when
an unused part of the sockaddr structure is not filled in.
It also applies this new code to bind and sendto as well as connect.
Julian Seward [Fri, 3 May 2002 19:09:05 +0000 (19:09 +0000)]
Change the way Valgrind exits.
Until now, valgrind waited for ld.so to call the .fini code in
valgrind.so, and took this as its cue to switch back to the real CPU
for the rest of the journey.
This is a problem if ld.so subsequently calls other .so's .fini code
and threading is in use, because they do pthread_* calls which cannot
be handled by valgrind's libpthread.so without valgrind actually being
active.
So we ignore the call to valgrind's .fini code, and run the program
all the way up to the point where it calls syscall exit() to
disappear. This makes the order in which the .fini sections are run
irrelevant, since Valgrind has control during all of them, and so
threading facilities are still available for all of them.
This change means Mozilla 1.0RC1 now exits a lot more cleanly than it
did.
vg_symtab2.c:
- Can now handle file sizes > 65536 lines, despite the stabs format only
storing line numbers in a short. Do this heuristically, by looking for
line number sequences that go from 65000-odd to 0-odd within the same
file.
This required changing the RiLoc.lineno field to 20 bytes, which gives a
maximum file length of 1,000,000-odd lines, whichs seems reasonable.
In order to keep RiLoc at 12 bytes (important because there are lots of
them) this required stealing four bits from the RiLoc.size field,
reducing it to 12 bits. This isn't too bad because the size is unlikely
to be larger than 4096 bytes -- we were already ignoring any ones larger
than 10,000 bytes because they were suspicious anyway (and see next
point).
- Tightened up the sanity checking on line address ranges. Previously any
range that looked suspicious (eg. > 10000 bytes, or not within the bound
of the segment info) was simply ignored(!) Now it prints a warning when
this happens and truncates the size to 1 to be safe; also there are some
extra assertions for totally space-cadet numbers.
(At first these checks were all assertions, but I tried a version of GNU
gas that produces a small handful of dodgy stabs entries; warnings
seemed a reasonable compromise.)
vg_cachesim.c:
- Removed the requirement that both types of cost centre (iCC, idCC) have
instr_addr as their second word. Less fragile -- now the only
requirement is that they both have their type tag as their first byte.
Julian Seward [Thu, 2 May 2002 03:57:00 +0000 (03:57 +0000)]
Remove comments about Mozilla 1.0RC1 crashing, since that's not a Valgrind
bug, and explain, for the benefit of Mozilla hackers, how to make 1.0RC1
work on Valgrind.
Julian Seward [Thu, 2 May 2002 03:47:01 +0000 (03:47 +0000)]
Jack up the size of the translation cache from 16 MB to 40 MB (!).
This is needed to give reasonable behaviour for the insanity of a
Mozilla debug build, apparently even worse than the insanity of a
KDE 3 debug build. Change some limit calculations to use double
rather than int, so as to avoid overflows.
Julian Seward [Wed, 1 May 2002 23:05:12 +0000 (23:05 +0000)]
Improve my implementations of strcmp() and memcpy() since Nick's profiler
indicates that KDE apps spend 20% of their simulated insns in these two
functions alone.
Julian Seward [Wed, 1 May 2002 01:58:35 +0000 (01:58 +0000)]
Reinstate use of VG_(do_sanity_checks), although at a lower frequency
than before. Turns out they were wasting 25-50% of total execution
time in valgrinds of the 200203XX vintage. Apologies, KDE hackers!
vg_symtab2.c:
Discovered sometimes a SLINE stabs entry is the last one (which broke an
assertion). In such a case, we must guess the line's instruction address
range -- I've guessed 4, arbitrarily.
vg_cachegen.in, vg_cachesim_{I1,D1,L2}.c:
Discovered a bad bug in the cache simulation: when determining if a
references straddles two memory blocks, to find the end of the range I was
adding 'size' to the base address, rather than 'size - 1'. This was
causing way too many straddled references, which would inflate the miss
counts.
vg_improve() -- the ucode optimiser: consistently apply the
no-deferred-updates of %ESP rule, regardless of end use of the ucode.
This seems more consistent, and was exposed following examination of
code causing an assertion failure in the cache profiler. Added an
assertion to check this too, and was surprised I hadn't had an
assertion there in the first place.
New files:
- vg_cachesim.c
- vg_cachesim_{I1,D1,L2}.c
- vg_annotate.in
- vg_cachegen.in
Changes to existing files:
- valgrind/valgrind.in, added option:
--cachesim=no|yes [no]
- Makefile/Makefile.am:
* added vg_cachesim.c to valgrind_so_SOURCES var
* added vg_cachesim_I1.c, vg_cachesim_D1.c, vg_cachesim_L2.c to
noinst_HEADERS var
* added vg_annotate, vg_cachegen to 'bin_SCRIPTS' var, and added empty
targets for them
- vg_main.c:
* added two offsets for cache sim functions (put in positions 17a,17b)
* added option handling (detection of --cachesim=yes which turns off of
--instrument);
* added calls to cachesim initialisation/finalisation functions
- vg_mylibc: added some system call wrappers (for chmod, open_write, etc) for
file writing
- vg_symtab2.c:
* allow it to read symbols if either of --instrument or --cachesim is
used
* made vg_symtab2.c:vg_what_{line,fn}_is_this extern, renaming it as
VG_(what_line_is_this) (and added to vg_include.h)
* completely rewrote the read loop in vg_read_lib_symbols, fixing
several bugs. Much better now, although probably not perfect. It's
also relatively fragile -- I'm using the "die immediately if anything
unexpected happens" approach.
- vg_to_ucode.c:
* in VG_(disBB), patching in x86 instruction size into extra4b field of
JMP instructions at the end of basic blocks if --cachesim=yes.
Shifted things around to do this; also had to fiddle around with
single-step stuff to get this to work, by not sticking extra JMPs on
the end of the single-instruction block if there was already one
there (to avoid breaking an assertion in vg_cachesim.c). Did a
similar thing to avoid an extra JMP on huge basic blocks that are
split.
- vg_translate.c:
* if --cachesim=yes call the cachesim instrumentation phase
* made some functions extern and renamed:
allocCodeBlock() --> VG_(allocCodeBlock)()
freeCodeBlock() --> VG_(freeCodeBlock)()
copyUInstr() --> VG_(copyUInstr)()
(added to vg_include.h too)
- vg_include.c: declared
* cachesim offsets
* exports of vg_cachesim.c
* added four new profiling events (increasing VGP_M_CCS to 24 -- I kept
the spare ones)
* added comment about UInstr.extra4b field being used for instr size in
JMPs for cache simulation
- docs/manual.html:
* Added --cachesim option to section 2.5.
* Added cache profiling stuff as section 7.
Fix really stupid error in computation of timeout point in nonblocking
poll(). After this change, Mozilla-0.9.2.1 and Galeon 0.11.3 finally
behave reasonably on my box.
Fix a subtle (?) bug in sched_do_syscall to with read/write calls for
which the client has already got the fd in nonblocking mode. In such
cases, do not wait for an IO completion -- since the client presumably
handles that somehow.
handle_signal_return: when a waiting read/write syscall is interrupted
by a signal which has been set to non-SARESTART, clean up the waiting_fds
table correctly.
Mess around with aliases to make the exported T/D/W syms look like those
of the real libpthread.so. This is a Good Thing, despite the fact it
temporarily breaks some threaded programs.
Fix many holes and bugs in an attempt to get my libpthread.so to export
the same set of symbols as the real one, which I now realise is crucial
for it to work at all.
Try and give at least some minimal binding for all functions exported
by the real libpthread.so. In the process fix a bunch of stuff, including
adding thread-specific h_errno and resolver state storage. This fixes
licq crashing at startup.