git.ipfire.org Git - thirdparty/valgrind.git/log

Bug #348247 jno jumps wrongly when overflow is not set.

Mention bug fixed in VEX r3147 in NEWS.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15295

Make some numbers in helgrind stats use , separators, as the numbers can be big

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15294

Slightly improve x86 unwind intensive workload.
e.g. perf/memrw is improved by 2% to 3% with this patch.

The unwinding code on x86 is trying to unwind using
either the %ebp-chain or CFI unwinding.
If these 2 techniques fail, then it tries to unwind
using FPO (PDB) debug info.
However, unless running wine or similar, there will never be
such FPO/PDB info.
The function VG_(use_FPO_info) is thus called for nothing
for each 'end of stack'. This function scans all the loaded di
to find a debug info that has some FP, to not find anything.

With this patch, the unwind code on x86 will only call VG_(use_FPO_info) if
some FPO/PDB info was loaded.

The fact that FPO/PDB info was loaded is cached and updated similarly to
cfi cache : each time new debug info is loaded, the cache value is refreshed
using the debuginfo generation.

The patch also changes the name of VG_(CF_info_generation)
to VG_(debuginfo_generation), as this generation is changed for
any kind of load or unload of debug info, not only for CFI based debug
info

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15293

Wraparounds are never allowed -- not evern for MAny requests.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15292

Remove dependency on bash. Fixes BZ #347978.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15291

Some platforms such as x86 and amd64 have efficient unaligned access.
On these platforms, implement read_/write_<type> by doing a direct
access, rather than calling a function that will read or write
'byte per byte'.

For platforms that do not have efficient unaligned access,
or that do not support at all unaligned access, call function
readUAS_/writeUAS_<type> that works as before.

Currently, direct acecss is activated only for x86 and amd64.
Unclear what other platforms support (efficiently) unaligned access.

On unwind intensive code (such as perf/memrw on amd64), this patch
gives up to 5% improvement.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15290

This patch decreases significantly the memory needed for OldRef and
slightly increases the performance. It also moderately improves
the nr of cases where helgrind can provide the stack trace of the old
access (when using the same amount of memory for the OldRef entries).
The patch also provides a new helgrind monitor command to show
the recorded accesses for an address+len, and adds an optional argument
lock_address to the monitor command 'info locks', to show the info
about just this lock.

Currently, oldref are maintained in a sparse WA, that points to N
entries, as specified by --conflict-cache-size=N.
For each entry (associated to an address), we have the last 5 accesses.

Old entries are recycled in an exact LRU order.
But inside an entry, we could have a recent access, and 4 very
old accesses that are kept 'alive' by a single thread accessing
repetitively the address shared with the 4 other old entries.

The attached patch replaces the sparse WA that maintains the OldREf
by an hash table.
Each OldRef now also only maintains one single access for an address.
As an OldRef now maintains only one access, all the entries are now
strictly in LRU mode.

Memory used for OldRef
-----------------------
For the trunk, an OldRef has a size of 72 bytes (on 32 bits archs)
maintaining up to 5 accesses to the same address.
On 64 bits arch, an OldRef is 104 bytes.

With the patch, an OldRef has a size of 32 bytes (on 32 bits archs)
or 56 bytes (on 64 bits archs).

So, for one single access, the new code needs (on 32 bits)
32 bytes, while the trunk needs only 14.4 bytes.
However, that is the worst case, assuming that the 5 entries in the
accs array are all used.
Looking on 2 big apps (one of them being firefox), we see that
we have very few OldRef entries that have the 5 entries occupied.
On a firefox startup, of the 5x1,000,000 accesses, we only have
1,406,939 accesses that are used.
So, in average, the trunk uses in reality around 52 bytes per access.

The default value for --conflict-cache-size has been doubled to 2000000.
This ensures that the memory used for the OldRef is more or less the
same as the trunk (104Mb for OldRef entries).

Memory used for sparseWA versus hashtable
-----------------------------------------
Looking on 2 big apps (one of them being firefox), we see that
there are big variations on the size of the WA : it can go in a few
seconds from 10MB to 250MB, or can decrease back to 10 MB.
This all depends where the last N accesses were done: if well localised,
the WA will be small.
If the last N accesses were distributed over a big address space,
then the WA will be big: the last level of WA (the biggest memory consumer)
uses slightly more than 1KB (2KB on 64 bits) for each '256 bytes' memory
zone where there is an oldref. So, in the worst case, on 32 bits, we
need > 1_000_000_000 sparseWA memory to keep 1_000_000 OldRef.

The hash table has between 1 to 2 Word overhead per OldRef
(as the chain array is +- doubled each time the hash table is full).
So, unless the OldRef are extremely localised, the overhead of the
hash table will be significantly less.

With the patch, the core arena total alloc is:
5299535/1201448632 totalloc-blocks/bytes
The trunk is
6693111/3959050280 totalloc-blocks/bytes
(so, around 1.20Gb versus 3.95Gb).
This big difference is due to the fact that the sparseWA repetitively
allocates then frees Level0 or LevelN when OldRef in the region covered
by the Level0/N have all been recycled.

In terms of CPU
---------------
With the patch, on amd64, a firefox startup seems slightly faster (around 1%).
The peak memory mmaped/used decreases by 200Mb.
For a libreoffice test, the memory decreases by 230Mb. CPU also decreases
slightly (1%).

In terms of correctness:
-----------------------
The trunk could potentially show not the most recent access
to the memory of a race : the first OldRef entry matching the raced upon
address was used, while we could have a more recent access in a following
OldRef entry. In other words, the trunk only guaranteed to find the
most recent access in an OldRef, but not between the several OldRef that
could cover the raced upon address.
So, assuming it is important to show the most recent access, this patch
ensures we really show the most recent access, even in presence of overlapping
accesses.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15289

Address clang compiler warnings on OS X.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15288

Fix regression test added in r15282.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15287

helgrind stats: show the total nr of thr_n_rcec

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15285

helgrind stats: give the memory occupied by the OldRef

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15284

Add stats in helgrind for oldref history found versus not found

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15283

Add (presently) failing test case for bz#234814.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15282

Unguard none/tests/x86/cse_fail on OS X, as the test completes. n-i-bz.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15281

Re-enable functioning none/tests/amd64/bug137714-amd64 on OS X
n-i-bz

Before:

== 588 tests, 221 stderr failures, 16 stdout failures, 0 stderrB failures, 0 stdoutB failures, 30 post failures ==

After:

== 588 tests, 220 stderr failures, 15 stdout failures, 0 stderrB failures, 0 stdoutB failures, 30 post failures ==

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15280

Properly guard none/tests/amd64/*.vgtest on OS X for tests not compiled
n-i-bz

Before:

== 595 tests, 228 stderr failures, 23 stdout failures, 0 stderrB failure, 0 stdoutB failure, 30 post failures ==

After:

== 595 tests, 221 stderr failures, 16 stdout failures, 0 stderrB failure, 0 stdoutB failure, 30 post failures ==

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15279

Fix a comment. Do not enumerate segment kinds as all segments
have an extent ... including SkShmC segments.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15278

Fix bug 345126: Incorrect handling of VIDIOC_G_AUDIO and G_AUDOUT
Patch from Hans Verkuil (hverkuil@xs4all.nl)

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15276

Followup to 15270. Completely forgot about the double maintenance.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15272

Add procfs-non-linux.stderr.exp variants to EXTRA_DIST.

For bz#344936 procfs-non-linux.stderr.exp was renamed and split into
procfs-non-linux.stderr.exp-with-readlinkat and
procfs-non-linux.stderr.exp-without-readlinkat add both to EXTRA_DIST.
Fixes make post-regtest-checks.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15271

Comment only change.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15270

Remove an incorrect assertion. Need to consider SkShmC segments as well.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15269

Also compare keys before calling cmp in the hash table stats printing

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15268

Follow-up bz#344936: Distinguish readlinkat tests for OS X platforms that do or do not support the readlinkat syscall.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15265

Fix rounding when printing floating point numbers.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15264

Have the hash table 'gen' functions comparing the key instead of the
cmp function.
Document this in the cmp function comment in pub_tool_hashtable.h

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15263

Improve presentation of first line of --profile-heap=yes
(i.e. use
-------- Arena "client": 4,194,304/4,194,304 max/cu...
instead of
-------- Arena "client": 4194304/4194304 max/cu....

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15262

Fix unhandled syscall: unix:473 (readlinkat) on OS X 10.10
bz#344936

Before:

== 595 tests, 229 stderr failures, 23 stdout failures, 1 stderrB failure, 1 stdoutB failure, 30 post failures ==

After:

== 595 tests, 228 stderr failures, 23 stdout failures, 1 stderrB failure, 1 stdoutB failure, 30 post failures ==

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15260

Improve documentation of syscall: unix: 44 profil() which was deprecated around OS X 10.6 and removed from the xnu kernel shipped with OS X 10.7. See unresolved bz#264253.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15259

Set tests/check_ppc64le_cap to executable.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15258

Fix for the HWCAP2 aux vector.

The support assumed that if HWCAP2 is present that the system also supports
ISA2.07.  That assumption is not correct as we have found a few systems (OS)
where the HWCAP2 entry is present but the ISA2.07 bit is not set.  This patch
fixes the assertion test to specifically check the ISA2.07 support bit setting
in the HWCAP2 and vex_archinfo->hwcaps variable.  The setting for the
ISA2.07 support must be the same in both variables if the HWCAP2 entry exists.

This patch updates Vagrind bugzilla 345695.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15257

Update.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15256

Silence some reachable system library reports on OS X 10.10 for simple Hello World console application. No regressions. n-i-bz.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15255

Follow up to r15253:
Having a one elt free lineF cache avoids many PA calls.
This seems to slightly improve (a few %) a firefox startup.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15254

This patch reduces the memory needed for the linesF.

Currently, each SecMap has an array of linesF, referenced by the linesZ
of the secmap that needs a lineF, via an index stored in dict[1].
When the array is full, its size is doubled.
The linesF array of a secmap is freed when the SecMap is GC-ed.
The above strategy has the following consequences:
  A. in average, 25% of the LinesF are unused.
  B. if a SecMap has 'temporarily' a need for linesF, but afterwards,
     these linesF are converted to normal lineZ representation, the linesF
     will not be recuperated unless the SecMap is GC-ed (i.e. fully marked
     no access).

The patch replaces the linesF array private per SecMap
by a pool allocator of LinesF shared between all SecMap.
A lineZ that needs a lineF will directly point to its lineF (using a pointer
stored in dict[1]), instead of having in dict[1] the index in the SecMap
linesF array.
When a lineZ needs a lineF, it is allocated from the pool allocator.
When a lineZ does not need anymore a lineF, it is returned back to the
pool allocator.

On a firefox startup, the above strategy reduces the memory for linesF
by about 42Mb. It seems that the more firefox is used (e.g. to visit
a few websites), the bigger the memory gain.
After opening the home page of valgrind, wikipedia and google, the memory
gain is about 94Mb:
trunk:
  linesF:    392,181 allocd ( 203,934,120 bytes occupied) (   173,279 used)
patch:
  linesF:    212,966 allocd ( 109,038,592 bytes occupied) (   170,252 used)

There is also less alloc/free operations in core arena with the patch:
trunk:
  core    :   810,680,320/  802,291,712 max/curr mmap'd, 17/19 unsplit/split sb unmmap'd,    759,441,224/  703,191,896 max/curr,    40631760/16376828248 totalloc-blocks/bytes,   188015696 searches 8 rzB
patch:
  core    :   701,628,416/  690,753,536 max/curr mmap'd, 12/29 unsplit/split sb unmmap'd,    643,041,944/  577,793,712 max/curr,    32050040/14056017712 totalloc-blocks/bytes,   174097728 searches 8 rzB

In terms of performance, no CPU impact detected on Firefox startup.
Note we have no representative reproducible (and preferrably small)
perf test that uses extensively linesF. Firefox is a good heavy lineF
user but is far to be reproducible, and is very far to be small.

Theoretically, in terms of CPU performance, the patch might have some
small benefits here and there for read operations, as the lineF pointer
is directly retrieved from the lineZ, rather than retrieved via an indirection
in the linesF array.
For write operations, the patch might need a little bit more CPU,
as we replace an
  assignment to lineF inUse boolean to False (and then probably back to True
  when the cacheline is written back)
by
  a call to pool allocator VG_(freeEltPA) (and then probably a call to
  VG_(allocEltPA) when the cacheline is written back).
These PA functions are small, so cost should be ok.
We might however still maintain in clear_LineF_of_Z the last cleared lineF
and re-use it in alloc_LineF_for_Z. Not sure how many calls to the PA functions
would be avoided by this '1 elt cache' (and the needed 'if elt == NULL'
check in both clear_LineF_of_Z and alloc_LineF_for_Z.
This possible optimisationwill be looked at later.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15253

Avoid warning about %d and long int

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15252

When process dies due to a signal, show the signal and the stacktrace
at default verbosity

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15251

Improve trace of pkt send by V gdbsrv:
  * show the len
  * print binary date using \octal notation (like printf, when given
    non printable chars)

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15250

Fix unhandled syscall: unix:410 (sigsuspend_nocancel) on OS X. bz#319274.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15249

* Let GDB user modify the signal to send to the guest process
* implement qXfer:siginfo:read: packet to allow GDB to show $_siginfo

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15248

Add some more cfi directives
With some gcc versions, without these directives, unwind does
not work or gives strange entries in stack traces.

Patch from Matthias Schwarzott

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15247

Fix Warning: noted but unhandled ioctl 0x2000747b on Mac OS X. bz#208217.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15246

Followup to r15242 - as PRE() and POST() wrappers utilised, define with the *XY variant.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15245

Enable a few more compiler warnings.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15243

Fix unhandled syscall: unix:132 (mkfifo) on OS X. bz#212291.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15242

Add (presently) failing test case for bz#212291.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15241

Patch 5 in a revised series of cleanup patches from Will Schmidt

Add a .exp for the pth_cond_destroy_busy for PPC64 big endian.
This is specifically to cover the last line of output as
seen on ppc64BE, which is "ERROR SUMMARY: X errors from 3 contexts",
where X is 6, versus 3 as seen on other architectures.
The additional errors show up on BE during the "Thread #1: pthread_cond
_destroy: destruction of condition variable being waited upon."

Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
This patch fixes Vagrind bugzilla 347686

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15239

Patch 6 in a revised series of cleanup patches from Will Schmidt

    Fix multipleinheritance heuristic for ppc64LE (leak_cpp_interior test).
    Adjust the PPC64 #ifdiffery to indicate that ppc64BE uses a thunk table,
    but ppc64LE (in particular, the ELF ABIV2) does not.  In this case, thunk
    table == function descriptors.

Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
    --

    This patch replaces the previously posted "[6/7] add leak_cpp_interior
    test .exp results ....."

This patch fixes Vagrind bugzilla 347686

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15238

Add statistics about the nr of used linesF

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15237

This patch (re-)gains performance in helgrind, following revision 15207, that
reduced memory use doing SecMap GC, but was slowing down some workloads
(typically, workloads doing a lot of malloc/free).

A significant part of the slowdown came from the clear of the filter,
that was not optimised for big ranges : the filter was working byte
per byte till an 8 alignment. Then working per 8 bytes at a time.

With the patch, the filter clear is done the following way:
   * all the bytes till 8 alignement are done together
   * then 8 bytes at a time till filter_line alignment (32 bytes)
   * then 32 bytes at a time.

Moreover, as the filter cache is small (1024 lines of 32 bytes),
clearing filter for ranges bigger than 32Kb was uselessly checking
several times the same entry. This is now avoided by using a range
check rather than a tag equality check.

As the new filter clear is significanly more complex than the previous simple
algorithm, the old algorithm is kept and used to check the new algorithm
when CHECK_ZSM is defined as 1.

The patch also contains a few micro optimisations and
disables
   // VG_(track_die_mem_stack)       ( evh__die_mem );
as this had no effect and was somewhat costly.

With this patch, we have almost reached for all perf tests the same
performance as we had before revision 15207. Some tests are still
slightly slower than before the SecMap GC (max 2% difference).
Some tests are now significantly faster (e.g. sarp).
For almost all tests, we are now faster than valgrind 3.10.1.
Details below.

Regtested on x86/amd64/ppc64 (and regtested with all compile time
checks set).
I have also regtested with libreoffice and firefox.
(with firefox, also with CHECK_ZSM set to 1).

Details about performance:
hgtrace = this patch
trunk_untouched = trunk
base_secmap = trunk before secmap GC
valgrind 3.10.1 included for comparison
Measured on core i5 2.53GHz

-- Running  tests in perf ----------------------------------------------
-- bigcode1 --
bigcode1 hgtrace   :0.14s  he: 2.6s (18.4x, -----)
bigcode1 trunk_untouched:0.14s  he: 2.6s (18.4x, -0.4%)
bigcode1 base_secmap:0.14s  he: 2.6s (18.6x, -1.2%)
bigcode1 valgrind-3.10.1:0.14s  he: 2.8s (19.8x, -7.8%)
-- bigcode2 --
bigcode2 hgtrace   :0.14s  he: 6.3s (44.7x, -----)
bigcode2 trunk_untouched:0.14s  he: 6.2s (44.6x,  0.2%)
bigcode2 base_secmap:0.14s  he: 6.3s (45.0x, -0.6%)
bigcode2 valgrind-3.10.1:0.14s  he: 6.6s (47.1x, -5.4%)
-- bz2 --
bz2      hgtrace   :0.64s  he:11.3s (17.7x, -----)
bz2      trunk_untouched:0.64s  he:11.7s (18.2x, -3.2%)
bz2      base_secmap:0.64s  he:11.1s (17.3x,  1.9%)
bz2      valgrind-3.10.1:0.64s  he:12.6s (19.7x,-11.3%)
-- fbench --
fbench   hgtrace   :0.29s  he: 3.4s (11.8x, -----)
fbench   trunk_untouched:0.29s  he: 3.4s (11.7x,  0.6%)
fbench   base_secmap:0.29s  he: 3.6s (12.4x, -5.0%)
fbench   valgrind-3.10.1:0.29s  he: 3.5s (12.2x, -3.5%)
-- ffbench --
ffbench  hgtrace   :0.26s  he: 9.8s (37.7x, -----)
ffbench  trunk_untouched:0.26s  he:10.0s (38.4x, -1.9%)
ffbench  base_secmap:0.26s  he: 9.8s (37.8x, -0.2%)
ffbench  valgrind-3.10.1:0.26s  he:10.0s (38.4x, -1.9%)
-- heap --
heap     hgtrace   :0.11s  he: 9.2s (84.0x, -----)
heap     trunk_untouched:0.11s  he: 9.6s (87.1x, -3.7%)
heap     base_secmap:0.11s  he: 9.0s (81.9x,  2.5%)
heap     valgrind-3.10.1:0.11s  he: 9.1s (82.9x,  1.3%)
-- heap_pdb4 --
heap_pdb4 hgtrace   :0.13s  he:10.7s (82.3x, -----)
heap_pdb4 trunk_untouched:0.13s  he:11.0s (84.8x, -3.0%)
heap_pdb4 base_secmap:0.13s  he:10.5s (80.8x,  1.8%)
heap_pdb4 valgrind-3.10.1:0.13s  he:10.6s (81.8x,  0.7%)
-- many-loss-records --
many-loss-records hgtrace   :0.01s  he: 1.5s (152.0x, -----)
many-loss-records trunk_untouched:0.01s  he: 1.6s (157.0x, -3.3%)
many-loss-records base_secmap:0.01s  he: 1.6s (158.0x, -3.9%)
many-loss-records valgrind-3.10.1:0.01s  he: 1.7s (167.0x, -9.9%)
-- many-xpts --
many-xpts hgtrace   :0.03s  he: 2.8s (91.7x, -----)
many-xpts trunk_untouched:0.03s  he: 2.8s (94.7x, -3.3%)
many-xpts base_secmap:0.03s  he: 2.8s (94.0x, -2.5%)
many-xpts valgrind-3.10.1:0.03s  he: 2.9s (97.7x, -6.5%)
-- memrw --
memrw    hgtrace   :0.06s  he: 7.3s (121.2x, -----)
memrw    trunk_untouched:0.06s  he: 7.2s (120.3x,  0.7%)
memrw    base_secmap:0.06s  he: 7.1s (117.7x,  2.9%)
memrw    valgrind-3.10.1:0.06s  he: 8.1s (135.2x,-11.6%)
-- sarp --
sarp     hgtrace   :0.02s  he: 7.6s (378.5x, -----)
sarp     trunk_untouched:0.02s  he: 8.4s (422.0x,-11.5%)
sarp     base_secmap:0.02s  he: 8.6s (431.0x,-13.9%)
sarp     valgrind-3.10.1:0.02s  he: 8.8s (442.0x,-16.8%)
-- tinycc --
tinycc   hgtrace   :0.20s  he:12.4s (62.0x, -----)
tinycc   trunk_untouched:0.20s  he:12.6s (63.2x, -1.9%)
tinycc   base_secmap:0.20s  he:12.6s (63.0x, -1.6%)
tinycc   valgrind-3.10.1:0.20s  he:12.7s (63.5x, -2.3%)
-- Finished tests in perf ----------------------------------------------

== 12 programs, 48 timings =================

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15236

micro-opt: add an UNLIKELY

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15235

* add some comments in stack_limits
* add UNLIKELY indications for unlikely conditions
No functional difference.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15234

Patch 4 in a revised series of cleanup patches from Will Schmidt

Add a suppression to handle a "Jump to the invalid address..." message
that gets generated on power. This is a variation of the existing
suppressions.

While here, I also updated the "prog:" line in the vgtest file to reference
the supp_unknown executable, versus the badjump executable. They share the
same source code, so I think this is effectively cosmetic.

This patch fixes Vagrind bugzilla 347686

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15233

avoid warning

m_xarray.c:133:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]

(I have double checked that passing a negative argument makes the
assert fail)

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15232

Add the lwpid in the scheduler status information
E.g. we now have:
Thread 1: status = VgTs_Runnable (lwpid 15782)
==15782==    at 0x8048EB5: main (sleepers.c:188)
client stack range: [0xBE836000 0xBE839FFF] client SP: 0xBE838F80
valgrind stack top usage: 10264 of 1048576

Thread 2: status = VgTs_WaitSys (lwpid 15828)
==15782==    at 0x2E9451: ??? (syscall-template.S:82)
==15782==    by 0x8048AD3: sleeper_or_burner (sleepers.c:84)
==15782==    by 0x39B924: start_thread (pthread_create.c:297)
==15782==    by 0x2F107D: clone (clone.S:130)
client stack range: [0x442F000 0x4E2EFFF] client SP: 0x4E2E338
valgrind stack top usage: 2288 of 1048576

This allows to attach with GDB to the good lwpid in case
you want to examine the valgrind state rather than the guest state.

(it is needed to attach to the specific lwpid as valgrind is not
linked with lib pthread, so GDB cannot discover the threads
of the process).

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15231

Implement 'qXfer:exec-file:read' packet in Valgrind gdbserver.

Thanks to this packet, with recent GDB (>= 7.9.50.20150514-cvs), the
command 'target remote' will automatically load the executable file of
the process running under Valgrind. This means you do not need to
specify the executable file yourself, GDB will discover it itself.
See GDB documentation about 'qXfer:exec-file:read' packet for more
info.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15230

bz#347233 - Fix memcheck/tests/strchr on OS X 10.10 (Haswell)

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15229

Patch 2 in a revised series of cleanup patches from Will Schmidt

Add a deep-D test .exp values for ppc64.
Depending on the system and the systems endianness, there are variances
in the library reference, and to the specific line number in the library.
I was able to add and modify existing filters to cover most of the variations,
but did need to add a .exp to cover the additional call stack entry as seen
on power.
This change allows the ppc64 targets to pass the massif/deep-D test.

This patch fixes Vagrind bugzilla 347686

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15228

Changes for tilegx: Use VKI_AT_FDCWD not AT_FDCWD.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15227

Patch 1 in a revised series of cleanup patches from Will Schmidt

Update the massif/big-alloc test for ppc64*.
In comparison to the existing .exp files, the time,total,extra-heap
values generated on ppc64* vary from the other architectures.

This .exp allows the ppc64 targets to pass the test.

This patch fixes Vagrind bugzilla 347322.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15226

Function is_plausible_guest_addr should also consider SkShmC
segments.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15222

In functions VG_(am_relocate_nooverlap_client) and VG_(am_extend_map_client)
need to allow SkShmC segments, too.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15221

Get prototype from system header.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15220

Fix bug in do_mremap. Also need to allow SkShmC segments.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15219

Allow suppression on OS X 10.10 in libSystem_initializer

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15217

Add back vg_assert(xa); that was removed by error in r15211

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15215

VTS stats
* add the missing increment to the nr of gc done
* add vts pruning stat

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15214

Simplify shmem__invalidate_scache_range : it only has to handle
cacheline aligned ranges.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15213

Small optimisations in libhb_core.c

* avoid indirection via function pointers to call SVal__rcinc and SVal__rcdec
* declare these functions inlined
* transform 2 asserts on hot path in conditionally compiled checks
on CHECK_ZSM

This slightly optimises some perf tests with helgrind

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15212

Micro-optimisation following helgrind secmap gc
Checking the range in indexXA can be done with one comparison.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15211

Fix typo in task_policy_set() output. n-i-bz.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15210

OS X task_info: UNKNOWN task message [id 3405, to mach_task_self(), reply 0x........]
bz#254164

Before:

== 593 tests, 234 stderr failures, 22 stdout failures, 0 stderrB failures, 0 stdoutB failures, 30 post failures ==

After:

== 593 tests, 233 stderr failures, 22 stdout failures, 0 stderrB failures, 0 stdoutB failures, 30 post failures ==

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15209

Improves the way arena statistics are shown
The mmap'd max/curr and max/curr nr of bytes will be shown e.g. as
11,440,408/ 4,508,968
instead of
11440656/ 4509200

So, using more space, but more readable (in particular when the
nr exceeds the width, and so are not aligned anymore)

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15208

This patch decreases the memory used by the helgrind SecMap,
by implementing a Garbage Collection for the SecMap.

The basic change is that freed memory is marked as noaccess
(while before, it kept the previous marking, on the basis that
non buggy applications are not accessing freed memory in any case).
Keeping the previous marking avoids the CPU/memory changes needed
to mark noaccess.

However, marking freed memory noaccess and GC the secmap reduces
the memory on big apps.
For example, a firefox test needs 220Mb less (on about 2.06 Gb).
Similar reduction for libreoffice batch (260 MB less on 1.09 Gb).
On such applications, the performance with the patch is similar to the trunk.

There is a performance decrease for applications that are doing
a lot of malloc/free repetitively: e.g. on some perf tests, an increase
in cpu of up to 15% has been observed.

Several performance optimisations can be done afterwards to not loose
too much performance. The decrease of memory is expected to produce
in any case significant benefit in memory constrained environments
(e.g. android phones).

So, after discussion with Julian, it was decided to commit as-is
and (re-)gain (part of) performance in follow-up commits.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15207

small refinement in the outer/inner doc

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15206

Add (presently) failing test case for bz#254164.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15205

Simplify is_valid_for taking advantage of the fact that SegKinds
are one-hot encoded.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15201

testsuite: properly svn:ignore output files in none/tests/amd64. n-i-bz.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15199

Use fxsave64 and fxrstor64 mnemonics instead of old-school rex64 prefix
bz#339636

Before:

== 591 tests, 232 stderr failures, 22 stdout failures, 0 stderrB failures, 0 stdoutB failures, 30 post failures ==

After:

== 591 tests, 232 stderr failures, 22 stdout failures, 0 stderrB failures, 0 stdoutB failures, 30 post failures ==

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15198

Add some cfi directives in the code doing syscall (by Valgrind).
This allows to attach to Valgrind when VAlgrind is blocked in a syscall
and have GDB producing a stacktrace, rather than being unable
to unwind.
I.e. instead of having:
  (gdb) bt
  #0  0x380460f2 in do_syscall_WRK ()
  (gdb)
with the directives, we obtain:
   (gdb) bt
   #0  vgPlain_mk_SysRes_x86_linux (val=1) at m_syscall.c:65
   #1  vgPlain_do_syscall (sysno=168, a1=944907996, a2=1, a3=4294967295, a4=0, a5=0, a6=0, a7=0, a8=0) at m_syscall.c:791
   #2  0x38031986 in vgPlain_poll (fds=0x385226dc <remote_desc_pollfdread_activity>, nfds=1, timeout=-1) at m_libcfile.c:535
   #3  0x3807479f in vgPlain_poll_no_eintr (fds=0x385226dc <remote_desc_pollfdread_activity>, nfds=1, timeout=-1)
       at m_gdbserver/remote-utils.c:86
   #4  0x380752f0 in readchar (single=4096) at m_gdbserver/remote-utils.c:938
   #5  0x38075ae3 in getpkt (buf=0x61f35020 "") at m_gdbserver/remote-utils.c:997
   #6  0x38076fcb in server_main () at m_gdbserver/server.c:1048
   #7  0x38072af2 in call_gdbserver (tid=1, reason=init_reason) at m_gdbserver/m_gdbserver.c:721
   #8  0x380735ba in vgPlain_gdbserver (tid=1) at m_gdbserver/m_gdbserver.c:788
   #9  0x3802c6ef in do_actions_on_error (allow_db_attach=<optimized out>, err=<optimized out>) at m_errormgr.c:532
   #10 pp_Error (err=0x61f580e0, allow_db_attach=1 '\001', xml=1 '\001') at m_errormgr.c:644
   #11 0x3802cc34 in vgPlain_maybe_record_error (tid=1643479264, ekind=8, a=2271560481, s=0x0, extra=0x62937f1c)
       at m_errormgr.c:851
   #12 0x38028821 in vgMemCheck_record_free_error (tid=1, a=2271560481) at mc_errors.c:836
   #13 0x38007b65 in vgMemCheck_free (tid=1, p=0x87654321) at mc_malloc_wrappers.c:496
   #14 0x3807e261 in do_client_request (tid=1) at m_scheduler/scheduler.c:1840
   #15 vgPlain_scheduler (tid=1) at m_scheduler/scheduler.c:1406
   #16 0x3808b6b2 in thread_wrapper (tidW=<optimized out>) at m_syswrap/syswrap-linux.c:102
   #17 run_a_thread_NORETURN (tidW=1) at m_syswrap/syswrap-linux.c:155
   #18 0x00000000 in ?? ()
   (gdb)

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15194

Update.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15193

Document fix for BZ#347389.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15192

Add support for the syncfs system call.

Based on patch from j@eckel.me on BZ#347389.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15191

valgrind --leak-check=full memleak errors from system libraries on OS X 10.8
bz#347379
== bz#217236

Before:

== 591 tests, 237 stderr failures, 23 stdout failures, 0 stderrB failures, 0 stdoutB failures, 30 post failures ==

After:

== 591 tests, 232 stderr failures, 22 stdout failures, 0 stderrB failures, 0 stdoutB failures, 30 post failures ==

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15190

Compute total size with unsigned long long

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15189

Patch 8 in a series of cleanup patches from Will Schmidt

Add a helper script to determine if the platform is ppc64le.
This is specifically used to help exclude the 32-bit tests from being
run on a ppc64LE (ABIV2) platform. The 32-bit targets, specifically ppc32/*
is not built on LE.

This patch fixes Vagrind bugzilla 347322.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15188

Fix also the rm vgcore of a disabled test (also spotted by Matthias Schwarzott)

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15187

Fix incorrect cleanup lines in 2 tests (spotted by Matthias Schwarzott)

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15186

Patch 3 in a series of cleanup patches from Will Schmidt

Update the pth_create_chain vgtest prereq to handle the ppc64le architecture
in the same way as ppc64 (BE).

This patch fixes Vagrind bugzilla 347322.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15185

Patch 2 in a series of cleanup patches from Will Schmidt

Adjust the badjump2 test for ppc64le/ABIV2. Under the ABIV2 there
is no function descriptor, so the fn[] setup does not apply.
This fixes the badjump2 test failure as seen on ppc64le.

This patch fixes Vagrind bugzilla 347322.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15184

Patch 1 in a series of cleanup patches from Will Schmidt

Update ifdefs around the bogus-LR-value-handling code to allow ppc64le to
behave as ppc64 (BE) does.

This fixes the overlap test case, where the stack unwinding code was
otherwise coming up with bad instruction pointers.

This patch fixes Vagrind bugzilla 347322.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15183

Simplify. The condition on line 1223 is always true.
Here's why:

The condition

if (VG_(brk_limit) > VG_(brk_base))   line 1223

is reachable iff

  newbrk < VG_(brk_base)  on line 1201  is false  AND
  newbrk < VG_(brk_limit) on line 1205  is true

Rewrite as

  newbrk >= VG_(brk_base)    is true  AND
  newbrk <  VG_(brk_limit)   is true

Rewrite as

  newbrk >= VG_(brk_base)        is true  AND
  newbrk <= VG_(brk_limit) - 1   is true

Combine

  VG_(brk_base) <= newbrk <= VG_(brk_limit) - 1

Therefore

  VG_(brk_base) <= VG_(brk_limit) - 1

Or

  VG_(brk_base) < VG_(brk_limit)

Which is the same as

  VG_(brk_limit) > VG_(brk_base)

qed.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15181

One more msg to use 'mmap-ed ANONYMOUS' wording

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15180

Fix suppression for pthread_rwlock_init on OS X 10.8
bz#347151

Before:

== 593 tests, 238 stderr failures, 23 stdout failures, 0 stderrB failures, 0 stdoutB failures, 30 post failures ==

After:

== 593 tests, 237 stderr failures, 23 stdout failures, 0 stderrB failures, 0 stdoutB failures, 30 post failures ==

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15179

Add a new howto for running mips64-linux on QEMU.
Rename the aarch64-linux howto accordingly.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15178

* Out of memory message was using  'bytes have already been allocated.'
  while this nr is in fact the total anonymously mmap-ed.
  Change the message so as to reflect the shown number.
* Show also the total anonymous mmaped in non OOM memory statistics

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15177

Reduce nr of lines produced by laog gc --stats=yes

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15176

Properly guard exp-bbv/tests/x86/ on OS X. Partial fix for BZ#344416 (at least reduces required hacks).

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15175

This patch reduces the memory needed for a VtsTE by 25% (one word)
on 32 bits platforms. No memory reduction on 64 bits platforms,
due to alignment.
The patch also shows the vts stats when showing the helgrind stats.

The perf/memrw.c perf test gets also some new additional features
allowing e.g. to control the size of the read or written blocks.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15174

This patch adds a function that allows to directly properly size an xarray
when the size is known in advance.

3 places identified where this function can be used trivially.

The result is a reduction of 'realloc' operations in core
arena, and a small reduction in ttaux arena
(it is the nr of operations that decreases, the memory usage itself
stays the same (ignoring some 'rounding' effects).

E.g. for perf/bigcode 0, we change from
  core 1085742/ 216745904 totalloc-blocks/bytes,     1085733 searches
  ttaux 5348/   6732560 totalloc-blocks/bytes,        5326 searches
to
  core 712666/ 190998592 totalloc-blocks/bytes,      712657 searches
  ttaux 5319/   6731808 totalloc-blocks/bytes,        5296 searches

For bz2, we switch from
  core 50285/  32383664 totalloc-blocks/bytes,       50256 searches
  ttaux 670/    245160 totalloc-blocks/bytes,         669 searches
to
  core 32564/  29971984 totalloc-blocks/bytes,       32535 searches
  ttaux 605/    243280 totalloc-blocks/bytes,         604 searches

Performance wise, on amd64, this improves memcheck performance
on perf tests by 0.0, 0.1 or 0.2 seconds depending on the test.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15173