]> git.ipfire.org Git - thirdparty/zlib-ng.git/commit
return an index for hash map collisions in insert_string
authorSebastian Pop <s.pop@samsung.com>
Thu, 6 Dec 2018 19:23:17 +0000 (13:23 -0600)
committerHans Kristian Rosbach <hk-github@circlestorm.org>
Thu, 13 Dec 2018 08:03:25 +0000 (09:03 +0100)
commit13619fd2b6d0d5e2c2b5d8e8c08bc97097415c11
tree18f6f448389c38585a5f6f9de24341b703b523b0
parent2aebdb5c7023de0b9ac7c19301216616997f961a
return an index for hash map collisions in insert_string

The current version of insert_string_c and variations for sse2, arm, and aarch64
in zlib-ng has changed semantics from the original code of INSERT_STRING macro
in zlib:

 #define INSERT_STRING(s, str, match_head) \
   (UPDATE_HASH(s, s->ins_h, s->window[(str) + (MIN_MATCH-1)]), \
    match_head = s->prev[(str) & s->w_mask] = s->head[s->ins_h], \
    s->head[s->ins_h] = (Pos)(str))

The code of INSERT_STRING assigns match_head with the content of s->head[s->ins_h].

In zlib-ng, the assignment to match_head happens in the caller of insert_string().
zlib-ng's insert_string_*() functions return 0 instead of str+idx in case of
collision, i.e., when if (s->head[s->ins_h] == str+idx).

The effect of returning 0 instead of the content of s->head[s->ins_h] is that
the search for a longest_match through s->prev[] chains will be cut short when
arriving at 0. This leads to a shorter compression time at the expense of a
worse compression rate: returning 0 cuts out the search space.

With this patch:

 Performance counter stats for './minigzip -9 llvm.tar':

      13422.379017      task-clock (msec)         #    1.000 CPUs utilized
                20      context-switches          #    0.001 K/sec
                 0      cpu-migrations            #    0.000 K/sec
               130      page-faults               #    0.010 K/sec
    58,926,104,511      cycles                    #    4.390 GHz
   <not supported>      stalled-cycles-frontend
   <not supported>      stalled-cycles-backend
    77,543,740,646      instructions              #    1.32  insns per cycle
    17,158,892,214      branches                  # 1278.379 M/sec
       198,433,680      branch-misses             #    1.16% of all branches

      13.423365095 seconds time elapsed

45408 -rw-rw-r-- 1 spop spop 46493896 Dec 11 11:47 llvm.tar.gz

Without this patch the compressed file is larger:

 Performance counter stats for './minigzip -9 llvm.tar':

      13459.342312      task-clock (msec)         #    1.000 CPUs utilized
                25      context-switches          #    0.002 K/sec
                 0      cpu-migrations            #    0.000 K/sec
               129      page-faults               #    0.010 K/sec
    59,088,391,808      cycles                    #    4.390 GHz
   <not supported>      stalled-cycles-frontend
   <not supported>      stalled-cycles-backend
    77,600,766,958      instructions              #    1.31  insns per cycle
    17,486,130,785      branches                  # 1299.182 M/sec
       196,281,761      branch-misses             #    1.12% of all branches

      13.463512830 seconds time elapsed

45408 -rw-rw-r-- 1 spop spop 46493896 Dec 11 11:48 llvm.tar.gz
arch/aarch64/insert_string_acle.c
arch/arm/insert_string_acle.c
arch/x86/insert_string_sse.c
deflate_p.h