]> git.ipfire.org Git - thirdparty/tar.git/commit
Fix O(n^2) time bug in --delay-directory-restore
authorBenjamin Woodruff <bgw@meta.com>
Mon, 21 Aug 2023 20:06:45 +0000 (13:06 -0700)
committerPaul Eggert <eggert@cs.ucla.edu>
Mon, 21 Aug 2023 20:42:14 +0000 (13:42 -0700)
commita5afb367655ac6dcbc79ba7802a2479044184b38
tree73a4afd76244aee52be97e36dabe8e6f66121f06
parentbfee1d44a3d09a25586fe364c0cde74ebf0fa638
Fix O(n^2) time bug in --delay-directory-restore

delayed_set_stat avoids inserting duplicate entries into
delayed_set_stat_head. It was doing this by scanning the entire
list.

Normally this list is small, but if --delay-directory-restore is
used (including automatically for incremental archives), this list
grows with the total number of directories in the archive.

The entire scan takes O(n) time. Extracting an archive with n
directories could therefore take O(n^2) time.

The included test uses AT_SKIP_LARGE_FILES, allowing it to optionally be
skipped. It may execute slowly on certain filesystems or disks, as it
creates thousands of directories.

There are still potentially problematic O(n) scans in
find_direct_ancestor and remove_delayed_set_stat, which this patch does
not attempt to fix.

* NEWS: Update.
* src/extract.c (delayed_set_stat_table): Create a table for O(1)
lookups of entries in the delayed_set_stat_head list. The list
remains, as tracking insertion order is important.
(dl_hash, dl_compare): New hash table helper functions.
(delay_set_stat): Create the hash table, replace the O(n) list scan
with a hash_lookup, insert new entries into the hash table.
(remove_delayed_set_stat): Also remove entry from hash table.
(apply_nonancestor_delayed_set_stat): Also remove entry from hash
table.
(extract_finish): Free the (empty) hash table.
* tests/extrac26.at: New file.
* tests/Makefile.am (TESTSUITE_AT): Include extrac26.at.
* tests/testsuite.at: Include extrac26.at.
NEWS
src/extract.c
tests/Makefile.am
tests/extrac26.at [new file with mode: 0644]
tests/testsuite.at