Paul Eggert [Sat, 13 Nov 2004 00:50:56 +0000 (00:50 +0000)]
Avoid O(N**2) behavior when there are many temporary files.
(temptail): New variable, so that we can easily append to list.
(create_temp_file): Create new files at end of list, so that
searching the list has O(N**NMERGE) behavior instead of O(N**2).
(zaptemp): Update temptail if needed.
(mergefps, merge): Accept new arg that counts temp files, and keep it
up to date as we create and remove temporaries. This is for
efficiency, so that we don't call zaptemp so often.
All callers changed.
(sort): Don't create array in reverse order, since the list of
temporaries is now in the correct order.
(zaptemp): Protect against race condition: if 'sort' is
interrupted in the middle of zaptemp, it might unlink the
temporary file twice, and the second time this happens the file
might already have been created by some other process.
(create_temp_file): Use offsetof for clarity.
(die): Move it up earlier, to clean up the code a bit.
Paul Eggert [Fri, 12 Nov 2004 19:35:54 +0000 (19:35 +0000)]
(strtoumax): Declare if not declared.
(skip_to_page, first_page_number, last_page_number, page_number,
first_last_page, print_header):
Use uintmax_t for page numbers.
(first_last_page): Remove unnecessary forward declaration.
Do not modify arg (it is now a const pointer).
Return a true if successful, false (without print a diagnostic)
otherwise.
(main): If +XXX does not specify a valid page range, treat it
as a file name. This follows the response to Open Group XCU ERN 41
<http://www.opengroup.org/sophocles/show_mail.tpl?source=L&listname=austin-group-l&id=7717>,
which says the behavior is allowed.
(skip_to_page): When starting page number exceeds page count,
print both numbers in the diagnostic.
(print_header): Detect page number overflow.
Jim Meyering [Tue, 9 Nov 2004 20:31:39 +0000 (20:31 +0000)]
[__APPLE__]: Include <mach/machine.h> and <mach-o/arch.h>.
(main) [__APPLE__]: Get the processor type via syscall rather than
hard-coding "powerpc". From toby@opendarwin.org.
Paul Eggert [Sat, 6 Nov 2004 23:46:47 +0000 (23:46 +0000)]
(first_same_file): Remove. Move most of the code to....
(avoid_trashing_input): New function.
(merge): Avoid some silly merges, e.g., copying a single file to
a temporary file when there are exactly 17 input files to merge.
Take a count of temporary files rather than a max_merge arg.
All uses changed.
Paul Eggert [Fri, 5 Nov 2004 23:02:09 +0000 (23:02 +0000)]
(inittables, sort_buffer_size, getmonth, mergefps,
first_same_file, merge, sort, main): Use size_t for indexes into arrays.
This fixes some unlikely havoc-wreaking bugs (e.g., more than INT_MAX
temporary files).
(getmonth, keycompare, compare): Rewrite to avoid need for alloca,
thus avoiding unchecked stack overflow in some cases. As a side
effect this improve the performance of "sort -M" by a factor of 4
on my benchmarks.
Paul Eggert [Fri, 29 Oct 2004 23:22:09 +0000 (23:22 +0000)]
Document TZ better, and adjust to new getdate.texi.
(Top): Update menu.
(pr invocation, Formatting file timestamps, touch invocation,
stat invocation, who invocation, date invocation, Options for date):
Mention TZ.
Jim Meyering [Fri, 29 Oct 2004 21:55:15 +0000 (21:55 +0000)]
`tac /proc/modules' would print nothing
(copy_to_temp): Renamed from save_stdin, since
now it copies a general file descriptor, not just stdin.
(tac_nonseekable): Renamed/adapted from tac_stdin.
(tac_file): Get fd via `open' directly rather than via fopen/fileno,
since we never used the stream. Perform "-" to stdin mapping here
rather than in main. Determine whether a file is seekable,
by trying to `lseek' to its end, and dispatch to tac_seekable or
tac_nonseekable accordingly.
(main): Rewrite argument handling now that it uses only tac_file.
Reported by Harald Dunkel in http://bugs.debian.org/278604.
Paul Eggert [Thu, 28 Oct 2004 07:50:51 +0000 (07:50 +0000)]
(Standards conformance): Use "head -10" rather than "head -1" as
example of obsolete usage, since the POSIX consensus is that "head -1"
could be supported even if we don't yet have clear consensus on "head
-10". See today's revision to the SUS FAQ
<http://www.opengroup.org/austin/papers/single_unix_faq.html>.
Jim Meyering [Thu, 21 Oct 2004 10:37:18 +0000 (10:37 +0000)]
Correct my patch of 2004-10-18.
(rm): Destroy the saved_cwd here (via cwd_state),
if necessary, not in remove_dir. Otherwise, removing multiple
`.'-relative nonempty directories no longer worked.
Jim Meyering [Mon, 18 Oct 2004 08:59:12 +0000 (08:59 +0000)]
Plug a leak that would cause rm or a cross-device mv to fail when
operating on too many command-line-specified nonempty directories.
(remove_dir): Destroy the `struct saved_cwd' on the
top of the stack before returning. This usually closes the file
descriptor that was used to return to the original working directory.
Reported by Cyril Bouthors in
http://article.gmane.org/gmane.comp.gnu.core-utils.bugs/3048
Jim Meyering [Mon, 18 Oct 2004 08:19:26 +0000 (08:19 +0000)]
(validate_file_name): Give a more descriptive
diagnostic when pathconf fails. This also avoids an unwarranted
warning from gcc-3.3.5 about a format not being a string literal.
Paul Eggert [Mon, 18 Oct 2004 06:30:49 +0000 (06:30 +0000)]
(AUTHORS): Add self.
Change "path" to "file name" whenever possible.
Remove usage comment, as it was a duplication of the code or doc.
Include <wchar.h> if available.
(mbrlen, mbstate_t) [! (HAVE_MBRLEN && HAVE_MBSTATE_T)]: Define.
(NEED_PATHCONF_WRAPPER, PATH_MAX, PATH_MAX_FOR, NAME_MAX,
pathconf_wrapper, portable_chars, dir_ok): Remove.
(NAME_MAX_MINIMUM, PATH_MAX_MINIMUM): New macros.
(pathconf, _PC_NAME_MAX, _PC_PATH_MAX): Define if nonexistent.
(portable_chars_only): New arg FILELEN.
Don't assume ASCII; we might be on an EBCDIC host.
Don't assume unibyte locale in diagnostic.
(component_start, component_len): New functions.
(validate_file_name): Renamed from validate_path. All uses changed.
Pretty much a complete rewrite.
Don't make copy of file arg. Always append trailing slash to
pathconf arg, just in case it's a symlink (this is pure paranoia;
we don't know of any hosts where the trailing slash is required).
Use size_t instead of long int when possible.
Avoid need to call pathconf in most practical cases.
Don't use euidaccess several times to test searchability;
just use lstat once. Reword diagnostic to put the (often very long)
file names last.