]> git.ipfire.org Git - thirdparty/public-inbox.git/log
thirdparty/public-inbox.git
2 years agodoc: technical/ds: update blurb to note more daemons
Eric Wong [Thu, 9 Mar 2023 19:28:38 +0000 (19:28 +0000)] 
doc: technical/ds: update blurb to note more daemons

And add a note about the various wakeup modes of kqueue|epoll
while we're at it; we use all of them!

2 years agodoc: technical/memory: add note about mwrap-perl
Eric Wong [Thu, 9 Mar 2023 19:28:37 +0000 (19:28 +0000)] 
doc: technical/memory: add note about mwrap-perl

It's already fixed memory usage problems not only in our codebase,
but also the standard `Encode' XS module and `git pack-objects'.

2 years agolei_mirror: unlink FETCH_HEAD when fetching forkgroups
Eric Wong [Wed, 8 Mar 2023 11:02:58 +0000 (11:02 +0000)] 
lei_mirror: unlink FETCH_HEAD when fetching forkgroups

Apparently, --no-write-fetch-head is broken in current git[1].
It also wasn't in older git, at all.  So just unlink FETCH_HEAD
as we see it, but keep using --no-write-fetch-head to avoid the
syscall and I/O overhead when we can.

[1] https://yhbt.net/lore/git/20230308100438.908471-1-e@80x24.org/

2 years agotest_common: run_script: drop special-case for -clone
Eric Wong [Tue, 7 Mar 2023 09:54:15 +0000 (09:54 +0000)] 
test_common: run_script: drop special-case for -clone

`make check' and `make check-run' actually work fine with it,
and TMPDIR=/dev/shm prove -lvw t/clone-coderepo.t is 2-3x faster

2 years agocgit: fix smart HTTP clone interception
Eric Wong [Tue, 7 Mar 2023 09:32:37 +0000 (09:32 +0000)] 
cgit: fix smart HTTP clone interception

We need to use the proper hash and key to do coderepo lookups
since we culled a redundant data structure a few months back.

Fixes: 1802dc29bda25a54 ("www_coderepo: do not copy {-code_repos} from config")
2 years agosha: fix compatibility with old OpenSSL + Net::SSLeay
Eric Wong [Tue, 7 Mar 2023 08:47:15 +0000 (08:47 +0000)] 
sha: fix compatibility with old OpenSSL + Net::SSLeay

In older OpenSSL, EVP_get_digestbyname() didn't work properly
without calling OpenSSL_add_all_digests(), first.  However,
OpenSSL_add_all_digests() is deprecated by OpenSSL 1.1.0 in
favor of OPENSSL_init_crypto().  Of course, OpenSSL_init_crypto()
isn't available in OpenSSL 1.0.1k nor Net::SSLeay as of 1.93_02
(2023-02-22).

Thus, instead of relying on string lookups and conditional
subroutine calls, just call EVP_sha1() and EVP_sha256() which
work on both old and new systems.

Tested with Net::SSLeay 1.55 and OpenSSL 1.0.1k on on CentOS 7.x

2 years agodoc: update public-inbox-clone examples and help
Eric Wong [Sun, 5 Mar 2023 22:18:11 +0000 (22:18 +0000)] 
doc: update public-inbox-clone examples and help

Basically, public-inbox-clone has become grok-pull without
config files nor absolute paths.

2 years agodoc: drop hosted.txt
Eric Wong [Thu, 2 Mar 2023 00:13:14 +0000 (00:13 +0000)] 
doc: drop hosted.txt

I'll have to downsize the server due to increased hosting costs,
so stop advertising these mirrors.

The inboxes still exist, for now; but will probably be proxied
behind an ssh tunnel via slow DSL connection, but it's not worth
increasing traffic to.

2 years agodoc: update clone+fetch with 2.0+ switches
Eric Wong [Mon, 27 Feb 2023 10:21:05 +0000 (10:21 +0000)] 
doc: update clone+fetch with 2.0+ switches

Because old versions will exist for a long time and our latest
documentation is visible on the web, we must document when a
switch appears to avoid confusing users of old versions.

2 years agoprocess_pipe: BINMODE: pass LAYER argument
Eric Wong [Mon, 27 Feb 2023 07:18:34 +0000 (07:18 +0000)] 
process_pipe: BINMODE: pass LAYER argument

We'll end up using this to handle `:utf8', probably.

2 years agodoc: note "lei q -tt" is broken with HTTP(S) remotes
Eric Wong [Sun, 26 Feb 2023 17:15:06 +0000 (17:15 +0000)] 
doc: note "lei q -tt" is broken with HTTP(S) remotes

I'm still trying to decide how to handle HTTP(S) remotes
properly...

Link: https://public-inbox.org/meta/20230226170931.M947721@dcvr/
2 years agods: write: do not assume final wbuf entry is tmpio
Eric Wong [Fri, 24 Feb 2023 16:59:10 +0000 (16:59 +0000)] 
ds: write: do not assume final wbuf entry is tmpio

The final entry of {wbuf} may be a CODE ref and not a
tmpio ARRAY ref, so we must ensure it's an ARRAY before
attempting to use `->[INDEX]' to access it.

This fixes:
  forward ->close error: Not an ARRAY reference at PublicInbox/DS.pm line 544.

2 years agoexamples: remove `Standard{Error,Output} = syslog' lines
Eric Wong [Wed, 22 Feb 2023 18:17:39 +0000 (18:17 +0000)] 
examples: remove `Standard{Error,Output} = syslog' lines

systemd (247.3-7+deb11u1 on Debian 11.x) considers them "obsolete" and
emits the following to my syslog:

  Standard output type syslog is obsolete, automatically updating to journal.
  Please update your unit file, and consider removing the setting altogether.

So we'll remove it altogether, as I'm sticking with rsyslog for now.

2 years agotreewide: simplify File::Path mkpath/make_path callers
Eric Wong [Wed, 22 Feb 2023 17:25:55 +0000 (17:25 +0000)] 
treewide: simplify File::Path mkpath/make_path callers

File::Path already accounts for the existence of directories,
handles races from redundant mkdir(2), and croaks on
unrecoverable errors.  So there's no point in doing any
of that on our end.

Furthermore, avoiding the overhead of loading File::Path doesn't
seem worth it to save 20-60ms given the overhead of loading
our other code.  Instead, try to reduce optree overhead on
our code, instead, since File::Path gets used in a bunch of
places.

We'll also favor the newer make_path for multi-directory
invocations to avoid bloating our own optree to create an
arrayref, but mkpath is one fewer subroutine call within
File::Path itself, right now.

2 years agosendmsg: prefix sleep message with `#'
Eric Wong [Wed, 22 Feb 2023 17:25:52 +0000 (17:25 +0000)] 
sendmsg: prefix sleep message with `#'

It's an informative message that's harmless, so hopefully
the `#' prefix puts the users mind at ease.

(I saw it on an `lei import' against an IMAP source)

2 years agolei_mirror: support --remote-manifest=URL
Eric Wong [Tue, 21 Feb 2023 12:17:44 +0000 (12:17 +0000)] 
lei_mirror: support --remote-manifest=URL

Since PublicInbox::WWW already generates manifest.js.gz, I'm
using an alternate path with PublicInbox::WwwStatic to host the
manifest.js.gz for coderepos at an alternate location.  The
following snippet lets me host
https://yhbt.net/lore/pub/manifest.js.gz for mirrored git
repositories, while https://yhbt.net/lore/manifest.js.gz
(no `pub') remains for inbox mirroring.

==> sample.psgi <==
use PublicInbox::WWW;
use PublicInbox::WwwStatic;
my $www = PublicInbox::WWW->new; # use default PI_CONFIG
my $st = PublicInbox::WwwStatic->new(docroot => '/path/to/code');
my $www_cb = sub {
my ($env) = @_;
if ($env->{PATH_INFO} eq '/pub/manifest.js.gz') {
local $env->{PATH_INFO} = '/manifest.js.gz';
my $res = $st->call($env);
return $res if $res->[0] != 404;
}
$www->call($env);
};
builder {
enable 'ReverseProxy';
enable 'Head';
mount '/lore' => $www_cb;
}

2 years agoviewvcs: handle non-UTF-8 commit message
Eric Wong [Tue, 21 Feb 2023 11:17:58 +0000 (11:17 +0000)] 
viewvcs: handle non-UTF-8 commit message

Back in the old days, git didn't store commit encodings
and allowed messages in various encodings to enter history.
Assuming such a commit is UTF-8 trips up s/// operations
on buffers read with the `:utf8' PerlIO layer.  So clear
Perl's internal UTF-8 flag if we end up with something
which isn't valid UTF-8

An example is commit 7eb93c89651c47c8095d476251f2e4314656b292
in git.git ([PATCH] Simplify git script, 2005-09-07)

2 years agoREADME: add POP3 bits
Eric Wong [Mon, 20 Feb 2023 11:06:03 +0000 (11:06 +0000)] 
README: add POP3 bits

Maybe this can make our newish support of POP3 more
noticeable...

2 years agosearchidx: do not index quoted Base-85 patches
Eric Wong [Mon, 20 Feb 2023 09:21:50 +0000 (09:21 +0000)] 
searchidx: do not index quoted Base-85 patches

Base-85 binary patches were a source of false-positives in results
and we've filtered out in non-quoted text since July 2022.
Unfortunately, people were quoting binary patch contents
in replies (*sigh*) and triggering false positives in search
results.  So we must filter out base-85-looking contents from
quoted text, too.

Followup-to: 8fda04081acde705 (search: do not index base-85 binary patches, 2022-06-20)
Followup-to: 840785917bc74c8e (searchidx: skip "delta $N" sections for base-85, 2022-07-19)
2 years agomulti_git: do not set include.path if already set
Eric Wong [Mon, 20 Feb 2023 05:32:02 +0000 (05:32 +0000)] 
multi_git: do not set include.path if already set

The epoch may already be read-only, and we don't need to cause
more I/O traffic and disk wear for no-op stuff.  This fixes
idempotent use of public-inbox-clone to update multi-epoch
inboxes.

2 years agogit_async_cat: don't mis-abort replaced process
Eric Wong [Mon, 20 Feb 2023 08:19:43 +0000 (08:19 +0000)] 
git_async_cat: don't mis-abort replaced process

When a git process gets replaced (e.g. due to new
epochs/alternates), we must be careful and not abort the wrong
one.

I suspect this fixes the problem exacerbated by --batch-command.
It was theoretically possible w/o --batch-command, but it seems
to have made it surface more readily.

This should fix "Failed to retrieve generated blob" errors from
PublicInbox/ViewVCS.pm appearing in syslog

Link: https://public-inbox.org/meta/20230209012932.M934961@dcvr/
2 years agosearch: translate d: to dt: in query
Eric Wong [Sun, 19 Feb 2023 08:18:14 +0000 (08:18 +0000)] 
search: translate d: to dt: in query

dt: is higher resolution and the YYYYMMDD column will be dropped
if there's ever another SCHEMA_VERSION update.  While the
upcoming code repo index is independent of the mail schemas,
it'll use similar query prefixes and likely use d:/dt: for
Author Date of git commits.

2 years agosearch: move query transform + enquire setup out of retry loop
Eric Wong [Fri, 17 Feb 2023 10:36:14 +0000 (10:36 +0000)] 
search: move query transform + enquire setup out of retry loop

The Xapian query transformation and Enquire object setup aren't
subject to MVCC and retries, so move it outside the retry loop
to save some cycles in case we need to retry on a busy DB.

2 years agopublic-inbox.cgi(1): Mention AllowEncodedSlashes for Apache setups
Uwe Kleine-König [Fri, 17 Feb 2023 11:08:50 +0000 (12:08 +0100)] 
public-inbox.cgi(1): Mention AllowEncodedSlashes for Apache setups

When AllowEncodedSlashes is Off (the default setting), URLs containing
%2f are replied with a 404 error without calling the CGI. To (maybe)
prevent others debugging this issue add a hint with the solution.

2 years agoTODO: handle more cases of unencoded slashes
Eric Wong [Fri, 17 Feb 2023 10:32:22 +0000 (10:32 +0000)] 
TODO: handle more cases of unencoded slashes

Nowadays, mutt defaults to Message-IDs with `/' in them :<

2 years agoMakefile.PL: drop update-copyrights rule
Eric Wong [Wed, 15 Feb 2023 08:01:12 +0000 (08:01 +0000)] 
Makefile.PL: drop update-copyrights rule

I'm no longer updating them since it's noisy and acceptable
to not have them:

  https://www.linuxfoundation.org/blog/copyright-notices-in-open-source-software-projects/

I'm tired of being reminded what year it is :<

2 years agodoc: extindex update on configuration and union section
Eric Wong [Wed, 15 Feb 2023 08:01:11 +0000 (08:01 +0000)] 
doc: extindex update on configuration and union section

The coderepo indexer will use similar ideas, I think...

2 years agodoc: flow: update with newer tools, note forkability
Eric Wong [Wed, 15 Feb 2023 08:01:10 +0000 (08:01 +0000)] 
doc: flow: update with newer tools, note forkability

public-inbox-{clone,fetch,netd} are all relatively new
developments which we can document, here.

We'll also update the generator Makefile snippet since there may
be more Graph::Easy-based docs coming.

2 years agodoc: WWW + cgi: favor -netd over -httpd
Eric Wong [Wed, 15 Feb 2023 08:01:09 +0000 (08:01 +0000)] 
doc: WWW + cgi: favor -netd over -httpd

-netd is strictly more powerful and a gateway drug for
imapd/nntpd/pop3d instances :>

2 years agowww_coderepo: handle unborn/dead branches in summary
Eric Wong [Tue, 14 Feb 2023 13:17:39 +0000 (13:17 +0000)] 
www_coderepo: handle unborn/dead branches in summary

We need to account for `git log' showing nothing for invalid
branches and continue to render properly.  We'll also quiet down
`git log' stderr to avoid cluttering stderr, too.

2 years agowww_coderepo: quiet 404s on Atom feeds for dead branches
Eric Wong [Tue, 14 Feb 2023 13:17:38 +0000 (13:17 +0000)] 
www_coderepo: quiet 404s on Atom feeds for dead branches

No need to clutter up logs when a request hits a dead branch.

2 years agolei q: do not collapse threads with `-tt'
Eric Wong [Tue, 14 Feb 2023 02:42:32 +0000 (02:42 +0000)] 
lei q: do not collapse threads with `-tt'

While having Xapian collapse threads is an easy way to reduce
the amount of deduplication work we need to do when writing
out threads; we can't rely on it when using `lei q -tt` since
that needs to flag all hits.

Reported-by: Maxim Mikityanskiy <maxtram95@gmail.com>
Link: https://public-inbox.org/git/Y+pgBmj0jxR+cVkD@mail.gmail.com/
2 years agoimap: quiet Parse::RecDescent errors on bad search queries
Eric Wong [Mon, 13 Feb 2023 01:02:12 +0000 (01:02 +0000)] 
imap: quiet Parse::RecDescent errors on bad search queries

Parse::RecDescent emits giant errors to STDERR by default
(bypassing $SIG{__WARN__}, even).  Shut it up since there's
no good way to pass those back to a client, and we don't want
clients flooding logs with bogus requests.

2 years agolei_mirror: fetch most-recently-updated repos, first
Eric Wong [Sun, 12 Feb 2023 23:18:28 +0000 (23:18 +0000)] 
lei_mirror: fetch most-recently-updated repos, first

Within the same forkgroup, we can assume the most recently updated
repo has the most data, so fetch those, first.  We'll save new clones
for last since we can preserve {reference} ordering for them.

2 years agolei_mirror: further reduce `git config' calls
Eric Wong [Sun, 12 Feb 2023 23:18:27 +0000 (23:18 +0000)] 
lei_mirror: further reduce `git config' calls

We can parse the config at once and avoid clobbering variables
which do not need changing.  We'll also do some prep work for
fetch.hideRefs proposal being discussed at
<https://public-inbox.org/git/20230209122857.M669733@dcvr/>

2 years agot/lei-refresh-mail-sync: avoid kill+sleep loop
Eric Wong [Sun, 12 Feb 2023 03:12:03 +0000 (03:12 +0000)] 
t/lei-refresh-mail-sync: avoid kill+sleep loop

While we can't waitpid() on daemonized process, we can abuse the
lack of FD_CLOEXEC to detect a process death.  This saves
roughly 400ms for this slow test.

2 years agogit_async_cat: use awaitpid
Eric Wong [Fri, 10 Feb 2023 08:56:41 +0000 (08:56 +0000)] 
git_async_cat: use awaitpid

While awaitpid already registered a no-op callback in
_bidi_pipe, we can still call it again when registering it into
our event loop to ensure EPOLL_CTL_DEL fires.

2 years agolei_mirror: avoid dir/file conflicts in update-ref
Eric Wong [Fri, 10 Feb 2023 03:58:52 +0000 (03:58 +0000)] 
lei_mirror: avoid dir/file conflicts in update-ref

Using the files ref backend for git, `delete' and `create'
operations for `update-ref --stdin' need to be processed in
separate transactions to avoid conflicts in cases where a file
becomes a directory (or presumably, vice versa).

2 years agospawn_pp: fix incorrect `use'
Eric Wong [Thu, 9 Feb 2023 21:53:20 +0000 (21:53 +0000)] 
spawn_pp: fix incorrect `use'

We can't `use PublicInbox::Spawn' from SpawnPP because
PublicInbox::Spawn loads SpawnPP from BEGIN.

Fixes: 9eb8baf199cd148b (spawn_pp: use `which()' properly for pure-Perl spawn, 2023-01-29)
2 years agolei_mirror: show non-ASCII owner properly w/ --verbose
Eric Wong [Thu, 9 Feb 2023 12:30:59 +0000 (12:30 +0000)] 
lei_mirror: show non-ASCII owner properly w/ --verbose

This makes the verbose progress output look nicer, but doesn't
affect the actual config file generation.

2 years agolei_mirror: reduce `git config' usage
Eric Wong [Mon, 6 Feb 2023 05:56:35 +0000 (05:56 +0000)] 
lei_mirror: reduce `git config' usage

We can use `git -c $KEY=$VAL fetch' with a random remote name
that never makes it to a config file.

2 years agowww: sort all /$INBOX/ topics by Received: timestamp
Eric Wong [Sat, 4 Feb 2023 20:41:10 +0000 (20:41 +0000)] 
www: sort all /$INBOX/ topics by Received: timestamp

Our previous pinning prevention only worked to prevent older
(non-most-recent) topics from being pinned to the landing page,
but not the most recent window of messages.

We still sort messages within threads by Date: because that
makes git-send-email patchsets display more nicely, but we
don't want recent topics pinned due to future Date: headers.

I nearly switched sort_ds() back to sorting by Received: until
I looked back on commit 8e52e5fdea416d6fda0b8d301144af0c043a5a76
(use both Date: and Received: times, 2018-03-21) and was reminded
git-send-email relies on Date: for large series, so I added a
note about it for sort_ds().

Reported-by: Kyle Meyer <kyle@kyleam.com>
Tested-by: Kyle Meyer <kyle@kyleam.com>
Link: https://public-inbox.org/meta/87edr5gx63.fsf@kyleam.com/
2 years agolei_mirror: use --no-write-fetch-head on git 2.29+
Eric Wong [Fri, 3 Feb 2023 03:46:03 +0000 (03:46 +0000)] 
lei_mirror: use --no-write-fetch-head on git 2.29+

This avoids unnecessary writes to the FETCH_HEAD file, which is
worthless in multi-remote mirrors.  Actually, I haven't found
FETCH_HEAD useful anywhere since the `/remotes/' namespace
became popular...

2 years agowww: diff: fix encoding problems when showing diff
Eric Wong [Tue, 31 Jan 2023 10:31:57 +0000 (10:31 +0000)] 
www: diff: fix encoding problems when showing diff

We need to use the utf8 layer when writing files to be diffed,
and utf8::decode the `git diff' output.  Furthermore, do the
CRLF > LF conversion early to avoid showing CRLF vs LF
differences in the diff, since that doesn't matter to MUAs
(nor our normal HTML views)

2 years agolei: drop -watches and -lei_note_event from workers
Eric Wong [Tue, 31 Jan 2023 00:05:15 +0000 (00:05 +0000)] 
lei: drop -watches and -lei_note_event from workers

I noticed these while tracking down circular refs for commit
7b654d175cf2e31b (ipc: drop awaitpid_init to avoid circular refs, 2023-01-30).
While they're not the cause of circular refs, they're still
a waste of memory in worker processes.

2 years agotests: make require_git and require_cmd easier-to-use
Eric Wong [Mon, 30 Jan 2023 22:50:07 +0000 (22:50 +0000)] 
tests: make require_git and require_cmd easier-to-use

We'll rely on defined(wantarray) to implicitly skip subtests,
and memoize these to reduce syscalls, since tests should
be short-lived enough to not be affected by new installations or
removals of git/xapian-compact/curl/etc...

2 years agotests: make slow tests easier-to-find
Eric Wong [Mon, 30 Jan 2023 04:30:58 +0000 (04:30 +0000)] 
tests: make slow tests easier-to-find

t/run.perl now prints slowest 10 tests at startup, and I've
added ./devel/longest-tests to print all tests sorted by
elapsed time.

This should allow us to notice outliers more quickly in the
future.

2 years agoipc: drop awaitpid_init to avoid circular refs
Eric Wong [Mon, 30 Jan 2023 04:30:57 +0000 (04:30 +0000)] 
ipc: drop awaitpid_init to avoid circular refs

This brings t/lei-index.t back down from ~8 to ~3s.  I didn't
notice this before was because the LeiNoteEvent timer was firing
every 5s and clearing circular refs and parallel testing meant
the delay got hidden.

Fixes: 4a2a95bbc78f99c8 (ipc+lei: switch to awaitpid, 2023-01-17)
2 years agoxt/lei-auth-fail: use valid label name
Eric Wong [Sun, 29 Jan 2023 22:58:35 +0000 (22:58 +0000)] 
xt/lei-auth-fail: use valid label name

Uppercase characters aren't allowed for labels due to Xapian
boolean limitations, so we need to use lowercase labels.

Fixes: 27015c3365fd0690 (lei_input: disallow uppercase characters for labels, 2021-10-31)
2 years agolei_input: give a hint for upper-case in labels
Eric Wong [Sun, 29 Jan 2023 22:58:34 +0000 (22:58 +0000)] 
lei_input: give a hint for upper-case in labels

I just encountered this error in xt/lei-auth-fail.t

2 years agocontent_digest_dbg: convert to arrayref and limit to lei
Eric Wong [Sun, 29 Jan 2023 10:30:42 +0000 (10:30 +0000)] 
content_digest_dbg: convert to arrayref and limit to lei

Since it's an extremely small class and not subclassed or
anything, we'll make it even smaller as an arrayref.

We also don't load this for PublicInbox::WWW or anything that
runs in public-facing daemons.

2 years agouse Net::SSLeay (OpenSSL) for SHA-(1|256) if installed
Eric Wong [Sun, 29 Jan 2023 10:30:41 +0000 (10:30 +0000)] 
use Net::SSLeay (OpenSSL) for SHA-(1|256) if installed

On my x86-64 machine, OpenSSL SHA-256 is nearly twice as fast as
the Digest::SHA implementation from Perl, most likely due to an
optimized assembly implementation.  SHA-1 is a few percent
faster, too.

2 years agospawn_pp: use `which()' properly for pure-Perl spawn
Eric Wong [Sun, 29 Jan 2023 09:45:11 +0000 (09:45 +0000)] 
spawn_pp: use `which()' properly for pure-Perl spawn

I have no idea if mod_perl/mod_perl2 is used nowadays, but
we're stuck supporting it as long as mod_perl exists.  So
add some tests and make minor updates to existing ones to
ensure it stays working.

2 years agowww_coderepo: summary: fix mis-linkification of `...'
Eric Wong [Sat, 28 Jan 2023 11:02:55 +0000 (11:02 +0000)] 
www_coderepo: summary: fix mis-linkification of `...'

We need to use the ternary operator in assignments to clobber
previous values of `$last'.

2 years agowww_coderepo: support $REPO/refs/{heads,tags}/ endpoints
Eric Wong [Sat, 28 Jan 2023 11:02:54 +0000 (11:02 +0000)] 
www_coderepo: support $REPO/refs/{heads,tags}/ endpoints

These are also in cgit, but we'll include CLI hints to show
viewers how our data is generated.  We don't have "$REPO/refs/"
without (heads|tags) yet, though...

2 years agorepo_atom: translate: account for multiple args
Eric Wong [Sat, 28 Jan 2023 11:02:53 +0000 (11:02 +0000)] 
repo_atom: translate: account for multiple args

->translate should handle unlimited args, even if we don't
currently use it that way...

2 years agowww_coderepo: reduce utf8::decode calls
Eric Wong [Sat, 28 Jan 2023 11:02:52 +0000 (11:02 +0000)] 
www_coderepo: reduce utf8::decode calls

It's safe to call utf8::decode on data where "\0" exists.

2 years agowww_coderepo: fix snapshot link generation
Eric Wong [Sat, 28 Jan 2023 11:02:51 +0000 (11:02 +0000)] 
www_coderepo: fix snapshot link generation

Do not assume ".git" exists as a suffix in the repo nickname,
and filter out all trailing slashes in case it didn't get
filtered from Config.

2 years agowww_coderepo: support /$REPO/tags.atom endpoint
Eric Wong [Sat, 28 Jan 2023 11:02:50 +0000 (11:02 +0000)] 
www_coderepo: support /$REPO/tags.atom endpoint

Providing an Atom feed for tags can be a nice way for users
to subscribe to new releases without excessive noise.

2 years agowww_coderepo: tree: quiet and 404 on non-existent refs
Eric Wong [Sat, 28 Jan 2023 11:02:49 +0000 (11:02 +0000)] 
www_coderepo: tree: quiet and 404 on non-existent refs

Clients should see 404s when attempting to hit files for deleted
branches or tags.

2 years agogit: drop needless checks for old git
Eric Wong [Thu, 26 Jan 2023 09:32:57 +0000 (09:32 +0000)] 
git: drop needless checks for old git

`ambiguous' was added in git 2.21, and `dangling' was the only
other possible phrase which was inadvertantly slipped in prior
to 2.21.  Thus there's no need to check for `notdir' or `loop'
responses since we aren't using `git cat-file --follow-symlinks'
anywhere.

2 years agogit: use --batch-command in git 2.36+ to save processes
Eric Wong [Thu, 26 Jan 2023 09:32:56 +0000 (09:32 +0000)] 
git: use --batch-command in git 2.36+ to save processes

`git cat-file --batch-command' combines the functionality of
`--batch' and `--batch-check' into a single process.  This
reduces the amount of running processes and is primarily
useful for coderepos (e.g. solver).

This also fixes prior use of `print { $git->{out} }' which is
a a potential (but unlikely) bug since commit d4ba8828ab23f278
(git: fix asynchronous batching for deep pipelines, 2023-01-04)

Lack of libgit2 on one of my test machines also uncovered fixes
necessary for t/imapd.t, t/nntpd.t and t/nntpd-v2.t.

2 years agogit: reduce delete ops in _destroy
Eric Wong [Wed, 25 Jan 2023 10:18:35 +0000 (10:18 +0000)] 
git: reduce delete ops in _destroy

We can avoid some extra returns and branches by just relying on
variadic arguments.

2 years agogit: drop needless ENOENT import
Eric Wong [Wed, 25 Jan 2023 10:18:34 +0000 (10:18 +0000)] 
git: drop needless ENOENT import

I imported it in commit 356439a571c536eaa487031802b436d087113f4f
(gcf2 + extsearch: check for unlinked files on Linux, 2021-09-22)
but never used it.

2 years agoprocess_pipe: warn hackers off using it for bidirectional pipes
Eric Wong [Wed, 25 Jan 2023 10:18:33 +0000 (10:18 +0000)] 
process_pipe: warn hackers off using it for bidirectional pipes

While most uses of ->DESTROY happens in a predictable order in
long-lived daemons, process teardown on exit is chaotic and not
subject to ordering guarantees, so we must keep both ends of a
`git cat-file --batch*' pipe at the same level in the object
hierarchy.

Drop an old Carp import while I'm in the area.

2 years agogit: use core.abbrev=no on git 2.31+
Eric Wong [Wed, 25 Jan 2023 10:18:32 +0000 (10:18 +0000)] 
git: use core.abbrev=no on git 2.31+

This makes it easier to support SHA-256 inboxes in the future.
Tested with both git 2.30.2 (Debian stable) and 2.39.1

2 years agoviewvcs: improve tree glossary view
Eric Wong [Tue, 24 Jan 2023 09:49:40 +0000 (09:49 +0000)] 
viewvcs: improve tree glossary view

Adding an <hr> helps delineate the glossary, note that
submodules are rare, and avoid needlessly defining the
commits-in-trees case since the extra information is likely
to overwhelm new users.

2 years agowww_coderepo: remove some needless return statements
Eric Wong [Tue, 24 Jan 2023 09:49:39 +0000 (09:49 +0000)] 
www_coderepo: remove some needless return statements

Maybe it makes control flow a little easier to rely on
implicit return (IIRC, it's slightly faster, too).

2 years agosolver_git: remove extraneous leading `-'
Eric Wong [Tue, 24 Jan 2023 09:49:38 +0000 (09:49 +0000)] 
solver_git: remove extraneous leading `-'

It was a harmless negation, I must've pasted a line from a diff
and forgotten to chop off the first character :x

Fixes: 6f5b238bae5c "solver: early make hints detection more robust"
2 years agoviewvcs: show message for 404||500 errors
Eric Wong [Tue, 24 Jan 2023 09:49:37 +0000 (09:49 +0000)] 
viewvcs: show message for 404||500 errors

Since the debug log isn't present from the /$REPO/ URLs,
the lack of debug log makes 404s look confusing.

2 years agoviewvcs: expand on path names being "non-authoritative"
Eric Wong [Tue, 24 Jan 2023 09:49:36 +0000 (09:49 +0000)] 
viewvcs: expand on path names being "non-authoritative"

Hopefully this makes sense...

2 years agohttp: reuse STDIN if it's already /dev/null
Eric Wong [Tue, 24 Jan 2023 09:49:35 +0000 (09:49 +0000)] 
http: reuse STDIN if it's already /dev/null

It's typical for -netd/-httpd to have STDIN pointed to
/dev/null, so try to use that instead of opening another
file description.

2 years agowww_coderepo: eliminate debug log footer
Eric Wong [Tue, 24 Jan 2023 09:49:34 +0000 (09:49 +0000)] 
www_coderepo: eliminate debug log footer

WwwCoderepo is for viewing blobs already in code repositories,
so there's no place for a debug log showing which mails were
used to arrive at a given blob.  The debug footer remains for
/$INBOX/$OID/s/ URLs, of course.

2 years agowww_coderepo: show /$INBOX/?t=$DATE link for commits
Eric Wong [Tue, 24 Jan 2023 09:49:33 +0000 (09:49 +0000)] 
www_coderepo: show /$INBOX/?t=$DATE link for commits

While we can't inexpensively search for git commits based on the
timestamp, coderepos configured for inboxes can still look up
messages based on the inbox URL.

2 years agoviewvcs: prepopulate search bar with dfpost + dfn
Eric Wong [Tue, 24 Jan 2023 09:49:32 +0000 (09:49 +0000)] 
viewvcs: prepopulate search bar with dfpost + dfn

I'm not sure if this will get overlooked by users, but maybe
it can serve as a hint...

2 years agoviewvcs: add path name hint based on `b=' query param
Eric Wong [Tue, 24 Jan 2023 09:49:31 +0000 (09:49 +0000)] 
viewvcs: add path name hint based on `b=' query param

Of course, we need a note saying it's non-authoritative since
anybody can fiddle with the `b=' parameter in the URL.

2 years agoqspawn: drop lineno from command failure warning
Eric Wong [Tue, 24 Jan 2023 09:49:30 +0000 (09:49 +0000)] 
qspawn: drop lineno from command failure warning

git, cgit, or any other command failing isn't an error
we can do anything about in qspawn, so don't have Perl
emit line number info and needlessly pollute logs.

2 years agods: awaitpid: do not clobber entries for reaped processes
Eric Wong [Sat, 21 Jan 2023 08:58:19 +0000 (08:58 +0000)] 
ds: awaitpid: do not clobber entries for reaped processes

We must only write to $AWAIT_PIDS on the initial reap attempt.
While we're at it, avoid triggering an extra wakeup if we're
doing synchronous awaitpid.  This seems to eliminate most
reliance on Qspawn->DESTROY to call Qspawn->finalize.

2 years agoqspawn: drop unnecessary awaitpid import
Eric Wong [Thu, 19 Jan 2023 20:32:37 +0000 (20:32 +0000)] 
qspawn: drop unnecessary awaitpid import

We don't actually need to call awaitpid here, ProcessPipe
will take care of that.

2 years agods: improve error handling of synchronous awaitpid
Eric Wong [Thu, 19 Jan 2023 20:32:36 +0000 (20:32 +0000)] 
ds: improve error handling of synchronous awaitpid

EINTR needs to be retried for non-kqueue|signalfd users,
and ECHILD indicates a bug in our code.

2 years agoqspawn: psgi_qx: do not call async_pass on errors
Eric Wong [Thu, 19 Jan 2023 20:32:35 +0000 (20:32 +0000)] 
qspawn: psgi_qx: do not call async_pass on errors

This makes control flow slightly less confusing.

2 years agoqspawn: {quiet} only affects normal command exit
Eric Wong [Thu, 19 Jan 2023 20:32:34 +0000 (20:32 +0000)] 
qspawn: {quiet} only affects normal command exit

{quiet} is nice for quieting normal/expected errors (e.g `git diff'),
but we still want to show the command in case there's errors in
our own code.

2 years agods: drop dwaitpid, switch to waitpid(-1)
Eric Wong [Tue, 17 Jan 2023 07:19:11 +0000 (07:19 +0000)] 
ds: drop dwaitpid, switch to waitpid(-1)

With no remaining users, we can drop dwaitpid and switch
awaitpid to rely on waitpid(-1) to save syscalls.

2 years agoipc+lei: switch to awaitpid
Eric Wong [Tue, 17 Jan 2023 07:19:10 +0000 (07:19 +0000)] 
ipc+lei: switch to awaitpid

This avoids awkwardly stuffing an arrayref into callbacks
which expect multiple arguments.  IPC->awaitpid_init now
allows pre-registering callbacks before spawning workers.

2 years agoipc: drop unused $args from ->ipc_worker_stop
Eric Wong [Tue, 17 Jan 2023 07:19:09 +0000 (07:19 +0000)] 
ipc: drop unused $args from ->ipc_worker_stop

It's not used anywhere, and simplifies the next commit.

2 years agowatch: IMAP and NNTP polling can use the same interval
Eric Wong [Tue, 17 Jan 2023 07:19:08 +0000 (07:19 +0000)] 
watch: IMAP and NNTP polling can use the same interval

An obvious error :x

2 years agoeofpipe: drop {arg} support for now
Eric Wong [Tue, 17 Jan 2023 07:19:07 +0000 (07:19 +0000)] 
eofpipe: drop {arg} support for now

The only user of EOFpipe has no args, so avoid wasting a hash
slot on it.  If we need it again in the future, EOFpipe will
allow an array of args, instead.

2 years agowatch: simplify internal data structures
Eric Wong [Tue, 17 Jan 2023 07:19:06 +0000 (07:19 +0000)] 
watch: simplify internal data structures

We can flatten arrays and avoid distinguishing between PID
types now that more of that logic and argument passing logic
is offloaded to awaitpid.

2 years agowatch: switch to awaitpid
Eric Wong [Tue, 17 Jan 2023 07:19:05 +0000 (07:19 +0000)] 
watch: switch to awaitpid

-watch relies on our event_loop anyways, and awaitpid lets us
avoid the extra overhead of EOFpipe.  Add an extra {quit} check
in imap_idle_fork while we're at it.

2 years agogit|gcf2: switch to awaitpid
Eric Wong [Tue, 17 Jan 2023 07:19:04 +0000 (07:19 +0000)] 
git|gcf2: switch to awaitpid

This is a trivial change compared to Qspawn in the previous
commit.

2 years agoqspawn: use ->DESTROY to force ->finalize
Eric Wong [Wed, 18 Jan 2023 02:10:11 +0000 (02:10 +0000)] 
qspawn: use ->DESTROY to force ->finalize

There's apparently a few places where we do not call ->finalize
or ->finish and leave dangling limiter slots occupied.  I can't
reproduce this easily, so it's likely in error-handling paths.

I already made ->finalize idempotent when switching to awaitpid
since I wanted to rely entirely on DESTROY.  However, DESTROY
doesn't always fire soon enough (and the client has already seen
a response), but using DESTROY as a fallback seems reasonable..

This does the minimum to ensure the limiter is freed up on
process exit, but ensuring a finish/finalize call always happens
is the goal.

2 years agods: introduce awaitpid, switch ProcessPipe users
Eric Wong [Tue, 17 Jan 2023 07:19:03 +0000 (07:19 +0000)] 
ds: introduce awaitpid, switch ProcessPipe users

awaitpid is the new API which will eventually replace dwaitpid.
It enables early registration of callback handlers.  Eventually
(once dwaitpid is gone) it'll be able to use fewer waitpid
calls.

The avoidance of waitpid(-1) in our earlier days was driven by
the belief that threads may eventually become relevant for Perl 5,
but that's extremely unlikely at this stage.  I will still
introduce optional threads via C, but they definitely won't be
spawning/reaping processes.

Argument order to callbacks is swapped (PID first) to allow
flattened multiple arguments more natrually.  The previous API
(allowing only a single argument, as influenced by
pthread_create(3)) was more tedious as it involved packing
multiple arguments into yet another array.

2 years agoqspawn: drop {psgi_env} deref
Eric Wong [Tue, 17 Jan 2023 07:19:02 +0000 (07:19 +0000)] 
qspawn: drop {psgi_env} deref

We don't use the assigned variable anywhere, and just access
PATH_INFO directly in the subsequent warning message.

2 years agot/solver_git.t: fix test message
Eric Wong [Tue, 17 Jan 2023 07:19:01 +0000 (07:19 +0000)] 
t/solver_git.t: fix test message

2 years agoipc: remove {-reap_async} field
Eric Wong [Tue, 17 Jan 2023 07:19:00 +0000 (07:19 +0000)] 
ipc: remove {-reap_async} field

We can just test for {-reap_do}, instead to save us a few bytes.

2 years agosearchview: fix uninitialized variable
Eric Wong [Tue, 17 Jan 2023 18:25:43 +0000 (18:25 +0000)] 
searchview: fix uninitialized variable

Seems harmless, but noise in logs is not good.

2 years agocoderepo: consolidate git --batch-check users
Eric Wong [Fri, 13 Jan 2023 10:35:50 +0000 (10:35 +0000)] 
coderepo: consolidate git --batch-check users

And another opportunity to simplify our code between different
PSGI-ish implementations.  The snapshot retrieval is simpler,
but potentially slower since we waste cycles scanning for tags
even after we've found one.  It's probably not a big deal since
it's only short info lines and we can utilize pipelining.

2 years agoviewvcs: use git(1) for coderepo access
Eric Wong [Fri, 13 Jan 2023 10:35:49 +0000 (10:35 +0000)] 
viewvcs: use git(1) for coderepo access

libgit2 development has fallen behind git.git and I've been
using objectformat=sha256 somewhere else for over 18 months.

Hoist out do_cat_async() into it's own sub to hide generic PSGI
vs -httpd differences while we're at it to save us some code.

2 years agoqspawn: import Scalar::Util::blessed properly
Eric Wong [Fri, 13 Jan 2023 10:35:48 +0000 (10:35 +0000)] 
qspawn: import Scalar::Util::blessed properly

Scalar::Util may not be loaded by other modules in the future.

2 years agowww_coderepo: tree: do not break #n$LINENO
Eric Wong [Fri, 13 Jan 2023 04:01:32 +0000 (04:01 +0000)] 
www_coderepo: tree: do not break #n$LINENO

We can't use 302 redirects at the /tree/ endpoint as originally
intended since "#n$LINENO" fragment links aren't preserved
across redirects (since clients don't typically send that part
of the URL in requests).

So we'll have to make sure we handle prefixes properly and show
trees directly.  Oh well :<  At least the history-aware 404
handling remains :>