From: Eric Wong Date: Tue, 8 Oct 2024 05:18:37 +0000 (+0000) Subject: v2writable: more debug output for `lei import' failures X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=a9d0d8527deba1fc6e31e90f61685bf8db498ffa;p=thirdparty%2Fpublic-inbox.git v2writable: more debug output for `lei import' failures A difficult-to-reproduce bug in `lei import' introduced in 4ff8e8d21ab5 (lei/store: stop shard workers + cat-file on idle, 2024-04-16) causes {idx_shards} to not be recreated properly due to a shard being locked while attempting to get a write lock on it: Exception: Unable to get write lock on /path/to/shard0: already locked I'm not sure if the bug is from unnecessarily holding onto a shard too long, or incorrectly attempting to open an already-open shard. In either case, hopefully a more complete backtrace can be obtained since setting PERL5OPT=-MCarp=verbose in the shared lei/store worker process isn't straightforward[*]. AFAIK, this doesn't affect normal v2 and -extindex activity, only lei users who import mail[*] [*] lei attempts to ensure read-after-write consistency across parallel instances to satisfy users who simultaneously import multiple IMAP mailboxes at once over high-latency networks. However, Xapian, SQLite, and multi-epoch git only work well with one writer-at-a-time so lei jumps through hoops to avoid introducing suprising local delays and waits. --- diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm index dab5e64d9..19bb66ada 100644 --- a/lib/PublicInbox/V2Writable.pm +++ b/lib/PublicInbox/V2Writable.pm @@ -23,6 +23,7 @@ use PublicInbox::Search; use PublicInbox::SearchIdx qw(log2stack is_ancestor check_size is_bad_blob); use IO::Handle; # ->autoflush use POSIX (); +use Carp qw(confess); my $OID = qr/[a-f0-9]{40,}/; # an estimate of the post-packed size to the raw uncompressed size @@ -90,9 +91,9 @@ sub init_inbox { sub idx_shard ($$) { my ($self, $num) = @_; - use Carp qw(confess); # FIXME: lei_store bug somewhere.. + # FIXME: lei_store bug somewhere.. confess 'BUG: {idx_shards} unset' if !$self->{idx_shards}; - confess 'BUG: {idx_shards} empty' if !@{$self->{idx_shards}}; + confess 'BUG: {idx_shards} empty' if !@{$self->{idx_shards}}; $self->{idx_shards}->[$num % scalar(@{$self->{idx_shards}})]; } @@ -227,11 +228,18 @@ sub _idx_init { # with_umask callback # need to create all shards before initializing msgmap FD # idx_shards must be visible to all forked processes - my $max = $self->{shards} - 1; - my $idx = $self->{idx_shards} = []; - push @$idx, PublicInbox::SearchIdxShard->new($self, $_) for (0..$max); - $self->{-need_xapian} = $idx->[0]->need_xapian; - + eval { + my $max = $self->{shards} - 1; + my $idx = $self->{idx_shards} = []; + for (0..$max) { + push @$idx, PublicInbox::SearchIdxShard->new($self, $_) + } + $self->{-need_xapian} = $idx->[0]->need_xapian; + }; + if ($@) { + delete $self->{idx_shards}; + confess $@; + } # SearchIdxShard may do their own flushing, so don't scale # until after forking $self->{batch_bytes} *= $self->{shards} if $self->{parallel};