]> git.ipfire.org Git - thirdparty/public-inbox.git/log
thirdparty/public-inbox.git
8 years agorepobrowse: shorten "repo_info" to "-repo"
Eric Wong [Thu, 16 Feb 2017 20:53:42 +0000 (20:53 +0000)] 
repobrowse: shorten "repo_info" to "-repo"

This makes it more consistent with how we use the Inbox
objects for the main code.

8 years agorepo: only read description if git
Eric Wong [Thu, 16 Feb 2017 20:39:08 +0000 (20:39 +0000)] 
repo: only read description if git

Other VCSes have other means of providing the description.

8 years agorepobrowse: switch to new URL format to avoid query strings
Eric Wong [Wed, 15 Feb 2017 22:35:18 +0000 (22:35 +0000)] 
repobrowse: switch to new URL format to avoid query strings

Query strings make endpoint caching more difficult since
they're order-independent.  They are also more likely lost
or truncated inadvertantly when copy+pasting, so try to
avoid them for default endpoints.

There's still some things which are broken and followup
commits will be needed to fix them.

8 years agoconfig: avoid circular loading dependency
Eric Wong [Wed, 15 Feb 2017 00:06:06 +0000 (00:06 +0000)] 
config: avoid circular loading dependency

We must lazilly load one of them, so load Inbox later
since we need to parse the config, first.

8 years agorepobrowse: do not unescape PATH_INFO twice
Eric Wong [Tue, 14 Feb 2017 23:19:34 +0000 (23:19 +0000)] 
repobrowse: do not unescape PATH_INFO twice

PSGI specs already require PATH_INFO to be unescaped.

Followup-to: commit 364de65f8a6b5729027cb70228312a141430122f
("www: do not unescape PATH_INFO twice")

8 years agoMerge remote-tracking branch 'origin/master' into repobrowse
Eric Wong [Tue, 14 Feb 2017 22:56:37 +0000 (22:56 +0000)] 
Merge remote-tracking branch 'origin/master' into repobrowse

* origin/master:
  www: do not unescape PATH_INFO twice
  t/mime: quiet warnings for old versions of Email::Simple
  handle repeated References and In-Reply-To headers

8 years agosearchidx: switch to accounting by message bytes
Eric Wong [Sun, 12 Feb 2017 09:04:54 +0000 (09:04 +0000)] 
searchidx: switch to accounting by message bytes

Xapian memory usage is tied to the size of the indexed
text, so take the raw message size into account when
deciding when to flush Xapian data.

More importantly, we now flush Xapian before we have it
buffer beyond our maximum; and we do it unconditionally
to prevent even high priority processes from OOM-ing.

8 years agowww: do not unescape PATH_INFO twice
Eric Wong [Tue, 14 Feb 2017 22:45:15 +0000 (22:45 +0000)] 
www: do not unescape PATH_INFO twice

PSGI specs already require PATH_INFO to be unescaped;
so our tests were wrong, too.

8 years agot/mime: quiet warnings for old versions of Email::Simple
Eric Wong [Sun, 12 Feb 2017 02:41:22 +0000 (02:41 +0000)] 
t/mime: quiet warnings for old versions of Email::Simple

This is fixed in the newest versions of Email::Simple,
but not the version in Debian jessie (2.203)

8 years agohandle repeated References and In-Reply-To headers
Eric Wong [Sat, 11 Feb 2017 23:54:48 +0000 (23:54 +0000)] 
handle repeated References and In-Reply-To headers

It seems possible for git-send-email(1) to generate repeated
repeated instances of References and In-Reply-To headers,
as evidenced in:

https://public-inbox.org/git/20161111124541.8216-17-vascomalmeida@sapo.pt/raw

This causes a mismatch between how our search indexer threads
and how our HTML view handles threading.  In the future, View.pm
will use the smsg-parsed {references} field and avoid redoing
Email::MIME header parsing.

We will still need to figure out a way to deal with messages
with repeated Message-IDs, at some point, too.

8 years agorepo: lazily read description and cloneurl
Eric Wong [Sat, 11 Feb 2017 00:41:29 +0000 (00:41 +0000)] 
repo: lazily read description and cloneurl

This improves startup speed at the cost of CoW-friendliness
for long-lived daemons (which can be fixed, later).

8 years agoconfig: move try_cat function from inbox
Eric Wong [Fri, 10 Feb 2017 21:27:11 +0000 (21:27 +0000)] 
config: move try_cat function from inbox

This allows RepoConfig to be independent of the
PublicInbox::Inbox class.

8 years agorepo: add class for representing a code repo
Eric Wong [Fri, 10 Feb 2017 21:23:01 +0000 (21:23 +0000)] 
repo: add class for representing a code repo

This should hopefully allow us to organize our code better

8 years agorepogit: add prototypes for error checking
Eric Wong [Fri, 10 Feb 2017 21:19:11 +0000 (21:19 +0000)] 
repogit: add prototypes for error checking

And add a note to remove git_commit_title

8 years agorepo: search index flushes for excessive active refs
Eric Wong [Fri, 10 Feb 2017 03:30:40 +0000 (03:30 +0000)] 
repo: search index flushes for excessive active refs

For certain repos, having too many active refs will cause
memory usage problems.  Mitigate the Xapian problems, for
now, and consider a switch to GDBM_File or similar for
repos with more refs.

8 years agosearch: remove unnecessary abstractions and functionality
Eric Wong [Fri, 10 Feb 2017 01:51:05 +0000 (01:51 +0000)] 
search: remove unnecessary abstractions and functionality

This simplifies the code a bit and reduces the translation
overhead for looking directly at data from tools shipped
with Xapian.

While we're at it, fix thread-all.t :)

8 years agorepo: search index no longer indexes for --contains
Eric Wong [Thu, 9 Feb 2017 23:50:25 +0000 (23:50 +0000)] 
repo: search index no longer indexes for --contains

It's extraordinarily expensive to add these terms for
each and every commit.

8 years agorepo: increase search index flush granularity
Eric Wong [Thu, 9 Feb 2017 21:11:00 +0000 (21:11 +0000)] 
repo: increase search index flush granularity

We need to flush Xapian more frequently to account for
gigantic commits which introduce lots of text, so do
it when accounting for each line processed, and not
for each commit processed.

8 years agorepobrowse: shorten internal names
Eric Wong [Thu, 9 Feb 2017 01:37:03 +0000 (01:37 +0000)] 
repobrowse: shorten internal names

We'll still be keeping "repobrowse" for the public API
for use with .psgi files, but shortening the name means
less typing and we may have command-line tools, too.

8 years agoMerge remote-tracking branch 'origin/master' into repobrowse
Eric Wong [Thu, 9 Feb 2017 00:43:02 +0000 (00:43 +0000)] 
Merge remote-tracking branch 'origin/master' into repobrowse

* origin/master:
  config: do not slurp lines into memory
  TODO: several updates
  search: schema version bump for empty References/In-Reply-To
  Revert "searchidx: reindex clobbers old thread IDs"
  searchidx: reindex clobbers old thread IDs
  searchidx: deal with empty In-Reply-To and References headers
  searchview: increase limit for displaying search results
  searchview: clarify numeric summary at bottom
  add filter for Subject: tags
  watchmaildir: allow arguments for filters
  watchmaildir: limit live importer processes
  learn: implement "rm" only functionality
  mime: avoid SUPER usage in Email::MIME subclass
  inbox: reinstate periodic cleanup of Xapian and SQLite objects
  introduce PublicInbox::MIME wrapper class

8 years agorepobrowse: avoid slurping lines
Eric Wong [Thu, 9 Feb 2017 00:26:52 +0000 (00:26 +0000)] 
repobrowse: avoid slurping lines

"foreach (<$fh>)" in Perl requests lines in array
context, so use "while" instead for lazy reading.

This follows ba4c50c20b95679580beba1ef290a4281d5285b7
in master ("config: do not slurp lines into memory")

8 years agoconfig: do not slurp lines into memory
Eric Wong [Wed, 8 Feb 2017 21:41:38 +0000 (21:41 +0000)] 
config: do not slurp lines into memory

There's no need to hold everything in memory, here,
since apparently "foreach" will read everything at
once in array context

(for some reason, I thought Perl5 was smart enough
 to avoid creating a temporary array, here...)

8 years agorepobrowse: start wiring up git search
Eric Wong [Sat, 4 Feb 2017 02:20:35 +0000 (02:20 +0000)] 
repobrowse: start wiring up git search

Much more work on this will be needed, but at least explicit
flush points prevents OOMs on my system.

8 years agoTODO: several updates
Eric Wong [Tue, 7 Feb 2017 22:27:52 +0000 (22:27 +0000)] 
TODO: several updates

Always plenty to do while working on this...

8 years agosearch: hoist out git directory search index helper
Eric Wong [Tue, 31 Jan 2017 22:55:58 +0000 (22:55 +0000)] 
search: hoist out git directory search index helper

We will be reusing this for indexing normal (code) repositories
using git and Xapian, too.

8 years agosearch: schema version bump for empty References/In-Reply-To
Eric Wong [Mon, 6 Feb 2017 21:39:45 +0000 (21:39 +0000)] 
search: schema version bump for empty References/In-Reply-To

We cannot distinguish between legitimate ghosts and mis-threaded
messages before commit 83425ef12e4b65cdcecd11ddcb38175d4a91d5a0
("searchidx: deal with empty In-Reply-To and References headers")
so we must rebuild the index in parallel to fix it.

8 years agoRevert "searchidx: reindex clobbers old thread IDs"
Eric Wong [Mon, 6 Feb 2017 21:37:26 +0000 (21:37 +0000)] 
Revert "searchidx: reindex clobbers old thread IDs"

Oops, that's broken, too.  I guess the only way to reindex
after fixing the thread detection is to start from scratch.

This reverts commit 5d91adedf5f33ef1cb87df2a86306ddf370b4f8d.

8 years agosearchidx: reindex clobbers old thread IDs
Eric Wong [Mon, 6 Feb 2017 21:08:13 +0000 (21:08 +0000)] 
searchidx: reindex clobbers old thread IDs

We cannot always reuse thread IDs since our threading
logic may change as bugs are fixed.

8 years agosearchidx: deal with empty In-Reply-To and References headers
Eric Wong [Mon, 6 Feb 2017 19:54:25 +0000 (19:54 +0000)] 
searchidx: deal with empty In-Reply-To and References headers

In some messages, these headers exist, but have empty values.
Do not let empty values throw off our search indexer to tie
threads together, as it can make non-sensical threads grouped
to a Message-Id of "" (empty string).

See
<https://public-inbox.org/git/11340844841342-git-send-email-mailing-lists.git@rawuncut.elitemail.org/raw>
for an example of such a message.

Thanks-to: Johannes Schindelin <Johannes.Schindelin@gmx.de>
  <https://public-inbox.org/git/alpine.DEB.2.20.1702041206130.3496@virtualbox/>

8 years agosearchview: increase limit for displaying search results
Eric Wong [Mon, 6 Feb 2017 02:38:37 +0000 (02:38 +0000)] 
searchview: increase limit for displaying search results

We are in no danger of excessive buffering or OOM-ing,
the main page for every inbox already loads 200 results;
and thread page views even load 1000!  Increase this to
200 for now.

8 years agosearchview: clarify numeric summary at bottom
Eric Wong [Mon, 6 Feb 2017 02:07:24 +0000 (02:07 +0000)] 
searchview: clarify numeric summary at bottom

Xapian can only give estimated results when a result limit is
given to it, so make clear it is an estimate to avoid showing
non-sensical ranges when no results are returned.

8 years agorepobrowse: git tag listing is now async
Eric Wong [Sat, 4 Feb 2017 02:21:06 +0000 (02:21 +0000)] 
repobrowse: git tag listing is now async

I'm unsure if this is even a good idea to support,
but we have it, for now.

8 years agorepobrowse/git/atom: remove unused subroutine
Eric Wong [Thu, 26 Jan 2017 07:58:28 +0000 (07:58 +0000)] 
repobrowse/git/atom: remove unused subroutine

We never ended up using it.

8 years agorepobrowse: simplify command generation for git commands
Eric Wong [Thu, 26 Jan 2017 04:27:02 +0000 (04:27 +0000)] 
repobrowse: simplify command generation for git commands

This shortens the code quite a bit at a negligible performance cost,
and the diffstat agrees.

8 years agoadd filter for Subject: tags
Eric Wong [Thu, 26 Jan 2017 02:09:36 +0000 (02:09 +0000)] 
add filter for Subject: tags

Some mailing lists add annoying tags into the Subject line which
discourages readers from doing proper mail organization on the
client side.  They also waste precious screen space and
attention span.

Remove them from our archives to reduce clutter.

8 years agowatchmaildir: allow arguments for filters
Eric Wong [Wed, 25 Jan 2017 21:39:06 +0000 (21:39 +0000)] 
watchmaildir: allow arguments for filters

We'll want to allow some degree of configuration for
various mailing lists.

8 years agorepobrowse: git summary view uses psgi_qx
Eric Wong [Sun, 22 Jan 2017 22:10:46 +0000 (22:10 +0000)] 
repobrowse: git summary view uses psgi_qx

This reduces one synchronous dependency from the hot path,
and psgi_return will be used in the future.

8 years agot/httpd-unix: better diagnostics and comments for test
Eric Wong [Sun, 22 Jan 2017 01:52:25 +0000 (01:52 +0000)] 
t/httpd-unix: better diagnostics and comments for test

I've hit random test failures on this, so attempt to improve
diagnostics and improve documentation for this test.

8 years agorepobrowse: preserve newlines in Atom feed
Eric Wong [Sat, 21 Jan 2017 11:50:58 +0000 (11:50 +0000)] 
repobrowse: preserve newlines in Atom feed

Commit messages are assumed to be displayed in a terminal
with a fixed width font, so we must preserve newlines and
all whitespace as-is so ASCII art may be displayed properly.

8 years agorepobrowse: simplify git log parsing implementation
Eric Wong [Sat, 21 Jan 2017 11:34:31 +0000 (11:34 +0000)] 
repobrowse: simplify git log parsing implementation

Based on what was done for the Atom feed, this will allow us to
simplify state management through metaprogramming and avoid
placeholder characters ('D' for decoration) for empty fields.

8 years agorepobrowse: fix full URL generation in Atom feed
Eric Wong [Sat, 21 Jan 2017 04:41:06 +0000 (04:41 +0000)] 
repobrowse: fix full URL generation in Atom feed

We must not drop the leading slash in the URI.  This
regression was introduced when we dropped Plack::Request
dependency.

8 years agorepobrowse: avoid extra hash assignments for Atom feed
Eric Wong [Sat, 21 Jan 2017 04:35:27 +0000 (04:35 +0000)] 
repobrowse: avoid extra hash assignments for Atom feed

This should make the code somewhat easier-to-follow.

8 years agorepobrowse: git Atom feed uses Qspawn->psgi_return
Eric Wong [Sat, 21 Jan 2017 02:29:52 +0000 (02:29 +0000)] 
repobrowse: git Atom feed uses Qspawn->psgi_return

This allows us to wait on "git log" output in a non-blocking manner
while being able to throttle on backpressure from slow clients
when used with pi-httpd.

8 years agorepobrowse: git Atom feed uses Qspawn->psgi_qx
Eric Wong [Sat, 21 Jan 2017 02:29:51 +0000 (02:29 +0000)] 
repobrowse: git Atom feed uses Qspawn->psgi_qx

This allows pi-httpd to service other I/O while we wait on "git
symbolic-ref" to run.  And psgi_return will be used in the next
commit...

8 years agoqspawn: better annotate where $qx_cb is called
Eric Wong [Sat, 21 Jan 2017 02:29:50 +0000 (02:29 +0000)] 
qspawn: better annotate where $qx_cb is called

Hopefully this makes the code easier-to-follow for random
readers.  This requires a small amount of modification to
our one caller, but this is a new, unstable API (as is
nearly all of our code).

8 years agowatchmaildir: limit live importer processes
Eric Wong [Wed, 18 Jan 2017 19:13:09 +0000 (19:13 +0000)] 
watchmaildir: limit live importer processes

We don't want to be triggering OOM or swapping on weaker
systems when we have dozens of inboxes as potential targets.

8 years agolearn: implement "rm" only functionality
Eric Wong [Thu, 19 Jan 2017 00:31:30 +0000 (00:31 +0000)] 
learn: implement "rm" only functionality

Do not consider this interface stable, but I just needed a
way to remove mis-imported multipart messages so
public-inbox-watch could pick them up again from my Maildir.

8 years agomime: avoid SUPER usage in Email::MIME subclass
Eric Wong [Wed, 18 Jan 2017 23:50:57 +0000 (23:50 +0000)] 
mime: avoid SUPER usage in Email::MIME subclass

We must call Email::Simple methods directly in our monkey patch
for Email::MIME to call the intended method.  Using SUPER in our
subclass would instead hit a different, unintended method in
Email::MIME.

Reported-by: Junio C Hamano <gitster@pobox.com>
<xmqq4m0wb43w.fsf@gitster.mtv.corp.google.com>

8 years agorepobrowse: expath is always defined
Eric Wong [Wed, 18 Jan 2017 08:17:50 +0000 (08:17 +0000)] 
repobrowse: expath is always defined

Remove an outdated comment while we're at it, too.

8 years agohttp: cast a wider net to prevent circular references
Eric Wong [Wed, 18 Jan 2017 07:35:35 +0000 (07:35 +0000)] 
http: cast a wider net to prevent circular references

We can more effectly nuke circular references by clearing
the entire PSGI $env, not just particular keys, when
there are self-referential fields such as "qspawn.response"
in our environment.

8 years agorepobrowse: git snapshot waits for all commands asynchronously
Eric Wong [Wed, 18 Jan 2017 07:27:03 +0000 (07:27 +0000)] 
repobrowse: git snapshot waits for all commands asynchronously

This new asynchronous API, psgi_qx, will allow us to take
advantage of non-blocking I/O from even small commands;
as those may still need to wait for slow operations.

8 years agoqspawn: better description
Eric Wong [Tue, 17 Jan 2017 19:38:36 +0000 (19:38 +0000)] 
qspawn: better description

We'll probably use this in a lot of places...

8 years agorepobrowse: verbose git tree display uses qspawn for ls-tree
Eric Wong [Sun, 15 Jan 2017 03:11:14 +0000 (03:11 +0000)] 
repobrowse: verbose git tree display uses qspawn for ls-tree

For now, qspawn provides resource management for dealing with
expensive "git ls-tree" processes.

8 years agorepobrowse: use qspawn for plain tree views
Eric Wong [Sun, 15 Jan 2017 02:26:39 +0000 (02:26 +0000)] 
repobrowse: use qspawn for plain tree views

We may eventually handle tree parsing ourselves (since we
already git cat-file), but for now we can rely on ls-tree
to give good output and qspawn to manage resource allocation.

8 years agorepobrowse: git: drop unused diff parsing routines
Eric Wong [Wed, 11 Jan 2017 08:46:35 +0000 (08:46 +0000)] 
repobrowse: git: drop unused diff parsing routines

We don't need these legacy routines anymore and use the
newer stream-friendly _sed interface.

8 years agohttpd/async: stop running command if client disconnects
Eric Wong [Fri, 13 Jan 2017 23:10:25 +0000 (23:10 +0000)] 
httpd/async: stop running command if client disconnects

If an HTTP client disconnects while we're piping the output of a
process to them, break the pipe of the process to reclaim
resources as soon as possible.

8 years agorepobrowse: simplify conditional for cat-file input
Eric Wong [Fri, 13 Jan 2017 22:53:20 +0000 (22:53 +0000)] 
repobrowse: simplify conditional for cat-file input

expath is always defined, even to an empty string,
so simplify the conditional for checking it.

8 years agorename "GitAsyncRd" to "GitAsync"
Eric Wong [Fri, 13 Jan 2017 22:28:36 +0000 (22:28 +0000)] 
rename "GitAsyncRd" to "GitAsync"

This wrapper class actually does both reading and
writing, and a shorter name is nicer.

8 years agogitasyncrd: pass a reference to Danga::Socket::write
Eric Wong [Fri, 13 Jan 2017 22:24:45 +0000 (22:24 +0000)] 
gitasyncrd: pass a reference to Danga::Socket::write

D::S creates a reference for this, anyways, so avoid
the extra work by doing it ourselves.

8 years agorepobrowse: comment describing Git wrapper creation
Eric Wong [Fri, 13 Jan 2017 22:05:10 +0000 (22:05 +0000)] 
repobrowse: comment describing Git wrapper creation

Metaprogramming can be difficult-to-read after several
months, so leave comments in place to describe common
usage results of.

8 years agorepobrowse: port git log view to qspawn streaming interface
Eric Wong [Fri, 13 Jan 2017 02:13:18 +0000 (02:13 +0000)] 
repobrowse: port git log view to qspawn streaming interface

This will prevent too many processes from being spawned at once
while also allowing us to respond to backpressure from slow
clients.

8 years agoinbox: reinstate periodic cleanup of Xapian and SQLite objects
Eric Wong [Wed, 11 Jan 2017 10:13:00 +0000 (10:13 +0000)] 
inbox: reinstate periodic cleanup of Xapian and SQLite objects

We may need to do this even more aggressively, since the
Xapian database does not always give the latest results.
This time, we'll do it without relying on weak references,
and instead check refcounts.

8 years agorepobrowse: make git diff output use qspawn
Eric Wong [Wed, 11 Jan 2017 04:12:29 +0000 (04:12 +0000)] 
repobrowse: make git diff output use qspawn

This is a potentially expensive operation, so we may want to
give it it's own limiter channel.

8 years agodiff: note the dangers of gigantic anchors hash
Eric Wong [Wed, 11 Jan 2017 04:12:28 +0000 (04:12 +0000)] 
diff: note the dangers of gigantic anchors hash

8 years agoasync: improve and fix out-of-date comments
Eric Wong [Wed, 11 Jan 2017 04:12:27 +0000 (04:12 +0000)] 
async: improve and fix out-of-date comments

8 years agorepobrowse: qspawn + streaming for git commit display
Eric Wong [Wed, 11 Jan 2017 04:12:26 +0000 (04:12 +0000)] 
repobrowse: qspawn + streaming for git commit display

This prevents "git show" processes from monopolizing
the system and allows us to better handle backpressure
from gigantic commits.

8 years agoqspawn: fix bad error reporting on errors
Eric Wong [Wed, 11 Jan 2017 04:12:25 +0000 (04:12 +0000)] 
qspawn: fix bad error reporting on errors

Oops :x

8 years agointroduce PublicInbox::MIME wrapper class
Eric Wong [Tue, 10 Jan 2017 21:40:37 +0000 (21:40 +0000)] 
introduce PublicInbox::MIME wrapper class

This should fix problems with multipart messages where
text/plain parts lack a header.

cf. git clone --mirror https://github.com/rjbs/Email-MIME.git
    refs/pull/28/head

In the future, we may still introduce as streaming
interface to reduce memory usage on large emails.

8 years agogithttpbackend: use psgi_return shortcut
Eric Wong [Sun, 8 Jan 2017 04:39:18 +0000 (04:39 +0000)] 
githttpbackend: use psgi_return shortcut

This drastically cuts down the amount of duplicate code
we have in this branch.

8 years agohttpd/async: remove needless sysread wrapper
Eric Wong [Sun, 8 Jan 2017 04:31:30 +0000 (04:31 +0000)] 
httpd/async: remove needless sysread wrapper

We don't appear to be using it anywhere

8 years agoMerge remote-tracking branch 'origin/master' into repobrowse
Eric Wong [Sun, 8 Jan 2017 04:25:51 +0000 (04:25 +0000)] 
Merge remote-tracking branch 'origin/master' into repobrowse

* origin/master:
  inbox: properly register cleanup timer for git processes
  search: remove subject_summary
  searchmsg: favor direct hash access over accessor methods
  remove incorrect comment about strftime + locales
  config: allow per-inbox nntpserver
  inbox: eliminate weaken usage entirely
  inbox: describe the full key name
  config: remove unused get() method
  config: always use namespaced "publicinboxlimiter"
  qspawn: prepare to support runtime reloading of Limiter
  http: remove weaken usage, reduce anonsub capture scope
  httpd/async: remove weaken usage
  http: fix spelling error
  watch: watchspam affects all configured inboxes
  doc: minor updates to design notes

8 years agoinitial git async work
Eric Wong [Sat, 31 Dec 2016 11:16:47 +0000 (11:16 +0000)] 
initial git async work

This will allow us to handle network operations while waiting
on "git cat-file" to seek and unpack things.

8 years agoinbox: drop $ref arg for writing destination buffer
Eric Wong [Sat, 7 Jan 2017 22:56:03 +0000 (22:56 +0000)] 
inbox: drop $ref arg for writing destination buffer

We never used this feature, so lets drop it for now
since we can have fine-grained memory release with
reference counting, anyways.

8 years agoinbox: properly register cleanup timer for git processes
Eric Wong [Sat, 7 Jan 2017 02:10:23 +0000 (02:10 +0000)] 
inbox: properly register cleanup timer for git processes

We still need to cleanup git processes occasionally, since
"git cat-file --batch" does not release old packs (and
git processes are fairly expensive).

For SQLite and Xapian file handles, they should be capable
of managing themselves without too much trouble, so lets
try keeping them for the lifetime of a process.

8 years agosearch: remove subject_summary
Eric Wong [Sat, 7 Jan 2017 01:44:52 +0000 (01:44 +0000)] 
search: remove subject_summary

Apparently it never actually got used, and the world seems
fine without it, so we can drop it.

While we're at it, consider removing our subject_path
usage from existence, too.  We are not using fancy subject-line
based URLs, here.

8 years agosearchmsg: favor direct hash access over accessor methods
Eric Wong [Sat, 7 Jan 2017 01:44:51 +0000 (01:44 +0000)] 
searchmsg: favor direct hash access over accessor methods

This is faster, smaller, and more straighforward to me with
fewer layers of indirection.

8 years agoremove incorrect comment about strftime + locales
Eric Wong [Sat, 7 Jan 2017 01:44:50 +0000 (01:44 +0000)] 
remove incorrect comment about strftime + locales

We only need strftime to be locale-independent when generating
dates for email and HTTP headers.  Purely numeric dates can
use strftime for ease-of-readability.

8 years agoconfig: allow per-inbox nntpserver
Eric Wong [Sat, 7 Jan 2017 01:44:49 +0000 (01:44 +0000)] 
config: allow per-inbox nntpserver

This allows certain inboxes to override the global nntpserver
(perhaps under a different domain).

8 years agoinbox: eliminate weaken usage entirely
Eric Wong [Sat, 7 Jan 2017 01:44:48 +0000 (01:44 +0000)] 
inbox: eliminate weaken usage entirely

We can do a better job initializing the data structure
so we no longer need to rely on weak references to cleanup
when we ditch the config on reload.

8 years agoinbox: describe the full key name
Eric Wong [Sat, 7 Jan 2017 01:44:47 +0000 (01:44 +0000)] 
inbox: describe the full key name

Hopefully make this easier for future generations to understand.

8 years agoconfig: remove unused get() method
Eric Wong [Sat, 7 Jan 2017 01:44:46 +0000 (01:44 +0000)] 
config: remove unused get() method

This seems like an unnecessary abstraction, or an abstraction
on the wrong level.

8 years agoconfig: always use namespaced "publicinboxlimiter"
Eric Wong [Sat, 7 Jan 2017 01:44:45 +0000 (01:44 +0000)] 
config: always use namespaced "publicinboxlimiter"

I'm not sure if we'll ever support sharing a config file
with other tools, but maybe we will, and "limiter" is
too generic.

8 years agoqspawn: prepare to support runtime reloading of Limiter
Eric Wong [Sat, 7 Jan 2017 01:44:44 +0000 (01:44 +0000)] 
qspawn: prepare to support runtime reloading of Limiter

We may allow the {max} value of a limiter to be changed
in the future, so lets start accounting for it before we
spawn followup processes.

8 years agohttp: remove weaken usage, reduce anonsub capture scope
Eric Wong [Wed, 4 Jan 2017 11:20:51 +0000 (11:20 +0000)] 
http: remove weaken usage, reduce anonsub capture scope

Avoiding weaken here is no more dangerous than the existing
circular refs (e.g. psgix.io) we create and manage throughout
the lifetime of the connection.  So, trust ourselves to maintain
the data structure properly and avoid triggering extra memory
usage.

While we're at it, avoid having anonymous subroutines capture
more variables than necessary to simplify reference auditing.

8 years agohttpd/async: remove weaken usage
Eric Wong [Wed, 4 Jan 2017 11:20:50 +0000 (11:20 +0000)] 
httpd/async: remove weaken usage

We do not need to use weaken() here, so avoid it to simplify our
interactions with Perl; as weaken requires additional storage
and (it seems) time complexity.

8 years agohttp: fix spelling error
Eric Wong [Wed, 4 Jan 2017 11:20:49 +0000 (11:20 +0000)] 
http: fix spelling error

Oops.  And we'll be fixing circular references from now...

8 years agowatch: watchspam affects all configured inboxes
Eric Wong [Mon, 2 Jan 2017 13:16:15 +0000 (13:16 +0000)] 
watch: watchspam affects all configured inboxes

If a message is spam in one mailbox, it is spam in all others a
particular user/group will care about.

8 years agorepobrowse: avoid empty pathspecs for future git compatibility
Eric Wong [Mon, 26 Dec 2016 03:04:08 +0000 (03:04 +0000)] 
repobrowse: avoid empty pathspecs for future git compatibility

At the moment, we always set expath, so it will always be
defined.

8 years agodoc: minor updates to design notes
Eric Wong [Mon, 26 Dec 2016 21:41:15 +0000 (21:41 +0000)] 
doc: minor updates to design notes

ssoma is not worth marketing, but perhaps our mirror of
the git mailing list archives is...

8 years agospawn: remove non-blocking support, here
Eric Wong [Mon, 26 Dec 2016 09:58:02 +0000 (09:58 +0000)] 
spawn: remove non-blocking support, here

It is never used, and inappropriate to support in generic code.

HTTPD::Async already sets non-blocking, and it's better to do it
in -httpd-specific code since we know our -httpd can handle it.

8 years agorepobrowse: port git snapshot over to qspawn
Eric Wong [Mon, 26 Dec 2016 09:44:44 +0000 (09:44 +0000)] 
repobrowse: port git snapshot over to qspawn

This is expensive, so we will utilize the qspawn system
to prevent excessive overhead.

8 years agorepobrowse: port patch generation over to qspawn
Eric Wong [Sun, 25 Dec 2016 08:53:19 +0000 (08:53 +0000)] 
repobrowse: port patch generation over to qspawn

And start generalizing the qspawn usage code for PSGI
with psgi_return.

8 years agogit: lazy initialization of error output
Eric Wong [Sun, 25 Dec 2016 08:52:41 +0000 (08:52 +0000)] 
git: lazy initialization of error output

We may not keep this feature after all, but for now we'll hold
off on creating it to cheapen instantiation.

8 years agoMerge remote-tracking branch 'origin/master' into repobrowse
Eric Wong [Mon, 26 Dec 2016 05:25:36 +0000 (05:25 +0000)] 
Merge remote-tracking branch 'origin/master' into repobrowse

* origin/master: (25 commits)
  evcleanup: ensure deferred close from timers are handled ASAP
  httpd/async: improve variable naming
  githttpbackend: minor cleanups to improve readability
  githttpbackend: simplify compatibility code
  githttpbackend: minor readability improvement
  http: fix clobbering of $null_io
  linkify: modify argument in place
  view: do not modify array during iteration
  view: stop chomping off whitespace at ends of messages
  view: remove unused parameter
  search: lookup_mail handles modified DBs
  doc: various comments on async handling
  searchthread: simplify API and remove needless OO
  searchthread: update comment about loop prevention
  searchmsg: remove ensure_metadata
  tests: add thread-all testing for benchmarking
  searchmsg: do not memoize {date} field
  searchmsg: remove locale-dependency for ->date
  t/config.t: fix feedmax default
  wwwtext: link to RFC4685 (Atom Threading)
  ...

8 years agoevcleanup: ensure deferred close from timers are handled ASAP
Eric Wong [Mon, 26 Dec 2016 03:05:15 +0000 (03:05 +0000)] 
evcleanup: ensure deferred close from timers are handled ASAP

Danga::Socket defers close() syscalls until the end of the event
loop to avoid FD recycling.  Unfortunately, this is dependent on
IO events firing and waking the process up from
poll/kevent/epoll_wait.

Without any I/O activity, a socket could remain in the
@Danga::Socket::ToClose array indefinitely.  Thus, we will
trigger a fake IO event after running all timers to trigger
the deferred close in Danga::Socket::PostEventLoop.

8 years agot/repobrowse_git_httpd: remove XS parser dependency
Eric Wong [Mon, 26 Dec 2016 02:15:29 +0000 (02:15 +0000)] 
t/repobrowse_git_httpd: remove XS parser dependency

Relying on the XS parser has been optional since March 2016:
commit 7dd78012da81d48e5e73e56c3255895dfa9de1f5
("http: use Plack::HTTPParser for HTTP parsing")

8 years agohttpd/async: improve variable naming
Eric Wong [Sun, 25 Dec 2016 08:09:48 +0000 (08:09 +0000)] 
httpd/async: improve variable naming

We only refer to PublicInbox::HTTP objects here, so '$io'
was a bad name.

8 years agogithttpbackend: minor cleanups to improve readability
Eric Wong [Sun, 25 Dec 2016 07:33:02 +0000 (07:33 +0000)] 
githttpbackend: minor cleanups to improve readability

Fewer returns improves readability and the diffstat agrees.

8 years agogithttpbackend: simplify compatibility code
Eric Wong [Sun, 25 Dec 2016 06:52:03 +0000 (06:52 +0000)] 
githttpbackend: simplify compatibility code

Fewer conditionals means theres fewer code paths to test
and makes things easier-to-read.

8 years agogithttpbackend: minor readability improvement
Eric Wong [Sun, 25 Dec 2016 06:39:13 +0000 (06:39 +0000)] 
githttpbackend: minor readability improvement

Use a more meaningful variable name for the Qspawn
object, since this module is the reference for its
use.