]> git.ipfire.org Git - thirdparty/public-inbox.git/log
thirdparty/public-inbox.git
8 years agosearch: allow searching within mail diffs repobrowse
Eric Wong [Wed, 12 Apr 2017 20:17:47 +0000 (20:17 +0000)] 
search: allow searching within mail diffs

This can be tied into a repository browser to browse
in-flight topics on a mailing list.

8 years agoMerge remote-tracking branch 'origin/master' into repobrowse
Eric Wong [Wed, 12 Apr 2017 21:10:05 +0000 (21:10 +0000)] 
Merge remote-tracking branch 'origin/master' into repobrowse

* origin/master:
  search: fix help message for searching within quotes
  learn: scan all inboxes when learning spam
  watchmaildir: do not reject lowercase flags on Maildir files
  searchview: show full (&x=t) messages in ascending chronlogical order
  searchview: add "t" id to link to thread overview
  extmsg: use updated mail-archive.com URL
  view: escape HTML description name

8 years agosearch: fix help message for searching within quotes
Eric Wong [Tue, 11 Apr 2017 23:39:54 +0000 (23:39 +0000)] 
search: fix help message for searching within quotes

I'm not sure if people use either and it's not in mairix
(where we base our abbreviations off of).  Lets go
with the shorter prefix since it's easier-to-type.

8 years agolearn: scan all inboxes when learning spam
Eric Wong [Wed, 5 Apr 2017 01:41:28 +0000 (01:41 +0000)] 
learn: scan all inboxes when learning spam

This matches the behavior of the -watch daemon since
6d534038285ddd760709ba76ea007f9108200097
("watch: watchspam affects all configured inboxes")

8 years agowatchmaildir: do not reject lowercase flags on Maildir files
Eric Wong [Tue, 4 Apr 2017 18:25:47 +0000 (18:25 +0000)] 
watchmaildir: do not reject lowercase flags on Maildir files

Dovecot uses 'a'..'z' (lowercase) to designate keywords
in Maildir flags.  This was preventing certain messages
from being marked as spam.

https://wiki2.dovecot.org/MailboxFormat/Maildir

8 years agosearchview: show full (&x=t) messages in ascending chronlogical order
Eric Wong [Fri, 24 Mar 2017 01:41:11 +0000 (01:41 +0000)] 
searchview: show full (&x=t) messages in ascending chronlogical order

When displaying search results with full messages, it makes
more sense to show them in ascending chronological order when
going by date.  Reverse chronological order makes more sense
for search results which only show the subject.

8 years agosearchview: add "t" id to link to thread overview
Eric Wong [Fri, 24 Mar 2017 00:15:08 +0000 (00:15 +0000)] 
searchview: add "t" id to link to thread overview

At least for the thread view (&x=t); this will make it
easy to link to the overview.

8 years agoextmsg: use updated mail-archive.com URL
Eric Wong [Wed, 22 Mar 2017 02:14:19 +0000 (02:14 +0000)] 
extmsg: use updated mail-archive.com URL

Apparently mid.mail-archive.com does not support HTTPS,
and the HTTP version redirects to the search query, anyways.

8 years agoview: escape HTML description name
Eric Wong [Tue, 14 Mar 2017 21:23:39 +0000 (21:23 +0000)] 
view: escape HTML description name

Otherwise funky filenames can cause HTML injection
vulnerabilities (hope you have JavaScript disabled!)

8 years agorepoobrowse: explicit EOF handling for git async callback
Eric Wong [Sat, 4 Mar 2017 03:52:29 +0000 (03:52 +0000)] 
repoobrowse: explicit EOF handling for git async callback

We need to ensure we've fully-drained the pipe before
signalling EOF to the callback, since pipelining may
not be the best choice with detachable processes
in the future.

8 years agorepobrowse: stop abbreviating object names
Eric Wong [Sat, 4 Mar 2017 02:15:29 +0000 (02:15 +0000)] 
repobrowse: stop abbreviating object names

Ending up with potentially ambiguous identifiers in the
future is not worth saving some bytes, in this case.

8 years agorepobrowse: fixup format-patch display
Eric Wong [Sat, 4 Mar 2017 00:32:45 +0000 (00:32 +0000)] 
repobrowse: fixup format-patch display

We need to take the revision into account when generating
patches :P  While we're at it, disabiguate URLs by resolving
refnames to (un-SHAttered) hex identifiers.

8 years agorepobrowse: raw: show the resulting tree for commits and tags
Eric Wong [Fri, 3 Mar 2017 22:31:28 +0000 (22:31 +0000)] 
repobrowse: raw: show the resulting tree for commits and tags

Seeing the raw tag or commit is not very useful, but people
tend to treat them as trees.  This behavior is also shared
by the "plain" endpoint in cgit.

8 years agorepobrowse: src: show a nicer message for big files
Eric Wong [Fri, 3 Mar 2017 22:28:43 +0000 (22:28 +0000)] 
repobrowse: src: show a nicer message for big files

It should be unlikely for a code repository to need
any source files over 64K; and we can't display binaries
in a meaningful way in HTML, anyways.

8 years agorepobrowse: src/ endpoint requires a tip to be specified
Eric Wong [Fri, 3 Mar 2017 22:07:19 +0000 (22:07 +0000)] 
repobrowse: src/ endpoint requires a tip to be specified

Implying a tip would make for ambiguous URLs and ruin
caching, so try to get everybody to hit the same URL.
This also simplifies some of our other code since
the tip is always in the request.

8 years agorepobrowse: raw display avoids forking for small files
Eric Wong [Fri, 3 Mar 2017 21:12:02 +0000 (21:12 +0000)] 
repobrowse: raw display avoids forking for small files

This is more efficient for the majority of source files
which fit into a stock 64K Linux pipe buffer used by
our interaction with git-cat-file.

8 years agorepobrowse: avoid excessive buffering in raw endpoint
Eric Wong [Fri, 3 Mar 2017 04:09:34 +0000 (04:09 +0000)] 
repobrowse: avoid excessive buffering in raw endpoint

Relying on qspawn allows us to serve arbitrarily large
files without excessive buffering.  We'll special-case
small files in the future to avoid qspawn, as those
small files should fit comfortably in socket buffers.

8 years agorepobrowse: remove unused "blob" endpoint
Eric Wong [Fri, 3 Mar 2017 01:14:07 +0000 (01:14 +0000)] 
repobrowse: remove unused "blob" endpoint

This is redundant with the "raw" endpoint.

8 years agorepobrowse: consistently set text charset
Eric Wong [Fri, 3 Mar 2017 00:55:07 +0000 (00:55 +0000)] 
repobrowse: consistently set text charset

For everything with relevant content, we'll try to set
UTF-8 charset and reduce duplication when generating
response headers.

8 years agorepobrowse: rename "tree" endpoint to "src"
Eric Wong [Thu, 2 Mar 2017 23:39:49 +0000 (23:39 +0000)] 
repobrowse: rename "tree" endpoint to "src"

This is shorter, and makes more sense as the endpoint
displays both tree listings and actual blob sources.
This will also make rewriting existing URLs from cgit
installations easier.

8 years agorepobrowse: rework source view to use async cat-file API
Eric Wong [Thu, 2 Mar 2017 03:36:24 +0000 (03:36 +0000)] 
repobrowse: rework source view to use async cat-file API

This will allow most source files to be displayed without
blocking public-inbox-httpd on slow disk access.  However, we no
longer support displaying source files larger than 65536 bytes
(the size of a pipe on current Linux).

8 years agorepobrowse: update documentation and variable naming
Eric Wong [Fri, 24 Feb 2017 02:52:23 +0000 (02:52 +0000)] 
repobrowse: update documentation and variable naming

Another change from abandoning the cgit URL format.

8 years agorepobrowse: update documentation for git patch generation
Eric Wong [Fri, 24 Feb 2017 02:41:17 +0000 (02:41 +0000)] 
repobrowse: update documentation for git patch generation

We abandoned cgit-compatible URLs, update documentation to match.

8 years agorepobrowse: git tree view checks object asynchronously
Eric Wong [Fri, 24 Feb 2017 02:02:38 +0000 (02:02 +0000)] 
repobrowse: git tree view checks object asynchronously

... when inside public-inbox-httpd.  This will allow
the server to handle other requests/responses while
waiting on "git cat-file --batch-check"

8 years agogit: move async detection to runtime
Eric Wong [Fri, 24 Feb 2017 00:47:45 +0000 (00:47 +0000)] 
git: move async detection to runtime

We don't actually know what context we'll be called under,
so detecting the mere use-ability of Danga::Socket is not
sufficient.

8 years agorepobrowse: eliminate unused query parameters
Eric Wong [Wed, 22 Feb 2017 03:10:43 +0000 (03:10 +0000)] 
repobrowse: eliminate unused query parameters

We will try to reduce the amount of query parameters as
much as possible to make URLs more amenable to caching
at various levels.

8 years agorepobrowse: fixup revision handling
Eric Wong [Wed, 22 Feb 2017 03:01:24 +0000 (03:01 +0000)] 
repobrowse: fixup revision handling

Revisions passed in the URL must not be ignored.
This fixes some bugs introduced in commit
f6244586ba4f5a5e7575e1254be8c9bbe303fce9
("repobrowse: switch to new URL format to avoid query strings")

8 years agorepobrowse: stop abbreviating commit hashes
Eric Wong [Tue, 21 Feb 2017 23:05:46 +0000 (23:05 +0000)] 
repobrowse: stop abbreviating commit hashes

Abbreviations can become ambiguous over time, and it seems other
tools are fine with displaying unabbreviated hashes for commits.
This should reduce workload for the search engines, too.

8 years agorepobrowse: unconditionally remove trailing slash handling
Eric Wong [Sun, 19 Feb 2017 19:01:39 +0000 (19:01 +0000)] 
repobrowse: unconditionally remove trailing slash handling

We do not need specialized trailing slashes if we break URL
compatibility from cgit, here.  Removing trailing (and redundant)
slashes improves our hit rates with across both server-side
(varnish, squid) and client-side (browser) layers.

8 years agorepobrowse: return git errors as text/plain, for now
Eric Wong [Sun, 19 Feb 2017 03:44:27 +0000 (03:44 +0000)] 
repobrowse: return git errors as text/plain, for now

For now, this avoids an HTML injection vector.  We'll try to
have more consistent error reporting in the future.

8 years agorepobrowse: minor style cleanups
Eric Wong [Fri, 17 Feb 2017 23:40:37 +0000 (23:40 +0000)] 
repobrowse: minor style cleanups

Avoid using '=>' arrow notation for arrays and array references,
it is confusing and more verbose.  Additionally, combine
"use constant" statements when possible.

8 years agorepobrowse: remove unnecessary import
Eric Wong [Fri, 17 Feb 2017 23:08:39 +0000 (23:08 +0000)] 
repobrowse: remove unnecessary import

We do not need to escape URIs in this file.

8 years agorepobrowse: rename "plain" endpoint to "raw"
Eric Wong [Fri, 17 Feb 2017 03:31:16 +0000 (03:31 +0000)] 
repobrowse: rename "plain" endpoint to "raw"

This name is shorter and matches terminology in gitweb and
other popular git web viewers.

8 years agorepobrowse: memoize git symbolic-ref resolution
Eric Wong [Thu, 16 Feb 2017 23:26:01 +0000 (23:26 +0000)] 
repobrowse: memoize git symbolic-ref resolution

The "HEAD" symbolic ref is rarely changed, so
memoize it for now and avoid exposing it in URLs.

8 years agorepobrowse: shorten "repo_info" to "-repo"
Eric Wong [Thu, 16 Feb 2017 20:53:42 +0000 (20:53 +0000)] 
repobrowse: shorten "repo_info" to "-repo"

This makes it more consistent with how we use the Inbox
objects for the main code.

8 years agorepo: only read description if git
Eric Wong [Thu, 16 Feb 2017 20:39:08 +0000 (20:39 +0000)] 
repo: only read description if git

Other VCSes have other means of providing the description.

8 years agorepobrowse: switch to new URL format to avoid query strings
Eric Wong [Wed, 15 Feb 2017 22:35:18 +0000 (22:35 +0000)] 
repobrowse: switch to new URL format to avoid query strings

Query strings make endpoint caching more difficult since
they're order-independent.  They are also more likely lost
or truncated inadvertantly when copy+pasting, so try to
avoid them for default endpoints.

There's still some things which are broken and followup
commits will be needed to fix them.

8 years agoconfig: avoid circular loading dependency
Eric Wong [Wed, 15 Feb 2017 00:06:06 +0000 (00:06 +0000)] 
config: avoid circular loading dependency

We must lazilly load one of them, so load Inbox later
since we need to parse the config, first.

8 years agorepobrowse: do not unescape PATH_INFO twice
Eric Wong [Tue, 14 Feb 2017 23:19:34 +0000 (23:19 +0000)] 
repobrowse: do not unescape PATH_INFO twice

PSGI specs already require PATH_INFO to be unescaped.

Followup-to: commit 364de65f8a6b5729027cb70228312a141430122f
("www: do not unescape PATH_INFO twice")

8 years agoMerge remote-tracking branch 'origin/master' into repobrowse
Eric Wong [Tue, 14 Feb 2017 22:56:37 +0000 (22:56 +0000)] 
Merge remote-tracking branch 'origin/master' into repobrowse

* origin/master:
  www: do not unescape PATH_INFO twice
  t/mime: quiet warnings for old versions of Email::Simple
  handle repeated References and In-Reply-To headers

8 years agosearchidx: switch to accounting by message bytes
Eric Wong [Sun, 12 Feb 2017 09:04:54 +0000 (09:04 +0000)] 
searchidx: switch to accounting by message bytes

Xapian memory usage is tied to the size of the indexed
text, so take the raw message size into account when
deciding when to flush Xapian data.

More importantly, we now flush Xapian before we have it
buffer beyond our maximum; and we do it unconditionally
to prevent even high priority processes from OOM-ing.

8 years agowww: do not unescape PATH_INFO twice
Eric Wong [Tue, 14 Feb 2017 22:45:15 +0000 (22:45 +0000)] 
www: do not unescape PATH_INFO twice

PSGI specs already require PATH_INFO to be unescaped;
so our tests were wrong, too.

8 years agot/mime: quiet warnings for old versions of Email::Simple
Eric Wong [Sun, 12 Feb 2017 02:41:22 +0000 (02:41 +0000)] 
t/mime: quiet warnings for old versions of Email::Simple

This is fixed in the newest versions of Email::Simple,
but not the version in Debian jessie (2.203)

8 years agohandle repeated References and In-Reply-To headers
Eric Wong [Sat, 11 Feb 2017 23:54:48 +0000 (23:54 +0000)] 
handle repeated References and In-Reply-To headers

It seems possible for git-send-email(1) to generate repeated
repeated instances of References and In-Reply-To headers,
as evidenced in:

https://public-inbox.org/git/20161111124541.8216-17-vascomalmeida@sapo.pt/raw

This causes a mismatch between how our search indexer threads
and how our HTML view handles threading.  In the future, View.pm
will use the smsg-parsed {references} field and avoid redoing
Email::MIME header parsing.

We will still need to figure out a way to deal with messages
with repeated Message-IDs, at some point, too.

8 years agorepo: lazily read description and cloneurl
Eric Wong [Sat, 11 Feb 2017 00:41:29 +0000 (00:41 +0000)] 
repo: lazily read description and cloneurl

This improves startup speed at the cost of CoW-friendliness
for long-lived daemons (which can be fixed, later).

8 years agoconfig: move try_cat function from inbox
Eric Wong [Fri, 10 Feb 2017 21:27:11 +0000 (21:27 +0000)] 
config: move try_cat function from inbox

This allows RepoConfig to be independent of the
PublicInbox::Inbox class.

8 years agorepo: add class for representing a code repo
Eric Wong [Fri, 10 Feb 2017 21:23:01 +0000 (21:23 +0000)] 
repo: add class for representing a code repo

This should hopefully allow us to organize our code better

8 years agorepogit: add prototypes for error checking
Eric Wong [Fri, 10 Feb 2017 21:19:11 +0000 (21:19 +0000)] 
repogit: add prototypes for error checking

And add a note to remove git_commit_title

8 years agorepo: search index flushes for excessive active refs
Eric Wong [Fri, 10 Feb 2017 03:30:40 +0000 (03:30 +0000)] 
repo: search index flushes for excessive active refs

For certain repos, having too many active refs will cause
memory usage problems.  Mitigate the Xapian problems, for
now, and consider a switch to GDBM_File or similar for
repos with more refs.

8 years agosearch: remove unnecessary abstractions and functionality
Eric Wong [Fri, 10 Feb 2017 01:51:05 +0000 (01:51 +0000)] 
search: remove unnecessary abstractions and functionality

This simplifies the code a bit and reduces the translation
overhead for looking directly at data from tools shipped
with Xapian.

While we're at it, fix thread-all.t :)

8 years agorepo: search index no longer indexes for --contains
Eric Wong [Thu, 9 Feb 2017 23:50:25 +0000 (23:50 +0000)] 
repo: search index no longer indexes for --contains

It's extraordinarily expensive to add these terms for
each and every commit.

8 years agorepo: increase search index flush granularity
Eric Wong [Thu, 9 Feb 2017 21:11:00 +0000 (21:11 +0000)] 
repo: increase search index flush granularity

We need to flush Xapian more frequently to account for
gigantic commits which introduce lots of text, so do
it when accounting for each line processed, and not
for each commit processed.

8 years agorepobrowse: shorten internal names
Eric Wong [Thu, 9 Feb 2017 01:37:03 +0000 (01:37 +0000)] 
repobrowse: shorten internal names

We'll still be keeping "repobrowse" for the public API
for use with .psgi files, but shortening the name means
less typing and we may have command-line tools, too.

8 years agoMerge remote-tracking branch 'origin/master' into repobrowse
Eric Wong [Thu, 9 Feb 2017 00:43:02 +0000 (00:43 +0000)] 
Merge remote-tracking branch 'origin/master' into repobrowse

* origin/master:
  config: do not slurp lines into memory
  TODO: several updates
  search: schema version bump for empty References/In-Reply-To
  Revert "searchidx: reindex clobbers old thread IDs"
  searchidx: reindex clobbers old thread IDs
  searchidx: deal with empty In-Reply-To and References headers
  searchview: increase limit for displaying search results
  searchview: clarify numeric summary at bottom
  add filter for Subject: tags
  watchmaildir: allow arguments for filters
  watchmaildir: limit live importer processes
  learn: implement "rm" only functionality
  mime: avoid SUPER usage in Email::MIME subclass
  inbox: reinstate periodic cleanup of Xapian and SQLite objects
  introduce PublicInbox::MIME wrapper class

8 years agorepobrowse: avoid slurping lines
Eric Wong [Thu, 9 Feb 2017 00:26:52 +0000 (00:26 +0000)] 
repobrowse: avoid slurping lines

"foreach (<$fh>)" in Perl requests lines in array
context, so use "while" instead for lazy reading.

This follows ba4c50c20b95679580beba1ef290a4281d5285b7
in master ("config: do not slurp lines into memory")

8 years agoconfig: do not slurp lines into memory
Eric Wong [Wed, 8 Feb 2017 21:41:38 +0000 (21:41 +0000)] 
config: do not slurp lines into memory

There's no need to hold everything in memory, here,
since apparently "foreach" will read everything at
once in array context

(for some reason, I thought Perl5 was smart enough
 to avoid creating a temporary array, here...)

8 years agorepobrowse: start wiring up git search
Eric Wong [Sat, 4 Feb 2017 02:20:35 +0000 (02:20 +0000)] 
repobrowse: start wiring up git search

Much more work on this will be needed, but at least explicit
flush points prevents OOMs on my system.

8 years agoTODO: several updates
Eric Wong [Tue, 7 Feb 2017 22:27:52 +0000 (22:27 +0000)] 
TODO: several updates

Always plenty to do while working on this...

8 years agosearch: hoist out git directory search index helper
Eric Wong [Tue, 31 Jan 2017 22:55:58 +0000 (22:55 +0000)] 
search: hoist out git directory search index helper

We will be reusing this for indexing normal (code) repositories
using git and Xapian, too.

8 years agosearch: schema version bump for empty References/In-Reply-To
Eric Wong [Mon, 6 Feb 2017 21:39:45 +0000 (21:39 +0000)] 
search: schema version bump for empty References/In-Reply-To

We cannot distinguish between legitimate ghosts and mis-threaded
messages before commit 83425ef12e4b65cdcecd11ddcb38175d4a91d5a0
("searchidx: deal with empty In-Reply-To and References headers")
so we must rebuild the index in parallel to fix it.

8 years agoRevert "searchidx: reindex clobbers old thread IDs"
Eric Wong [Mon, 6 Feb 2017 21:37:26 +0000 (21:37 +0000)] 
Revert "searchidx: reindex clobbers old thread IDs"

Oops, that's broken, too.  I guess the only way to reindex
after fixing the thread detection is to start from scratch.

This reverts commit 5d91adedf5f33ef1cb87df2a86306ddf370b4f8d.

8 years agosearchidx: reindex clobbers old thread IDs
Eric Wong [Mon, 6 Feb 2017 21:08:13 +0000 (21:08 +0000)] 
searchidx: reindex clobbers old thread IDs

We cannot always reuse thread IDs since our threading
logic may change as bugs are fixed.

8 years agosearchidx: deal with empty In-Reply-To and References headers
Eric Wong [Mon, 6 Feb 2017 19:54:25 +0000 (19:54 +0000)] 
searchidx: deal with empty In-Reply-To and References headers

In some messages, these headers exist, but have empty values.
Do not let empty values throw off our search indexer to tie
threads together, as it can make non-sensical threads grouped
to a Message-Id of "" (empty string).

See
<https://public-inbox.org/git/11340844841342-git-send-email-mailing-lists.git@rawuncut.elitemail.org/raw>
for an example of such a message.

Thanks-to: Johannes Schindelin <Johannes.Schindelin@gmx.de>
  <https://public-inbox.org/git/alpine.DEB.2.20.1702041206130.3496@virtualbox/>

8 years agosearchview: increase limit for displaying search results
Eric Wong [Mon, 6 Feb 2017 02:38:37 +0000 (02:38 +0000)] 
searchview: increase limit for displaying search results

We are in no danger of excessive buffering or OOM-ing,
the main page for every inbox already loads 200 results;
and thread page views even load 1000!  Increase this to
200 for now.

8 years agosearchview: clarify numeric summary at bottom
Eric Wong [Mon, 6 Feb 2017 02:07:24 +0000 (02:07 +0000)] 
searchview: clarify numeric summary at bottom

Xapian can only give estimated results when a result limit is
given to it, so make clear it is an estimate to avoid showing
non-sensical ranges when no results are returned.

8 years agorepobrowse: git tag listing is now async
Eric Wong [Sat, 4 Feb 2017 02:21:06 +0000 (02:21 +0000)] 
repobrowse: git tag listing is now async

I'm unsure if this is even a good idea to support,
but we have it, for now.

8 years agorepobrowse/git/atom: remove unused subroutine
Eric Wong [Thu, 26 Jan 2017 07:58:28 +0000 (07:58 +0000)] 
repobrowse/git/atom: remove unused subroutine

We never ended up using it.

8 years agorepobrowse: simplify command generation for git commands
Eric Wong [Thu, 26 Jan 2017 04:27:02 +0000 (04:27 +0000)] 
repobrowse: simplify command generation for git commands

This shortens the code quite a bit at a negligible performance cost,
and the diffstat agrees.

8 years agoadd filter for Subject: tags
Eric Wong [Thu, 26 Jan 2017 02:09:36 +0000 (02:09 +0000)] 
add filter for Subject: tags

Some mailing lists add annoying tags into the Subject line which
discourages readers from doing proper mail organization on the
client side.  They also waste precious screen space and
attention span.

Remove them from our archives to reduce clutter.

8 years agowatchmaildir: allow arguments for filters
Eric Wong [Wed, 25 Jan 2017 21:39:06 +0000 (21:39 +0000)] 
watchmaildir: allow arguments for filters

We'll want to allow some degree of configuration for
various mailing lists.

8 years agorepobrowse: git summary view uses psgi_qx
Eric Wong [Sun, 22 Jan 2017 22:10:46 +0000 (22:10 +0000)] 
repobrowse: git summary view uses psgi_qx

This reduces one synchronous dependency from the hot path,
and psgi_return will be used in the future.

8 years agot/httpd-unix: better diagnostics and comments for test
Eric Wong [Sun, 22 Jan 2017 01:52:25 +0000 (01:52 +0000)] 
t/httpd-unix: better diagnostics and comments for test

I've hit random test failures on this, so attempt to improve
diagnostics and improve documentation for this test.

8 years agorepobrowse: preserve newlines in Atom feed
Eric Wong [Sat, 21 Jan 2017 11:50:58 +0000 (11:50 +0000)] 
repobrowse: preserve newlines in Atom feed

Commit messages are assumed to be displayed in a terminal
with a fixed width font, so we must preserve newlines and
all whitespace as-is so ASCII art may be displayed properly.

8 years agorepobrowse: simplify git log parsing implementation
Eric Wong [Sat, 21 Jan 2017 11:34:31 +0000 (11:34 +0000)] 
repobrowse: simplify git log parsing implementation

Based on what was done for the Atom feed, this will allow us to
simplify state management through metaprogramming and avoid
placeholder characters ('D' for decoration) for empty fields.

8 years agorepobrowse: fix full URL generation in Atom feed
Eric Wong [Sat, 21 Jan 2017 04:41:06 +0000 (04:41 +0000)] 
repobrowse: fix full URL generation in Atom feed

We must not drop the leading slash in the URI.  This
regression was introduced when we dropped Plack::Request
dependency.

8 years agorepobrowse: avoid extra hash assignments for Atom feed
Eric Wong [Sat, 21 Jan 2017 04:35:27 +0000 (04:35 +0000)] 
repobrowse: avoid extra hash assignments for Atom feed

This should make the code somewhat easier-to-follow.

8 years agorepobrowse: git Atom feed uses Qspawn->psgi_return
Eric Wong [Sat, 21 Jan 2017 02:29:52 +0000 (02:29 +0000)] 
repobrowse: git Atom feed uses Qspawn->psgi_return

This allows us to wait on "git log" output in a non-blocking manner
while being able to throttle on backpressure from slow clients
when used with pi-httpd.

8 years agorepobrowse: git Atom feed uses Qspawn->psgi_qx
Eric Wong [Sat, 21 Jan 2017 02:29:51 +0000 (02:29 +0000)] 
repobrowse: git Atom feed uses Qspawn->psgi_qx

This allows pi-httpd to service other I/O while we wait on "git
symbolic-ref" to run.  And psgi_return will be used in the next
commit...

8 years agoqspawn: better annotate where $qx_cb is called
Eric Wong [Sat, 21 Jan 2017 02:29:50 +0000 (02:29 +0000)] 
qspawn: better annotate where $qx_cb is called

Hopefully this makes the code easier-to-follow for random
readers.  This requires a small amount of modification to
our one caller, but this is a new, unstable API (as is
nearly all of our code).

8 years agowatchmaildir: limit live importer processes
Eric Wong [Wed, 18 Jan 2017 19:13:09 +0000 (19:13 +0000)] 
watchmaildir: limit live importer processes

We don't want to be triggering OOM or swapping on weaker
systems when we have dozens of inboxes as potential targets.

8 years agolearn: implement "rm" only functionality
Eric Wong [Thu, 19 Jan 2017 00:31:30 +0000 (00:31 +0000)] 
learn: implement "rm" only functionality

Do not consider this interface stable, but I just needed a
way to remove mis-imported multipart messages so
public-inbox-watch could pick them up again from my Maildir.

8 years agomime: avoid SUPER usage in Email::MIME subclass
Eric Wong [Wed, 18 Jan 2017 23:50:57 +0000 (23:50 +0000)] 
mime: avoid SUPER usage in Email::MIME subclass

We must call Email::Simple methods directly in our monkey patch
for Email::MIME to call the intended method.  Using SUPER in our
subclass would instead hit a different, unintended method in
Email::MIME.

Reported-by: Junio C Hamano <gitster@pobox.com>
<xmqq4m0wb43w.fsf@gitster.mtv.corp.google.com>

8 years agorepobrowse: expath is always defined
Eric Wong [Wed, 18 Jan 2017 08:17:50 +0000 (08:17 +0000)] 
repobrowse: expath is always defined

Remove an outdated comment while we're at it, too.

8 years agohttp: cast a wider net to prevent circular references
Eric Wong [Wed, 18 Jan 2017 07:35:35 +0000 (07:35 +0000)] 
http: cast a wider net to prevent circular references

We can more effectly nuke circular references by clearing
the entire PSGI $env, not just particular keys, when
there are self-referential fields such as "qspawn.response"
in our environment.

8 years agorepobrowse: git snapshot waits for all commands asynchronously
Eric Wong [Wed, 18 Jan 2017 07:27:03 +0000 (07:27 +0000)] 
repobrowse: git snapshot waits for all commands asynchronously

This new asynchronous API, psgi_qx, will allow us to take
advantage of non-blocking I/O from even small commands;
as those may still need to wait for slow operations.

8 years agoqspawn: better description
Eric Wong [Tue, 17 Jan 2017 19:38:36 +0000 (19:38 +0000)] 
qspawn: better description

We'll probably use this in a lot of places...

8 years agorepobrowse: verbose git tree display uses qspawn for ls-tree
Eric Wong [Sun, 15 Jan 2017 03:11:14 +0000 (03:11 +0000)] 
repobrowse: verbose git tree display uses qspawn for ls-tree

For now, qspawn provides resource management for dealing with
expensive "git ls-tree" processes.

8 years agorepobrowse: use qspawn for plain tree views
Eric Wong [Sun, 15 Jan 2017 02:26:39 +0000 (02:26 +0000)] 
repobrowse: use qspawn for plain tree views

We may eventually handle tree parsing ourselves (since we
already git cat-file), but for now we can rely on ls-tree
to give good output and qspawn to manage resource allocation.

8 years agorepobrowse: git: drop unused diff parsing routines
Eric Wong [Wed, 11 Jan 2017 08:46:35 +0000 (08:46 +0000)] 
repobrowse: git: drop unused diff parsing routines

We don't need these legacy routines anymore and use the
newer stream-friendly _sed interface.

8 years agohttpd/async: stop running command if client disconnects
Eric Wong [Fri, 13 Jan 2017 23:10:25 +0000 (23:10 +0000)] 
httpd/async: stop running command if client disconnects

If an HTTP client disconnects while we're piping the output of a
process to them, break the pipe of the process to reclaim
resources as soon as possible.

8 years agorepobrowse: simplify conditional for cat-file input
Eric Wong [Fri, 13 Jan 2017 22:53:20 +0000 (22:53 +0000)] 
repobrowse: simplify conditional for cat-file input

expath is always defined, even to an empty string,
so simplify the conditional for checking it.

8 years agorename "GitAsyncRd" to "GitAsync"
Eric Wong [Fri, 13 Jan 2017 22:28:36 +0000 (22:28 +0000)] 
rename "GitAsyncRd" to "GitAsync"

This wrapper class actually does both reading and
writing, and a shorter name is nicer.

8 years agogitasyncrd: pass a reference to Danga::Socket::write
Eric Wong [Fri, 13 Jan 2017 22:24:45 +0000 (22:24 +0000)] 
gitasyncrd: pass a reference to Danga::Socket::write

D::S creates a reference for this, anyways, so avoid
the extra work by doing it ourselves.

8 years agorepobrowse: comment describing Git wrapper creation
Eric Wong [Fri, 13 Jan 2017 22:05:10 +0000 (22:05 +0000)] 
repobrowse: comment describing Git wrapper creation

Metaprogramming can be difficult-to-read after several
months, so leave comments in place to describe common
usage results of.

8 years agorepobrowse: port git log view to qspawn streaming interface
Eric Wong [Fri, 13 Jan 2017 02:13:18 +0000 (02:13 +0000)] 
repobrowse: port git log view to qspawn streaming interface

This will prevent too many processes from being spawned at once
while also allowing us to respond to backpressure from slow
clients.

8 years agoinbox: reinstate periodic cleanup of Xapian and SQLite objects
Eric Wong [Wed, 11 Jan 2017 10:13:00 +0000 (10:13 +0000)] 
inbox: reinstate periodic cleanup of Xapian and SQLite objects

We may need to do this even more aggressively, since the
Xapian database does not always give the latest results.
This time, we'll do it without relying on weak references,
and instead check refcounts.

8 years agorepobrowse: make git diff output use qspawn
Eric Wong [Wed, 11 Jan 2017 04:12:29 +0000 (04:12 +0000)] 
repobrowse: make git diff output use qspawn

This is a potentially expensive operation, so we may want to
give it it's own limiter channel.

8 years agodiff: note the dangers of gigantic anchors hash
Eric Wong [Wed, 11 Jan 2017 04:12:28 +0000 (04:12 +0000)] 
diff: note the dangers of gigantic anchors hash

8 years agoasync: improve and fix out-of-date comments
Eric Wong [Wed, 11 Jan 2017 04:12:27 +0000 (04:12 +0000)] 
async: improve and fix out-of-date comments

8 years agorepobrowse: qspawn + streaming for git commit display
Eric Wong [Wed, 11 Jan 2017 04:12:26 +0000 (04:12 +0000)] 
repobrowse: qspawn + streaming for git commit display

This prevents "git show" processes from monopolizing
the system and allows us to better handle backpressure
from gigantic commits.