models: Validate Project.linkname does not contain forward slash
I started by creating a project that contained a forward slash
(importing patches from https://lists.sr.ht/~sircmpwn/sr.ht-dev/) and
it fails to render the "projects" main page.
The specific error reads:
NoReverseMatch at /
Reverse for 'patch-list' with keyword arguments
'{'project_id': 'foo/bar'}' not found. 1 pattern(s) tried:
['project/(?P<project_id>[^/]+)/list/$']
which appears to explicitly disallow forward slashes.
So I think it makes sense to validate that project linkname doesn't
contain forward slahes.
This implementation uses the validate_unicode_slug validator instead of just
rejecting inputs that contain forward slashes.
Signed-off-by: Thomas Bracht Laumann Jespersen <t@laumann.xyz> Signed-off-by: Stephen Finucane <stephen@that.guru> Closes: #380
Django 3.1 adds a new admin sidebar feature that requires the
django.template.context_processors.request context processor to be enabled
in the settings.
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Stephen Finucane <stephen@that.guru>
Andrew Donnellan [Thu, 27 Aug 2020 14:14:03 +0000 (00:14 +1000)]
urls: Update url pattern functions
Django 3.1 deprecates django.conf.urls.url() as an alias for
django.urls.re_path(). Also switch to using django.urls.include() rather
than django.conf.urls.include().
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Stephen Finucane <stephen@that.guru>
Rohit Sarkar [Sat, 1 Aug 2020 06:24:15 +0000 (11:54 +0530)]
management: introduce replacerelations command
The replacerelations script is used to ingest relations into Patchwork's
patch database. A patch groups file is taken as input, which on each
line contains a space separated list of patchwork ids denoting a
relation. All the existing relations in Patchwork's database are removed
and the relations read from the patch groups file are ingested.
Signed-off-by: Rohit Sarkar <rohitsarkar5398@gmail.com>
[dja: pep8, drop relations directory as empty dirs don't get stored by git,
comment about how lines are generated.] Signed-off-by: Daniel Axtens <dja@axtens.net>
Stephen Finucane [Thu, 30 Apr 2020 21:22:10 +0000 (22:22 +0100)]
tox: Add default Django version
I occasionally forget myself and run e.g. 'tox -e pyNN' when I want to
sanity check something instead of 'tox -e pyNN-djangoMM'. Add fallback
Django versions so that this doesn't crash and burn. It's less succict
than it could be since tox doesn't seem to support '!django{22,30}'
(yet!).
Signed-off-by: Stephen Finucane <stephen@that.guru>
Stephen Finucane [Sun, 26 Apr 2020 13:17:57 +0000 (14:17 +0100)]
migrations: Resolve issues with other DB backends for 0042, 0043
0042 was using MySQL-specific SQL to delete entries in the
'patchwork_comment' table that were associated with entries in the
'patchwork_coverletter' table, while 0043 only considered MySQL and
PostgrSQL when attempting to copy fields from 'patchwork_patch' to
'patchwork_submission'. Both issues are resolved.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Stephen Finucane [Fri, 10 Apr 2020 11:08:32 +0000 (12:08 +0100)]
migrations: Squash 0001 to 0040
Now that we're moving a new major version, we can squash the migrations
we have to date. This coexists alongside the old migrations, which we
can remove and replace entirely when we release 4.0, per the advice of
the 'squashmigrations' tool.
You should commit this migration but leave the old ones in place; the
new migration will be used for new installs. Once you are sure all
instances of the codebase have applied the migrations you squashed,
you can delete them.
The 'squashmigrations' tool can't parse the output of 'RunPython' blocks
so the output of the tool was less optimized than it could be. As a
result, we've manually modified this change to remove the 'RunPython'
block and unnecessary 'AlterField' entries. This was done by removing
all migrations and generating a new "initial" migration, which was then
modified to mark all strings as byte strings (as they were when we were
using Python 2 to generate these migrations) so that 0041 would apply
cleanly.
The main benefit of this change is that it significantly reduces the
startup time for unit tests. Executed on my host, the run time for a
single test goes from ~ 22 seconds to ~ 14 seconds. This is obviously
reduced for additional tests.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Oh, the follies of youth. Time to undo the damage of 2.0.0, specifically
commit 86172ccc16, which split Patch into two separate models using
concrete inheritance. As noted previously, this introduced a large
number of unavoidable JOINs across large tables and the performance
impacts these introduce are blocking other features we want, such as
improved tagging functionality. To combine these two models, we must do
the following:
- Update any references to the 'Patch' model to point to the
'Submission' model instead
- Move everything from 'Patch' to 'Submission', including both fields
and options
- Delete the 'Patch' model
- Rename the 'Submission' model to 'Patch'
With this change, our model "hierarchy" goes from:
Submission
Patch
PatchComment
Cover
CoverComment
To a nice, flat:
Patch
PatchComment
Cover
CoverComment
I expect this migration to be intensive, particularly for MySQL users
who will see their entire tables rewritten. Unfortunately I don't see
any way to resolve this in an easier manner.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Stephen Finucane [Sat, 29 Feb 2020 17:20:50 +0000 (17:20 +0000)]
models: Split 'CoverLetter' from 'Submission'
We want to get rid of the split between 'Patch' and 'Submission' because
of the cost of using JOINs basically everywhere we use 'Patch'. Before
we do that, we need to move the other users of 'Submission' to other
models and other models that rely on these users sharing the common
'Submission' base. For the former, there is only one user,
'CoverLetter', while for the latter there is only the 'Comment' model.
As a result, we must do the following:
- Create a new 'Cover' model
- Create a new 'CoverComment' model
- Move everything from 'CoverLetter' to 'Cover' and all entries
associated with a 'CoverLetter' from 'Comment' to 'CoverComment'
- Delete the 'CoverLetter' model
- Rename the 'Comment' model to 'PatchComment'
This means our model "hierarchy" goes from:
Submission
Patch
CoverLetter
Comment
To:
Submission
Patch
PatchComment
Cover
CoverComment
A future change will flatten the 'Submission' and 'Patch' model.
Note that this actually highlighted a bug in Django, which has since
been reported upstream [1]. As noted there, the issue stems from MySQL's
refusal to remove an index from a foreign key when DB constraints are
used and the workaround is to remove the foreign key constraint before
altering the indexes and then re-add the constraint after.
[1] https://code.djangoproject.com/ticket/31335
Signed-off-by: Stephen Finucane <stephen@that.guru>
Stephen Finucane [Sat, 18 Apr 2020 12:21:13 +0000 (13:21 +0100)]
travis: Resolve warnings, info messages from Travis
The following were reported by Travis' build config validation:
- root: deprecated key 'sudo' (The key `sudo` has no effect anymore.)
- env: key 'matrix' is an alias for 'jobs', using 'jobs'
- root: key 'matrix' is an alias for 'jobs', using 'jobs'
- root: missing 'os', using the default 'linux'
Resolve all of the above.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Jeremy Kerr [Thu, 16 Apr 2020 01:29:25 +0000 (09:29 +0800)]
tests: ensure we don't see database errors during duplicate insert
Currently, the parser causes IntegrityErrors while inserting duplicate
patches; these tend to pollute database logs.
This change adds a check, which currently fails, to ensure we do not
cause errors during a duplicate patch parse.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Stephen Finucane <stephen@that.guru>
[stephenfin: Add 'expectedFailure' marker to keep all tests green]
Andrew Donnellan [Wed, 15 Apr 2020 09:06:56 +0000 (19:06 +1000)]
parser: Don't crash when From: is list email but has weird mangle format
get_original_sender() tries to demangle DMARC-mangled From headers, in
the case where the email's From address is the list address. It knows how
to handle Google Groups and Mailman style mangling, where the original
submitter's name will be turned into e.g. "Andrew Donnellan via
linuxppc-dev".
If an email has the From header set to the list address but has a name that
doesn't include " via ", we'll throw an exception because stripped_name
hasn't been set. Sometimes this is because the list name is seemingly
empty, resulting in a mangled name like "Andrew Donnellan via"
without the space after "via" that we detect. Handle this as well as we can
instead, and add a test.
Fixes: 8279a84238c10 ("parser: Unmangle From: headers that have been mangled for DMARC purposes") Reported-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Stephen Finucane <stephen@that.guru>
Stephen Finucane [Wed, 15 Apr 2020 15:52:08 +0000 (16:52 +0100)]
tests: Switch to openapi-core 0.13.x
We've done the necessary work here already so this is a relatively easy
switchover. However, we do have to work around an issue whereby the
first possible matching route is used rather than the best one [1]. In
addition, we have to install from master since there are fixes missing
from the latest release, 0.13.3. Hopefully both issues will be resolved
in a future release.
Stephen Finucane [Wed, 15 Apr 2020 15:44:32 +0000 (16:44 +0100)]
tests: Drop Django 1.x support
openapi-core 0.13.x has added support for Django validation. Before we
migrate to that version and presumably remove most of this code, remove
the stuff that is *definitely* dead.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Stephen Finucane [Thu, 16 Apr 2020 13:03:54 +0000 (14:03 +0100)]
docs: Resolve issues with 'relations'
Two issues here:
- 'PATCH /patches/{id}' and 'PUT /patches/{id}' expect a list of
integers on the 'related' field - not strings
- 'GET /patches' and 'GET /patches/{id}' return a list of embedded patch
objects on the 'related' field - not strings
Signed-off-by: Stephen Finucane <stephen@that.guru>
Stephen Finucane [Thu, 16 Apr 2020 12:45:28 +0000 (13:45 +0100)]
docs: Resolve issues with 'events'
Four things to change here:
- The response is any array that can contain any type of event, not one
of them.
- The 'actor' field is nullable.
- The 'cover' field of the 'cover-created' event is an embedded cover
letter, not a string.
- The specifications for the 'current_delegate' and 'previous_delegate'
fields of the 'patch-delegated' field were apparently invalid.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Stephen Finucane [Wed, 15 Apr 2020 23:51:17 +0000 (00:51 +0100)]
docs: Resolve issues with 'covers'
Two issues to correct:
- Each header in the 'headers' field can be either a string or a list
value.
- We state that the 'content' field will always have content but our
tests were configuring otherwise.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Stephen Finucane [Wed, 15 Apr 2020 22:36:59 +0000 (23:36 +0100)]
docs: Resolve issues with 'patches'
Four issues to resolve:
- The 'tags' field is a key-value mapping, not an array.
- Each header in the 'headers' field can be either a string or a list
value.
- Errors are reported as a mapping of the field name to an array of
errors, not a string.
- The security type information isn't complete and doesn't account for
security types. Skip it for now.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Stephen Finucane [Wed, 15 Apr 2020 22:37:17 +0000 (23:37 +0100)]
docs: Make embedded fields nullable by default
As discussed at [1], "subtypes can add restrictions, but they cannot
relax restrictions that are already in place." These fields need to be
marked nullable and then "subclassed" to set non-nullability if
required.
Stephen Finucane [Fri, 17 Apr 2020 22:07:27 +0000 (23:07 +0100)]
REST: Allow update of bundle without patches
Presently, when updating a patch we assume that patches are provided.
This isn't necessary - you might just want to make it public - and isn't
enforced by the API itself. However, because we make this assumption, we
see a HTTP 500. Resolve the issue and add tests to prevent a regression.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Resolves: #357
Daniel Axtens [Tue, 14 Apr 2020 05:34:40 +0000 (15:34 +1000)]
api: allow filtering patches and covers by msgid
In the process of fixing the previous bug, I realised that:
a) /api/patches/msgid is a perfectly reasonable thing to attempt
b) We have no way of finding a patch by message id in the API
We can't actualy make /api/patches/msgid work because it may not
be unique, but we can add a filter.
I'm shoehorning this into stable/2.2, even though it's technically
an API change: it's minor, not incompatible and in hindsight a
glaring hole.
Cc: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Jeremy Kerr <jk@ozlabs.org> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Stephen Finucane <stephen@that.guru> Signed-off-by: Daniel Axtens <dja@axtens.net>
Daniel Axtens [Tue, 14 Apr 2020 04:26:58 +0000 (14:26 +1000)]
api: do not fetch every patch in a patch detail view 404
mpe and jk and sfr found that the OzLabs server was melting due
to some queries downloading every patch.
Turns out if you 404 the patch detail view in the API, d-r-f attempts
to render a listbox with every single patch to fill in the 'related'
field. The bundle API also has a similar field.
Replace the multiple selection box with a text field. You can still
(AIUI) populate the relevant patch IDs manually.
This is the recommended approach per
https://www.django-rest-framework.org/topics/browsable-api/#handling-choicefield-with-large-numbers-of-items
Reported-by: Jeremy Kerr <jk@ozlabs.org> Reported-by: Michael Ellerman <mpe@ellerman.id.au> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Tested-by: Jeremy Kerr <jk@ozlabs.org> Server-no-longer-on-fire-by: Jeremy Kerr <jk@ozlabs.org> Reviewed-by: Stephen Finucane <stephen@that.guru> Signed-off-by: Daniel Axtens <dja@axtens.net>
Stephen Finucane [Fri, 10 Apr 2020 16:33:26 +0000 (17:33 +0100)]
pre-commit: Use Python 3 for everything
This lets us use e.g. f-strings. We also bump the version of the default
pre-commit lib and migrate to the upstream flake8 plugin, since the old
one is now deprecated.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Note that we need to use a subclass because the 'ServerProxy' class,
rather annoyingly, does not expose a 'close()' method. Instead, you're
expected to use a context manager, which isn't useful from the context
of a 'setUp' call. We could call '__enter__' and '__exit__' manually but
this seems cleaner. Also note that 'Server' was an alias of
'ServerProxy' [1], and we're taking the opportunity to switch here.
Daniel Axtens [Tue, 17 Mar 2020 13:59:14 +0000 (00:59 +1100)]
REST: fix patch listing query
The patch listing query is punishingly slow under even very simple
filters.
The new data model in 3.0 will help _a lot_, so this is a simple fix: I
did try indexes but haven't got really deeply into the weeds of what we can do
with them.
Move a number of things from select_related to prefetch_related: we trade off
one big, inefficient query for a slightly larger number of significantly more
efficient queries.
On my laptop with 2 copies of the canonical kernel team list loaded into the
database, and considering only the API view (the JSON view is always faster)
with warm caches and considering the entire set of SQL queries:
- /api/patches/?project=1
~1.4-1.5s -> <100ms, something like 14x better
- /api/patches/?project=1&since=2010-11-01T00:00:00
~1.7-1.8s -> <560ms, something like 3x better (now dominated by the
counting query only invoked on the HTML API view,
not the pure JSON API view.)
The things I moved:
* project: this was generating SQL that looked like:
INNER JOIN `patchwork_project` T5
ON (`patchwork_submission`.`project_id` = T5.`id`)
This is correct but we've already had to join the patchwork_submission
table and perhaps as a result it seems to be inefficient.
* series__project: Likewise we've already had to join the series table,
doing another join is possibly why it is inefficient.
* delegate: I do not know why this was tanking performance. I think it
might relate to the strategy mysql was using.
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Signed-off-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Stephen Finucane <stephen@that.guru>
Daniel Axtens [Tue, 17 Mar 2020 13:59:13 +0000 (00:59 +1100)]
REST: massively improve the patch counting query under filters
The DRF web view counts the patches as part of pagination.
The query it uses is a disaster zone:
SELECT Count(*) FROM (
SELECT DISTINCT
`patchwork_submission`.`id` AS Col1,
`patchwork_submission`.`msgid` AS Col2,
`patchwork_submission`.`date` AS Col3,
`patchwork_submission`.`submitter_id` AS Col4,
`patchwork_submission`.`project_id` AS Col5,
`patchwork_submission`.`name` AS Col6,
`patchwork_patch`.`submission_ptr_id` AS Col7,
`patchwork_patch`.`commit_ref` AS Col8,
`patchwork_patch`.`pull_url` AS Col9,
`patchwork_patch`.`delegate_id` AS Col10,
`patchwork_patch`.`state_id` AS Col11,
`patchwork_patch`.`archived` AS Col12,
`patchwork_patch`.`hash` AS Col13,
`patchwork_patch`.`patch_project_id` AS Col14,
`patchwork_patch`.`series_id` AS Col15,
`patchwork_patch`.`number` AS Col16,
`patchwork_patch`.`related_id` AS Col17
FROM `patchwork_patch`
INNER JOIN `patchwork_submission`
ON (`patchwork_patch`.`submission_ptr_id`=`patchwork_submission`.`id`)
WHERE `patchwork_submission`.`project_id`=1
)
This is because django-filters adds a DISTINCT qualifier on a
ModelMultiChoiceFilter by default. I guess it makes sense and they do a
decent job of justifying it, but it causes the count to be made with
this awful subquery. (The justification is that they don't know if you're
filtering on a to-many relationship, in which case there could be
duplicate values that need to be removed.)
While fixing that, we can also tell the filter to filter on patch_project
rather than submission's project, which allows us in some cases to avoid
the join entirely.
The resultant SQL is beautiful when filtering by project only:
SELECT COUNT(*) AS `__count`
FROM `patchwork_patch`
WHERE `patchwork_patch`.`patch_project_id` = 1
On my test setup (2x canonical kernel mailing list in the db, warm cache,
my laptop) this query goes from >1s to ~10ms, a ~100x improvement.
If we filter by project and date the query is still nice, but still also
very slow:
SELECT COUNT(*) AS `__count`
FROM `patchwork_patch`
INNER JOIN `patchwork_submission`
ON (`patchwork_patch`.`submission_ptr_id`=`patchwork_submission`.`id`)
WHERE (`patchwork_patch`.`patch_project_id`=1 AND `patchwork_submission`.`date`>='2010-11-01 00:00:00')
This us from ~1.3s to a bit under 400ms - still not ideal, but I'll take
the 3x improvement!
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Signed-off-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Stephen Finucane <stephen@that.guru>
Mete Polat [Thu, 27 Feb 2020 23:29:32 +0000 (23:29 +0000)]
REST: Add patch relations
View relations and add/update/delete them as a maintainer. Maintainers
can only create relations of patches which are part of a project they
maintain. Because this is a writable many-many nested relationship, it
behaves a little unusually. In short:
- All operations use PATCH to the 'related' field of a patch
- To relate a patch to another patch, say 7 to 19, either:
PATCH /api/patch/7 related := [19]
PATCH /api/patch/19 related := [7]
- To delete a patch from a relation, say 1, 21 and 42 are related but we
only want it to be 1 and 42:
PATCH /api/patch/21 related := []
* You _cannot_ remove a patch from a relation by patching another
patch in the relation: I'm trying to avoid read-modify-write loops.
* Relations that would be left with just 1 patch are deleted. This is
only ensured in the API - the admin interface will let you do this.
- Break-before-make: if you have [1, 12, 24] and [7, 15, 42] and you want
to end up with [1, 12, 15, 42], you have to remove 15 from the second
relation first:
PATCH /api/patch/1 related := [15] will fail with 409 Conflict.
Instead do:
PATCH /api/patch/15 related := []
PATCH /api/patch/1 related := [15]
-> 200 OK, [1, 12, 15, 42] and [7, 42] are the resulting relations
Signed-off-by: Mete Polat <metepolat2000@gmail.com> Signed-off-by: Stephen Finucane <stephen@that.guru> Signed-off-by: Daniel Axtens <dja@axtens.net>
Mete Polat [Thu, 27 Feb 2020 23:29:31 +0000 (23:29 +0000)]
models, templates: Add patch relations
Introduces the ability to add relations between patches. Relations are
displayed in the details page of a patch under 'Related'. Related
patches located in another projects can be viewed as well.
Changes to relations are tracked in events. Currently the display of
this is very bare in the API but that will be fixed in a subsequent patch:
this is the minimum required to avoid throwing errors when you view the
events feed.
Signed-off-by: Mete Polat <metepolat2000@gmail.com>
[dja: address some review comments from Stephen, add an admin view,
move to using Events, misc tidy-ups.] Signed-off-by: Daniel Axtens <dja@axtens.net>
Migration 0039 attempts to move patches that have ended up in an
arbitrary series due to race conditions into the correct series.
However, there are a number of race conditions that can occur here that
make this particularly tricky to do. Given that series are really just
arbitary metadata, it's not really necessary to do this...so don't.
Instead, just delete the series references that identical message IDs
and below to the same project, allowing us to add the uniqueness
constraint and prevent the issue bubbling up in the future. This means
we're still left with orphaned series but these could be fixed manually,
if necessary.
Signed-off-by: Stephen Finucane <stephen@that.guru> Tested-by: Ali Alnubani <alialnu@mellanox.com> Closes: #340
parser: Don't group patches with different versions in a series
As noted in #340 [1], if a patch from a series is dropped or
miscategorised, patches from a later revision of that series can end up
included in the earlier series rather than in their own series. This was
actually intentional as part of the fix for #105 [2]. However,
completely ignoring this information can be problematic. Refine things
by checking for versions and, if they don't match, using timeboxing to
try guess if they should be kept together. This would resolve the issue
seen in #340 while preventing a regression for #105.
Daniel Axtens [Tue, 28 Jan 2020 13:55:20 +0000 (00:55 +1100)]
pyenv: also install requirements for python2
The first time you do a migration with python3, you get a whole
lot of seemingly null changes. This is a bit annoying so I use
py2 to generate the changes. To do that, first fix the pyenv
transition so requirements are still installed for python2.
Signed-off-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Stephen Finucane <stephen@that.guru>
Pranav Annam [Wed, 12 Feb 2020 03:35:47 +0000 (04:35 +0100)]
docker: update dependency for current build
The dependency libssl1.0-dev in the Dockerfile makes docker build fail:
The following packages have unmet dependencies:
libmysqlclient-dev : Depends: libssl-dev (>= 1.1.1-1ubuntu2.1~18.04.5~)
but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
There seems to be a conflict with different versions of libssl and
libmysqlclient that did not exist previously with Ubuntu 18.04.
Just use the current libssl-dev from Ubuntu 18.04 to fix the build.
Mete Polat [Wed, 29 Jan 2020 19:01:22 +0000 (20:01 +0100)]
REST: Fix duplicate project queries
Eliminates duplicate project queries caused by calling
get_absolute_url() in the embedded serializers. Following foreign keys
with 'series__project' will cache the project of the series as well as
the series itself.
Signed-off-by: Mete Polat <metepolat2000@gmail.com> Signed-off-by: Stephen Finucane <stephen@that.guru> Closes: #335
Another fix for copy-pasted pull requests, this time for cases
when something is copy-pasted from a terminal and retains all
the bogus trailing whitespace.