Denis Laxalde [Tue, 10 May 2022 07:01:07 +0000 (09:01 +0200)]
test: parametrize transaction tests with pipeline on/off
The 'conn' fixture is overloaded in transaction tests, pulling the
'pipeline' fixture to have them run with or without a pipeline.
When checking for connection's transaction state, just after entering
the transaction() block (e.g. in test_basic), we need to sync the
pipeline in order to get the state right. This is just for tests as the
exact transaction state, in pipeline mode, is not deterministic since
query results are not fetched directly.
This is because the connection state is not correct at the end of
transaction, we'll fix this in next commit.
Some others were adjusted as follows:
* when entering the transaction() block, e.g. in test_begins_on_enter,
we eventually call pipeline.sync() in order to get the state right
(this is just for tests as the exact transaction state, in pipeline
mode, is not deterministic since query results are not fetched
directly);
* test_context_inerror_rollback_no_clobber[pipeline=on] is skipped, as
explained in comment inline;
* test_prohibits_use_of_commit_rollback_autocommit[pipeline=on] is
xfail, because Connection._check_intrans() does not check for pipeline
mode (see TODO inline about a possible solution);
* test_str now checks for [<txstatus>, pipeline=ON] in str(Transaction).
fix: fix integration between pipelines and nested transaction
Refactor to remove all the knowledge about pipelines from the
transaction object. Replaced by just syncing before entering the
transaction.
Had to drop the pipeline exit, which will now fetch results whatever the
level. If not doing so, a final sync should have been forced to exit
with a clean state from an inner transaction block.
test: drop TODO points and emergency rollbacks from pipeline tests
The previous changeset restored some sanity. The behaviour in
`test_errors_raised_on_commit` is not inconsistent with the non-pipeline
mode because the two are not comparable: in non-pipeline mode the
exception is raised by `execute()`, not by `commit()`. If an exception
is raised in `commit()` itself, it is already handled consistently, as
`test_error_on_commit` shows.
Denis Laxalde [Mon, 9 May 2022 19:14:59 +0000 (21:14 +0200)]
fix: wrap transaction in pipelines if the connection has one
When entering a transaction on a connection in pipeline mode, we open an
inner pipeline to ensure that a Sync is emitted at the end of
transaction thus restoring the connection in its expected state (i.e.
the same as in non-pipeline mode).
Make sure we receive the commit message, so we can raise an immediate
exception, as we do for connection-handled transactions.
Highlight, in the test, a behaviour difference between commit in
pipeline and non-pipeline mode. In the latter, after a failed commit,
the connection is left IDLE; in pipeline mode it is left INERROR so we
need a further rollback to keep on using it. Will look if this can be
made consistent.
fix: fix nested transaction entering in pipeline mode
If the connection is idle, it might be that we are missing results.
Receive them to align the connection state to what the transaction
object expects, otherwise the assert in _push_savepoint() will fail.
This maintains the expectation that, after commit(), the operation is
really persisted. As per postgres pipeline docs:
The client must not assume that work is committed when it sends a
COMMIT — only when the corresponding result is received to confirm the
commit is complete.
With this change we are effectively receiving the result of the commit,
eventually throwing an exception if it happened.
This allows a sync() call to establish an high level synchronization
point between the application and the server. If there were errors in
the processing so far, sync will throw the exception.
test: add test to verify the wrong array oid with numbers
The array binary dumper does the right thing; the text one picks
numeric[] unconditionally. It was clearly made on purpose, but #293
shows that it's a bad idea.
It's not needed to force a results fetch when exiting a nested pipeline
as this would naturally happen either later in the outer pipeline or
eventually at the end of the outermost one. Rather, just do a Sync, as
it allows outer pipeline to resume normal execution in case the inner
one got into aborted state.
During previous refactorings, we made Pipeline.sync() also fetch results
from the server. But this somehow breaks the semantics of the
synchronization point as defined by Postgres because the user might be
interested in emitting Sync message as a way to solely close the current
series of queries in the pipeline: i.e., flush queries from client to
server and reset the pipeline error state. In this respect, the 'fetch'
step should be explicit.
BasePipeline._sync_gen() is changed to only emit a Sync and a new
_exit_gen() method is introduced doing what _sync_gen() previously did.
Accordingly, the warning emitted when calling this _exit_gen() at
pipeline exit is adjusted to say "terminating" instead of "syncing".
The details shown are only the communication bowels. So we go from:
Traceback (most recent call last):
File "trace.py", line 301, in <module>
conn.execute("select pg_sleep(0.2)")
File "/usr/lib/python3.8/contextlib.py", line 120, in __exit__
next(self.gen)
File "/home/piro/dev/psycopg3/psycopg/psycopg/connection.py", line 881, in pipeline
yield pipeline
File "/home/piro/dev/psycopg3/psycopg/psycopg/_pipeline.py", line 197, in __exit__
self.sync()
File "/home/piro/dev/psycopg3/psycopg/psycopg/_pipeline.py", line 183, in sync
self._conn.wait(self._sync_gen())
File "/home/piro/dev/psycopg3/psycopg/psycopg/connection.py", line 896, in wait
return waiting.wait(gen, self.pgconn.socket, timeout=timeout)
File "/home/piro/dev/psycopg3/psycopg/psycopg/waiting.py", line 237, in wait_epoll
s = gen.send(ready)
File "/home/piro/dev/psycopg3/psycopg/psycopg/_pipeline.py", line 92, in _sync_gen
yield from self._communicate_gen()
File "/home/piro/dev/psycopg3/psycopg/psycopg/_pipeline.py", line 105, in _communicate_gen
self._process_results(queued, results)
File "/home/piro/dev/psycopg3/psycopg/psycopg/_pipeline.py", line 149, in _process_results
raise e.error_from_result(result, encoding=pgconn_encoding(self.pgconn))
psycopg.errors.UndefinedColumn: column "foo" does not exist
LINE 1: select foo
^
to:
Traceback (most recent call last):
File "trace.py", line 301, in <module>
conn.execute("select pg_sleep(0.2)")
File "/usr/lib/python3.8/contextlib.py", line 120, in __exit__
next(self.gen)
File "/home/piro/dev/psycopg3/psycopg/psycopg/connection.py", line 881, in pipeline
yield pipeline
File "/home/piro/dev/psycopg3/psycopg/psycopg/_pipeline.py", line 205, in __exit__
raise exc2.with_traceback(None)
psycopg.errors.UndefinedColumn: column "foo" does not exist
LINE 1: select foo
^
which shows much more clearly that the traceback was raised at pipeline
block exit.
perf: avoid unnecessary recvfrom() in cursor.stream()
Call PQisBusy() before PQconsumeInput() on fetching results. If not
busy, don't call PQconsumeInput() at all but just go to fetching results
and notifications.
This is especially useful in single-row mode because most of the times
the libpq can produce several results after a single network fetch.
Previously we were calling PQconsumeInput() also when results were
already on the client and there was nothing new to read, which forced
the libpq to run a select() to tell apart a lack of data from an EOF,
see `the grumble`_, and caused the overhead reported in #286.
refactor(enum): rename python_type -> enum, enum_labels -> labels on EnumInfo
The previous name were lifted from the composite adapters. However, that
class would have had a more ambiguous "type" attribute, and "types" for
the fields, hence the decision of using "python_" and "fields_" prefix
to disambiguate.
In the enum adapters context, enum is not ambiguous, so a more natural
name seems preferred.
This removes the need for enums to be str-based. It also removes the
asymmetry whereby automatically generated enums were *not* string based,
but pure enums.
- New version scheme is "starting from PG 10", not "after PG 10" (also,
chunky bit moved from docstring to docs as for most other docs).
- The transaction status is of the session, not the server.
Daniele Varrazzo [Tue, 29 Mar 2022 23:57:06 +0000 (01:57 +0200)]
fix: consistent sync/exit and error management in pipeline contexts
Don't clobber an exception on exit of the nested block too. In order to
simplify the code, make the pipeline count the number of time it is
entered, and call _exit() only the last time it exits.
Drop assert that we have left pipeline mode leaving the block. If we get
in unrecoverable state, we will have not. By now we should probably just
close the connection; however, leaving it this way is a better
indication that the connection is broken because of something about
pipeline mode; closing it would hide it, and even if we raised a
warning, it would be much easier to miss it than to miss the exceptions
raised in broken state.
Daniele Varrazzo [Tue, 29 Mar 2022 23:54:56 +0000 (01:54 +0200)]
fix: forbid COPY in pipeline mode
COPY is not supported. Attempting it puts the connection in
unrecoverable state, with pipeline sync failing and pipeline exit
complaining that there are still results. So let's try to not get in
that state.
Daniele Varrazzo [Tue, 29 Mar 2022 20:03:49 +0000 (22:03 +0200)]
fix: restore sending a Sync on block exit but not on executemany end
It was removed a few commits ago, but as Denis suggests, it makes sense
to keep it. Reintroduce it (and test for it), but make sure executemany
doesn't send extra sync.
Daniele Varrazzo [Tue, 29 Mar 2022 12:29:11 +0000 (14:29 +0200)]
fix: make no-tuple results available after executemany returning
There could be useful info there, if the user asks for it. Now this
doesn't slow down the happy execute path, because the default of
returning is False.
Daniele Varrazzo [Tue, 29 Mar 2022 12:26:26 +0000 (14:26 +0200)]
fix: only flush the pipeline after executemany if returning
This has the side effect of breaking rowcount, as before, but can send
many executemany in the same pipeline.
`rowcount` is correct if `returning=True` is set instead, which is a
thing we can at least document, and makes sense: "if you want a result,
you flush the pipeline, dude".