From: Daniele Varrazzo Date: Fri, 13 Nov 2020 03:09:06 +0000 (+0000) Subject: Added more documentation on COPY X-Git-Tag: 3.0.dev0~368 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=9939c68ba4b30016808c56350f11af1e3604767b;p=thirdparty%2Fpsycopg.git Added more documentation on COPY --- diff --git a/docs/adaptation.rst b/docs/adaptation.rst new file mode 100644 index 000000000..78a67ce86 --- /dev/null +++ b/docs/adaptation.rst @@ -0,0 +1,6 @@ +.. _adaptation: + +Adaptation of data between Python and PostgreSQL +================================================ + +TODO diff --git a/docs/cursor.rst b/docs/cursor.rst index 281bccb2c..6debec8a7 100644 --- a/docs/cursor.rst +++ b/docs/cursor.rst @@ -128,14 +128,36 @@ Cursor support objects .. autoclass:: Copy + The object is normally returned by `Cursor.copy()`. It can be used as a + context manager (useful to load data into a database using :sql:`COPY FROM`) + and can be iterated (useful to read data after a :sql:`COPY TO`). + + See :ref:`copy` for details. + .. automethod:: read + + Alternatively, you can iterate on the `Copy` object to read its data + row by row. + .. automethod:: write .. automethod:: write_row + + The data in the tuple will be converted as configured on the cursor; + see :ref:`adaptation` for details. + .. automethod:: finish + If an *error* is specified, the :sql:`COPY` operation is cancelled. + + The method is called automatically at the end of a `!with` block. + .. autoclass:: AsyncCopy + The object is normally returned by `AsyncCursor.copy()`. Its methods are + the same of the `Copy` object but offering an `asyncio` interface + (`await`, `async for`, `async with`). + .. automethod:: read .. automethod:: write .. automethod:: write_row diff --git a/docs/from_pg2.rst b/docs/from_pg2.rst index 15db99548..32c60e807 100644 --- a/docs/from_pg2.rst +++ b/docs/from_pg2.rst @@ -34,6 +34,17 @@ PostgreSQL will also reject the execution of several queries at once you should use distinct `execute()` calls; otherwise you may consider merging the query client-side, using `psycopg3.sql` module. +Certain commands cannot be used with server-side binding, for instance +:sql:`SET` or :sql:`NOTIFY`:: + + >>> cur.execute("SET timezone TO %s", ["utc"]) + ... + psycopg3.errors.SyntaxError: syntax error at or near "$1" + +Sometimes PostgreSQL offers an alternative (e.g. :sql:`SELECT set_config()`, +:sql:`SELECT pg_notify()`). If no alternative exist you can use `psycopg3.sql` +to compose the query client-side. + Different adaptation system --------------------------- @@ -43,7 +54,7 @@ server-side parameters adaptation, but also to consider performance, flexibility, ease of customization. Builtin data types should work as expected; if you have wrapped a custom data -type you should check the ` Adaptation` topic. +type you should check the :ref:`Adaptation` topic. Other differences diff --git a/docs/index.rst b/docs/index.rst index 7f841ecac..0e4890b10 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -21,9 +21,10 @@ the COPY support. install usage - from_pg2 + adaptation connection cursor + from_pg2 Indices and tables diff --git a/docs/usage.rst b/docs/usage.rst index 4e86a787c..d1d1f87ab 100644 --- a/docs/usage.rst +++ b/docs/usage.rst @@ -87,9 +87,8 @@ The main entry points of `!psycopg3` are: - send commands to the database using methods such as `~Cursor.execute()` and `~Cursor.executemany()`, - - retrieve data from the database :ref:`by iteration ` or - using methods such as `~Cursor.fetchone()`, `~Cursor.fetchmany()`, - `~Cursor.fetchall()`. + - retrieve data from the database, iterating on the cursor or using methods + such as `~Cursor.fetchone()`, `~Cursor.fetchmany()`, `~Cursor.fetchall()`. @@ -129,6 +128,18 @@ TODO: lift from psycopg2 docs +.. index:: + pair: Query; Parameters + +.. _binary-data: + +Binary parameters and results +----------------------------- + +TODO: lift from psycopg2 docs + + + .. _transactions: Transaction management @@ -146,7 +157,59 @@ TODO Using COPY TO and COPY FROM --------------------------- -TODO +`psycopg3` allows to operate with `PostgreSQL COPY protocol`__. :sql:`COPY` is +one of the most efficient ways to load data into the database (and to modify +it, with some SQL creativity). + +.. __: https://www.postgresql.org/docs/current/sql-copy.html + +Using `!psycopg3` you can do three things: + +- loading data into the database row-by-row, from a stream of Python objects; +- loading data into the database block-by-block, with data already formatted in + a way suitable for :sql:`COPY FROM`; +- reading data from the database block-by-block, with data emitted by a + :sql:`COPY TO` statement. + +The missing quadrant, copying data from database row-by-row, is not covered by +COPY because that's pretty much normal querying, and :sql:`COPY TO` doesn't +offer enough metadata to decode the data to Python objects. + +The first option is the most powerful, because it allows to load data into the +database from any Python iterable (a list of tuple, or any iterable of +sequences): the Python values are adapted as they would be in normal querying. +To perform such operation use a :sql:`COPY [table] FROM STDIN` with +`Cursor.copy()` and use `~Copy.write_row()` on the resulting object in a +`!with` block. On exiting the block the operation will be concluded: + +.. code:: python + + with cursor.copy("COPY table_name (col1, col2) FROM STDIN") as copy: + for row in source: + copy.write_row(row) + +If an exception is raised inside the block, the operation is interrupted and +the records inserted so far discarded. + +If data is already formatted in a way suitable for copy (for instance because +it is coming from a file resulting from a previous `COPY TO` operation) it can +be loaded using `Copy.write()` instead. + +In order to read data in :sql:`COPY` format you can use a :sql:`COPY TO +STDOUT` statement and iterate over the resulting `Copy` object, which will +produce `!bytes`: + +.. code:: python + + with open("data.out", "wb") as f: + for data in cursor.copy("COPY table_name TO STDOUT") as copy: + f.write(data) + +Asynchronous operations are supported using the same patterns on an +`AsyncConnection`. + +Binary data can be produced and consumed using :sql:`FORMAT BINARY` in the +:sql:`COPY` command: see :ref:`binary-data` for details and limitations. .. index:: async @@ -161,7 +224,6 @@ The design of the asynchronous objects is pretty much the same of the sync ones: in order to use them you will only have to scatter the ``async`` keyword here and there. - .. code:: python async with await psycopg3.AsyncConnection.connect( diff --git a/psycopg3/psycopg3/copy.py b/psycopg3/psycopg3/copy.py index f616ee7da..4dfc95526 100644 --- a/psycopg3/psycopg3/copy.py +++ b/psycopg3/psycopg3/copy.py @@ -132,7 +132,13 @@ _bsrepl_re = re.compile(b"[\b\t\n\v\f\r\\\\]") class Copy(BaseCopy["Connection"]): + """Manage a :sql:`COPY` operation.""" + def read(self) -> Optional[bytes]: + """Read a row after a :sql:`COPY TO` operation. + + Return `None` when the data is finished. + """ if self._finished: return None @@ -144,14 +150,17 @@ class Copy(BaseCopy["Connection"]): return rv def write(self, buffer: Union[str, bytes]) -> None: + """Write a block of data after a :sql:`COPY FROM` operation.""" conn = self.connection conn.wait(copy_to(conn.pgconn, self._ensure_bytes(buffer))) def write_row(self, row: Sequence[Any]) -> None: + """Write a record after a :sql:`COPY FROM` operation.""" data = self.format_row(row) self.write(data) def finish(self, error: str = "") -> None: + """Terminate a :sql:`COPY FROM` operation.""" conn = self.connection berr = error.encode(conn.client_encoding, "replace") if error else None conn.wait(copy_end(conn.pgconn, berr)) @@ -170,13 +179,15 @@ class Copy(BaseCopy["Connection"]): if self.pgresult.status == ExecStatus.COPY_OUT: return - if exc_val is None: + if not exc_type: if self.format == Format.BINARY and not self._first_row: # send EOF only if we copied binary rows (_first_row is False) self.write(b"\xff\xff") self.finish() else: - self.finish(str(exc_val) or type(exc_val).__qualname__) + self.finish( + f"error from Python: {exc_type.__qualname__} - {exc_val}" + ) def __iter__(self) -> Iterator[bytes]: while 1: @@ -187,6 +198,8 @@ class Copy(BaseCopy["Connection"]): class AsyncCopy(BaseCopy["AsyncConnection"]): + """Manage an asynchronous :sql:`COPY` operation.""" + async def read(self) -> Optional[bytes]: if self._finished: return None @@ -225,13 +238,15 @@ class AsyncCopy(BaseCopy["AsyncConnection"]): if self.pgresult.status == ExecStatus.COPY_OUT: return - if exc_val is None: + if not exc_type: if self.format == Format.BINARY and not self._first_row: # send EOF only if we copied binary rows (_first_row is False) await self.write(b"\xff\xff") await self.finish() else: - await self.finish(str(exc_val)) + await self.finish( + f"error from Python: {exc_type.__qualname__} - {exc_val}" + ) async def __aiter__(self) -> AsyncIterator[bytes]: while 1: