Mike Bayer [Fri, 19 Jun 2020 14:25:42 +0000 (10:25 -0400)]
Propose black py27 + py35 mode for the rest of Py2K
py27 mode produces one failure for flake8 which is the
space added between exec and parenthesis. however apparently
we can add multiple versions to target-versions which allows
the exec() calls to come out in python 3 style.
The issue we want to improve is issues of trailing
commas being added. I'm not really able to get black to
consistently add or remove these trailing commas in any
case no matter what I set target-version towards.
Mike Bayer [Fri, 19 Jun 2020 04:32:00 +0000 (00:32 -0400)]
perf tweaks
- avoid abc checks in distill_20
- ColumnEntity subclasses are unique to their compile state and
have no querycontext specific state. They can do a simple memoize of their
fetch_column without using attributes, and they can memoize their
_getter() too so that it goes into the cache, just like
instance_processor() does.
- unify ORMColumnEntity and RawColumnEntity for the row processor part,
add some test coverage for the case where it is used in a from_statement
- do a faster generate if there are no memoized entries
- query._params is always immutabledict
Change-Id: I1e2dfe607a1749b5b434fc11f9348ee631501dfa
Mike Bayer [Fri, 12 Jun 2020 17:09:15 +0000 (13:09 -0400)]
Warn when transaction context manager ends on inactive transaction
if .rollback() or .commit() is called inside the transaction
context manager, the transaction object is deactivated.
the context manager continues but will not be able to correctly
fulfill it's closing state. Ensure a warning is emitted when
this happens.
Mike Bayer [Thu, 11 Jun 2020 18:31:57 +0000 (14:31 -0400)]
Add version token to error URL
the sqlalche.me redirector now supports the numerical version
code in the URL, e.g. /13/, /14/, /20/, etc., so that we can
redirect to the error codes for the appropriate version
of SQLAlchemy in use without going through the catch-all "latest"
link. If a particular version of the docs is no longer on the
site, the redirect will revert to falling through the "latest"
link (which ultimately lands on the current release version,
/13/ at the time of this writing).
Added "exists" to the list of reserved words for SQLite so that this word
will be quoted when used as a label or column name. Pull request courtesy
Thodoris Sotiropoulos.
Mike Bayer [Sun, 7 Jun 2020 00:40:43 +0000 (20:40 -0400)]
Turn on caching everywhere, add logging
A variety of caching issues found by running
all tests with statement caching turned on.
The cache system now has a more conservative approach where
any subclass of a SQL element will by default invalidate
the cache key unless it adds the flag inherit_cache=True
at the class level, or if it implements its own caching.
Add working caching to a few elements that were
omitted previously; fix some caching implementations
to suit lesser used edge cases such as json casts
and array slices.
Refine the way BaseCursorResult and CursorMetaData
interact with caching; to suit cases like Alembic
modifying table structures, don't cache the
cursor metadata if it were created against a
cursor.description using non-positional matching,
e.g. "select *". if a table re-ordered its columns
or added/removed, now that data is obsolete.
Additionally we have to adapt the cursor metadata
_keymap regardless of if we just processed
cursor.description, because if we ran against
a cached SQLCompiler we won't have the right
columns in _keymap.
Other refinements to how and when we do this
adaption as some weird cases
were exposed in the Postgresql dialect,
a text() construct that names just one column that
is not actually in the statement. Fixed that
also as it looks like a cut-and-paste artifact
that doesn't actually affect anything.
Various issues with re-use of compiled result maps
and cursor metadata in conjunction with tables being
changed, such as change in order of columns.
mappers can be cleared but the class remains, meaning
a mapper has to use itself as the cache key not the class.
lots of bound parameter / literal issues, due to Alembic
creating a straight subclass of bindparam that renders
inline directly. While we can update Alembic to not
do this, we have to assume other people might be doing
this, so bindparam() implements the inherit_cache=True
logic as well that was a bit involved.
turn on cache stats in logging.
Includes a fix to subqueryloader which moves all setup to
the create_row_processor() phase and elminates any storage
within the compiled context. This includes some changes
to create_row_processor() signature and a revising of the
technique used to determine if the loader can participate
in polymorphic queries, which is also applied to
selectinloading.
DML update.values() and ordered_values() now coerces the
keys as we have tests that pass an arbitrary class here
which only includes __clause_element__(), so the
key can't be cached unless it is coerced. this in turn
changed how composite attributes support bulk update
to use the standard approach of ClauseElement with
annotations that are parsed in the ORM context.
memory profiling successfully caught that the Session
from Query was getting passed into _statement_20()
so that was a big win for that test suite.
Apparently Compiler had .execute() and .scalar() methods
stuck on it, these date back to version 0.4 and there
was a single test in the PostgreSQL dialect tests
that exercised it for no apparent reason. Removed
these methods as well as the concept of a Compiler
holding onto a "bind".
Mike Bayer [Wed, 3 Jun 2020 21:38:35 +0000 (17:38 -0400)]
Convert bulk update/delete to new execution model
This reorganizes the BulkUD model in sqlalchemy.orm.persistence
to be based on the CompileState concept and to allow plain
update() / delete() to be passed to session.execute() where
the ORM synchronize session logic will take place.
Also gets "synchronize_session='fetch'" working with horizontal
sharding.
Adding a few more result.scalar_one() types of methods
as scalar_one() seems like what is normally desired.
### Description
There were a few remnant uses of master/slave in the code and docs. The project previously made a decision to move away from them to use modern and inclusive terminology.
This PR does not cover a bug or necessitate a documented entry into the changelog, so an issue ticket was not created.
### Checklist
This pull request is:
- [x] A documentation / typographical error fix
- [x] A short code fix
- [ ] A new feature implementation
Mike Bayer [Thu, 4 Jun 2020 17:28:21 +0000 (13:28 -0400)]
Document that type_coerce does not currently imply parenthesization
We've had a few issues where the current solution
is to use the self_group() method, so document that as
the current approach for the parenthesization use case.
Whether or not type_coerce() is changed later, this is
how it works at the moment.
Mike Bayer [Tue, 2 Jun 2020 18:21:03 +0000 (14:21 -0400)]
Default create_constraint to False
The :paramref:`.Enum.create_constraint` and
:paramref:`.Boolean.create_constraint` parameters now default to False,
indicating when a so-called "non-native" version of these two datatypes is
created, a CHECK constraint will not be generated by default. These CHECK
constraints present schema-management maintenance complexities that should
be opted in to, rather than being turned on by default.
Added new reflection method :meth:`.Inspector.get_sequence_names` which
returns all the sequences defined. Support for this method has been added
to the backend that support :class:`.Sequence`: PostgreSql, Oracle,
MSSQL and MariaDB >= 10.3.
Elmer de Looff [Wed, 3 Jun 2020 16:24:41 +0000 (12:24 -0400)]
Folds two identical exception handlers into a single one
Fixes a `TODO` that searches for py2/3 compatible syntax to match multiple exception types.
### Description
Merges the two exception clauses using the syntax that exists for both Python 2 and 3 as per the exception handling tutorials ([Python 2](https://docs.python.org/2/tutorial/errors.html#handling-exceptions), [Python 3](https://docs.python.org/3/tutorial/errors.html#handling-exceptions))
### Checklist
This pull request is:
- [ ] A documentation / typographical error fix
- Good to go, no issue or tests are needed
- [x] A short code fix
- please include the issue number, and create an issue if none exists, which
must include a complete example of the issue. one line code fixes without an
issue and demonstration will not be accepted.
- Please include: `Fixes: #<issue number>` in the commit message
- please include tests. one line code fixes without tests will not be accepted.
- [ ] A new feature implementation
- please include the issue number, and create an issue if none exists, which must
include a complete example of how the feature would look.
- Please include: `Fixes: #<issue number>` in the commit message
- please include tests.
Mike Bayer [Mon, 1 Jun 2020 23:11:19 +0000 (19:11 -0400)]
Refine IN and scalar subquery coercions
Ensure IN emits a warning when it coerces a FromClause
into a select(), however that it continues to allow the
scalar_subquery() coercion to be automatic, particularly
since it's not clear that "col IN (select)" is necessarily
"scalar" in the case of tuples.
Convert the "scalar_subquery()" warning emitted in other
cases to be a warning, rather than a deprecation warning.
I can't imagine taking this coercion out as it is intuitive
and is always going to happen; we just would like to note that
an implicit coercion is occurring.
Mike Bayer [Mon, 1 Jun 2020 00:34:03 +0000 (20:34 -0400)]
Support multiple dotted sections in mssql schema names
Refined the logic used by the SQL Server dialect to interpret multi-part
schema names that contain many dots, to not actually lose any dots if the
name does not have bracking or quoting used, and additionally to support a
"dbname" token that has many parts including that it may have multiple,
independently-bracketed sections.
This fix addresses #5364 to some degree but probably does not
resolve it fully.
Haoyu Sun [Fri, 29 May 2020 18:31:07 +0000 (14:31 -0400)]
Add default expression to query_expression()
Added a new parameter :paramref:`_orm.query_expression.default_expr` to the
:func:`_orm.query_expression` construct, which will be appled to queries
automatically if the :func:`_orm.with_expression` option is not used. Pull
request courtesy Haoyu Sun.
Gord Thompson [Fri, 29 May 2020 13:20:54 +0000 (07:20 -0600)]
Fix is_disconnect false positive for mssql+pyodbc
Fixed an issue where the ``is_disconnect`` function in the SQL Server
pyodbc dialect was incorrectly reporting the disconnect state when the
exception messsage had a substring that matched a SQL Server ODBC error
code.
Mike Bayer [Wed, 29 Apr 2020 23:46:43 +0000 (19:46 -0400)]
Improve rendering of core statements w/ ORM elements
This patch contains a variety of ORM and expression layer
tweaks to support ORM constructs in select() statements,
without the 1.3.x requiremnt in Query that a full
_compile_context() + new select() is needed in order to
get a working statement object.
Includes such tweaks as the ability to implement
aliased class of an aliased class,
as we are looking to fully support ACs against subqueries,
as well as the ability to access anonymously-labeled
ColumnProperty expressions within subqueries by
naming the ".key" of the label after the property
key. Some tuning to query.join() as well
as ORMJoin internals to allow things to work more
smoothly.
Added support for "CREATE SEQUENCE" and full :class:`.Sequence` support for
Microsoft SQL Server. This removes the deprecated feature of using
:class:`.Sequence` objects to manipulate IDENTITY characteristics which
should now be performed using ``mssql_identity_start`` and
``mssql_identity_increment`` as documented at :ref:`mssql_identity`. The
change includes a new parameter :paramref:`.Sequence.data_type` to
accommodate SQL Server's choice of datatype, which for that backend
includes INTEGER and BIGINT. The default starting value for SQL Server's
version of :class:`.Sequence` has been set at 1; this default is now
emitted within the CREATE SEQUENCE DDL for all backends.
Mike Bayer [Tue, 26 May 2020 02:36:44 +0000 (22:36 -0400)]
callcount reductions and refinement for cached queries
This commit includes that we've removed the "_orm_query"
attribute from compile state as well as query context.
The attribute created reference cycles and also added
method call overhead. As part of this change,
the interface for ORMExecuteState changes a bit, as well
as the interface for the horizontal sharding extension
which now deprecates the "query_chooser" callable
in favor of "execute_chooser", which receives the contextual
object. This will also work more nicely when we implement
the new execution path for bulk updates and deletes.
Pre-merge execution options for statement, connection,
arguments all up front in Connection. that way they
can be passed to the before_execute / after_execute events,
and the ExecutionContext doesn't have to merge as second
time. Core execute is pretty close to 1.3 now.
baked wasn't using the new one()/first()/one_or_none() methods,
fixed that.
Convert non-buffered cursor strategy to be a stateless
singleton. inline all the paths by which the strategy
gets chosen, oracle and SQL Server dialects make use of the
already-invoked post_exec() hook to establish the alternate
strategies, and this is actually much nicer than it was before.
Add caching to mapper instance processor for getters.
Identified a reference cycle per query that was showing
up as a lot of gc cleanup, fixed that.
After all that, performance not budging much. Even
test_baked_query now runs with significantly fewer function
calls than 1.3, still 40% slower.
Basically something about the new patterns just makes
this slower and while I've walked a whole bunch of them
back, it hardly makes a dent. that said, the performance
issues are relatively small, in the 20-40% time increase
range, and the new caching feature
does provide for regular ORM and Core queries that
are cached, and they are faster than non-cached.
Mark Amery [Thu, 28 May 2020 12:57:26 +0000 (08:57 -0400)]
Fix 'email_address' being typoed as 'email_addres' in two places
### Description
Fixes a typo that coincidentally occurs in a couple of different places - once in the docs, and another time in a test, where it was presumably neutering one of the test's assertions by making it always pass.
### Checklist
<!-- go over following points. check them with an `x` if they do apply, (they turn into clickable checkboxes once the PR is submitted, so no need to do everything at once)
-->
This pull request is:
- [x] A documentation / typographical error fix
- Good to go, no issue or tests are needed
- [ ] A short code fix
- please include the issue number, and create an issue if none exists, which
must include a complete example of the issue. one line code fixes without an
issue and demonstration will not be accepted.
- Please include: `Fixes: #<issue number>` in the commit message
- please include tests. one line code fixes without tests will not be accepted.
- [ ] A new feature implementation
- please include the issue number, and create an issue if none exists, which must
include a complete example of how the feature would look.
- Please include: `Fixes: #<issue number>` in the commit message
- please include tests.
Mike Bayer [Wed, 27 May 2020 14:18:33 +0000 (10:18 -0400)]
Render table hints in generic SQL
Added :meth:`.Select.with_hint` output to the generic SQL string that is
produced when calling ``str()`` on a statement. Previously, this clause
would be omitted under the assumption that it was dialect specific.
The hint text is presented within brackets to indicate the rendering
of such hints varies among backends.
Mike Bayer [Tue, 26 May 2020 02:36:44 +0000 (22:36 -0400)]
Small callcount reductions and refinement for cached queries
baked wasn't using the new one()/first()/one_or_none() methods,
fixed that.
loading._instance_processor() can skip setting up the
quick populators every time because it can cache the getters.
Callcounts have gone below what 1.3 does for the
test_baked_query performance suite, however runtime for
continued inexplicable reasons has not :(. still suspecting
the result tuples but this seems so hard to believe.
Mike Bayer [Mon, 27 Apr 2020 16:58:12 +0000 (12:58 -0400)]
Convert execution to move through Session
This patch replaces the ORM execution flow with a
single pathway through Session.execute() for all queries,
including Core and ORM.
Currently included is full support for ORM Query,
Query.from_statement(), select(), as well as the
baked query and horizontal shard systems. Initial
changes have also been made to the dogpile caching
example, which like baked query makes use of a
new ORM-specific execution hook that replaces the
use of both QueryEvents.before_compile() as well
as Query._execute_and_instances() as the central
ORM interception hooks.
select() and Query() constructs alike can be passed to
Session.execute() where they will return ORM
results in a Results object. This API is currently
used internally by Query. Full support for
Session.execute()->results to behave in a fully
2.0 fashion will be in later changesets.
bulk update/delete with ORM support will also
be delivered via the update() and delete()
constructs, however these have not yet been adapted
to the new system and may follow in a subsequent
update.
Performance is also beginning to lag as of this
commit and some previous ones. It is hoped that
a few central functions such as the coercions
functions can be rewritten in C to re-gain
performance. Additionally, query caching
is now available and some subsequent patches
will attempt to cache more of the per-execution
work from the ORM layer, e.g. column getters
and adapters.
This patch also contains initial "turn on" of the
caching system enginewide via the query_cache_size
parameter to create_engine(). Still defaulting at
zero for "no caching". The caching system still
needs adjustments in order to gain adequate performance.
Mike Bayer [Sun, 1 Dec 2019 22:24:27 +0000 (17:24 -0500)]
Unify Query and select() , move all processing to compile phase
Convert Query to do virtually all compile state computation
in the _compile_context() phase, and organize it all
such that a plain select() construct may also be used as the
source of information in order to generate ORM query state.
This makes it such that Query is not needed except for
its additional methods like from_self() which are all to
be deprecated.
The construction of ORM state will occur beyond the
caching boundary when the new execution model is integrated.
future select() gains a working join() and filter_by() method.
as we continue to rebase and merge each commit in the steps,
callcounts continue to bump around. will have to look at
the final result when it's all in.
Federico Caselli [Thu, 21 May 2020 19:50:49 +0000 (21:50 +0200)]
Avoid proxy functions in row functions
This streamlines a bit for non-C implementations, however
also adds and tests behavioral contracts that mappings should
not allow integer or slice access and should behave like a
Python mapping in that it raises KeyError for an integer
and TypeError for a slice. Py3/Py2/C/noC :)
Federico Caselli [Fri, 22 May 2020 20:48:16 +0000 (22:48 +0200)]
Add python 3.8 profiles; remove zoomark tests
The zoomark tests have served us well for many years.
At this point, they have been using a very antiquated
calling style for many years and are no longer where we catch
performance issues.
The performance suite now has a large number of individual
tests that catch issues very specifically and additionally
record just one performance count per test. This also
allows us to remove the "replay" fixtures that were not
used for anything else.
Mike Bayer [Fri, 22 May 2020 04:06:06 +0000 (00:06 -0400)]
Add immutabledict C code
Start trying to convert fundamental objects to
C as we now rely on a fairly small core of things,
and 1.4 is having problems with complexity added being
slower than the performance gains we are trying to build in.
immutabledict here does seem to bench as twice as fast as the
Python one, see below. However, it does not appear to be
used prominently enough to make any dent in the performance
tests.
at the very least it may provide us some more lift-and-copy
code for more C extensions.
import timeit
from sqlalchemy.util._collections import not_immutabledict, immutabledict
def run(dict_cls):
for i in range(1000000):
d1 = dict_cls({"x": 5, "y": 4})
d2 = d1.union({"x": 17, "new key": "some other value"}, None)
Mike Bayer [Fri, 22 May 2020 17:14:58 +0000 (13:14 -0400)]
Don't emit pyodbc "no driver" warning for empty URL
Fixed an issue in the pyodbc connector such that a warning about pyodbc
"drivername" would be emitted when using a totally empty URL. Empty URLs
are normal when producing a non-connected dialect object or when using the
"creator" argument to create_engine(). The warning now only emits if the
driver name is missing but other parameters are still present.
Mike Bayer [Fri, 22 May 2020 14:13:02 +0000 (10:13 -0400)]
Don't incref on new reference key_style
in 4550983e0ce2f35b3585e53894c941c23693e71d we
added a new attribute key_style. remove an erroneous
Py_INCREF when we acquire it from PyLong_FromLong
as we already own the reference. since this
is a new reference we actualy need to Py_DECREF
it because we aren't returning it.
Miguel Grinberg [Thu, 21 May 2020 09:54:47 +0000 (05:54 -0400)]
Fix query string escaping in engine URLs
Fixed issue in :class:`.URL` object where stringifying the object
would not URL encode special characters, preventing the URL from being
re-consumable as a real URL. Pull request courtesy Miguel Grinberg.
Mike Bayer [Wed, 20 May 2020 17:41:44 +0000 (13:41 -0400)]
Performance fixes for new result set
A few small mistakes led to huge callcounts. Additionally,
the warn-on-get behavior which is attempting to warn for
deprecated access in SQLAlchemy 2.0 is very expensive; it's not clear
if its feasible to have this warning or to somehow alter how it
works.
Mike Bayer [Thu, 7 May 2020 14:53:15 +0000 (10:53 -0400)]
Disable "check unicode returns" under Python 3
Disabled the "unicode returns" check that runs on dialect startup when
running under Python 3, which for many years has occurred in order to test
the current DBAPI's behavior for whether or not it returns Python Unicode
or Py2K strings for the VARCHAR and NVARCHAR datatypes. The check still
occurs by default under Python 2, however the mechanism to test the
behavior will be removed in SQLAlchemy 2.0 when Python 2 support is also
removed.
This logic was very effective when it was needed, however now that Python 3
is standard, all DBAPIs are expected to return Python 3 strings for
character datatypes. In the unlikely case that a third party DBAPI does
not support this, the conversion logic within :class:`.String` is still
available and the third party dialect may specify this in its upfront
dialect flags by setting the dialect level flag ``returns_unicode_strings``
to one of :attr:`.String.RETURNS_CONDITIONAL` or
:attr:`.String.RETURNS_BYTES`, both of which will enable Unicode conversion
even under Python 3.
As part of this change, disabling testing of the doctest tutorials under
Python 2.
Mike Bayer [Mon, 18 May 2020 20:08:33 +0000 (16:08 -0400)]
Streamline visitors.iterate
This method might be used more significantly in the
ORM refactor, so further refine it.
* all get_children() methods now work entirely based on iterators.
Basically only select() was sensitive to this anymore and it now
chains the iterators together
* remove all kinds of flags like column_collections, schema_visitor
that apparently aren't used anymore.
* remove the "depthfirst" visitors as these don't seem to be
used either.
* make sure select() yields its columns first as these will be used
to determine the current mapper.
Mike Bayer [Thu, 14 May 2020 16:50:11 +0000 (12:50 -0400)]
Update transaction / connection handling
step one, do away with __connection attribute and using
awkward AttributeError logic
step two, move all management of "connection._transaction"
into the transaction objects themselves where it's easier
to follow.
build MarkerTransaction that takes the role of
"do-nothing block"
new connection datamodel is: connection._transaction, always
a root, connection._nested_transaction, always a nested.
nested transactions still chain to each other as this
is still sort of necessary but they consider the root
transaction separately, and the marker transactions
not at all.
introduce new InvalidRequestError subclass
PendingRollbackError. Apply to connection and session
for all cases where a transaction needs to be rolled
back before continuing. Within Connection,
both PendingRollbackError as well as ResourceClosedError
are now raised directly without being handled by
handle_dbapi_error(); this removes these two exception
cases from the handle_error event handler as well as
from StatementError wrapping, as these two exceptions are
not statement oriented and are instead programmatic
issues, that the application is failing to handle database
errors properly.
Revise savepoints so that when a release fails, they set
themselves as inactive so that their rollback() method
does not throw another exception.
Give savepoints another go on MySQL, can't get release working
however get support for basic round trip going
Mike Bayer [Sat, 16 May 2020 17:05:00 +0000 (13:05 -0400)]
Reword delete-orphan on many error message and document
For many years we have encountered users making use of the
"single_parent" flag in response to the error message for
"delete-orphan" expressing this as a means to cancel the current
error. However, the actual issue here is usually a misuse
of the delete-orphan cascade setting. Reword the error message to
be much more descriptive about what this means and add new
error link sections describing the situation in as much detail
as possible.
Fixes: #4860
# Description
Add nowait, skip_lock, of arguments to for_update_clause for mysql
### Checklist
This pull request is:
- [ ] A documentation / typographical error fix
- Good to go, no issue or tests are needed
- [ ] A short code fix
- please include the issue number, and create an issue if none exists, which
must include a complete example of the issue. one line code fixes without an
issue and demonstration will not be accepted.
- Please include: `Fixes: #<issue number>` in the commit message
- please include tests. one line code fixes without tests will not be accepted.
- [x] A new feature implementation
- please include the issue number, and create an issue if none exists, which must
include a complete example of how the feature would look.
- Please include: `Fixes: #<issue number>` in the commit message
- please include tests.