--- /dev/null
+.. change::
+ :tags: bug, mysql
+ :tickets: 4283
+
+ Fixed bug in MySQLdb dialect and variants such as PyMySQL where an
+ additional "unicode returns" check upon connection makes explicit use of
+ the "utf8" character set, which in MySQL 8.0 emits a warning that utf8mb4
+ should be used. This is now replaced with a utf8mb4 equivalent.
+ Documentation is also updated for the MySQL dialect to specify utf8mb4 in
+ all examples. Additional changes have been made to the test suite to use
+ utf8mb3 charsets and databases (there seem to be collation issues in some
+ edge cases with utf8mb4), and to support configuration default changes made
+ in MySQL 8.0 such as explicit_defaults_for_timestamp as well as new errors
+ raised for invalid MyISAM indexes.
+
+
``INSERT_METHOD``, and many more.
To accommodate the rendering of these arguments, specify the form
``mysql_argument_name="value"``. For example, to specify a table with
-``ENGINE`` of ``InnoDB``, ``CHARSET`` of ``utf8``, and ``KEY_BLOCK_SIZE``
+``ENGINE`` of ``InnoDB``, ``CHARSET`` of ``utf8mb4``, and ``KEY_BLOCK_SIZE``
of ``1024``::
Table('mytable', metadata,
Column('data', String(32)),
mysql_engine='InnoDB',
- mysql_charset='utf8',
+ mysql_charset='utf8mb4',
mysql_key_block_size="1024"
)
in the URL, such as::
e = create_engine(
- "mysql+pymysql://scott:tiger@localhost/test?charset=utf8")
+ "mysql+pymysql://scott:tiger@localhost/test?charset=utf8mb4")
This charset is the **client character set** for the connection. Some
MySQL DBAPIs will default this to a value such as ``latin1``, and some
The encoding used for Unicode has traditionally been ``'utf8'``. However,
for MySQL versions 5.5.3 on forward, a new MySQL-specific encoding
-``'utf8mb4'`` has been introduced. The rationale for this new encoding
-is due to the fact that MySQL's utf-8 encoding only supports
+``'utf8mb4'`` has been introduced, and as of MySQL 8.0 a warning is emitted
+by the server if plain ``utf8`` is specified within any server-side
+directives, replaced with ``utf8mb3``. The rationale for this new encoding
+is due to the fact that MySQL's legacy utf-8 encoding only supports
codepoints up to three bytes instead of four. Therefore,
when communicating with a MySQL database
that includes codepoints more than three bytes in size,
e = create_engine(
"mysql+pymysql://scott:tiger@localhost/test?charset=utf8mb4")
-At the moment, up-to-date versions of MySQLdb and PyMySQL support the
-``utf8mb4`` charset. Other DBAPIs such as MySQL-Connector and OurSQL
-may **not** support it as of yet.
+All modern DBAPIs should support the ``utf8mb4`` charset.
-In order to use ``utf8mb4`` encoding, changes to
-the MySQL schema and/or server configuration may be required.
+In order to use ``utf8mb4`` encoding for a schema that was created with legacy
+``utf8``, changes to the MySQL schema and/or server configuration may be
+required.
.. seealso::
All modern MySQL DBAPIs all offer the service of handling the encoding and
decoding of unicode data between the Python application space and the database.
-As this was not always the case, SQLAlchemy also includes a comprehensive system
-of performing the encode/decode task as well. As only one of these systems
-should be in use at at time, SQLAlchemy has long included functionality
-to automatically detect upon first connection whether or not the DBAPI is
-automatically handling unicode.
-
-Whether or not the MySQL DBAPI will handle encoding can usually be configured
-using a DBAPI flag ``use_unicode``, which is known to be supported at least
-by MySQLdb, PyMySQL, and MySQL-Connector. Setting this value to ``0``
-in the "connect args" or query string will have the effect of disabling the
-DBAPI's handling of unicode, such that it instead will return data of the
-``str`` type or ``bytes`` type, with data in the configured charset::
-
- # connect while disabling the DBAPI's unicode encoding/decoding
+As this was not always the case, SQLAlchemy also includes a comprehensive
+system of performing the encode/decode task as well, which for MySQL dialects
+can be enabled by passing the flag ``use_unicode=0`` onto the query string, as
+in::
+
e = create_engine(
- "mysql+mysqldb://scott:tiger@localhost/test?charset=utf8&use_unicode=0")
-
-Current recommendations for modern DBAPIs are as follows:
-
-* It is generally always safe to leave the ``use_unicode`` flag set at
- its default; that is, don't use it at all.
-* Under Python 3, the ``use_unicode=0`` flag should **never be used**.
- SQLAlchemy under Python 3 generally assumes the DBAPI receives and returns
- string values as Python 3 strings, which are inherently unicode objects.
-* Under Python 2 with MySQLdb, the ``use_unicode=0`` flag will **offer
- superior performance**, as MySQLdb's unicode converters under Python 2 only
- have been observed to have unusually slow performance compared to SQLAlchemy's
- fast C-based encoders/decoders.
-
-In short: don't specify ``use_unicode`` *at all*, with the possible
-exception of ``use_unicode=0`` on MySQLdb with Python 2 **only** for a
-potential performance gain.
+ "mysql+mysqldb://scott:tiger@localhost/test?charset=utf8mb4&use_unicode=0")
+
+Current recommendations are to **not** use this flag. All modern MySQL DBAPIs
+handle unicode natively as is required on Python 3 in any case.
Ansi Quoting Style
------------------
def _check_unicode_returns(self, connection):
# work around issue fixed in
# https://github.com/farcepest/MySQLdb1/commit/cd44524fef63bd3fcb71947392326e9742d520e8
- # specific issue w/ the utf8_bin collation and unicode returns
+ # specific issue w/ the utf8mb4_bin collation and unicode returns
- has_utf8_bin = self.server_version_info > (5, ) and \
+ has_utf8mb4_bin = self.server_version_info > (5, ) and \
connection.scalar(
- "show collation where %s = 'utf8' and %s = 'utf8_bin'"
+ "show collation where %s = 'utf8mb4' and %s = 'utf8mb4_bin'"
% (
self.identifier_preparer.quote("Charset"),
self.identifier_preparer.quote("Collation")
))
- if has_utf8_bin:
+ if has_utf8mb4_bin:
additional_tests = [
sql.collate(sql.cast(
sql.literal_column(
"'test collated returns'"),
- TEXT(charset='utf8')), "utf8_bin")
+ TEXT(charset='utf8mb4')), "utf8mb4_bin")
]
else:
additional_tests = []
_mysql_drop_db(cfg, conn, ident)
except Exception:
pass
- conn.execute("CREATE DATABASE %s" % ident)
- conn.execute("CREATE DATABASE %s_test_schema" % ident)
- conn.execute("CREATE DATABASE %s_test_schema_2" % ident)
+
+ # using utf8mb4 we are getting collation errors on UNIONS:
+ # test/orm/inheritance/test_polymorphic_rel.py"
+ # 1271, u"Illegal mix of collations for operation 'UNION'"
+ conn.execute("CREATE DATABASE %s CHARACTER SET utf8mb3" % ident)
+ conn.execute(
+ "CREATE DATABASE %s_test_schema CHARACTER SET utf8mb3" % ident)
+ conn.execute(
+ "CREATE DATABASE %s_test_schema_2 CHARACTER SET utf8mb3" % ident)
@_configure_follower.for_db("mysql")
kw.update(table_options)
if exclusions.against(config._current, 'mysql'):
- if 'mysql_engine' not in kw and 'mysql_type' not in kw:
+ if 'mysql_engine' not in kw and 'mysql_type' not in kw and \
+ "autoload_with" not in kw:
if 'test_needs_fk' in test_opts or 'test_needs_acid' in test_opts:
kw['mysql_engine'] = 'InnoDB'
else:
cls.define_index(metadata, users)
if not schema:
+ # test_needs_fk is at the moment to force MySQL InnoDB
noncol_idx_test_nopk = Table(
'noncol_idx_test_nopk', metadata,
Column('q', sa.String(5)),
+ test_needs_fk=True,
)
+
noncol_idx_test_pk = Table(
'noncol_idx_test_pk', metadata,
Column('id', sa.Integer, primary_key=True),
Column('q', sa.String(5)),
+ test_needs_fk=True,
)
Index('noncol_idx_nopk', noncol_idx_test_nopk.c.q.desc())
Index('noncol_idx_pk', noncol_idx_test_pk.c.q.desc())
assert not c.execute('SELECT @@autocommit;').scalar()
def test_isolation_level(self):
- values = {
- # sqlalchemy -> mysql
- 'READ UNCOMMITTED': 'READ-UNCOMMITTED',
- 'READ COMMITTED': 'READ-COMMITTED',
- 'REPEATABLE READ': 'REPEATABLE-READ',
- 'SERIALIZABLE': 'SERIALIZABLE'
- }
- for sa_value, mysql_value in values.items():
+ values = [
+ 'READ UNCOMMITTED',
+ 'READ COMMITTED',
+ 'REPEATABLE READ',
+ 'SERIALIZABLE'
+ ]
+ for value in values:
c = testing.db.connect().execution_options(
- isolation_level=sa_value
+ isolation_level=value
)
- assert c.execute('SELECT @@tx_isolation;').scalar() == mysql_value
+ eq_(
+ testing.db.dialect.get_isolation_level(c.connection),
+ value)
class ParseVersionTest(fixtures.TestBase):
# this is ideally one table, but older MySQL versions choke
# on the multiple TIMESTAMP columns
+ row = testing.db.execute(
+ "show variables like '%%explicit_defaults_for_timestamp%%'"
+ ).first()
+ explicit_defaults_for_timestamp = row[1].lower() in ('on', '1', 'true')
+
reflected = []
for idx, cols in enumerate([
[
{'name': 'p', 'nullable': True,
'default': current_timestamp},
{'name': 'r', 'nullable': False,
- 'default':
+ 'default': None if explicit_defaults_for_timestamp else
"%(current_timestamp)s ON UPDATE %(current_timestamp)s" %
{"current_timestamp": current_timestamp}},
{'name': 's', 'nullable': False,
'default': current_timestamp},
- {'name': 't', 'nullable': False,
- 'default':
+ {'name': 't',
+ 'nullable': True if explicit_defaults_for_timestamp else
+ False,
+ 'default': None if explicit_defaults_for_timestamp else
"%(current_timestamp)s ON UPDATE %(current_timestamp)s" %
{"current_timestamp": current_timestamp}},
- {'name': 'u', 'nullable': False,
+ {'name': 'u',
+ 'nullable': True if explicit_defaults_for_timestamp else
+ False,
'default': current_timestamp},
]
)
# will raise without quoting
"postgresql": "POSIX",
- "mysql": "latin1_general_ci",
+ # note MySQL databases need to be created w/ utf8mb3 charset
+ # for the test suite
+ "mysql": "utf8mb3_bin",
"sqlite": "NOCASE",
# will raise *with* quoting
t.insert().execute({}, {}, {})
ctexec = currenttime.scalar()
- result = t.select().execute()
+ result = t.select().order_by(t.c.col1).execute()
today = datetime.date.today()
eq_(result.fetchall(),
[(51, 'imthedefault', f, ts, ts, ctexec, True, False,
t.insert().values([{}, {}, {}]).execute()
ctexec = currenttime.scalar()
- result = t.select().execute()
+ result = t.select().order_by(t.c.col1).execute()
today = datetime.date.today()
eq_(result.fetchall(),
[(51, 'imthedefault', f, ts, ts, ctexec, True, False,