*object*, *length*, *start*, *end* and *reason*. *encoding* and *reason* are
UTF-8 encoded strings.
-.. c:function:: PyObject* PyUnicodeEncodeError_Create(const char *encoding, const Py_UNICODE *object, Py_ssize_t length, Py_ssize_t start, Py_ssize_t end, const char *reason)
-
- Create a :class:`UnicodeEncodeError` object with the attributes *encoding*,
- *object*, *length*, *start*, *end* and *reason*. *encoding* and *reason* are
- UTF-8 encoded strings.
-
- .. deprecated:: 3.3 3.11
-
- ``Py_UNICODE`` is deprecated since Python 3.3. Please migrate to
- ``PyObject_CallFunction(PyExc_UnicodeEncodeError, "sOnns", ...)``.
-
-.. c:function:: PyObject* PyUnicodeTranslateError_Create(const Py_UNICODE *object, Py_ssize_t length, Py_ssize_t start, Py_ssize_t end, const char *reason)
-
- Create a :class:`UnicodeTranslateError` object with the attributes *object*,
- *length*, *start*, *end* and *reason*. *reason* is a UTF-8 encoded string.
-
- .. deprecated:: 3.3 3.11
-
- ``Py_UNICODE`` is deprecated since Python 3.3. Please migrate to
- ``PyObject_CallFunction(PyExc_UnicodeTranslateError, "Onns", ...)``.
-
.. c:function:: PyObject* PyUnicodeDecodeError_GetEncoding(PyObject *exc)
PyObject* PyUnicodeEncodeError_GetEncoding(PyObject *exc)
:c:func:`PyUnicode_ReadChar` or similar new APIs.
-.. c:function:: PyObject* PyUnicode_TransformDecimalToASCII(Py_UNICODE *s, Py_ssize_t size)
-
- Create a Unicode object by replacing all decimal digits in
- :c:type:`Py_UNICODE` buffer of the given *size* by ASCII digits 0--9
- according to their decimal value. Return ``NULL`` if an exception occurs.
-
- .. deprecated-removed:: 3.3 3.11
- Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
- :c:func:`Py_UNICODE_TODECIMAL`.
-
-
.. c:function:: Py_UNICODE* PyUnicode_AsUnicodeAndSize(PyObject *unicode, Py_ssize_t *size)
Like :c:func:`PyUnicode_AsUnicode`, but also saves the :c:func:`Py_UNICODE`
the codec.
-.. c:function:: PyObject* PyUnicode_Encode(const Py_UNICODE *s, Py_ssize_t size, \
- const char *encoding, const char *errors)
-
- Encode the :c:type:`Py_UNICODE` buffer *s* of the given *size* and return a Python
- bytes object. *encoding* and *errors* have the same meaning as the
- parameters of the same name in the Unicode :meth:`~str.encode` method. The codec
- to be used is looked up using the Python codec registry. Return ``NULL`` if an
- exception was raised by the codec.
-
- .. deprecated-removed:: 3.3 3.11
- Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
- :c:func:`PyUnicode_AsEncodedString`.
-
-
UTF-8 Codecs
""""""""""""
The return type is now ``const char *`` rather of ``char *``.
-.. c:function:: PyObject* PyUnicode_EncodeUTF8(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
-
- Encode the :c:type:`Py_UNICODE` buffer *s* of the given *size* using UTF-8 and
- return a Python bytes object. Return ``NULL`` if an exception was raised by
- the codec.
-
- .. deprecated-removed:: 3.3 3.11
- Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
- :c:func:`PyUnicode_AsUTF8String`, :c:func:`PyUnicode_AsUTF8AndSize` or
- :c:func:`PyUnicode_AsEncodedString`.
-
-
UTF-32 Codecs
"""""""""""""
Return ``NULL`` if an exception was raised by the codec.
-.. c:function:: PyObject* PyUnicode_EncodeUTF32(const Py_UNICODE *s, Py_ssize_t size, \
- const char *errors, int byteorder)
-
- Return a Python bytes object holding the UTF-32 encoded value of the Unicode
- data in *s*. Output is written according to the following byte order::
-
- byteorder == -1: little endian
- byteorder == 0: native byte order (writes a BOM mark)
- byteorder == 1: big endian
-
- If byteorder is ``0``, the output string will always start with the Unicode BOM
- mark (U+FEFF). In the other two modes, no BOM mark is prepended.
-
- If ``Py_UNICODE_WIDE`` is not defined, surrogate pairs will be output
- as a single code point.
-
- Return ``NULL`` if an exception was raised by the codec.
-
- .. deprecated-removed:: 3.3 3.11
- Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
- :c:func:`PyUnicode_AsUTF32String` or :c:func:`PyUnicode_AsEncodedString`.
-
-
UTF-16 Codecs
"""""""""""""
Return ``NULL`` if an exception was raised by the codec.
-.. c:function:: PyObject* PyUnicode_EncodeUTF16(const Py_UNICODE *s, Py_ssize_t size, \
- const char *errors, int byteorder)
-
- Return a Python bytes object holding the UTF-16 encoded value of the Unicode
- data in *s*. Output is written according to the following byte order::
-
- byteorder == -1: little endian
- byteorder == 0: native byte order (writes a BOM mark)
- byteorder == 1: big endian
-
- If byteorder is ``0``, the output string will always start with the Unicode BOM
- mark (U+FEFF). In the other two modes, no BOM mark is prepended.
-
- If ``Py_UNICODE_WIDE`` is defined, a single :c:type:`Py_UNICODE` value may get
- represented as a surrogate pair. If it is not defined, each :c:type:`Py_UNICODE`
- values is interpreted as a UCS-2 character.
-
- Return ``NULL`` if an exception was raised by the codec.
-
- .. deprecated-removed:: 3.3 3.11
- Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
- :c:func:`PyUnicode_AsUTF16String` or :c:func:`PyUnicode_AsEncodedString`.
-
-
UTF-7 Codecs
""""""""""""
bytes that have been decoded will be stored in *consumed*.
-.. c:function:: PyObject* PyUnicode_EncodeUTF7(const Py_UNICODE *s, Py_ssize_t size, \
- int base64SetO, int base64WhiteSpace, const char *errors)
-
- Encode the :c:type:`Py_UNICODE` buffer of the given size using UTF-7 and
- return a Python bytes object. Return ``NULL`` if an exception was raised by
- the codec.
-
- If *base64SetO* is nonzero, "Set O" (punctuation that has no otherwise
- special meaning) will be encoded in base-64. If *base64WhiteSpace* is
- nonzero, whitespace will be encoded in base-64. Both are set to zero for the
- Python "utf-7" codec.
-
- .. deprecated-removed:: 3.3 3.11
- Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
- :c:func:`PyUnicode_AsEncodedString`.
-
-
Unicode-Escape Codecs
"""""""""""""""""""""
raised by the codec.
-.. c:function:: PyObject* PyUnicode_EncodeUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size)
-
- Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Unicode-Escape and
- return a bytes object. Return ``NULL`` if an exception was raised by the codec.
-
- .. deprecated-removed:: 3.3 3.11
- Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
- :c:func:`PyUnicode_AsUnicodeEscapeString`.
-
-
Raw-Unicode-Escape Codecs
"""""""""""""""""""""""""
was raised by the codec.
-.. c:function:: PyObject* PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE *s, \
- Py_ssize_t size)
-
- Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Raw-Unicode-Escape
- and return a bytes object. Return ``NULL`` if an exception was raised by the codec.
-
- .. deprecated-removed:: 3.3 3.11
- Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
- :c:func:`PyUnicode_AsRawUnicodeEscapeString` or
- :c:func:`PyUnicode_AsEncodedString`.
-
-
Latin-1 Codecs
""""""""""""""
raised by the codec.
-.. c:function:: PyObject* PyUnicode_EncodeLatin1(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
-
- Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Latin-1 and
- return a Python bytes object. Return ``NULL`` if an exception was raised by
- the codec.
-
- .. deprecated-removed:: 3.3 3.11
- Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
- :c:func:`PyUnicode_AsLatin1String` or
- :c:func:`PyUnicode_AsEncodedString`.
-
-
ASCII Codecs
""""""""""""
raised by the codec.
-.. c:function:: PyObject* PyUnicode_EncodeASCII(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
-
- Encode the :c:type:`Py_UNICODE` buffer of the given *size* using ASCII and
- return a Python bytes object. Return ``NULL`` if an exception was raised by
- the codec.
-
- .. deprecated-removed:: 3.3 3.11
- Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
- :c:func:`PyUnicode_AsASCIIString` or
- :c:func:`PyUnicode_AsEncodedString`.
-
-
Character Map Codecs
""""""""""""""""""""
``None`` are treated as "undefined mapping" and cause an error.
-.. c:function:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, \
- PyObject *mapping, const char *errors)
-
- Encode the :c:type:`Py_UNICODE` buffer of the given *size* using the given
- *mapping* object and return the result as a bytes object. Return ``NULL`` if
- an exception was raised by the codec.
-
- .. deprecated-removed:: 3.3 3.11
- Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
- :c:func:`PyUnicode_AsCharmapString` or
- :c:func:`PyUnicode_AsEncodedString`.
-
-
The following codec API is special in that maps Unicode to Unicode.
.. c:function:: PyObject* PyUnicode_Translate(PyObject *str, PyObject *table, const char *errors)
use the default error handling.
-.. c:function:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, \
- PyObject *mapping, const char *errors)
-
- Translate a :c:type:`Py_UNICODE` buffer of the given *size* by applying a
- character *mapping* table to it and return the resulting Unicode object.
- Return ``NULL`` when an exception was raised by the codec.
-
- .. deprecated-removed:: 3.3 3.11
- Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
- :c:func:`PyUnicode_Translate`. or :ref:`generic codec based API
- <codec-registry>`
-
-
MBCS codecs for Windows
"""""""""""""""""""""""
.. versionadded:: 3.3
-.. c:function:: PyObject* PyUnicode_EncodeMBCS(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
-
- Encode the :c:type:`Py_UNICODE` buffer of the given *size* using MBCS and return
- a Python bytes object. Return ``NULL`` if an exception was raised by the
- codec.
-
- .. deprecated-removed:: 3.3 4.0
- Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
- :c:func:`PyUnicode_AsMBCSString`, :c:func:`PyUnicode_EncodeCodePage` or
- :c:func:`PyUnicode_AsEncodedString`.
-
-
Methods & Slots
"""""""""""""""
PyUnicode_AsUnicode:Py_UNICODE*:::
PyUnicode_AsUnicode:PyObject*:unicode:0:
-PyUnicode_TransformDecimalToASCII:PyObject*::+1:
-PyUnicode_TransformDecimalToASCII:Py_UNICODE*:s::
-PyUnicode_TransformDecimalToASCII:Py_ssize_t:size::
-
PyUnicode_AsUnicodeAndSize:Py_UNICODE*:::
PyUnicode_AsUnicodeAndSize:PyObject*:unicode:0:
PyUnicode_AsUnicodeAndSize:Py_ssize_t*:size::
PyUnicode_DecodeUTF8Stateful:const char*:errors::
PyUnicode_DecodeUTF8Stateful:Py_ssize_t*:consumed::
-PyUnicode_Encode:PyObject*::+1:
-PyUnicode_Encode:const Py_UNICODE*:s::
-PyUnicode_Encode:Py_ssize_t:size::
-PyUnicode_Encode:const char*:encoding::
-PyUnicode_Encode:const char*:errors::
-
PyUnicode_AsEncodedString:PyObject*::+1:
PyUnicode_AsEncodedString:PyObject*:unicode:0:
PyUnicode_AsEncodedString:const char*:encoding::
PyUnicode_DecodeUTF7Stateful:const char*:errors::
PyUnicode_DecodeUTF7Stateful:Py_ssize_t*:consumed::
-PyUnicode_EncodeUTF7:PyObject*::+1:
-PyUnicode_EncodeUTF7:const Py_UNICODE*:s::
-PyUnicode_EncodeUTF7:Py_ssize_t:size::
-PyUnicode_EncodeUTF7:int:base64SetO::
-PyUnicode_EncodeUTF7:int:base64WhiteSpace::
-PyUnicode_EncodeUTF7:const char*:errors::
-
PyUnicode_DecodeUTF8:PyObject*::+1:
PyUnicode_DecodeUTF8:const char*:s::
PyUnicode_DecodeUTF8:Py_ssize_t:size::
PyUnicode_DecodeUTF8:const char*:errors::
-PyUnicode_EncodeUTF8:PyObject*::+1:
-PyUnicode_EncodeUTF8:const Py_UNICODE*:s::
-PyUnicode_EncodeUTF8:Py_ssize_t:size::
-PyUnicode_EncodeUTF8:const char*:errors::
-
PyUnicode_AsUTF8String:PyObject*::+1:
PyUnicode_AsUTF8String:PyObject*:unicode:0:
PyUnicode_DecodeUTF16:const char*:errors::
PyUnicode_DecodeUTF16:int*:byteorder::
-PyUnicode_EncodeUTF16:PyObject*::+1:
-PyUnicode_EncodeUTF16:const Py_UNICODE*:s::
-PyUnicode_EncodeUTF16:Py_ssize_t:size::
-PyUnicode_EncodeUTF16:const char*:errors::
-PyUnicode_EncodeUTF16:int:byteorder::
-
PyUnicode_AsUTF16String:PyObject*::+1:
PyUnicode_AsUTF16String:PyObject*:unicode:0:
PyUnicode_AsUTF32String:PyObject*::+1:
PyUnicode_AsUTF32String:PyObject*:unicode:0:
-PyUnicode_EncodeUTF32:PyObject*::+1:
-PyUnicode_EncodeUTF32:const Py_UNICODE*:s::
-PyUnicode_EncodeUTF32:Py_ssize_t:size::
-PyUnicode_EncodeUTF32:const char*:errors::
-PyUnicode_EncodeUTF32:int:byteorder::
-
PyUnicode_DecodeUnicodeEscape:PyObject*::+1:
PyUnicode_DecodeUnicodeEscape:const char*:s::
PyUnicode_DecodeUnicodeEscape:Py_ssize_t:size::
PyUnicode_DecodeUnicodeEscape:const char*:errors::
-PyUnicode_EncodeUnicodeEscape:PyObject*::+1:
-PyUnicode_EncodeUnicodeEscape:const Py_UNICODE*:s::
-PyUnicode_EncodeUnicodeEscape:Py_ssize_t:size::
-
PyUnicode_AsUnicodeEscapeString:PyObject*::+1:
PyUnicode_AsUnicodeEscapeString:PyObject*:unicode:0:
PyUnicode_DecodeRawUnicodeEscape:Py_ssize_t:size::
PyUnicode_DecodeRawUnicodeEscape:const char*:errors::
-PyUnicode_EncodeRawUnicodeEscape:PyObject*::+1:
-PyUnicode_EncodeRawUnicodeEscape:const Py_UNICODE*:s::
-PyUnicode_EncodeRawUnicodeEscape:Py_ssize_t:size::
-
PyUnicode_AsRawUnicodeEscapeString:PyObject*::+1:
PyUnicode_AsRawUnicodeEscapeString:PyObject*:unicode:0:
PyUnicode_DecodeLatin1:Py_ssize_t:size::
PyUnicode_DecodeLatin1:const char*:errors::
-PyUnicode_EncodeLatin1:PyObject*::+1:
-PyUnicode_EncodeLatin1:const Py_UNICODE*:s::
-PyUnicode_EncodeLatin1:Py_ssize_t:size::
-PyUnicode_EncodeLatin1:const char*:errors::
-
PyUnicode_AsLatin1String:PyObject*::+1:
PyUnicode_AsLatin1String:PyObject*:unicode:0:
PyUnicode_DecodeASCII:Py_ssize_t:size::
PyUnicode_DecodeASCII:const char*:errors::
-PyUnicode_EncodeASCII:PyObject*::+1:
-PyUnicode_EncodeASCII:const Py_UNICODE*:s::
-PyUnicode_EncodeASCII:Py_ssize_t:size::
-PyUnicode_EncodeASCII:const char*:errors::
-
PyUnicode_AsASCIIString:PyObject*::+1:
PyUnicode_AsASCIIString:PyObject*:unicode:0:
PyUnicode_DecodeCharmap:PyObject*:mapping:0:
PyUnicode_DecodeCharmap:const char*:errors::
-PyUnicode_EncodeCharmap:PyObject*::+1:
-PyUnicode_EncodeCharmap:const Py_UNICODE*:s::
-PyUnicode_EncodeCharmap:Py_ssize_t:size::
-PyUnicode_EncodeCharmap:PyObject*:mapping:0:
-PyUnicode_EncodeCharmap:const char*:errors::
-
PyUnicode_AsCharmapString:PyObject*::+1:
PyUnicode_AsCharmapString:PyObject*:unicode:0:
PyUnicode_AsCharmapString:PyObject*:mapping:0:
-PyUnicode_TranslateCharmap:PyObject*::+1:
-PyUnicode_TranslateCharmap:const Py_UNICODE*:s::
-PyUnicode_TranslateCharmap:Py_ssize_t:size::
-PyUnicode_TranslateCharmap:PyObject*:mapping:0:
-PyUnicode_TranslateCharmap:const char*:errors::
-
PyUnicode_DecodeMBCS:PyObject*::+1:
PyUnicode_DecodeMBCS:const char*:s::
PyUnicode_DecodeMBCS:Py_ssize_t:size::
PyUnicode_EncodeCodePage:PyObject*:unicode:0:
PyUnicode_EncodeCodePage:const char*:errors::
-PyUnicode_EncodeMBCS:PyObject*::+1:
-PyUnicode_EncodeMBCS:const Py_UNICODE*:s::
-PyUnicode_EncodeMBCS:Py_ssize_t:size::
-PyUnicode_EncodeMBCS:const char*:errors::
-
PyUnicode_AsMBCSString:PyObject*::+1:
PyUnicode_AsMBCSString:PyObject*:unicode:0:
PyUnicodeDecodeError_SetStart:PyObject*:exc:0:
PyUnicodeDecodeError_SetStart:Py_ssize_t:start::
-PyUnicodeEncodeError_Create:PyObject*::+1:
-PyUnicodeEncodeError_Create:const char*:encoding::
-PyUnicodeEncodeError_Create:const Py_UNICODE*:object::
-PyUnicodeEncodeError_Create:Py_ssize_t:length::
-PyUnicodeEncodeError_Create:Py_ssize_t:start::
-PyUnicodeEncodeError_Create:Py_ssize_t:end::
-PyUnicodeEncodeError_Create:const char*:reason::
-
-PyUnicodeTranslateError_Create:PyObject*::+1:
-PyUnicodeTranslateError_Create:const Py_UNICODE*:object::
-PyUnicodeTranslateError_Create:Py_ssize_t:length::
-PyUnicodeTranslateError_Create:Py_ssize_t:start::
-PyUnicodeTranslateError_Create:Py_ssize_t:end::
-PyUnicodeTranslateError_Create:const char*:reason::
-
PyWeakref_Check:int:::
PyWeakref_Check:PyObject*:ob::
PyObject *filename,
int lineno);
-/* Create a UnicodeEncodeError object.
- *
- * TODO: This API will be removed in Python 3.11.
- */
-Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject *) PyUnicodeEncodeError_Create(
- const char *encoding, /* UTF-8 encoded string */
- const Py_UNICODE *object,
- Py_ssize_t length,
- Py_ssize_t start,
- Py_ssize_t end,
- const char *reason /* UTF-8 encoded string */
- );
-
-/* Create a UnicodeTranslateError object.
- *
- * TODO: This API will be removed in Python 3.11.
- */
-Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject *) PyUnicodeTranslateError_Create(
- const Py_UNICODE *object,
- Py_ssize_t length,
- Py_ssize_t start,
- Py_ssize_t end,
- const char *reason /* UTF-8 encoded string */
- );
PyAPI_FUNC(PyObject *) _PyUnicodeTranslateError_Create(
PyObject *object,
Py_ssize_t start,
#define _PyUnicode_AsString PyUnicode_AsUTF8
-/* --- Generic Codecs ----------------------------------------------------- */
-
-/* Encodes a Py_UNICODE buffer of the given size and returns a
- Python string object. */
-Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_Encode(
- const Py_UNICODE *s, /* Unicode char buffer */
- Py_ssize_t size, /* number of Py_UNICODE chars to encode */
- const char *encoding, /* encoding */
- const char *errors /* error handling */
- );
-
/* --- UTF-7 Codecs ------------------------------------------------------- */
-Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeUTF7(
- const Py_UNICODE *data, /* Unicode char buffer */
- Py_ssize_t length, /* number of Py_UNICODE chars to encode */
- int base64SetO, /* Encode RFC2152 Set O characters in base64 */
- int base64WhiteSpace, /* Encode whitespace (sp, ht, nl, cr) in base64 */
- const char *errors /* error handling */
- );
-
PyAPI_FUNC(PyObject*) _PyUnicode_EncodeUTF7(
PyObject *unicode, /* Unicode object */
int base64SetO, /* Encode RFC2152 Set O characters in base64 */
PyObject *unicode,
const char *errors);
-Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeUTF8(
- const Py_UNICODE *data, /* Unicode char buffer */
- Py_ssize_t length, /* number of Py_UNICODE chars to encode */
- const char *errors /* error handling */
- );
-
/* --- UTF-32 Codecs ------------------------------------------------------ */
-Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeUTF32(
- const Py_UNICODE *data, /* Unicode char buffer */
- Py_ssize_t length, /* number of Py_UNICODE chars to encode */
- const char *errors, /* error handling */
- int byteorder /* byteorder to use 0=BOM+native;-1=LE,1=BE */
- );
-
PyAPI_FUNC(PyObject*) _PyUnicode_EncodeUTF32(
PyObject *object, /* Unicode object */
const char *errors, /* error handling */
If byteorder is 0, the output string will always start with the
Unicode BOM mark (U+FEFF). In the other two modes, no BOM mark is
prepended.
-
- Note that Py_UNICODE data is being interpreted as UTF-16 reduced to
- UCS-2. This trick makes it possible to add full UTF-16 capabilities
- at a later point without compromising the APIs.
-
*/
-Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeUTF16(
- const Py_UNICODE *data, /* Unicode char buffer */
- Py_ssize_t length, /* number of Py_UNICODE chars to encode */
- const char *errors, /* error handling */
- int byteorder /* byteorder to use 0=BOM+native;-1=LE,1=BE */
- );
-
PyAPI_FUNC(PyObject*) _PyUnicode_EncodeUTF16(
PyObject* unicode, /* Unicode object */
const char *errors, /* error handling */
string. */
);
-Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeUnicodeEscape(
- const Py_UNICODE *data, /* Unicode char buffer */
- Py_ssize_t length /* Number of Py_UNICODE chars to encode */
- );
-
-/* --- Raw-Unicode-Escape Codecs ------------------------------------------ */
-
-Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeRawUnicodeEscape(
- const Py_UNICODE *data, /* Unicode char buffer */
- Py_ssize_t length /* Number of Py_UNICODE chars to encode */
- );
-
/* --- Latin-1 Codecs ----------------------------------------------------- */
PyAPI_FUNC(PyObject*) _PyUnicode_AsLatin1String(
PyObject* unicode,
const char* errors);
-Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeLatin1(
- const Py_UNICODE *data, /* Unicode char buffer */
- Py_ssize_t length, /* Number of Py_UNICODE chars to encode */
- const char *errors /* error handling */
- );
-
/* --- ASCII Codecs ------------------------------------------------------- */
PyAPI_FUNC(PyObject*) _PyUnicode_AsASCIIString(
PyObject* unicode,
const char* errors);
-Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeASCII(
- const Py_UNICODE *data, /* Unicode char buffer */
- Py_ssize_t length, /* Number of Py_UNICODE chars to encode */
- const char *errors /* error handling */
- );
-
/* --- Character Map Codecs ----------------------------------------------- */
-Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeCharmap(
- const Py_UNICODE *data, /* Unicode char buffer */
- Py_ssize_t length, /* Number of Py_UNICODE chars to encode */
- PyObject *mapping, /* encoding mapping */
- const char *errors /* error handling */
- );
-
-PyAPI_FUNC(PyObject*) _PyUnicode_EncodeCharmap(
- PyObject *unicode, /* Unicode object */
- PyObject *mapping, /* encoding mapping */
- const char *errors /* error handling */
- );
-
-/* Translate a Py_UNICODE buffer of the given length by applying a
- character mapping table to it and return the resulting Unicode
- object.
+/* Translate an Unicode object by applying a character mapping table to
+ it and return the resulting Unicode object.
The mapping table must map Unicode ordinal integers to Unicode strings,
Unicode ordinal integers or None (causing deletion of the character).
Mapping tables may be dictionaries or sequences. Unmapped character
ordinals (ones which cause a LookupError) are left untouched and
are copied as-is.
-
*/
-Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject *) PyUnicode_TranslateCharmap(
- const Py_UNICODE *data, /* Unicode char buffer */
- Py_ssize_t length, /* Number of Py_UNICODE chars to encode */
- PyObject *table, /* Translate table */
- const char *errors /* error handling */
- );
-
-/* --- MBCS codecs for Windows -------------------------------------------- */
-
-#ifdef MS_WINDOWS
-Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeMBCS(
- const Py_UNICODE *data, /* Unicode char buffer */
- Py_ssize_t length, /* number of Py_UNICODE chars to encode */
+PyAPI_FUNC(PyObject*) _PyUnicode_EncodeCharmap(
+ PyObject *unicode, /* Unicode object */
+ PyObject *mapping, /* encoding mapping */
const char *errors /* error handling */
);
-#endif
/* --- Decimal Encoder ---------------------------------------------------- */
-/* Takes a Unicode string holding a decimal value and writes it into
- an output buffer using standard ASCII digit codes.
-
- The output buffer has to provide at least length+1 bytes of storage
- area. The output string is 0-terminated.
-
- The encoder converts whitespace to ' ', decimal characters to their
- corresponding ASCII digit and all other Latin-1 characters except
- \0 as-is. Characters outside this range (Unicode ordinals 1-256)
- are treated as errors. This includes embedded NULL bytes.
-
- Error handling is defined by the errors argument:
-
- NULL or "strict": raise a ValueError
- "ignore": ignore the wrong characters (these are not copied to the
- output buffer)
- "replace": replaces illegal characters with '?'
-
- Returns 0 on success, -1 on failure.
-
-*/
-
-Py_DEPRECATED(3.3) PyAPI_FUNC(int) PyUnicode_EncodeDecimal(
- Py_UNICODE *s, /* Unicode buffer */
- Py_ssize_t length, /* Number of Py_UNICODE chars to encode */
- char *output, /* Output buffer; must have size >= length */
- const char *errors /* error handling */
- );
-
-/* Transforms code points that have decimal digit property to the
- corresponding ASCII digit code points.
-
- Returns a new Unicode string on success, NULL on failure.
-*/
-
-Py_DEPRECATED(3.3)
-PyAPI_FUNC(PyObject*) PyUnicode_TransformDecimalToASCII(
- Py_UNICODE *s, /* Unicode buffer */
- Py_ssize_t length /* Number of Py_UNICODE chars to transform */
- );
-
/* Coverts a Unicode object holding a decimal value to an ASCII string
for using in int, float and complex parsers.
Transforms code points that have decimal digit property to the
self.assertRaises(SystemError, unicode_copycharacters, s, 0, s, 0, -1)
self.assertRaises(SystemError, unicode_copycharacters, s, 0, b'', 0, 0)
- @support.cpython_only
- @support.requires_legacy_unicode_capi
- def test_encode_decimal(self):
- from _testcapi import unicode_encodedecimal
- with warnings_helper.check_warnings():
- warnings.simplefilter('ignore', DeprecationWarning)
- self.assertEqual(unicode_encodedecimal('123'),
- b'123')
- self.assertEqual(unicode_encodedecimal('\u0663.\u0661\u0664'),
- b'3.14')
- self.assertEqual(unicode_encodedecimal(
- "\N{EM SPACE}3.14\N{EN SPACE}"), b' 3.14 ')
- self.assertRaises(UnicodeEncodeError,
- unicode_encodedecimal, "123\u20ac", "strict")
- self.assertRaisesRegex(
- ValueError,
- "^'decimal' codec can't encode character",
- unicode_encodedecimal, "123\u20ac", "replace")
-
- @support.cpython_only
- @support.requires_legacy_unicode_capi
- def test_transform_decimal(self):
- from _testcapi import unicode_transformdecimaltoascii as transform_decimal
- with warnings_helper.check_warnings():
- warnings.simplefilter('ignore', DeprecationWarning)
- self.assertEqual(transform_decimal('123'),
- '123')
- self.assertEqual(transform_decimal('\u0663.\u0661\u0664'),
- '3.14')
- self.assertEqual(transform_decimal("\N{EM SPACE}3.14\N{EN SPACE}"),
- "\N{EM SPACE}3.14\N{EN SPACE}")
- self.assertEqual(transform_decimal('123\u20ac'),
- '123\u20ac')
-
@support.cpython_only
def test_pep393_utf8_caching_bug(self):
# Issue #25709: Problem with string concatenation and utf-8 cache
--- /dev/null
+Remove deprecated ``Py_UNICODE`` APIs: ``PyUnicode_Encode``,
+``PyUnicode_EncodeUTF7``, ``PyUnicode_EncodeUTF8``,
+``PyUnicode_EncodeUTF16``, ``PyUnicode_EncodeUTF32``,
+``PyUnicode_EncodeLatin1``, ``PyUnicode_EncodeMBCS``,
+``PyUnicode_EncodeDecimal``, ``PyUnicode_EncodeRawUnicodeEscape``,
+``PyUnicode_EncodeCharmap``, ``PyUnicode_EncodeUnicodeEscape``,
+``PyUnicode_TransformDecimalToASCII``, ``PyUnicode_TranslateCharmap``,
+``PyUnicodeEncodeError_Create``, ``PyUnicodeTranslateError_Create``. See
+:pep:`393` and :pep:`624` for reference.
return 0;
}
-/* A copy of PyUnicode_EncodeRawUnicodeEscape() that also translates
+/* A copy of PyUnicode_AsRawUnicodeEscapeString() that also translates
backslash and newline characters to \uXXXX escapes. */
static PyObject *
raw_unicode_escape(PyObject *obj)
_Py_COMP_DIAG_PUSH
_Py_COMP_DIAG_IGNORE_DEPR_DECLS
-static PyObject *
-unicode_encodedecimal(PyObject *self, PyObject *args)
-{
- Py_UNICODE *unicode;
- Py_ssize_t length;
- char *errors = NULL;
- PyObject *decimal;
- Py_ssize_t decimal_length, new_length;
- int res;
-
- if (!PyArg_ParseTuple(args, "u#|s", &unicode, &length, &errors))
- return NULL;
-
- decimal_length = length * 7; /* len('€') */
- decimal = PyBytes_FromStringAndSize(NULL, decimal_length);
- if (decimal == NULL)
- return NULL;
-
- res = PyUnicode_EncodeDecimal(unicode, length,
- PyBytes_AS_STRING(decimal),
- errors);
- if (res < 0) {
- Py_DECREF(decimal);
- return NULL;
- }
-
- new_length = strlen(PyBytes_AS_STRING(decimal));
- assert(new_length <= decimal_length);
- res = _PyBytes_Resize(&decimal, new_length);
- if (res < 0)
- return NULL;
-
- return decimal;
-}
-
-static PyObject *
-unicode_transformdecimaltoascii(PyObject *self, PyObject *args)
-{
- Py_UNICODE *unicode;
- Py_ssize_t length;
- if (!PyArg_ParseTuple(args, "u#|s", &unicode, &length))
- return NULL;
- return PyUnicode_TransformDecimalToASCII(unicode, length);
-}
-
static PyObject *
unicode_legacy_string(PyObject *self, PyObject *args)
{
{"unicode_findchar", unicode_findchar, METH_VARARGS},
{"unicode_copycharacters", unicode_copycharacters, METH_VARARGS},
#if USE_UNICODE_WCHAR_CACHE
- {"unicode_encodedecimal", unicode_encodedecimal, METH_VARARGS},
- {"unicode_transformdecimaltoascii", unicode_transformdecimaltoascii, METH_VARARGS},
{"unicode_legacy_string", unicode_legacy_string, METH_VARARGS},
#endif /* USE_UNICODE_WCHAR_CACHE */
{"_test_thread_state", test_thread_state, METH_VARARGS},
};
PyObject *PyExc_UnicodeEncodeError = (PyObject *)&_PyExc_UnicodeEncodeError;
-PyObject *
-PyUnicodeEncodeError_Create(
- const char *encoding, const Py_UNICODE *object, Py_ssize_t length,
- Py_ssize_t start, Py_ssize_t end, const char *reason)
-{
- return PyObject_CallFunction(PyExc_UnicodeEncodeError, "su#nns",
- encoding, object, length, start, end, reason);
-}
-
/*
* UnicodeDecodeError extends UnicodeError
};
PyObject *PyExc_UnicodeTranslateError = (PyObject *)&_PyExc_UnicodeTranslateError;
-/* Deprecated. */
-PyObject *
-PyUnicodeTranslateError_Create(
- const Py_UNICODE *object, Py_ssize_t length,
- Py_ssize_t start, Py_ssize_t end, const char *reason)
-{
- return PyObject_CallFunction(PyExc_UnicodeTranslateError, "u#nns",
- object, length, start, end, reason);
-}
-
PyObject *
_PyUnicodeTranslateError_Create(
PyObject *object,
return NULL;
}
-PyObject *
-PyUnicode_Encode(const Py_UNICODE *s,
- Py_ssize_t size,
- const char *encoding,
- const char *errors)
-{
- PyObject *v, *unicode;
-
- unicode = PyUnicode_FromWideChar(s, size);
- if (unicode == NULL)
- return NULL;
- v = PyUnicode_AsEncodedString(unicode, encoding, errors);
- Py_DECREF(unicode);
- return v;
-}
-
PyObject *
PyUnicode_AsEncodedObject(PyObject *unicode,
const char *encoding,
return NULL;
return v;
}
-PyObject *
-PyUnicode_EncodeUTF7(const Py_UNICODE *s,
- Py_ssize_t size,
- int base64SetO,
- int base64WhiteSpace,
- const char *errors)
-{
- PyObject *result;
- PyObject *tmp = PyUnicode_FromWideChar(s, size);
- if (tmp == NULL)
- return NULL;
- result = _PyUnicode_EncodeUTF7(tmp, base64SetO,
- base64WhiteSpace, errors);
- Py_DECREF(tmp);
- return result;
-}
#undef IS_BASE64
#undef FROM_BASE64
}
-PyObject *
-PyUnicode_EncodeUTF8(const Py_UNICODE *s,
- Py_ssize_t size,
- const char *errors)
-{
- PyObject *v, *unicode;
-
- unicode = PyUnicode_FromWideChar(s, size);
- if (unicode == NULL)
- return NULL;
- v = _PyUnicode_AsUTF8String(unicode, errors);
- Py_DECREF(unicode);
- return v;
-}
-
PyObject *
PyUnicode_AsUTF8String(PyObject *unicode)
{
return NULL;
}
-PyObject *
-PyUnicode_EncodeUTF32(const Py_UNICODE *s,
- Py_ssize_t size,
- const char *errors,
- int byteorder)
-{
- PyObject *result;
- PyObject *tmp = PyUnicode_FromWideChar(s, size);
- if (tmp == NULL)
- return NULL;
- result = _PyUnicode_EncodeUTF32(tmp, errors, byteorder);
- Py_DECREF(tmp);
- return result;
-}
-
PyObject *
PyUnicode_AsUTF32String(PyObject *unicode)
{
#undef STORECHAR
}
-PyObject *
-PyUnicode_EncodeUTF16(const Py_UNICODE *s,
- Py_ssize_t size,
- const char *errors,
- int byteorder)
-{
- PyObject *result;
- PyObject *tmp = PyUnicode_FromWideChar(s, size);
- if (tmp == NULL)
- return NULL;
- result = _PyUnicode_EncodeUTF16(tmp, errors, byteorder);
- Py_DECREF(tmp);
- return result;
-}
-
PyObject *
PyUnicode_AsUTF16String(PyObject *unicode)
{
return repr;
}
-PyObject *
-PyUnicode_EncodeUnicodeEscape(const Py_UNICODE *s,
- Py_ssize_t size)
-{
- PyObject *result;
- PyObject *tmp = PyUnicode_FromWideChar(s, size);
- if (tmp == NULL) {
- return NULL;
- }
-
- result = PyUnicode_AsUnicodeEscapeString(tmp);
- Py_DECREF(tmp);
- return result;
-}
-
/* --- Raw Unicode Escape Codec ------------------------------------------- */
PyObject *
return repr;
}
-PyObject *
-PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE *s,
- Py_ssize_t size)
-{
- PyObject *result;
- PyObject *tmp = PyUnicode_FromWideChar(s, size);
- if (tmp == NULL)
- return NULL;
- result = PyUnicode_AsRawUnicodeEscapeString(tmp);
- Py_DECREF(tmp);
- return result;
-}
-
/* --- Latin-1 Codec ------------------------------------------------------ */
PyObject *
return NULL;
}
-/* Deprecated */
-PyObject *
-PyUnicode_EncodeLatin1(const Py_UNICODE *p,
- Py_ssize_t size,
- const char *errors)
-{
- PyObject *result;
- PyObject *unicode = PyUnicode_FromWideChar(p, size);
- if (unicode == NULL)
- return NULL;
- result = unicode_encode_ucs1(unicode, errors, 256);
- Py_DECREF(unicode);
- return result;
-}
-
PyObject *
_PyUnicode_AsLatin1String(PyObject *unicode, const char *errors)
{
return NULL;
}
-/* Deprecated */
-PyObject *
-PyUnicode_EncodeASCII(const Py_UNICODE *p,
- Py_ssize_t size,
- const char *errors)
-{
- PyObject *result;
- PyObject *unicode = PyUnicode_FromWideChar(p, size);
- if (unicode == NULL)
- return NULL;
- result = unicode_encode_ucs1(unicode, errors, 128);
- Py_DECREF(unicode);
- return result;
-}
-
PyObject *
_PyUnicode_AsASCIIString(PyObject *unicode, const char *errors)
{
return outbytes;
}
-PyObject *
-PyUnicode_EncodeMBCS(const Py_UNICODE *p,
- Py_ssize_t size,
- const char *errors)
-{
- PyObject *unicode, *res;
- unicode = PyUnicode_FromWideChar(p, size);
- if (unicode == NULL)
- return NULL;
- res = encode_code_page(CP_ACP, unicode, errors);
- Py_DECREF(unicode);
- return res;
-}
-
PyObject *
PyUnicode_EncodeCodePage(int code_page,
PyObject *unicode,
return NULL;
}
-/* Deprecated */
-PyObject *
-PyUnicode_EncodeCharmap(const Py_UNICODE *p,
- Py_ssize_t size,
- PyObject *mapping,
- const char *errors)
-{
- PyObject *result;
- PyObject *unicode = PyUnicode_FromWideChar(p, size);
- if (unicode == NULL)
- return NULL;
- result = _PyUnicode_EncodeCharmap(unicode, mapping, errors);
- Py_DECREF(unicode);
- return result;
-}
-
PyObject *
PyUnicode_AsCharmapString(PyObject *unicode,
PyObject *mapping)
return NULL;
}
-/* Deprecated. Use PyUnicode_Translate instead. */
-PyObject *
-PyUnicode_TranslateCharmap(const Py_UNICODE *p,
- Py_ssize_t size,
- PyObject *mapping,
- const char *errors)
-{
- PyObject *result;
- PyObject *unicode = PyUnicode_FromWideChar(p, size);
- if (!unicode)
- return NULL;
- result = _PyUnicode_TranslateCharmap(unicode, mapping, errors);
- Py_DECREF(unicode);
- return result;
-}
-
PyObject *
PyUnicode_Translate(PyObject *str,
PyObject *mapping,
return result;
}
-PyObject *
-PyUnicode_TransformDecimalToASCII(Py_UNICODE *s,
- Py_ssize_t length)
-{
- PyObject *decimal;
- Py_ssize_t i;
- Py_UCS4 maxchar;
- enum PyUnicode_Kind kind;
- const void *data;
-
- maxchar = 127;
- for (i = 0; i < length; i++) {
- Py_UCS4 ch = s[i];
- if (ch > 127) {
- int decimal = Py_UNICODE_TODECIMAL(ch);
- if (decimal >= 0)
- ch = '0' + decimal;
- maxchar = Py_MAX(maxchar, ch);
- }
- }
-
- /* Copy to a new string */
- decimal = PyUnicode_New(length, maxchar);
- if (decimal == NULL)
- return decimal;
- kind = PyUnicode_KIND(decimal);
- data = PyUnicode_DATA(decimal);
- /* Iterate over code points */
- for (i = 0; i < length; i++) {
- Py_UCS4 ch = s[i];
- if (ch > 127) {
- int decimal = Py_UNICODE_TODECIMAL(ch);
- if (decimal >= 0)
- ch = '0' + decimal;
- }
- PyUnicode_WRITE(kind, data, i, ch);
- }
- return unicode_result(decimal);
-}
-/* --- Decimal Encoder ---------------------------------------------------- */
-
-int
-PyUnicode_EncodeDecimal(Py_UNICODE *s,
- Py_ssize_t length,
- char *output,
- const char *errors)
-{
- PyObject *unicode;
- Py_ssize_t i;
- enum PyUnicode_Kind kind;
- const void *data;
-
- if (output == NULL) {
- PyErr_BadArgument();
- return -1;
- }
-
- unicode = PyUnicode_FromWideChar(s, length);
- if (unicode == NULL)
- return -1;
-
- kind = PyUnicode_KIND(unicode);
- data = PyUnicode_DATA(unicode);
-
- for (i=0; i < length; ) {
- PyObject *exc;
- Py_UCS4 ch;
- int decimal;
- Py_ssize_t startpos;
-
- ch = PyUnicode_READ(kind, data, i);
-
- if (Py_UNICODE_ISSPACE(ch)) {
- *output++ = ' ';
- i++;
- continue;
- }
- decimal = Py_UNICODE_TODECIMAL(ch);
- if (decimal >= 0) {
- *output++ = '0' + decimal;
- i++;
- continue;
- }
- if (0 < ch && ch < 256) {
- *output++ = (char)ch;
- i++;
- continue;
- }
-
- startpos = i;
- exc = NULL;
- raise_encode_exception(&exc, "decimal", unicode,
- startpos, startpos+1,
- "invalid decimal Unicode string");
- Py_XDECREF(exc);
- Py_DECREF(unicode);
- return -1;
- }
- /* 0-terminate the output string */
- *output++ = '\0';
- Py_DECREF(unicode);
- return 0;
-}
-
/* --- Helpers ------------------------------------------------------------ */
/* helper macro to fixup start/end slice values */