bpo-40334: Always show the caret on SyntaxErrors (GH-20050)
This commit fixes SyntaxError locations when the caret is not displayed,
by doing the following:
- `col_number` always gets set to the location of the offending
node/expr. When no caret is to be displayed, this gets achieved
by setting the object holding the error line to None.
- Introduce a new function `_PyPegen_raise_error_known_location`,
which can be called, when an arbitrary `lineno`/`col_offset`
needs to be passed. This function then gets used in the grammar
(through some new macros and inline functions) so that SyntaxError
locations of the new parser match that of the old.
Victor Stinner [Wed, 13 May 2020 02:40:30 +0000 (04:40 +0200)]
bpo-40609: _Py_hashtable_t values become void* (GH-20065)
_Py_hashtable_t values become regular "void *" pointers.
* Add _Py_hashtable_entry_t.data member
* Remove _Py_hashtable_t.data_size member
* Remove _Py_hashtable_t.get_func member. It is no longer needed
to specialize _Py_hashtable_get() for a specific value size, since
all entries now have the same size (void*).
* Remove the following macros:
* Rename _Py_hashtable_pop() to _Py_hashtable_steal()
* _Py_hashtable_foreach() callback now gets key and value rather than
entry.
* Remove _Py_hashtable_value_destroy_func type. value_destroy_func
callback now only has a single parameter: data (void*).
Rewrite _tracemalloc to store "trace_t*" rather than directly
"trace_t" in traces hash tables. Traces are now allocated on the heap
memory, outside the hash table.
Add tracemalloc_copy_traces() and tracemalloc_copy_domains() helper
functions.
Remove _Py_hashtable_copy() function since there is no API to copy a
key or a value.
Remove also _Py_hashtable_delete() function which was commented.
Victor Stinner [Tue, 12 May 2020 23:36:47 +0000 (01:36 +0200)]
bpo-40609: Rewrite how _tracemalloc handles domains (GH-20059)
Rewrite how the _tracemalloc module stores traces of other domains.
Rather than storing the domain inside the key, it now uses a new hash
table with the domain as the key, and the data is a per-domain traces
hash table.
* Add tracemalloc_domain hash table.
* Remove _Py_tracemalloc_config.use_domain.
* Remove pointer_t and related functions.
Victor Stinner [Tue, 12 May 2020 16:46:20 +0000 (18:46 +0200)]
bpo-40602: Add _Py_HashPointerRaw() function (GH-20056)
Add a new _Py_HashPointerRaw() function which avoids replacing -1
with -2 to micro-optimize hash table using pointer keys: using
_Py_hashtable_hash_ptr() hash function.
* Add get_func and get_entry_func members to _Py_hashtable_t
* Convert _Py_hashtable_get() and _Py_hashtable_get_entry() functions
to static nline functions.
* Add specialized get and get entry for pointer keys.
Victor Stinner [Tue, 12 May 2020 00:42:19 +0000 (02:42 +0200)]
bpo-40602: Rename hashtable.h to pycore_hashtable.h (GH-20044)
* Move Modules/hashtable.h to Include/internal/pycore_hashtable.h
* Move Modules/hashtable.c to Python/hashtable.c
* Python is now linked to hashtable.c. _tracemalloc is no longer
linked to hashtable.c. Previously, marshal.c got hashtable.c via
_tracemalloc.c which is built as a builtin module.
scoder [Mon, 11 May 2020 04:04:31 +0000 (06:04 +0200)]
bpo-40575: Avoid unnecessary overhead in _PyDict_GetItemIdWithError() (GH-20018)
Avoid unnecessary overhead in _PyDict_GetItemIdWithError() by calling
_PyDict_GetItem_KnownHash() instead of the more generic PyDict_GetItemWithError(),
since we already know the hash of interned strings.
Pablo Galindo [Mon, 11 May 2020 00:41:26 +0000 (01:41 +0100)]
bpo-40585: Normalize errors messages in codeop when comparing them (GH-20030)
With the new parser, the error message contains always the trailing
newlines, causing the comparison of the repr of the error messages
in codeop to fail. This commit makes the new parser mirror the old parser's
behaviour regarding trailing newlines.
Victor Stinner [Sun, 10 May 2020 09:05:29 +0000 (11:05 +0200)]
bpo-40549: Convert posixmodule.c to multiphase init (GH-19982)
Convert posixmodule.c ("posix" or "nt" module) to the multiphase
initialization (PEP 489).
* Create the module using PyModuleDef_Init().
* Create ScandirIteratorType and DirEntryType with the new
PyType_FromModuleAndSpec() (PEP 573)
* Get the module state from ScandirIteratorType and DirEntryType with
the new PyType_GetModule() (PEP 573)
* Pass module to functions which access the module state.
* convert_sched_param() gets a new module parameter. It is now called
directly since Argument Clinic doesn't support passing the module
to an argument converter callback.
* Remove _posixstate_global macro.
Pablo Galindo [Sun, 10 May 2020 04:34:50 +0000 (05:34 +0100)]
bpo-40334: Avoid collisions between parser variables and grammar variables (GH-19987)
This is for the C generator:
- Disallow rule and variable names starting with `_`
- Rename most local variable names generated by the parser to start with `_`
Exceptions:
- Renaming `p` to `_p` will be a separate PR
- There are still some names that might clash, e.g.
- anything starting with `Py`
- C reserved words (`if` etc.)
- Macros like `EXTRA` and `CHECK`
Victor Stinner [Thu, 7 May 2020 20:42:14 +0000 (22:42 +0200)]
bpo-40548: Always run GitHub action, even on doc PRs (GH-19981)
Always run GitHub action jobs, even on documentation-only pull
requests. So it will be possible to make a GitHub action job, like
the Windows (64-bit) job, mandatory.
(Note: PEP 554 is not accepted and the implementation in the code base is a private one for use in the test suite.)
If code running in a subinterpreter raises an uncaught exception then the "run" call in the calling interpreter fails. A RunFailedError is raised there that summarizes the original exception as a string. The actual exception type, __cause__, __context__, state, etc. are all discarded. This turned out to be functionally insufficient in practice. There is a more helpful solution (and PEP 554 has been updated appropriately).
This change adds the exception propagation behavior described in PEP 554 to the _xxsubinterpreters module. With this change a copy of the original exception is set to __cause__ on the RunFailedError. For now we are using "pickle", which preserves the exception's state. We also preserve the original __cause__, __context__, and __traceback__ (since "pickle" does not preserve those).
bpo-40334: Error message for invalid default args in function call (GH-19973)
When parsing something like `f(g()=2)`, where the name of a default arg
is not a NAME, but an arbitrary expression, a specialised error message
is emitted.
bpo-40334: Fix error location upon parsing an invalid string literal (GH-19962)
When parsing a string with an invalid escape, the old parser used to
point to the beginning of the invalid string. This commit changes the new
parser to match that behaviour, since it's currently pointing to the
end of the string (or to be more precise, to the beginning of the next
token).
Make the design more object-oriented.
Split _GenericAlias on two almost independent classes: for special
generic aliases like List and for parametrized generic aliases like List[int].
Add specialized subclasses for Callable, Callable[...], Tuple and Union[...].
In the experimental isolated subinterpreters build mode,
_PyThreadState_GET() gets the autoTSSkey variable and
_PyThreadState_Swap() sets the autoTSSkey variable.
* Add _PyThreadState_GetTSS()
* _PyRuntimeState_GetThreadState() and _PyThreadState_GET()
return _PyThreadState_GetTSS()
* PyEval_SaveThread() sets the autoTSSkey variable to current Python
thread state rather than NULL.
* eval_frame_handle_pending() doesn't check that
_PyThreadState_Swap() result is NULL.
* _PyThreadState_Swap() gets the current Python thread state with
_PyThreadState_GetTSS() rather than
_PyRuntimeGILState_GetThreadState().
* PyGILState_Ensure() no longer checks _PyEval_ThreadsInitialized()
since it cannot access the current interpreter.
Victor Stinner [Tue, 5 May 2020 16:50:30 +0000 (18:50 +0200)]
bpo-40521: Disable Unicode caches in isolated subinterpreters (GH-19933)
When Python is built in the experimental isolated subinterpreters
mode, disable Unicode singletons and Unicode interned strings since
they are shared by all interpreters.
Temporary workaround until these caches are made per-interpreter.
Move recursion_limit member from _PyRuntimeState.ceval to
PyInterpreterState.ceval.
* Py_SetRecursionLimit() now only sets _Py_CheckRecursionLimit
of ceval.c if the current Python thread is part of the main
interpreter.
* Inline _Py_MakeEndRecCheck() into _Py_LeaveRecursiveCall().
* Convert _Py_RecursionLimitLowerWaterMark() macro into a static
inline function.