Archives are extracted using a :ref:`filter <tarfile-extraction-filter>`,
which makes it possible to either limit surprising/dangerous features,
or to acknowledge that they are expected and the archive is fully trusted.
- By default, archives are fully trusted, but this default is deprecated
- and slated to change in Python 3.14.
+.. versionchanged:: 3.14
+ Set the default extraction filter to :func:`data <data_filter>`,
+ which disallows some dangerous features such as links to absolute paths
+ or paths outside of the destination. Previously, the filter strategy
+ was equivalent to :func:`fully_trusted <fully_trusted_filter>`.
.. function:: open(name=None, mode='r', fileobj=None, bufsize=10240, **kwargs)
The *filter* argument specifies how ``members`` are modified or rejected
before extraction.
See :ref:`tarfile-extraction-filter` for details.
- It is recommended to set this explicitly depending on which *tar* features
- you need to support.
+ It is recommended to set this explicitly only if specific *tar* features
+ are required, or as ``filter='data'`` to support Python versions with a less
+ secure default (3.13 and lower).
.. warning::
Never extract archives from untrusted sources without prior inspection.
- It is possible that files are created outside of *path*, e.g. members
- that have absolute filenames starting with ``"/"`` or filenames with two
- dots ``".."``.
- Set ``filter='data'`` to prevent the most dangerous security issues,
- and read the :ref:`tarfile-extraction-filter` section for details.
+ Since Python 3.14, the default (:func:`data <data_filter>`) will prevent
+ the most dangerous security issues.
+ However, it will not prevent *all* unintended or insecure behavior.
+ Read the :ref:`tarfile-extraction-filter` section for details.
.. versionchanged:: 3.5
Added the *numeric_owner* parameter.
.. versionchanged:: 3.12
Added the *filter* parameter.
+ .. versionchanged:: 3.14
+ The *filter* parameter now defaults to ``'data'``.
+
.. method:: TarFile.extract(member, path="", set_attrs=True, *, numeric_owner=False, filter=None)
.. warning::
- See the warning for :meth:`extractall`.
-
- Set ``filter='data'`` to prevent the most dangerous security issues,
- and read the :ref:`tarfile-extraction-filter` section for details.
+ Never extract archives from untrusted sources without prior inspection.
+ See the warning for :meth:`extractall` for details.
.. versionchanged:: 3.2
Added the *set_attrs* parameter.
String names are not allowed for this attribute, unlike the *filter*
argument to :meth:`~TarFile.extract`.
- If ``extraction_filter`` is ``None`` (the default),
- calling an extraction method without a *filter* argument will raise a
- ``DeprecationWarning``,
- and fall back to the :func:`fully_trusted <fully_trusted_filter>` filter,
- whose dangerous behavior matches previous versions of Python.
-
- In Python 3.14+, leaving ``extraction_filter=None`` will cause
- extraction methods to use the :func:`data <data_filter>` filter by default.
+ If ``extraction_filter`` is ``None`` (the default), extraction methods
+ will use the :func:`data <data_filter>` filter by default.
The attribute may be set on instances or overridden in subclasses.
It also is possible to set it on the ``TarFile`` class itself to set a
To set a global default this way, a filter function needs to be wrapped in
:func:`staticmethod()` to prevent injection of a ``self`` argument.
+ .. versionchanged:: 3.14
+
+ The default filter is set to :func:`data <data_filter>`,
+ which disallows some dangerous features such as links to absolute paths
+ or paths outside of the destination.
+ Previously, the default was equivalent to
+ :func:`fully_trusted <fully_trusted_filter>`.
+
.. method:: TarFile.add(name, arcname=None, recursive=True, *, filter=None)
Add the file *name* to the archive. *name* may be any type of file
Therefore, *tarfile* supports extraction filters: a mechanism to limit
functionality, and thus mitigate some of the security issues.
+.. warning::
+
+ None of the available filters blocks *all* dangerous archive features.
+ Never extract archives from untrusted sources without prior inspection.
+ See also :ref:`tarfile-further-verification`.
+
.. seealso::
:pep:`706`
* ``None`` (default): Use :attr:`TarFile.extraction_filter`.
- If that is also ``None`` (the default), raise a ``DeprecationWarning``,
- and fall back to the ``'fully_trusted'`` filter, whose dangerous behavior
- matches previous versions of Python.
+ If that is also ``None`` (the default), the ``'data'`` filter will be used.
+
+ .. versionchanged:: 3.14
- In Python 3.14, the ``'data'`` filter will become the default instead.
- It's possible to switch earlier; see :attr:`TarFile.extraction_filter`.
+ The default filter is set to :func:`data <data_filter>`.
+ Previously, the default was equivalent to
+ :func:`fully_trusted <fully_trusted_filter>`.
* A callable which will be called for each extracted member with a
:ref:`TarInfo <tarinfo-objects>` describing the member and the destination
Return the modified ``TarInfo`` member.
+ Note that this filter does not block *all* dangerous archive features.
+ See :ref:`tarfile-further-verification` for details.
+
.. _tarfile-extraction-refuse:
but extraction will continue.
+.. _tarfile-further-verification:
+
Hints for further verification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
disk, memory and CPU usage.
* Check filenames against an allow-list of characters
(to filter out control characters, confusables, foreign path separators,
- etc.).
+ and so on).
* Check that filenames have expected extensions (discouraging files that
- execute when you “click on them”, or extension-less files like Windows special device names).
+ execute when you “click on them”, or extension-less files like Windows
+ special device names).
* Limit the number of extracted files, total size of extracted data,
filename length (including symlink length), and size of individual files.
* Check for files that would be shadowed on case-insensitive filesystems.
tar.close()
os_helper.rmtree(DIR)
+ @staticmethod
+ def test_extractall_default_filter():
+ # Test that the default filter is now "data", and the other filter types are not used.
+ DIR = pathlib.Path(TEMPDIR) / "extractall_default_filter"
+ with (
+ os_helper.temp_dir(DIR),
+ tarfile.open(tarname, encoding="iso8859-1") as tar,
+ unittest.mock.patch("tarfile.data_filter", wraps=tarfile.data_filter) as mock_data_filter,
+ unittest.mock.patch("tarfile.tar_filter", wraps=tarfile.tar_filter) as mock_tar_filter,
+ unittest.mock.patch("tarfile.fully_trusted_filter", wraps=tarfile.fully_trusted_filter) as mock_ft_filter
+ ):
+ directories = [t for t in tar if t.isdir()]
+ tar.extractall(DIR, directories)
+
+ mock_data_filter.assert_called()
+ mock_ft_filter.assert_not_called()
+ mock_tar_filter.assert_not_called()
+
@os_helper.skip_unless_working_chmod
def test_extract_directory(self):
dirtype = "ustar/dirtype"
finally:
os_helper.rmtree(DIR)
- def test_deprecation_if_no_filter_passed_to_extractall(self):
- DIR = pathlib.Path(TEMPDIR) / "extractall"
- with (
- os_helper.temp_dir(DIR),
- tarfile.open(tarname, encoding="iso8859-1") as tar
- ):
- directories = [t for t in tar if t.isdir()]
- with self.assertWarnsRegex(DeprecationWarning, "Use the filter argument") as cm:
- tar.extractall(DIR, directories)
- # check that the stacklevel of the deprecation warning is correct:
- self.assertEqual(cm.filename, __file__)
-
- def test_deprecation_if_no_filter_passed_to_extract(self):
- dirtype = "ustar/dirtype"
- DIR = pathlib.Path(TEMPDIR) / "extractall"
- with (
- os_helper.temp_dir(DIR),
- tarfile.open(tarname, encoding="iso8859-1") as tar
- ):
- tarinfo = tar.getmember(dirtype)
- with self.assertWarnsRegex(DeprecationWarning, "Use the filter argument") as cm:
- tar.extract(tarinfo, path=DIR)
- # check that the stacklevel of the deprecation warning is correct:
- self.assertEqual(cm.filename, __file__)
-
def test_extractall_pathlike_dir(self):
DIR = os.path.join(TEMPDIR, "extractall")
with os_helper.temp_dir(DIR), \
self.assertIs(filtered.name, tarinfo.name)
self.assertIs(filtered.type, tarinfo.type)
- def test_default_filter_warns(self):
- """Ensure the default filter warns"""
- with ArchiveMaker() as arc:
- arc.add('foo')
- with warnings_helper.check_warnings(
- ('Python 3.14', DeprecationWarning)):
- with self.check_context(arc.open(), None):
- self.expect_file('foo')
-
def test_change_default_filter_on_instance(self):
tar = tarfile.TarFile(tarname, 'r')
def strict_filter(tarinfo, path):