--- /dev/null
+From 86314751c7945fa0c67f459beeda2e7c610ca429 Mon Sep 17 00:00:00 2001
+From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
+Date: Thu, 2 Jun 2016 01:57:50 +0200
+Subject: ACPI / processor: Avoid reserving IO regions too early
+
+From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+
+commit 86314751c7945fa0c67f459beeda2e7c610ca429 upstream.
+
+Roland Dreier reports that one of his systems cannot boot because of
+the changes made by commit ac212b6980d8 (ACPI / processor: Use common
+hotplug infrastructure).
+
+The problematic part of it is the request_region() call in
+acpi_processor_get_info() that used to run at module init time before
+the above commit and now it runs much earlier. Unfortunately, the
+region(s) reserved by it fall into a range the PCI subsystem attempts
+to reserve for AHCI IO BARs. As a result, the PCI reservation fails
+and AHCI doesn't work, while previously the PCI reservation would
+be made before acpi_processor_get_info() and it would succeed.
+
+That request_region() call, however, was overlooked by commit
+ac212b6980d8, as it is not necessary for the enumeration of the
+processors. It only is needed when the ACPI processor driver
+actually attempts to handle them which doesn't happen before
+loading the ACPI processor driver module. Therefore that call
+should have been moved from acpi_processor_get_info() into that
+module.
+
+Address the problem by moving the request_region() call in question
+out of acpi_processor_get_info() and use the observation that the
+region reserved by it is only needed if the FADT-based CPU
+throttling method is going to be used, which means that it should
+be sufficient to invoke it from acpi_processor_get_throttling_fadt().
+
+Fixes: ac212b6980d8 (ACPI / processor: Use common hotplug infrastructure)
+Reported-by: Roland Dreier <roland@purestorage.com>
+Tested-by: Roland Dreier <roland@purestorage.com>
+Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+Acked-by: Joerg Roedel <jroedel@suse.de>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ drivers/acpi/acpi_processor.c | 9 ---------
+ drivers/acpi/processor_throttling.c | 9 +++++++++
+ 2 files changed, 9 insertions(+), 9 deletions(-)
+
+--- a/drivers/acpi/acpi_processor.c
++++ b/drivers/acpi/acpi_processor.c
+@@ -331,15 +331,6 @@ static int acpi_processor_get_info(struc
+ pr->throttling.duty_width = acpi_gbl_FADT.duty_width;
+
+ pr->pblk = object.processor.pblk_address;
+-
+- /*
+- * We don't care about error returns - we just try to mark
+- * these reserved so that nobody else is confused into thinking
+- * that this region might be unused..
+- *
+- * (In particular, allocating the IO range for Cardbus)
+- */
+- request_region(pr->throttling.address, 6, "ACPI CPU throttle");
+ }
+
+ /*
+--- a/drivers/acpi/processor_throttling.c
++++ b/drivers/acpi/processor_throttling.c
+@@ -676,6 +676,15 @@ static int acpi_processor_get_throttling
+ if (!pr->flags.throttling)
+ return -ENODEV;
+
++ /*
++ * We don't care about error returns - we just try to mark
++ * these reserved so that nobody else is confused into thinking
++ * that this region might be unused..
++ *
++ * (In particular, allocating the IO range for Cardbus)
++ */
++ request_region(pr->throttling.address, 6, "ACPI CPU throttle");
++
+ pr->throttling.state = 0;
+
+ duty_mask = pr->throttling.state_count - 1;
--- /dev/null
+From c2a6bbaf0c5f90463a7011a295bbdb7e33c80b51 Mon Sep 17 00:00:00 2001
+From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
+Date: Fri, 30 Dec 2016 02:27:31 +0100
+Subject: ACPI / scan: Prefer devices without _HID/_CID for _ADR matching
+
+From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+
+commit c2a6bbaf0c5f90463a7011a295bbdb7e33c80b51 upstream.
+
+The way acpi_find_child_device() works currently is that, if there
+are two (or more) devices with the same _ADR value in the same
+namespace scope (which is not specifically allowed by the spec and
+the OS behavior in that case is not defined), the first one of them
+found to be present (with the help of _STA) will be returned.
+
+This covers the majority of cases, but is not sufficient if some of
+the devices in question have a _HID (or _CID) returning some valid
+ACPI/PNP device IDs (which is disallowed by the spec) and the
+ASL writers' expectation appears to be that the OS will match
+devices without a valid ACPI/PNP device ID against a given bus
+address first.
+
+To cover this special case as well, modify find_child_checks()
+to prefer devices without ACPI/PNP device IDs over devices that
+have them.
+
+Suggested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
+Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+Tested-by: Hans de Goede <hdegoede@redhat.com>
+Signed-off-by: Jiri Slaby <jslaby@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ drivers/acpi/glue.c | 12 ++++++------
+ 1 file changed, 6 insertions(+), 6 deletions(-)
+
+--- a/drivers/acpi/glue.c
++++ b/drivers/acpi/glue.c
+@@ -99,13 +99,13 @@ static int find_child_checks(struct acpi
+ return -ENODEV;
+
+ /*
+- * If the device has a _HID (or _CID) returning a valid ACPI/PNP
+- * device ID, it is better to make it look less attractive here, so that
+- * the other device with the same _ADR value (that may not have a valid
+- * device ID) can be matched going forward. [This means a second spec
+- * violation in a row, so whatever we do here is best effort anyway.]
++ * If the device has a _HID returning a valid ACPI/PNP device ID, it is
++ * better to make it look less attractive here, so that the other device
++ * with the same _ADR value (that may not have a valid device ID) can be
++ * matched going forward. [This means a second spec violation in a row,
++ * so whatever we do here is best effort anyway.]
+ */
+- return sta_present && list_empty(&adev->pnp.ids) ?
++ return sta_present && !adev->pnp.type.platform_id ?
+ FIND_CHILD_MAX_SCORE : FIND_CHILD_MIN_SCORE;
+ }
+
--- /dev/null
+From 3b2d69114fefa474fca542e51119036dceb4aa6f Mon Sep 17 00:00:00 2001
+From: Seunghun Han <kkamagui@gmail.com>
+Date: Wed, 26 Apr 2017 16:18:08 +0800
+Subject: ACPICA: Namespace: fix operand cache leak
+
+From: Seunghun Han <kkamagui@gmail.com>
+
+commit 3b2d69114fefa474fca542e51119036dceb4aa6f upstream.
+
+ACPICA commit a23325b2e583556eae88ed3f764e457786bf4df6
+
+I found some ACPI operand cache leaks in ACPI early abort cases.
+
+Boot log of ACPI operand cache leak is as follows:
+>[ 0.174332] ACPI: Added _OSI(Module Device)
+>[ 0.175504] ACPI: Added _OSI(Processor Device)
+>[ 0.176010] ACPI: Added _OSI(3.0 _SCP Extensions)
+>[ 0.177032] ACPI: Added _OSI(Processor Aggregator Device)
+>[ 0.178284] ACPI: SCI (IRQ16705) allocation failed
+>[ 0.179352] ACPI Exception: AE_NOT_ACQUIRED, Unable to install
+System Control Interrupt handler (20160930/evevent-131)
+>[ 0.180008] ACPI: Unable to start the ACPI Interpreter
+>[ 0.181125] ACPI Error: Could not remove SCI handler
+(20160930/evmisc-281)
+>[ 0.184068] kmem_cache_destroy Acpi-Operand: Slab cache still has
+objects
+>[ 0.185358] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc3 #2
+>[ 0.186820] Hardware name: innotek gmb_h virtual_box/virtual_box, BIOS
+virtual_box 12/01/2006
+>[ 0.188000] Call Trace:
+>[ 0.188000] ? dump_stack+0x5c/0x7d
+>[ 0.188000] ? kmem_cache_destroy+0x224/0x230
+>[ 0.188000] ? acpi_sleep_proc_init+0x22/0x22
+>[ 0.188000] ? acpi_os_delete_cache+0xa/0xd
+>[ 0.188000] ? acpi_ut_delete_caches+0x3f/0x7b
+>[ 0.188000] ? acpi_terminate+0x5/0xf
+>[ 0.188000] ? acpi_init+0x288/0x32e
+>[ 0.188000] ? __class_create+0x4c/0x80
+>[ 0.188000] ? video_setup+0x7a/0x7a
+>[ 0.188000] ? do_one_initcall+0x4e/0x1b0
+>[ 0.188000] ? kernel_init_freeable+0x194/0x21a
+>[ 0.188000] ? rest_init+0x80/0x80
+>[ 0.188000] ? kernel_init+0xa/0x100
+>[ 0.188000] ? ret_from_fork+0x25/0x30
+
+When early abort is occurred due to invalid ACPI information, Linux kernel
+terminates ACPI by calling acpi_terminate() function. The function calls
+acpi_ns_terminate() function to delete namespace data and ACPI operand cache
+(acpi_gbl_module_code_list).
+
+But the deletion code in acpi_ns_terminate() function is wrapped in
+ACPI_EXEC_APP definition, therefore the code is only executed when the
+definition exists. If the define doesn't exist, ACPI operand cache
+(acpi_gbl_module_code_list) is leaked, and stack dump is shown in kernel log.
+
+This causes a security threat because the old kernel (<= 4.9) shows memory
+locations of kernel functions in stack dump, therefore kernel ASLR can be
+neutralized.
+
+To fix ACPI operand leak for enhancing security, I made a patch which
+removes the ACPI_EXEC_APP define in acpi_ns_terminate() function for
+executing the deletion code unconditionally.
+
+Link: https://github.com/acpica/acpica/commit/a23325b2
+Signed-off-by: Seunghun Han <kkamagui@gmail.com>
+Signed-off-by: Lv Zheng <lv.zheng@intel.com>
+Signed-off-by: Bob Moore <robert.moore@intel.com>
+Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+Acked-by: Lee, Chun-Yi <jlee@suse.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ drivers/acpi/acpica/nsutils.c | 23 +++++++++--------------
+ 1 file changed, 9 insertions(+), 14 deletions(-)
+
+--- a/drivers/acpi/acpica/nsutils.c
++++ b/drivers/acpi/acpica/nsutils.c
+@@ -593,25 +593,20 @@ struct acpi_namespace_node *acpi_ns_vali
+ void acpi_ns_terminate(void)
+ {
+ acpi_status status;
++ union acpi_operand_object *prev;
++ union acpi_operand_object *next;
+
+ ACPI_FUNCTION_TRACE(ns_terminate);
+
+-#ifdef ACPI_EXEC_APP
+- {
+- union acpi_operand_object *prev;
+- union acpi_operand_object *next;
++ /* Delete any module-level code blocks */
+
+- /* Delete any module-level code blocks */
+-
+- next = acpi_gbl_module_code_list;
+- while (next) {
+- prev = next;
+- next = next->method.mutex;
+- prev->method.mutex = NULL; /* Clear the Mutex (cheated) field */
+- acpi_ut_remove_reference(prev);
+- }
++ next = acpi_gbl_module_code_list;
++ while (next) {
++ prev = next;
++ next = next->method.mutex;
++ prev->method.mutex = NULL; /* Clear the Mutex (cheated) field */
++ acpi_ut_remove_reference(prev);
+ }
+-#endif
+
+ /*
+ * Free the entire namespace -- all nodes and all objects
--- /dev/null
+From e048cb32f69038aa1c8f11e5c1b331be4181659d Mon Sep 17 00:00:00 2001
+From: Doug Berger <opendmb@gmail.com>
+Date: Mon, 10 Jul 2017 15:49:44 -0700
+Subject: cma: fix calculation of aligned offset
+
+From: Doug Berger <opendmb@gmail.com>
+
+commit e048cb32f69038aa1c8f11e5c1b331be4181659d upstream.
+
+The align_offset parameter is used by bitmap_find_next_zero_area_off()
+to represent the offset of map's base from the previous alignment
+boundary; the function ensures that the returned index, plus the
+align_offset, honors the specified align_mask.
+
+The logic introduced by commit b5be83e308f7 ("mm: cma: align to physical
+address, not CMA region position") has the cma driver calculate the
+offset to the *next* alignment boundary. In most cases, the base
+alignment is greater than that specified when making allocations,
+resulting in a zero offset whether we align up or down. In the example
+given with the commit, the base alignment (8MB) was half the requested
+alignment (16MB) so the math also happened to work since the offset is
+8MB in both directions. However, when requesting allocations with an
+alignment greater than twice that of the base, the returned index would
+not be correctly aligned.
+
+Also, the align_order arguments of cma_bitmap_aligned_mask() and
+cma_bitmap_aligned_offset() should not be negative so the argument type
+was made unsigned.
+
+Fixes: b5be83e308f7 ("mm: cma: align to physical address, not CMA region position")
+Link: http://lkml.kernel.org/r/20170628170742.2895-1-opendmb@gmail.com
+Signed-off-by: Angus Clark <angus@angusclark.org>
+Signed-off-by: Doug Berger <opendmb@gmail.com>
+Acked-by: Gregory Fong <gregory.0xf0@gmail.com>
+Cc: Doug Berger <opendmb@gmail.com>
+Cc: Angus Clark <angus@angusclark.org>
+Cc: Laura Abbott <labbott@redhat.com>
+Cc: Vlastimil Babka <vbabka@suse.cz>
+Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+Cc: Lucas Stach <l.stach@pengutronix.de>
+Cc: Catalin Marinas <catalin.marinas@arm.com>
+Cc: Shiraz Hashim <shashim@codeaurora.org>
+Cc: Jaewon Kim <jaewon31.kim@samsung.com>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ mm/cma.c | 15 ++++++---------
+ 1 file changed, 6 insertions(+), 9 deletions(-)
+
+--- a/mm/cma.c
++++ b/mm/cma.c
+@@ -54,7 +54,7 @@ unsigned long cma_get_size(const struct
+ }
+
+ static unsigned long cma_bitmap_aligned_mask(const struct cma *cma,
+- int align_order)
++ unsigned int align_order)
+ {
+ if (align_order <= cma->order_per_bit)
+ return 0;
+@@ -62,17 +62,14 @@ static unsigned long cma_bitmap_aligned_
+ }
+
+ /*
+- * Find a PFN aligned to the specified order and return an offset represented in
+- * order_per_bits.
++ * Find the offset of the base PFN from the specified align_order.
++ * The value returned is represented in order_per_bits.
+ */
+ static unsigned long cma_bitmap_aligned_offset(const struct cma *cma,
+- int align_order)
++ unsigned int align_order)
+ {
+- if (align_order <= cma->order_per_bit)
+- return 0;
+-
+- return (ALIGN(cma->base_pfn, (1UL << align_order))
+- - cma->base_pfn) >> cma->order_per_bit;
++ return (cma->base_pfn & ((1UL << align_order) - 1))
++ >> cma->order_per_bit;
+ }
+
+ static unsigned long cma_bitmap_pages_to_bits(const struct cma *cma,
--- /dev/null
+From a992f2d38e4ce17b8c7d1f7f67b2de0eebdea069 Mon Sep 17 00:00:00 2001
+From: Jan Kara <jack@suse.cz>
+Date: Wed, 21 Jun 2017 14:34:15 +0200
+Subject: ext2: Don't clear SGID when inheriting ACLs
+
+From: Jan Kara <jack@suse.cz>
+
+commit a992f2d38e4ce17b8c7d1f7f67b2de0eebdea069 upstream.
+
+When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
+set, DIR1 is expected to have SGID bit set (and owning group equal to
+the owning group of 'DIR0'). However when 'DIR0' also has some default
+ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
+'DIR1' to get cleared if user is not member of the owning group.
+
+Fix the problem by creating __ext2_set_acl() function that does not call
+posix_acl_update_mode() and use it when inheriting ACLs. That prevents
+SGID bit clearing and the mode has been properly set by
+posix_acl_create() anyway.
+
+Fixes: 073931017b49d9458aa351605b43a7e34598caef
+CC: stable@vger.kernel.org
+CC: linux-ext4@vger.kernel.org
+Signed-off-by: Jan Kara <jack@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ fs/ext2/acl.c | 36 ++++++++++++++++++++++--------------
+ 1 file changed, 22 insertions(+), 14 deletions(-)
+
+--- a/fs/ext2/acl.c
++++ b/fs/ext2/acl.c
+@@ -178,11 +178,8 @@ ext2_get_acl(struct inode *inode, int ty
+ return acl;
+ }
+
+-/*
+- * inode->i_mutex: down
+- */
+-int
+-ext2_set_acl(struct inode *inode, struct posix_acl *acl, int type)
++static int
++__ext2_set_acl(struct inode *inode, struct posix_acl *acl, int type)
+ {
+ int name_index;
+ void *value = NULL;
+@@ -192,13 +189,6 @@ ext2_set_acl(struct inode *inode, struct
+ switch(type) {
+ case ACL_TYPE_ACCESS:
+ name_index = EXT2_XATTR_INDEX_POSIX_ACL_ACCESS;
+- if (acl) {
+- error = posix_acl_update_mode(inode, &inode->i_mode, &acl);
+- if (error)
+- return error;
+- inode->i_ctime = CURRENT_TIME_SEC;
+- mark_inode_dirty(inode);
+- }
+ break;
+
+ case ACL_TYPE_DEFAULT:
+@@ -225,6 +215,24 @@ ext2_set_acl(struct inode *inode, struct
+ }
+
+ /*
++ * inode->i_mutex: down
++ */
++int
++ext2_set_acl(struct inode *inode, struct posix_acl *acl, int type)
++{
++ int error;
++
++ if (type == ACL_TYPE_ACCESS && acl) {
++ error = posix_acl_update_mode(inode, &inode->i_mode, &acl);
++ if (error)
++ return error;
++ inode->i_ctime = CURRENT_TIME_SEC;
++ mark_inode_dirty(inode);
++ }
++ return __ext2_set_acl(inode, acl, type);
++}
++
++/*
+ * Initialize the ACLs of a new inode. Called from ext2_new_inode.
+ *
+ * dir->i_mutex: down
+@@ -241,12 +249,12 @@ ext2_init_acl(struct inode *inode, struc
+ return error;
+
+ if (default_acl) {
+- error = ext2_set_acl(inode, default_acl, ACL_TYPE_DEFAULT);
++ error = __ext2_set_acl(inode, default_acl, ACL_TYPE_DEFAULT);
+ posix_acl_release(default_acl);
+ }
+ if (acl) {
+ if (!error)
+- error = ext2_set_acl(inode, acl, ACL_TYPE_ACCESS);
++ error = __ext2_set_acl(inode, acl, ACL_TYPE_ACCESS);
+ posix_acl_release(acl);
+ }
+ return error;
--- /dev/null
+From fc3dc67471461c0efcb1ed22fb7595121d65fad9 Mon Sep 17 00:00:00 2001
+From: Jiri Slaby <jslaby@suse.cz>
+Date: Tue, 13 Jun 2017 13:35:51 +0200
+Subject: fs/fcntl: f_setown, avoid undefined behaviour
+
+From: Jiri Slaby <jslaby@suse.cz>
+
+commit fc3dc67471461c0efcb1ed22fb7595121d65fad9 upstream.
+
+fcntl(0, F_SETOWN, 0x80000000) triggers:
+UBSAN: Undefined behaviour in fs/fcntl.c:118:7
+negation of -2147483648 cannot be represented in type 'int':
+CPU: 1 PID: 18261 Comm: syz-executor Not tainted 4.8.1-0-syzkaller #1
+...
+Call Trace:
+...
+ [<ffffffffad8f0868>] ? f_setown+0x1d8/0x200
+ [<ffffffffad8f19a9>] ? SyS_fcntl+0x999/0xf30
+ [<ffffffffaed1fb00>] ? entry_SYSCALL_64_fastpath+0x23/0xc1
+
+Fix that by checking the arg parameter properly (against INT_MAX) before
+"who = -who". And return immediatelly with -EINVAL in case it is wrong.
+Note that according to POSIX we can return EINVAL:
+ http://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html
+
+ [EINVAL]
+ The cmd argument is F_SETOWN and the value of the argument
+ is not valid as a process or process group identifier.
+
+[v2] returns an error, v1 used to fail silently
+[v3] implement proper check for the bad value INT_MIN
+
+Signed-off-by: Jiri Slaby <jslaby@suse.cz>
+Cc: Jeff Layton <jlayton@poochiereds.net>
+Cc: "J. Bruce Fields" <bfields@fieldses.org>
+Cc: Alexander Viro <viro@zeniv.linux.org.uk>
+Cc: linux-fsdevel@vger.kernel.org
+Signed-off-by: Jeff Layton <jlayton@redhat.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ fs/fcntl.c | 4 ++++
+ 1 file changed, 4 insertions(+)
+
+--- a/fs/fcntl.c
++++ b/fs/fcntl.c
+@@ -113,6 +113,10 @@ void f_setown(struct file *filp, unsigne
+ int who = arg;
+ type = PIDTYPE_PID;
+ if (who < 0) {
++ /* avoid overflow below */
++ if (who == INT_MIN)
++ return;
++
+ type = PIDTYPE_PGID;
+ who = -who;
+ }
--- /dev/null
+From 18365225f0440d09708ad9daade2ec11275c3df9 Mon Sep 17 00:00:00 2001
+From: Michal Hocko <mhocko@suse.com>
+Date: Fri, 12 May 2017 15:46:26 -0700
+Subject: hwpoison, memcg: forcibly uncharge LRU pages
+
+From: Michal Hocko <mhocko@suse.com>
+
+commit 18365225f0440d09708ad9daade2ec11275c3df9 upstream.
+
+Laurent Dufour has noticed that hwpoinsoned pages are kept charged. In
+his particular case he has hit a bad_page("page still charged to
+cgroup") when onlining a hwpoison page. While this looks like something
+that shouldn't happen in the first place because onlining hwpages and
+returning them to the page allocator makes only little sense it shows a
+real problem.
+
+hwpoison pages do not get freed usually so we do not uncharge them (at
+least not since commit 0a31bc97c80c ("mm: memcontrol: rewrite uncharge
+API")). Each charge pins memcg (since e8ea14cc6ead ("mm: memcontrol:
+take a css reference for each charged page")) as well and so the
+mem_cgroup and the associated state will never go away. Fix this leak
+by forcibly uncharging a LRU hwpoisoned page in delete_from_lru_cache().
+We also have to tweak uncharge_list because it cannot rely on zero ref
+count for these pages.
+
+[akpm@linux-foundation.org: coding-style fixes]
+Fixes: 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API")
+Link: http://lkml.kernel.org/r/20170502185507.GB19165@dhcp22.suse.cz
+Signed-off-by: Michal Hocko <mhocko@suse.com>
+Reported-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
+Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
+Reviewed-by: Balbir Singh <bsingharora@gmail.com>
+Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ mm/memcontrol.c | 2 +-
+ mm/memory-failure.c | 7 +++++++
+ 2 files changed, 8 insertions(+), 1 deletion(-)
+
+--- a/mm/memcontrol.c
++++ b/mm/memcontrol.c
+@@ -5576,7 +5576,7 @@ static void uncharge_list(struct list_he
+ next = page->lru.next;
+
+ VM_BUG_ON_PAGE(PageLRU(page), page);
+- VM_BUG_ON_PAGE(page_count(page), page);
++ VM_BUG_ON_PAGE(!PageHWPoison(page) && page_count(page), page);
+
+ if (!page->mem_cgroup)
+ continue;
+--- a/mm/memory-failure.c
++++ b/mm/memory-failure.c
+@@ -539,6 +539,13 @@ static int delete_from_lru_cache(struct
+ */
+ ClearPageActive(p);
+ ClearPageUnevictable(p);
++
++ /*
++ * Poisoned page might never drop its ref count to 0 so we have
++ * to uncharge it manually from its memcg.
++ */
++ mem_cgroup_uncharge(p);
++
+ /*
+ * drop the page count elevated by isolate_lru_page()
+ */
--- /dev/null
+From 999898355e08ae3b92dfd0a08db706e0c6703d30 Mon Sep 17 00:00:00 2001
+From: Jiri Slaby <jslaby@suse.cz>
+Date: Wed, 14 Dec 2016 15:06:07 -0800
+Subject: ipc: msg, make msgrcv work with LONG_MIN
+
+From: Jiri Slaby <jslaby@suse.cz>
+
+commit 999898355e08ae3b92dfd0a08db706e0c6703d30 upstream.
+
+When LONG_MIN is passed to msgrcv, one would expect to recieve any
+message. But convert_mode does *msgtyp = -*msgtyp and -LONG_MIN is
+undefined. In particular, with my gcc -LONG_MIN produces -LONG_MIN
+again.
+
+So handle this case properly by assigning LONG_MAX to *msgtyp if
+LONG_MIN was specified as msgtyp to msgrcv.
+
+This code:
+ long msg[] = { 100, 200 };
+ int m = msgget(IPC_PRIVATE, IPC_CREAT | 0644);
+ msgsnd(m, &msg, sizeof(msg), 0);
+ msgrcv(m, &msg, sizeof(msg), LONG_MIN, 0);
+
+produces currently nothing:
+
+ msgget(IPC_PRIVATE, IPC_CREAT|0644) = 65538
+ msgsnd(65538, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, 0) = 0
+ msgrcv(65538, ...
+
+Except a UBSAN warning:
+
+ UBSAN: Undefined behaviour in ipc/msg.c:745:13
+ negation of -9223372036854775808 cannot be represented in type 'long int':
+
+With the patch, I see what I expect:
+
+ msgget(IPC_PRIVATE, IPC_CREAT|0644) = 0
+ msgsnd(0, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, 0) = 0
+ msgrcv(0, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, -9223372036854775808, 0) = 16
+
+Link: http://lkml.kernel.org/r/20161024082633.10148-1-jslaby@suse.cz
+Signed-off-by: Jiri Slaby <jslaby@suse.cz>
+Cc: Davidlohr Bueso <dave@stgolabs.net>
+Cc: Manfred Spraul <manfred@colorfullife.com>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ ipc/msg.c | 5 ++++-
+ 1 file changed, 4 insertions(+), 1 deletion(-)
+
+--- a/ipc/msg.c
++++ b/ipc/msg.c
+@@ -742,7 +742,10 @@ static inline int convert_mode(long *msg
+ if (*msgtyp == 0)
+ return SEARCH_ANY;
+ if (*msgtyp < 0) {
+- *msgtyp = -*msgtyp;
++ if (*msgtyp == LONG_MIN) /* -LONG_MIN is undefined */
++ *msgtyp = LONG_MAX;
++ else
++ *msgtyp = -*msgtyp;
+ return SEARCH_LESSEQUAL;
+ }
+ if (msgflg & MSG_EXCEPT)
--- /dev/null
+From 561b5e0709e4a248c67d024d4d94b6e31e3edf2f Mon Sep 17 00:00:00 2001
+From: Michal Hocko <mhocko@suse.com>
+Date: Mon, 10 Jul 2017 15:49:51 -0700
+Subject: mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
+
+From: Michal Hocko <mhocko@suse.com>
+
+commit 561b5e0709e4a248c67d024d4d94b6e31e3edf2f upstream.
+
+Commit 1be7107fbe18 ("mm: larger stack guard gap, between vmas") has
+introduced a regression in some rust and Java environments which are
+trying to implement their own stack guard page. They are punching a new
+MAP_FIXED mapping inside the existing stack Vma.
+
+This will confuse expand_{downwards,upwards} into thinking that the
+stack expansion would in fact get us too close to an existing non-stack
+vma which is a correct behavior wrt safety. It is a real regression on
+the other hand.
+
+Let's work around the problem by considering PROT_NONE mapping as a part
+of the stack. This is a gros hack but overflowing to such a mapping
+would trap anyway an we only can hope that usespace knows what it is
+doing and handle it propely.
+
+Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas")
+Link: http://lkml.kernel.org/r/20170705182849.GA18027@dhcp22.suse.cz
+Signed-off-by: Michal Hocko <mhocko@suse.com>
+Debugged-by: Vlastimil Babka <vbabka@suse.cz>
+Cc: Ben Hutchings <ben@decadent.org.uk>
+Cc: Willy Tarreau <w@1wt.eu>
+Cc: Oleg Nesterov <oleg@redhat.com>
+Cc: Rik van Riel <riel@redhat.com>
+Cc: Hugh Dickins <hughd@google.com>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ mm/mmap.c | 6 ++++--
+ 1 file changed, 4 insertions(+), 2 deletions(-)
+
+--- a/mm/mmap.c
++++ b/mm/mmap.c
+@@ -2188,7 +2188,8 @@ int expand_upwards(struct vm_area_struct
+ gap_addr = TASK_SIZE;
+
+ next = vma->vm_next;
+- if (next && next->vm_start < gap_addr) {
++ if (next && next->vm_start < gap_addr &&
++ (next->vm_flags & (VM_WRITE|VM_READ|VM_EXEC))) {
+ if (!(next->vm_flags & VM_GROWSUP))
+ return -ENOMEM;
+ /* Check that both stack segments have the same anon_vma? */
+@@ -2273,7 +2274,8 @@ int expand_downwards(struct vm_area_stru
+ if (gap_addr > address)
+ return -ENOMEM;
+ prev = vma->vm_prev;
+- if (prev && prev->vm_end > gap_addr) {
++ if (prev && prev->vm_end > gap_addr &&
++ (prev->vm_flags & (VM_WRITE|VM_READ|VM_EXEC))) {
+ if (!(prev->vm_flags & VM_GROWSDOWN))
+ return -ENOMEM;
+ /* Check that both stack segments have the same anon_vma? */
--- /dev/null
+From b050e3769c6b4013bb937e879fc43bf1847ee819 Mon Sep 17 00:00:00 2001
+From: Vlastimil Babka <vbabka@suse.cz>
+Date: Wed, 15 Nov 2017 17:38:30 -0800
+Subject: mm, page_alloc: fix potential false positive in __zone_watermark_ok
+
+From: Vlastimil Babka <vbabka@suse.cz>
+
+commit b050e3769c6b4013bb937e879fc43bf1847ee819 upstream.
+
+Since commit 97a16fc82a7c ("mm, page_alloc: only enforce watermarks for
+order-0 allocations"), __zone_watermark_ok() check for high-order
+allocations will shortcut per-migratetype free list checks for
+ALLOC_HARDER allocations, and return true as long as there's free page
+of any migratetype. The intention is that ALLOC_HARDER can allocate
+from MIGRATE_HIGHATOMIC free lists, while normal allocations can't.
+
+However, as a side effect, the watermark check will then also return
+true when there are pages only on the MIGRATE_ISOLATE list, or (prior to
+CMA conversion to ZONE_MOVABLE) on the MIGRATE_CMA list. Since the
+allocation cannot actually obtain isolated pages, and might not be able
+to obtain CMA pages, this can result in a false positive.
+
+The condition should be rare and perhaps the outcome is not a fatal one.
+Still, it's better if the watermark check is correct. There also
+shouldn't be a performance tradeoff here.
+
+Link: http://lkml.kernel.org/r/20171102125001.23708-1-vbabka@suse.cz
+Fixes: 97a16fc82a7c ("mm, page_alloc: only enforce watermarks for order-0 allocations")
+Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
+Acked-by: Mel Gorman <mgorman@techsingularity.net>
+Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
+Cc: Rik van Riel <riel@redhat.com>
+Cc: David Rientjes <rientjes@google.com>
+Cc: Johannes Weiner <hannes@cmpxchg.org>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ mm/page_alloc.c | 6 +++---
+ 1 file changed, 3 insertions(+), 3 deletions(-)
+
+--- a/mm/page_alloc.c
++++ b/mm/page_alloc.c
+@@ -2468,9 +2468,6 @@ static bool __zone_watermark_ok(struct z
+ if (!area->nr_free)
+ continue;
+
+- if (alloc_harder)
+- return true;
+-
+ for (mt = 0; mt < MIGRATE_PCPTYPES; mt++) {
+ if (!list_empty(&area->free_list[mt]))
+ return true;
+@@ -2482,6 +2479,9 @@ static bool __zone_watermark_ok(struct z
+ return true;
+ }
+ #endif
++ if (alloc_harder &&
++ !list_empty(&area->free_list[MIGRATE_HIGHATOMIC]))
++ return true;
+ }
+ return false;
+ }
--- /dev/null
+From 17a49cd549d9dc8707dc9262210166455c612dde Mon Sep 17 00:00:00 2001
+From: Hongxu Jia <hongxu.jia@windriver.com>
+Date: Tue, 29 Nov 2016 21:56:26 -0500
+Subject: netfilter: arp_tables: fix invoking 32bit "iptable -P INPUT ACCEPT" failed in 64bit kernel
+
+From: Hongxu Jia <hongxu.jia@windriver.com>
+
+commit 17a49cd549d9dc8707dc9262210166455c612dde upstream.
+
+Since 09d9686047db ("netfilter: x_tables: do compat validation via
+translate_table"), it used compatr structure to assign newinfo
+structure. In translate_compat_table of ip_tables.c and ip6_tables.c,
+it used compatr->hook_entry to replace info->hook_entry and
+compatr->underflow to replace info->underflow, but not do the same
+replacement in arp_tables.c.
+
+It caused invoking 32-bit "arptbale -P INPUT ACCEPT" failed in 64bit
+kernel.
+--------------------------------------
+root@qemux86-64:~# arptables -P INPUT ACCEPT
+root@qemux86-64:~# arptables -P INPUT ACCEPT
+ERROR: Policy for `INPUT' offset 448 != underflow 0
+arptables: Incompatible with this kernel
+--------------------------------------
+
+Fixes: 09d9686047db ("netfilter: x_tables: do compat validation via translate_table")
+Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
+Acked-by: Florian Westphal <fw@strlen.de>
+Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
+Acked-by: Michal Kubecek <mkubecek@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ net/ipv4/netfilter/arp_tables.c | 4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+--- a/net/ipv4/netfilter/arp_tables.c
++++ b/net/ipv4/netfilter/arp_tables.c
+@@ -1339,8 +1339,8 @@ static int translate_compat_table(struct
+
+ newinfo->number = compatr->num_entries;
+ for (i = 0; i < NF_ARP_NUMHOOKS; i++) {
+- newinfo->hook_entry[i] = info->hook_entry[i];
+- newinfo->underflow[i] = info->underflow[i];
++ newinfo->hook_entry[i] = compatr->hook_entry[i];
++ newinfo->underflow[i] = compatr->underflow[i];
+ }
+ entry1 = newinfo->entries;
+ pos = entry1;
--- /dev/null
+From 92b4423e3a0bc5d43ecde4bcad871f8b5ba04efd Mon Sep 17 00:00:00 2001
+From: Pablo Neira Ayuso <pablo@netfilter.org>
+Date: Fri, 29 Apr 2016 10:39:34 +0200
+Subject: netfilter: fix IS_ERR_VALUE usage
+
+From: Pablo Neira Ayuso <pablo@netfilter.org>
+
+commit 92b4423e3a0bc5d43ecde4bcad871f8b5ba04efd upstream.
+
+This is a forward-port of the original patch from Andrzej Hajda,
+he said:
+
+"IS_ERR_VALUE should be used only with unsigned long type.
+Otherwise it can work incorrectly. To achieve this function
+xt_percpu_counter_alloc is modified to return unsigned long,
+and its result is assigned to temporary variable to perform
+error checking, before assigning to .pcnt field.
+
+The patch follows conclusion from discussion on LKML [1][2].
+
+[1]: http://permalink.gmane.org/gmane.linux.kernel/2120927
+[2]: http://permalink.gmane.org/gmane.linux.kernel/2150581"
+
+Original patch from Andrzej is here:
+
+http://patchwork.ozlabs.org/patch/582970/
+
+This patch has clashed with input validation fixes for x_tables.
+
+Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+Acked-by: Michal Kubecek <mkubecek@suse.cz>
+
+---
+ include/linux/netfilter/x_tables.h | 6 +++---
+ net/ipv4/netfilter/arp_tables.c | 6 ++++--
+ net/ipv4/netfilter/ip_tables.c | 6 ++++--
+ net/ipv6/netfilter/ip6_tables.c | 6 ++++--
+ 4 files changed, 15 insertions(+), 9 deletions(-)
+
+--- a/include/linux/netfilter/x_tables.h
++++ b/include/linux/netfilter/x_tables.h
+@@ -381,16 +381,16 @@ static inline unsigned long ifname_compa
+ * allows us to return 0 for single core systems without forcing
+ * callers to deal with SMP vs. NONSMP issues.
+ */
+-static inline u64 xt_percpu_counter_alloc(void)
++static inline unsigned long xt_percpu_counter_alloc(void)
+ {
+ if (nr_cpu_ids > 1) {
+ void __percpu *res = __alloc_percpu(sizeof(struct xt_counters),
+ sizeof(struct xt_counters));
+
+ if (res == NULL)
+- return (u64) -ENOMEM;
++ return -ENOMEM;
+
+- return (u64) (__force unsigned long) res;
++ return (__force unsigned long) res;
+ }
+
+ return 0;
+--- a/net/ipv4/netfilter/arp_tables.c
++++ b/net/ipv4/netfilter/arp_tables.c
+@@ -511,11 +511,13 @@ find_check_entry(struct arpt_entry *e, c
+ {
+ struct xt_entry_target *t;
+ struct xt_target *target;
++ unsigned long pcnt;
+ int ret;
+
+- e->counters.pcnt = xt_percpu_counter_alloc();
+- if (IS_ERR_VALUE(e->counters.pcnt))
++ pcnt = xt_percpu_counter_alloc();
++ if (IS_ERR_VALUE(pcnt))
+ return -ENOMEM;
++ e->counters.pcnt = pcnt;
+
+ t = arpt_get_target(e);
+ target = xt_request_find_target(NFPROTO_ARP, t->u.user.name,
+--- a/net/ipv4/netfilter/ip_tables.c
++++ b/net/ipv4/netfilter/ip_tables.c
+@@ -653,10 +653,12 @@ find_check_entry(struct ipt_entry *e, st
+ unsigned int j;
+ struct xt_mtchk_param mtpar;
+ struct xt_entry_match *ematch;
++ unsigned long pcnt;
+
+- e->counters.pcnt = xt_percpu_counter_alloc();
+- if (IS_ERR_VALUE(e->counters.pcnt))
++ pcnt = xt_percpu_counter_alloc();
++ if (IS_ERR_VALUE(pcnt))
+ return -ENOMEM;
++ e->counters.pcnt = pcnt;
+
+ j = 0;
+ mtpar.net = net;
+--- a/net/ipv6/netfilter/ip6_tables.c
++++ b/net/ipv6/netfilter/ip6_tables.c
+@@ -666,10 +666,12 @@ find_check_entry(struct ip6t_entry *e, s
+ unsigned int j;
+ struct xt_mtchk_param mtpar;
+ struct xt_entry_match *ematch;
++ unsigned long pcnt;
+
+- e->counters.pcnt = xt_percpu_counter_alloc();
+- if (IS_ERR_VALUE(e->counters.pcnt))
++ pcnt = xt_percpu_counter_alloc();
++ if (IS_ERR_VALUE(pcnt))
+ return -ENOMEM;
++ e->counters.pcnt = pcnt;
+
+ j = 0;
+ mtpar.net = net;
--- /dev/null
+From 444f901742d054a4cd5ff045871eac5131646cfb Mon Sep 17 00:00:00 2001
+From: Ulrich Weber <ulrich.weber@riverbed.com>
+Date: Mon, 24 Oct 2016 18:07:23 +0200
+Subject: netfilter: nf_conntrack_sip: extend request line validation
+
+From: Ulrich Weber <ulrich.weber@riverbed.com>
+
+commit 444f901742d054a4cd5ff045871eac5131646cfb upstream.
+
+on SIP requests, so a fragmented TCP SIP packet from an allow header starting with
+ INVITE,NOTIFY,OPTIONS,REFER,REGISTER,UPDATE,SUBSCRIBE
+ Content-Length: 0
+
+will not bet interpreted as an INVITE request. Also Request-URI must start with an alphabetic character.
+
+Confirm with RFC 3261
+ Request-Line = Method SP Request-URI SP SIP-Version CRLF
+
+Fixes: 30f33e6dee80 ("[NETFILTER]: nf_conntrack_sip: support method specific request/response handling")
+Signed-off-by: Ulrich Weber <ulrich.weber@riverbed.com>
+Acked-by: Marco Angaroni <marcoangaroni@gmail.com>
+Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
+Acked-by: Michal Kubecek <mkubecek@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ net/netfilter/nf_conntrack_sip.c | 5 ++++-
+ 1 file changed, 4 insertions(+), 1 deletion(-)
+
+--- a/net/netfilter/nf_conntrack_sip.c
++++ b/net/netfilter/nf_conntrack_sip.c
+@@ -1434,9 +1434,12 @@ static int process_sip_request(struct sk
+ handler = &sip_handlers[i];
+ if (handler->request == NULL)
+ continue;
+- if (*datalen < handler->len ||
++ if (*datalen < handler->len + 2 ||
+ strncasecmp(*dptr, handler->method, handler->len))
+ continue;
++ if ((*dptr)[handler->len] != ' ' ||
++ !isalpha((*dptr)[handler->len+1]))
++ continue;
+
+ if (ct_sip_get_header(ct, *dptr, 0, *datalen, SIP_HDR_CSEQ,
+ &matchoff, &matchlen) <= 0) {
--- /dev/null
+From b173a28f62cf929324a8a6adcc45adadce311d16 Mon Sep 17 00:00:00 2001
+From: Liping Zhang <liping.zhang@spreadtrum.com>
+Date: Mon, 8 Aug 2016 21:57:58 +0800
+Subject: netfilter: nf_ct_expect: remove the redundant slash when policy name is empty
+
+From: Liping Zhang <liping.zhang@spreadtrum.com>
+
+commit b173a28f62cf929324a8a6adcc45adadce311d16 upstream.
+
+The 'name' filed in struct nf_conntrack_expect_policy{} is not a
+pointer, so check it is NULL or not will always return true. Even if the
+name is empty, slash will always be displayed like follows:
+ # cat /proc/net/nf_conntrack_expect
+ 297 l3proto = 2 proto=6 src=1.1.1.1 dst=2.2.2.2 sport=1 dport=1025 ftp/
+ ^
+
+Fixes: 3a8fc53a45c4 ("netfilter: nf_ct_helper: allocate 16 bytes for the helper and policy names")
+Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
+Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
+Acked-by: Michal Kubecek <mkubecek@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ net/netfilter/nf_conntrack_expect.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/net/netfilter/nf_conntrack_expect.c
++++ b/net/netfilter/nf_conntrack_expect.c
+@@ -560,7 +560,7 @@ static int exp_seq_show(struct seq_file
+ helper = rcu_dereference(nfct_help(expect->master)->helper);
+ if (helper) {
+ seq_printf(s, "%s%s", expect->flags ? " " : "", helper->name);
+- if (helper->expect_policy[expect->class].name)
++ if (helper->expect_policy[expect->class].name[0])
+ seq_printf(s, "/%s",
+ helper->expect_policy[expect->class].name);
+ }
--- /dev/null
+From 83170f3beccccd7ceb4f9a0ac0c4dc736afde90c Mon Sep 17 00:00:00 2001
+From: Paolo Abeni <pabeni@redhat.com>
+Date: Thu, 26 May 2016 19:08:10 +0200
+Subject: netfilter: nf_dup_ipv6: set again FLOWI_FLAG_KNOWN_NH at flowi6_flags
+
+From: Paolo Abeni <pabeni@redhat.com>
+
+commit 83170f3beccccd7ceb4f9a0ac0c4dc736afde90c upstream.
+
+With the commit 48e8aa6e3137 ("ipv6: Set FLOWI_FLAG_KNOWN_NH at
+flowi6_flags") ip6_pol_route() callers were asked to to set the
+FLOWI_FLAG_KNOWN_NH properly and xt_TEE was updated accordingly,
+but with the later refactor in commit bbde9fc1824a ("netfilter:
+factor out packet duplication for IPv4/IPv6") the flowi6_flags
+update was lost.
+This commit re-add it just before the routing decision.
+
+Fixes: bbde9fc1824a ("netfilter: factor out packet duplication for IPv4/IPv6")
+Signed-off-by: Paolo Abeni <pabeni@redhat.com>
+Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
+Acked-by: Michal Kubecek <mkubecek@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ net/ipv6/netfilter/nf_dup_ipv6.c | 1 +
+ 1 file changed, 1 insertion(+)
+
+--- a/net/ipv6/netfilter/nf_dup_ipv6.c
++++ b/net/ipv6/netfilter/nf_dup_ipv6.c
+@@ -33,6 +33,7 @@ static bool nf_dup_ipv6_route(struct net
+ fl6.daddr = *gw;
+ fl6.flowlabel = (__force __be32)(((iph->flow_lbl[0] & 0xF) << 16) |
+ (iph->flow_lbl[1] << 8) | iph->flow_lbl[2]);
++ fl6.flowi6_flags = FLOWI_FLAG_KNOWN_NH;
+ dst = ip6_route_output(net, NULL, &fl6);
+ if (dst->error) {
+ dst_release(dst);
--- /dev/null
+From 4b380c42f7d00a395feede754f0bc2292eebe6e5 Mon Sep 17 00:00:00 2001
+From: Kevin Cernekee <cernekee@chromium.org>
+Date: Sun, 3 Dec 2017 12:12:45 -0800
+Subject: netfilter: nfnetlink_cthelper: Add missing permission checks
+
+From: Kevin Cernekee <cernekee@chromium.org>
+
+commit 4b380c42f7d00a395feede754f0bc2292eebe6e5 upstream.
+
+The capability check in nfnetlink_rcv() verifies that the caller
+has CAP_NET_ADMIN in the namespace that "owns" the netlink socket.
+However, nfnl_cthelper_list is shared by all net namespaces on the
+system. An unprivileged user can create user and net namespaces
+in which he holds CAP_NET_ADMIN to bypass the netlink_net_capable()
+check:
+
+ $ nfct helper list
+ nfct v1.4.4: netlink error: Operation not permitted
+ $ vpnns -- nfct helper list
+ {
+ .name = ftp,
+ .queuenum = 0,
+ .l3protonum = 2,
+ .l4protonum = 6,
+ .priv_data_len = 24,
+ .status = enabled,
+ };
+
+Add capable() checks in nfnetlink_cthelper, as this is cleaner than
+trying to generalize the solution.
+
+Signed-off-by: Kevin Cernekee <cernekee@chromium.org>
+Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
+Acked-by: Michal Kubecek <mkubecek@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ net/netfilter/nfnetlink_cthelper.c | 10 ++++++++++
+ 1 file changed, 10 insertions(+)
+
+--- a/net/netfilter/nfnetlink_cthelper.c
++++ b/net/netfilter/nfnetlink_cthelper.c
+@@ -17,6 +17,7 @@
+ #include <linux/types.h>
+ #include <linux/list.h>
+ #include <linux/errno.h>
++#include <linux/capability.h>
+ #include <net/netlink.h>
+ #include <net/sock.h>
+
+@@ -392,6 +393,9 @@ nfnl_cthelper_new(struct sock *nfnl, str
+ struct nfnl_cthelper *nlcth;
+ int ret = 0;
+
++ if (!capable(CAP_NET_ADMIN))
++ return -EPERM;
++
+ if (!tb[NFCTH_NAME] || !tb[NFCTH_TUPLE])
+ return -EINVAL;
+
+@@ -595,6 +599,9 @@ nfnl_cthelper_get(struct sock *nfnl, str
+ struct nfnl_cthelper *nlcth;
+ bool tuple_set = false;
+
++ if (!capable(CAP_NET_ADMIN))
++ return -EPERM;
++
+ if (nlh->nlmsg_flags & NLM_F_DUMP) {
+ struct netlink_dump_control c = {
+ .dump = nfnl_cthelper_dump_table,
+@@ -661,6 +668,9 @@ nfnl_cthelper_del(struct sock *nfnl, str
+ struct nfnl_cthelper *nlcth, *n;
+ int j = 0, ret;
+
++ if (!capable(CAP_NET_ADMIN))
++ return -EPERM;
++
+ if (tb[NFCTH_NAME])
+ helper_name = nla_data(tb[NFCTH_NAME]);
+
--- /dev/null
+From 00a3101f561816e58de054a470484996f78eb5eb Mon Sep 17 00:00:00 2001
+From: Liping Zhang <liping.zhang@spreadtrum.com>
+Date: Mon, 8 Aug 2016 22:07:27 +0800
+Subject: netfilter: nfnetlink_queue: reject verdict request from different portid
+
+From: Liping Zhang <liping.zhang@spreadtrum.com>
+
+commit 00a3101f561816e58de054a470484996f78eb5eb upstream.
+
+Like NFQNL_MSG_VERDICT_BATCH do, we should also reject the verdict
+request when the portid is not same with the initial portid(maybe
+from another process).
+
+Fixes: 97d32cf9440d ("netfilter: nfnetlink_queue: batch verdict support")
+Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
+Reviewed-by: Florian Westphal <fw@strlen.de>
+Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
+Acked-by: Michal Kubecek <mkubecek@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ net/netfilter/nfnetlink_queue.c | 6 ++----
+ 1 file changed, 2 insertions(+), 4 deletions(-)
+
+--- a/net/netfilter/nfnetlink_queue.c
++++ b/net/netfilter/nfnetlink_queue.c
+@@ -1053,10 +1053,8 @@ nfqnl_recv_verdict(struct sock *ctnl, st
+ struct net *net = sock_net(ctnl);
+ struct nfnl_queue_net *q = nfnl_queue_pernet(net);
+
+- queue = instance_lookup(q, queue_num);
+- if (!queue)
+- queue = verdict_instance_lookup(q, queue_num,
+- NETLINK_CB(skb).portid);
++ queue = verdict_instance_lookup(q, queue_num,
++ NETLINK_CB(skb).portid);
+ if (IS_ERR(queue))
+ return PTR_ERR(queue);
+
--- /dev/null
+From 95a8d19f28e6b29377a880c6264391a62e07fccc Mon Sep 17 00:00:00 2001
+From: Florian Westphal <fw@strlen.de>
+Date: Thu, 25 Aug 2016 15:33:29 +0200
+Subject: netfilter: restart search if moved to other chain
+
+From: Florian Westphal <fw@strlen.de>
+
+commit 95a8d19f28e6b29377a880c6264391a62e07fccc upstream.
+
+In case nf_conntrack_tuple_taken did not find a conflicting entry
+check that all entries in this hash slot were tested and restart
+in case an entry was moved to another chain.
+
+Reported-by: Eric Dumazet <edumazet@google.com>
+Fixes: ea781f197d6a ("netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu()")
+Signed-off-by: Florian Westphal <fw@strlen.de>
+Acked-by: Eric Dumazet <edumazet@google.com>
+Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
+Acked-by: Michal Kubecek <mkubecek@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ net/netfilter/nf_conntrack_core.c | 7 +++++++
+ 1 file changed, 7 insertions(+)
+
+--- a/net/netfilter/nf_conntrack_core.c
++++ b/net/netfilter/nf_conntrack_core.c
+@@ -719,6 +719,7 @@ nf_conntrack_tuple_taken(const struct nf
+ * least once for the stats anyway.
+ */
+ rcu_read_lock_bh();
++ begin:
+ hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[hash], hnnode) {
+ ct = nf_ct_tuplehash_to_ctrack(h);
+ if (ct != ignored_conntrack &&
+@@ -730,6 +731,12 @@ nf_conntrack_tuple_taken(const struct nf
+ }
+ NF_CT_STAT_INC(net, searched);
+ }
++
++ if (get_nulls_value(n) != hash) {
++ NF_CT_STAT_INC(net, search_restart);
++ goto begin;
++ }
++
+ rcu_read_unlock_bh();
+
+ return 0;
--- /dev/null
+From cc31d43b4154ad5a7d8aa5543255a93b7e89edc2 Mon Sep 17 00:00:00 2001
+From: Pau Espin Pedrol <pau.espin@tessares.net>
+Date: Fri, 6 Jan 2017 20:33:27 +0100
+Subject: netfilter: use fwmark_reflect in nf_send_reset
+
+From: Pau Espin Pedrol <pau.espin@tessares.net>
+
+commit cc31d43b4154ad5a7d8aa5543255a93b7e89edc2 upstream.
+
+Otherwise, RST packets generated by ipt_REJECT always have mark 0 when
+the routing is checked later in the same code path.
+
+Fixes: e110861f8609 ("net: add a sysctl to reflect the fwmark on replies")
+Cc: Lorenzo Colitti <lorenzo@google.com>
+Signed-off-by: Pau Espin Pedrol <pau.espin@tessares.net>
+Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
+Acked-by: Michal Kubecek <mkubecek@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+
+---
+ net/ipv4/netfilter/nf_reject_ipv4.c | 2 ++
+ net/ipv6/netfilter/nf_reject_ipv6.c | 3 +++
+ 2 files changed, 5 insertions(+)
+
+--- a/net/ipv4/netfilter/nf_reject_ipv4.c
++++ b/net/ipv4/netfilter/nf_reject_ipv4.c
+@@ -124,6 +124,8 @@ void nf_send_reset(struct net *net, stru
+ /* ip_route_me_harder expects skb->dst to be set */
+ skb_dst_set_noref(nskb, skb_dst(oldskb));
+
++ nskb->mark = IP4_REPLY_MARK(net, oldskb->mark);
++
+ skb_reserve(nskb, LL_MAX_HEADER);
+ niph = nf_reject_iphdr_put(nskb, oldskb, IPPROTO_TCP,
+ ip4_dst_hoplimit(skb_dst(nskb)));
+--- a/net/ipv6/netfilter/nf_reject_ipv6.c
++++ b/net/ipv6/netfilter/nf_reject_ipv6.c
+@@ -157,6 +157,7 @@ void nf_send_reset6(struct net *net, str
+ fl6.daddr = oip6h->saddr;
+ fl6.fl6_sport = otcph->dest;
+ fl6.fl6_dport = otcph->source;
++ fl6.flowi6_mark = IP6_REPLY_MARK(net, oldskb->mark);
+ security_skb_classify_flow(oldskb, flowi6_to_flowi(&fl6));
+ dst = ip6_route_output(net, NULL, &fl6);
+ if (dst == NULL || dst->error) {
+@@ -180,6 +181,8 @@ void nf_send_reset6(struct net *net, str
+
+ skb_dst_set(nskb, dst);
+
++ nskb->mark = fl6.flowi6_mark;
++
+ skb_reserve(nskb, hh_len + dst->header_len);
+ ip6h = nf_reject_ip6hdr_put(nskb, oldskb, IPPROTO_TCP,
+ ip6_dst_hoplimit(dst));
--- /dev/null
+From f4dc77713f8016d2e8a3295e1c9c53a21f296def Mon Sep 17 00:00:00 2001
+From: Florian Westphal <fw@strlen.de>
+Date: Thu, 14 Jul 2016 17:51:26 +0200
+Subject: netfilter: x_tables: speed up jump target validation
+
+From: Florian Westphal <fw@strlen.de>
+
+commit f4dc77713f8016d2e8a3295e1c9c53a21f296def upstream.
+
+The dummy ruleset I used to test the original validation change was broken,
+most rules were unreachable and were not tested by mark_source_chains().
+
+In some cases rulesets that used to load in a few seconds now require
+several minutes.
+
+sample ruleset that shows the behaviour:
+
+echo "*filter"
+for i in $(seq 0 100000);do
+ printf ":chain_%06x - [0:0]\n" $i
+done
+for i in $(seq 0 100000);do
+ printf -- "-A INPUT -j chain_%06x\n" $i
+ printf -- "-A INPUT -j chain_%06x\n" $i
+ printf -- "-A INPUT -j chain_%06x\n" $i
+done
+echo COMMIT
+
+[ pipe result into iptables-restore ]
+
+This ruleset will be about 74mbyte in size, with ~500k searches
+though all 500k[1] rule entries. iptables-restore will take forever
+(gave up after 10 minutes)
+
+Instead of always searching the entire blob for a match, fill an
+array with the start offsets of every single ipt_entry struct,
+then do a binary search to check if the jump target is present or not.
+
+After this change ruleset restore times get again close to what one
+gets when reverting 36472341017529e (~3 seconds on my workstation).
+
+[1] every user-defined rule gets an implicit RETURN, so we get
+300k jumps + 100k userchains + 100k returns -> 500k rule entries
+
+Fixes: 36472341017529e ("netfilter: x_tables: validate targets of jumps")
+Reported-by: Jeff Wu <wujiafu@gmail.com>
+Tested-by: Jeff Wu <wujiafu@gmail.com>
+Signed-off-by: Florian Westphal <fw@strlen.de>
+Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
+Acked-by: Michal Kubecek <mkubecek@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ include/linux/netfilter/x_tables.h | 4 ++
+ net/ipv4/netfilter/arp_tables.c | 47 ++++++++++++++++++----------------
+ net/ipv4/netfilter/ip_tables.c | 45 +++++++++++++++++----------------
+ net/ipv6/netfilter/ip6_tables.c | 45 +++++++++++++++++----------------
+ net/netfilter/x_tables.c | 50 +++++++++++++++++++++++++++++++++++++
+ 5 files changed, 127 insertions(+), 64 deletions(-)
+
+--- a/include/linux/netfilter/x_tables.h
++++ b/include/linux/netfilter/x_tables.h
+@@ -243,6 +243,10 @@ int xt_check_entry_offsets(const void *b
+ unsigned int target_offset,
+ unsigned int next_offset);
+
++unsigned int *xt_alloc_entry_offsets(unsigned int size);
++bool xt_find_jump_offset(const unsigned int *offsets,
++ unsigned int target, unsigned int size);
++
+ int xt_check_match(struct xt_mtchk_param *, unsigned int size, u_int8_t proto,
+ bool inv_proto);
+ int xt_check_target(struct xt_tgchk_param *, unsigned int size, u_int8_t proto,
+--- a/net/ipv4/netfilter/arp_tables.c
++++ b/net/ipv4/netfilter/arp_tables.c
+@@ -367,23 +367,12 @@ static inline bool unconditional(const s
+ memcmp(&e->arp, &uncond, sizeof(uncond)) == 0;
+ }
+
+-static bool find_jump_target(const struct xt_table_info *t,
+- const struct arpt_entry *target)
+-{
+- struct arpt_entry *iter;
+-
+- xt_entry_foreach(iter, t->entries, t->size) {
+- if (iter == target)
+- return true;
+- }
+- return false;
+-}
+-
+ /* Figures out from what hook each rule can be called: returns 0 if
+ * there are loops. Puts hook bitmask in comefrom.
+ */
+ static int mark_source_chains(const struct xt_table_info *newinfo,
+- unsigned int valid_hooks, void *entry0)
++ unsigned int valid_hooks, void *entry0,
++ unsigned int *offsets)
+ {
+ unsigned int hook;
+
+@@ -472,10 +461,11 @@ static int mark_source_chains(const stru
+ /* This a jump; chase it. */
+ duprintf("Jump rule %u -> %u\n",
+ pos, newpos);
++ if (!xt_find_jump_offset(offsets, newpos,
++ newinfo->number))
++ return 0;
+ e = (struct arpt_entry *)
+ (entry0 + newpos);
+- if (!find_jump_target(newinfo, e))
+- return 0;
+ } else {
+ /* ... this is a fallthru */
+ newpos = pos + e->next_offset;
+@@ -642,6 +632,7 @@ static int translate_table(struct xt_tab
+ const struct arpt_replace *repl)
+ {
+ struct arpt_entry *iter;
++ unsigned int *offsets;
+ unsigned int i;
+ int ret = 0;
+
+@@ -655,6 +646,9 @@ static int translate_table(struct xt_tab
+ }
+
+ duprintf("translate_table: size %u\n", newinfo->size);
++ offsets = xt_alloc_entry_offsets(newinfo->number);
++ if (!offsets)
++ return -ENOMEM;
+ i = 0;
+
+ /* Walk through entries, checking offsets. */
+@@ -665,7 +659,9 @@ static int translate_table(struct xt_tab
+ repl->underflow,
+ repl->valid_hooks);
+ if (ret != 0)
+- break;
++ goto out_free;
++ if (i < repl->num_entries)
++ offsets[i] = (void *)iter - entry0;
+ ++i;
+ if (strcmp(arpt_get_target(iter)->u.user.name,
+ XT_ERROR_TARGET) == 0)
+@@ -673,12 +669,13 @@ static int translate_table(struct xt_tab
+ }
+ duprintf("translate_table: ARPT_ENTRY_ITERATE gives %d\n", ret);
+ if (ret != 0)
+- return ret;
++ goto out_free;
+
++ ret = -EINVAL;
+ if (i != repl->num_entries) {
+ duprintf("translate_table: %u not %u entries\n",
+ i, repl->num_entries);
+- return -EINVAL;
++ goto out_free;
+ }
+
+ /* Check hooks all assigned */
+@@ -689,17 +686,20 @@ static int translate_table(struct xt_tab
+ if (newinfo->hook_entry[i] == 0xFFFFFFFF) {
+ duprintf("Invalid hook entry %u %u\n",
+ i, repl->hook_entry[i]);
+- return -EINVAL;
++ goto out_free;
+ }
+ if (newinfo->underflow[i] == 0xFFFFFFFF) {
+ duprintf("Invalid underflow %u %u\n",
+ i, repl->underflow[i]);
+- return -EINVAL;
++ goto out_free;
+ }
+ }
+
+- if (!mark_source_chains(newinfo, repl->valid_hooks, entry0))
+- return -ELOOP;
++ if (!mark_source_chains(newinfo, repl->valid_hooks, entry0, offsets)) {
++ ret = -ELOOP;
++ goto out_free;
++ }
++ kvfree(offsets);
+
+ /* Finally, each sanity check must pass */
+ i = 0;
+@@ -720,6 +720,9 @@ static int translate_table(struct xt_tab
+ }
+
+ return ret;
++ out_free:
++ kvfree(offsets);
++ return ret;
+ }
+
+ static void get_counters(const struct xt_table_info *t,
+--- a/net/ipv4/netfilter/ip_tables.c
++++ b/net/ipv4/netfilter/ip_tables.c
+@@ -443,23 +443,12 @@ ipt_do_table(struct sk_buff *skb,
+ #endif
+ }
+
+-static bool find_jump_target(const struct xt_table_info *t,
+- const struct ipt_entry *target)
+-{
+- struct ipt_entry *iter;
+-
+- xt_entry_foreach(iter, t->entries, t->size) {
+- if (iter == target)
+- return true;
+- }
+- return false;
+-}
+-
+ /* Figures out from what hook each rule can be called: returns 0 if
+ there are loops. Puts hook bitmask in comefrom. */
+ static int
+ mark_source_chains(const struct xt_table_info *newinfo,
+- unsigned int valid_hooks, void *entry0)
++ unsigned int valid_hooks, void *entry0,
++ unsigned int *offsets)
+ {
+ unsigned int hook;
+
+@@ -552,10 +541,11 @@ mark_source_chains(const struct xt_table
+ /* This a jump; chase it. */
+ duprintf("Jump rule %u -> %u\n",
+ pos, newpos);
++ if (!xt_find_jump_offset(offsets, newpos,
++ newinfo->number))
++ return 0;
+ e = (struct ipt_entry *)
+ (entry0 + newpos);
+- if (!find_jump_target(newinfo, e))
+- return 0;
+ } else {
+ /* ... this is a fallthru */
+ newpos = pos + e->next_offset;
+@@ -811,6 +801,7 @@ translate_table(struct net *net, struct
+ const struct ipt_replace *repl)
+ {
+ struct ipt_entry *iter;
++ unsigned int *offsets;
+ unsigned int i;
+ int ret = 0;
+
+@@ -824,6 +815,9 @@ translate_table(struct net *net, struct
+ }
+
+ duprintf("translate_table: size %u\n", newinfo->size);
++ offsets = xt_alloc_entry_offsets(newinfo->number);
++ if (!offsets)
++ return -ENOMEM;
+ i = 0;
+ /* Walk through entries, checking offsets. */
+ xt_entry_foreach(iter, entry0, newinfo->size) {
+@@ -833,17 +827,20 @@ translate_table(struct net *net, struct
+ repl->underflow,
+ repl->valid_hooks);
+ if (ret != 0)
+- return ret;
++ goto out_free;
++ if (i < repl->num_entries)
++ offsets[i] = (void *)iter - entry0;
+ ++i;
+ if (strcmp(ipt_get_target(iter)->u.user.name,
+ XT_ERROR_TARGET) == 0)
+ ++newinfo->stacksize;
+ }
+
++ ret = -EINVAL;
+ if (i != repl->num_entries) {
+ duprintf("translate_table: %u not %u entries\n",
+ i, repl->num_entries);
+- return -EINVAL;
++ goto out_free;
+ }
+
+ /* Check hooks all assigned */
+@@ -854,17 +851,20 @@ translate_table(struct net *net, struct
+ if (newinfo->hook_entry[i] == 0xFFFFFFFF) {
+ duprintf("Invalid hook entry %u %u\n",
+ i, repl->hook_entry[i]);
+- return -EINVAL;
++ goto out_free;
+ }
+ if (newinfo->underflow[i] == 0xFFFFFFFF) {
+ duprintf("Invalid underflow %u %u\n",
+ i, repl->underflow[i]);
+- return -EINVAL;
++ goto out_free;
+ }
+ }
+
+- if (!mark_source_chains(newinfo, repl->valid_hooks, entry0))
+- return -ELOOP;
++ if (!mark_source_chains(newinfo, repl->valid_hooks, entry0, offsets)) {
++ ret = -ELOOP;
++ goto out_free;
++ }
++ kvfree(offsets);
+
+ /* Finally, each sanity check must pass */
+ i = 0;
+@@ -885,6 +885,9 @@ translate_table(struct net *net, struct
+ }
+
+ return ret;
++ out_free:
++ kvfree(offsets);
++ return ret;
+ }
+
+ static void
+--- a/net/ipv6/netfilter/ip6_tables.c
++++ b/net/ipv6/netfilter/ip6_tables.c
+@@ -455,23 +455,12 @@ ip6t_do_table(struct sk_buff *skb,
+ #endif
+ }
+
+-static bool find_jump_target(const struct xt_table_info *t,
+- const struct ip6t_entry *target)
+-{
+- struct ip6t_entry *iter;
+-
+- xt_entry_foreach(iter, t->entries, t->size) {
+- if (iter == target)
+- return true;
+- }
+- return false;
+-}
+-
+ /* Figures out from what hook each rule can be called: returns 0 if
+ there are loops. Puts hook bitmask in comefrom. */
+ static int
+ mark_source_chains(const struct xt_table_info *newinfo,
+- unsigned int valid_hooks, void *entry0)
++ unsigned int valid_hooks, void *entry0,
++ unsigned int *offsets)
+ {
+ unsigned int hook;
+
+@@ -564,10 +553,11 @@ mark_source_chains(const struct xt_table
+ /* This a jump; chase it. */
+ duprintf("Jump rule %u -> %u\n",
+ pos, newpos);
++ if (!xt_find_jump_offset(offsets, newpos,
++ newinfo->number))
++ return 0;
+ e = (struct ip6t_entry *)
+ (entry0 + newpos);
+- if (!find_jump_target(newinfo, e))
+- return 0;
+ } else {
+ /* ... this is a fallthru */
+ newpos = pos + e->next_offset;
+@@ -823,6 +813,7 @@ translate_table(struct net *net, struct
+ const struct ip6t_replace *repl)
+ {
+ struct ip6t_entry *iter;
++ unsigned int *offsets;
+ unsigned int i;
+ int ret = 0;
+
+@@ -836,6 +827,9 @@ translate_table(struct net *net, struct
+ }
+
+ duprintf("translate_table: size %u\n", newinfo->size);
++ offsets = xt_alloc_entry_offsets(newinfo->number);
++ if (!offsets)
++ return -ENOMEM;
+ i = 0;
+ /* Walk through entries, checking offsets. */
+ xt_entry_foreach(iter, entry0, newinfo->size) {
+@@ -845,17 +839,20 @@ translate_table(struct net *net, struct
+ repl->underflow,
+ repl->valid_hooks);
+ if (ret != 0)
+- return ret;
++ goto out_free;
++ if (i < repl->num_entries)
++ offsets[i] = (void *)iter - entry0;
+ ++i;
+ if (strcmp(ip6t_get_target(iter)->u.user.name,
+ XT_ERROR_TARGET) == 0)
+ ++newinfo->stacksize;
+ }
+
++ ret = -EINVAL;
+ if (i != repl->num_entries) {
+ duprintf("translate_table: %u not %u entries\n",
+ i, repl->num_entries);
+- return -EINVAL;
++ goto out_free;
+ }
+
+ /* Check hooks all assigned */
+@@ -866,17 +863,20 @@ translate_table(struct net *net, struct
+ if (newinfo->hook_entry[i] == 0xFFFFFFFF) {
+ duprintf("Invalid hook entry %u %u\n",
+ i, repl->hook_entry[i]);
+- return -EINVAL;
++ goto out_free;
+ }
+ if (newinfo->underflow[i] == 0xFFFFFFFF) {
+ duprintf("Invalid underflow %u %u\n",
+ i, repl->underflow[i]);
+- return -EINVAL;
++ goto out_free;
+ }
+ }
+
+- if (!mark_source_chains(newinfo, repl->valid_hooks, entry0))
+- return -ELOOP;
++ if (!mark_source_chains(newinfo, repl->valid_hooks, entry0, offsets)) {
++ ret = -ELOOP;
++ goto out_free;
++ }
++ kvfree(offsets);
+
+ /* Finally, each sanity check must pass */
+ i = 0;
+@@ -897,6 +897,9 @@ translate_table(struct net *net, struct
+ }
+
+ return ret;
++ out_free:
++ kvfree(offsets);
++ return ret;
+ }
+
+ static void
+--- a/net/netfilter/x_tables.c
++++ b/net/netfilter/x_tables.c
+@@ -701,6 +701,56 @@ int xt_check_entry_offsets(const void *b
+ }
+ EXPORT_SYMBOL(xt_check_entry_offsets);
+
++/**
++ * xt_alloc_entry_offsets - allocate array to store rule head offsets
++ *
++ * @size: number of entries
++ *
++ * Return: NULL or kmalloc'd or vmalloc'd array
++ */
++unsigned int *xt_alloc_entry_offsets(unsigned int size)
++{
++ unsigned int *off;
++
++ off = kcalloc(size, sizeof(unsigned int), GFP_KERNEL | __GFP_NOWARN);
++
++ if (off)
++ return off;
++
++ if (size < (SIZE_MAX / sizeof(unsigned int)))
++ off = vmalloc(size * sizeof(unsigned int));
++
++ return off;
++}
++EXPORT_SYMBOL(xt_alloc_entry_offsets);
++
++/**
++ * xt_find_jump_offset - check if target is a valid jump offset
++ *
++ * @offsets: array containing all valid rule start offsets of a rule blob
++ * @target: the jump target to search for
++ * @size: entries in @offset
++ */
++bool xt_find_jump_offset(const unsigned int *offsets,
++ unsigned int target, unsigned int size)
++{
++ int m, low = 0, hi = size;
++
++ while (hi > low) {
++ m = (low + hi) / 2u;
++
++ if (offsets[m] > target)
++ hi = m;
++ else if (offsets[m] < target)
++ low = m + 1;
++ else
++ return true;
++ }
++
++ return false;
++}
++EXPORT_SYMBOL(xt_find_jump_offset);
++
+ int xt_check_target(struct xt_tgchk_param *par,
+ unsigned int size, u_int8_t proto, bool inv_proto)
+ {
--- /dev/null
+From 916a27901de01446bcf57ecca4783f6cff493309 Mon Sep 17 00:00:00 2001
+From: Kevin Cernekee <cernekee@chromium.org>
+Date: Tue, 5 Dec 2017 15:42:41 -0800
+Subject: netfilter: xt_osf: Add missing permission checks
+
+From: Kevin Cernekee <cernekee@chromium.org>
+
+commit 916a27901de01446bcf57ecca4783f6cff493309 upstream.
+
+The capability check in nfnetlink_rcv() verifies that the caller
+has CAP_NET_ADMIN in the namespace that "owns" the netlink socket.
+However, xt_osf_fingers is shared by all net namespaces on the
+system. An unprivileged user can create user and net namespaces
+in which he holds CAP_NET_ADMIN to bypass the netlink_net_capable()
+check:
+
+ vpnns -- nfnl_osf -f /tmp/pf.os
+
+ vpnns -- nfnl_osf -f /tmp/pf.os -d
+
+These non-root operations successfully modify the systemwide OS
+fingerprint list. Add new capable() checks so that they can't.
+
+Signed-off-by: Kevin Cernekee <cernekee@chromium.org>
+Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
+Acked-by: Michal Kubecek <mkubecek@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ net/netfilter/xt_osf.c | 7 +++++++
+ 1 file changed, 7 insertions(+)
+
+--- a/net/netfilter/xt_osf.c
++++ b/net/netfilter/xt_osf.c
+@@ -19,6 +19,7 @@
+ #include <linux/module.h>
+ #include <linux/kernel.h>
+
++#include <linux/capability.h>
+ #include <linux/if.h>
+ #include <linux/inetdevice.h>
+ #include <linux/ip.h>
+@@ -69,6 +70,9 @@ static int xt_osf_add_callback(struct so
+ struct xt_osf_finger *kf = NULL, *sf;
+ int err = 0;
+
++ if (!capable(CAP_NET_ADMIN))
++ return -EPERM;
++
+ if (!osf_attrs[OSF_ATTR_FINGER])
+ return -EINVAL;
+
+@@ -112,6 +116,9 @@ static int xt_osf_remove_callback(struct
+ struct xt_osf_finger *sf;
+ int err = -ENOENT;
+
++ if (!capable(CAP_NET_ADMIN))
++ return -EPERM;
++
+ if (!osf_attrs[OSF_ATTR_FINGER])
+ return -EINVAL;
+
--- /dev/null
+From 6883cd7f68245e43e91e5ee583b7550abf14523f Mon Sep 17 00:00:00 2001
+From: Jan Kara <jack@suse.cz>
+Date: Thu, 22 Jun 2017 09:32:49 +0200
+Subject: reiserfs: Don't clear SGID when inheriting ACLs
+
+From: Jan Kara <jack@suse.cz>
+
+commit 6883cd7f68245e43e91e5ee583b7550abf14523f upstream.
+
+When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
+set, DIR1 is expected to have SGID bit set (and owning group equal to
+the owning group of 'DIR0'). However when 'DIR0' also has some default
+ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
+'DIR1' to get cleared if user is not member of the owning group.
+
+Fix the problem by moving posix_acl_update_mode() out of
+__reiserfs_set_acl() into reiserfs_set_acl(). That way the function will
+not be called when inheriting ACLs which is what we want as it prevents
+SGID bit clearing and the mode has been properly set by
+posix_acl_create() anyway.
+
+Fixes: 073931017b49d9458aa351605b43a7e34598caef
+CC: reiserfs-devel@vger.kernel.org
+Signed-off-by: Jan Kara <jack@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+
+---
+ fs/reiserfs/xattr_acl.c | 12 +++++++-----
+ 1 file changed, 7 insertions(+), 5 deletions(-)
+
+--- a/fs/reiserfs/xattr_acl.c
++++ b/fs/reiserfs/xattr_acl.c
+@@ -37,7 +37,14 @@ reiserfs_set_acl(struct inode *inode, st
+ error = journal_begin(&th, inode->i_sb, jcreate_blocks);
+ reiserfs_write_unlock(inode->i_sb);
+ if (error == 0) {
++ if (type == ACL_TYPE_ACCESS && acl) {
++ error = posix_acl_update_mode(inode, &inode->i_mode,
++ &acl);
++ if (error)
++ goto unlock;
++ }
+ error = __reiserfs_set_acl(&th, inode, type, acl);
++unlock:
+ reiserfs_write_lock(inode->i_sb);
+ error2 = journal_end(&th);
+ reiserfs_write_unlock(inode->i_sb);
+@@ -245,11 +252,6 @@ __reiserfs_set_acl(struct reiserfs_trans
+ switch (type) {
+ case ACL_TYPE_ACCESS:
+ name = POSIX_ACL_XATTR_ACCESS;
+- if (acl) {
+- error = posix_acl_update_mode(inode, &inode->i_mode, &acl);
+- if (error)
+- return error;
+- }
+ break;
+ case ACL_TYPE_DEFAULT:
+ name = POSIX_ACL_XATTR_DEFAULT;
--- /dev/null
+From 54930dfeb46e978b447af0fb8ab4e181c1bf9d7a Mon Sep 17 00:00:00 2001
+From: Jeff Mahoney <jeffm@suse.com>
+Date: Thu, 22 Jun 2017 16:35:04 -0400
+Subject: reiserfs: don't preallocate blocks for extended attributes
+
+From: Jeff Mahoney <jeffm@suse.com>
+
+commit 54930dfeb46e978b447af0fb8ab4e181c1bf9d7a upstream.
+
+Most extended attributes will fit in a single block. More importantly,
+we drop the reference to the inode while holding the transaction open
+so the preallocated blocks aren't released. As a result, the inode
+may be evicted before it's removed from the transaction's prealloc list
+which can cause memory corruption.
+
+Signed-off-by: Jeff Mahoney <jeffm@suse.com>
+Signed-off-by: Jan Kara <jack@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ fs/reiserfs/bitmap.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/fs/reiserfs/bitmap.c
++++ b/fs/reiserfs/bitmap.c
+@@ -1136,7 +1136,7 @@ static int determine_prealloc_size(reise
+ hint->prealloc_size = 0;
+
+ if (!hint->formatted_node && hint->preallocate) {
+- if (S_ISREG(hint->inode->i_mode)
++ if (S_ISREG(hint->inode->i_mode) && !IS_PRIVATE(hint->inode)
+ && hint->inode->i_size >=
+ REISERFS_SB(hint->th->t_super)->s_alloc_options.
+ preallocmin * hint->inode->i_sb->s_blocksize)
--- /dev/null
+From 08db141b5313ac2f64b844fb5725b8d81744b417 Mon Sep 17 00:00:00 2001
+From: Jeff Mahoney <jeffm@suse.com>
+Date: Thu, 22 Jun 2017 16:47:34 -0400
+Subject: reiserfs: fix race in prealloc discard
+
+From: Jeff Mahoney <jeffm@suse.com>
+
+commit 08db141b5313ac2f64b844fb5725b8d81744b417 upstream.
+
+The main loop in __discard_prealloc is protected by the reiserfs write lock
+which is dropped across schedules like the BKL it replaced. The problem is
+that it checks the value, calls a routine that schedules, and then adjusts
+the state. As a result, two threads that are calling
+reiserfs_prealloc_discard at the same time can race when one calls
+reiserfs_free_prealloc_block, the lock is dropped, and the other calls
+reiserfs_free_prealloc_block with the same block number. In the right
+circumstances, it can cause the prealloc count to go negative.
+
+Signed-off-by: Jeff Mahoney <jeffm@suse.com>
+Signed-off-by: Jan Kara <jack@suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ fs/reiserfs/bitmap.c | 12 ++++++++++--
+ 1 file changed, 10 insertions(+), 2 deletions(-)
+
+--- a/fs/reiserfs/bitmap.c
++++ b/fs/reiserfs/bitmap.c
+@@ -513,9 +513,17 @@ static void __discard_prealloc(struct re
+ "inode has negative prealloc blocks count.");
+ #endif
+ while (ei->i_prealloc_count > 0) {
+- reiserfs_free_prealloc_block(th, inode, ei->i_prealloc_block);
+- ei->i_prealloc_block++;
++ b_blocknr_t block_to_free;
++
++ /*
++ * reiserfs_free_prealloc_block can drop the write lock,
++ * which could allow another caller to free the same block.
++ * We can protect against it by modifying the prealloc
++ * state before calling it.
++ */
++ block_to_free = ei->i_prealloc_block++;
+ ei->i_prealloc_count--;
++ reiserfs_free_prealloc_block(th, inode, block_to_free);
+ dirty = 1;
+ }
+ if (dirty)
--- /dev/null
+From eef9ffdf9cd39b2986367bc8395e2772bc1284ba Mon Sep 17 00:00:00 2001
+From: Johannes Thumshirn <jthumshirn@suse.de>
+Date: Mon, 9 Oct 2017 13:33:19 +0200
+Subject: scsi: libiscsi: fix shifting of DID_REQUEUE host byte
+
+From: Johannes Thumshirn <jthumshirn@suse.de>
+
+commit eef9ffdf9cd39b2986367bc8395e2772bc1284ba upstream.
+
+The SCSI host byte should be shifted left by 16 in order to have
+scsi_decide_disposition() do the right thing (.i.e. requeue the
+command).
+
+Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
+Fixes: 661134ad3765 ("[SCSI] libiscsi, bnx2i: make bound ep check common")
+Cc: Lee Duncan <lduncan@suse.com>
+Cc: Hannes Reinecke <hare@suse.de>
+Cc: Bart Van Assche <Bart.VanAssche@sandisk.com>
+Cc: Chris Leech <cleech@redhat.com>
+Acked-by: Lee Duncan <lduncan@suse.com>
+Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ drivers/scsi/libiscsi.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/drivers/scsi/libiscsi.c
++++ b/drivers/scsi/libiscsi.c
+@@ -1727,7 +1727,7 @@ int iscsi_queuecommand(struct Scsi_Host
+
+ if (test_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx)) {
+ reason = FAILURE_SESSION_IN_RECOVERY;
+- sc->result = DID_REQUEUE;
++ sc->result = DID_REQUEUE << 16;
+ goto fault;
+ }
+
pci-layerscape-fix-msg-tlp-drop-setting.patch
mmc-sdhci-of-esdhc-add-remove-some-quirks-according-to-vendor-version.patch
fs-select-add-vmalloc-fallback-for-select-2.patch
+mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
+hwpoison-memcg-forcibly-uncharge-lru-pages.patch
+cma-fix-calculation-of-aligned-offset.patch
+mm-page_alloc-fix-potential-false-positive-in-__zone_watermark_ok.patch
+ipc-msg-make-msgrcv-work-with-long_min.patch
+x86-ioapic-fix-incorrect-pointers-in-ioapic_setup_resources.patch
+acpi-processor-avoid-reserving-io-regions-too-early.patch
+acpi-scan-prefer-devices-without-_hid-_cid-for-_adr-matching.patch
+acpica-namespace-fix-operand-cache-leak.patch
+netfilter-x_tables-speed-up-jump-target-validation.patch
+netfilter-arp_tables-fix-invoking-32bit-iptable-p-input-accept-failed-in-64bit-kernel.patch
+netfilter-nf_dup_ipv6-set-again-flowi_flag_known_nh-at-flowi6_flags.patch
+netfilter-nf_ct_expect-remove-the-redundant-slash-when-policy-name-is-empty.patch
+netfilter-nfnetlink_queue-reject-verdict-request-from-different-portid.patch
+netfilter-restart-search-if-moved-to-other-chain.patch
+netfilter-nf_conntrack_sip-extend-request-line-validation.patch
+netfilter-use-fwmark_reflect-in-nf_send_reset.patch
+netfilter-fix-is_err_value-usage.patch
+netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
+netfilter-xt_osf-add-missing-permission-checks.patch
+ext2-don-t-clear-sgid-when-inheriting-acls.patch
+reiserfs-fix-race-in-prealloc-discard.patch
+reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
+reiserfs-don-t-clear-sgid-when-inheriting-acls.patch
+fs-fcntl-f_setown-avoid-undefined-behaviour.patch
+scsi-libiscsi-fix-shifting-of-did_requeue-host-byte.patch
--- /dev/null
+From 9d98bcec731756b8688b59ec998707924d716d7b Mon Sep 17 00:00:00 2001
+From: Rui Wang <rui.y.wang@intel.com>
+Date: Wed, 8 Jun 2016 14:59:52 +0800
+Subject: x86/ioapic: Fix incorrect pointers in ioapic_setup_resources()
+
+From: Rui Wang <rui.y.wang@intel.com>
+
+commit 9d98bcec731756b8688b59ec998707924d716d7b upstream.
+
+On a 4-socket Brickland system, hot-removing one ioapic is fine.
+Hot-removing the 2nd one causes panic in mp_unregister_ioapic()
+while calling release_resource().
+
+It is because the iomem_res pointer has already been released
+when removing the first ioapic.
+
+To explain the use of &res[num] here: res is assigned to ioapic_resources,
+and later in ioapic_insert_resources() we do:
+
+ struct resource *r = ioapic_resources;
+
+ for_each_ioapic(i) {
+ insert_resource(&iomem_resource, r);
+ r++;
+ }
+
+Here 'r' is treated as an arry of 'struct resource', and the r++ ensures
+that each element of the array is inserted separately. Thus we should call
+release_resouce() on each element at &res[num].
+
+Fix it by assigning the correct pointers to ioapics[i].iomem_res in
+ioapic_setup_resources().
+
+Signed-off-by: Rui Wang <rui.y.wang@intel.com>
+Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
+Cc: tony.luck@intel.com
+Cc: linux-pci@vger.kernel.org
+Cc: rjw@rjwysocki.net
+Cc: linux-acpi@vger.kernel.org
+Cc: bhelgaas@google.com
+Link: http://lkml.kernel.org/r/1465369193-4816-3-git-send-email-rui.y.wang@intel.com
+Signed-off-by: Ingo Molnar <mingo@kernel.org>
+Acked-by: Joerg Roedel <jroedel@suse.de>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ arch/x86/kernel/apic/io_apic.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/arch/x86/kernel/apic/io_apic.c
++++ b/arch/x86/kernel/apic/io_apic.c
+@@ -2592,8 +2592,8 @@ static struct resource * __init ioapic_s
+ res[num].flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+ snprintf(mem, IOAPIC_RESOURCE_NAME_SIZE, "IOAPIC %u", i);
+ mem += IOAPIC_RESOURCE_NAME_SIZE;
++ ioapics[i].iomem_res = &res[num];
+ num++;
+- ioapics[i].iomem_res = res;
+ }
+
+ ioapic_resources = res;