From: Greg Kroah-Hartman Date: Fri, 31 Jul 2015 00:42:31 +0000 (-0700) Subject: 4.1-stable patches X-Git-Tag: v4.1.4~20 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=0e71cf1ac1e67ec1e2cced65c8f5741fd64902de;p=thirdparty%2Fkernel%2Fstable-queue.git 4.1-stable patches added patches: acpi-lpss-fix-up-acpi_lpss_create_device.patch acpi-pci-fix-regressions-caused-by-resource_size_t-overflow-with-32-bit-kernel.patch acpi-pnp-reserve-acpi-resources-at-the-fs_initcall_sync-stage.patch acpi-resources-free-memory-on-error-in-add_region_before.patch acpica-tables-enable-both-32-bit-and-64-bit-facs.patch acpica-tables-enable-default-64-bit-fadt-addresses-favor.patch acpica-tables-fix-an-issue-that-facs-initialization-is-performed-twice.patch btrfs-fix-file-corruption-after-cloning-inline-extents.patch btrfs-fix-fsync-data-loss-after-append-write.patch btrfs-fix-list-transaction-pending_ordered-corruption.patch btrfs-fix-memory-leak-in-the-extent_same-ioctl.patch btrfs-fix-race-between-caching-kthread-and-returning-inode-to-inode-cache.patch btrfs-use-kmem_cache_free-when-freeing-entry-in-inode-cache.patch crush-fix-a-bug-in-tree-bucket-decode.patch fuse-initialize-fc-release-before-calling-it.patch selinux-don-t-waste-ebitmap-space-when-importing-netlabel-categories.patch selinux-fix-mprotect-prot_exec-regression-caused-by-mm-change.patch --- diff --git a/queue-4.1/acpi-lpss-fix-up-acpi_lpss_create_device.patch b/queue-4.1/acpi-lpss-fix-up-acpi_lpss_create_device.patch new file mode 100644 index 00000000000..7d6e6145dbe --- /dev/null +++ b/queue-4.1/acpi-lpss-fix-up-acpi_lpss_create_device.patch @@ -0,0 +1,45 @@ +From d3e13ff3c1aa2403d9a5f371baac088daeb8f56d Mon Sep 17 00:00:00 2001 +From: "Rafael J. Wysocki" +Date: Tue, 7 Jul 2015 00:31:47 +0200 +Subject: ACPI / LPSS: Fix up acpi_lpss_create_device() + +From: "Rafael J. Wysocki" + +commit d3e13ff3c1aa2403d9a5f371baac088daeb8f56d upstream. + +Fix a return value (which should be a negative error code) and a +memory leak (the list allocated by acpi_dev_get_resources() needs +to be freed on ioremap() errors too) in acpi_lpss_create_device() +introduced by commit 4483d59e29fe 'ACPI / LPSS: check the result +of ioremap()'. + +Fixes: 4483d59e29fe 'ACPI / LPSS: check the result of ioremap()' +Reported-by: Dan Carpenter +Signed-off-by: Rafael J. Wysocki +Signed-off-by: Greg Kroah-Hartman + +--- + drivers/acpi/acpi_lpss.c | 7 +++++-- + 1 file changed, 5 insertions(+), 2 deletions(-) + +--- a/drivers/acpi/acpi_lpss.c ++++ b/drivers/acpi/acpi_lpss.c +@@ -352,13 +352,16 @@ static int acpi_lpss_create_device(struc + pdata->mmio_size = resource_size(rentry->res); + pdata->mmio_base = ioremap(rentry->res->start, + pdata->mmio_size); +- if (!pdata->mmio_base) +- goto err_out; + break; + } + + acpi_dev_free_resource_list(&resource_list); + ++ if (!pdata->mmio_base) { ++ ret = -ENOMEM; ++ goto err_out; ++ } ++ + pdata->dev_desc = dev_desc; + + if (dev_desc->setup) diff --git a/queue-4.1/acpi-pci-fix-regressions-caused-by-resource_size_t-overflow-with-32-bit-kernel.patch b/queue-4.1/acpi-pci-fix-regressions-caused-by-resource_size_t-overflow-with-32-bit-kernel.patch new file mode 100644 index 00000000000..5257f676cf0 --- /dev/null +++ b/queue-4.1/acpi-pci-fix-regressions-caused-by-resource_size_t-overflow-with-32-bit-kernel.patch @@ -0,0 +1,116 @@ +From 1fb01ca93a1348a1469b8777326cd7632483de77 Mon Sep 17 00:00:00 2001 +From: Jiang Liu +Date: Wed, 8 Jul 2015 15:26:39 +0800 +Subject: ACPI / PCI: Fix regressions caused by resource_size_t overflow with 32-bit kernel + +From: Jiang Liu + +commit 1fb01ca93a1348a1469b8777326cd7632483de77 upstream. + +Zoltan Boszormenyi reported this regression: + "There's a Realtek RTL8111/8168/8411 (PCI ID 10ec:8168, Subsystem ID + 1565:230e) network chip on the mainboard. After the r8169 driver loaded + the IRQs in the machine went berserk. Keyboard keypressed arrived with + considerable latency and duplicated, so no real work was possible. + The machine responded to the power button but didn't actually power + down. It just stuck at the powering down message. I had to press the + power button for 4 seconds to power it down. + + The computer is a POS machine with a big battery inside. Because of this, + either ACPI or the Realtek chip kept the bad state and after rebooting, + the network chip didn't even show up in lspci. Not even the PXE ROM + announced itself during boot. I had to disconnect the battery to beat + some sense back to the computer. + + The regression happens with 4.0.5, 4.1.0-rc8 and 4.1.0-final. 3.18.16 was + good." + +The regression is caused by commit 593669c2ac0f (x86/PCI/ACPI: Use common +ACPI resource interfaces to simplify implementation). Since commit +593669c2ac0f, x86 PCI ACPI host bridge driver validates ACPI resources by +first converting an ACPI resource to a 'struct resource' structure and +then applying checks against the converted resource structure. The 'start' +and 'end' fields in 'struct resource' are defined to be type of +resource_size_t, which may be 32 bits or 64 bits depending on +CONFIG_PHYS_ADDR_T_64BIT. + +This may cause incorrect resource validation results with 32-bit kernels +because 64-bit ACPI resource descriptors may get truncated when converting +to 32-bit 'start' and 'end' fields in 'struct resource'. It eventually +affects PCI resource allocation subsystem and makes some PCI devices and +the system behave abnormally due to incorrect resource assignment. + +So enhance the ACPI resource parsing interfaces to ignore ACPI resource +descriptors with address/offset above 4G when running in 32-bit mode. + +With the fix applied, the behavior of the machine was restored to how +3.18.16 worked, i.e. the memory range that is over 4GB is ignored again, +and lspci -vvxxx shows that everything is at the same memory window as +they were with 3.18.16. + +Reported-and-tested-by: Boszormenyi Zoltan +Fixes: 593669c2ac0f (x86/PCI/ACPI: Use common ACPI resource interfaces to simplify implementation) +Signed-off-by: Jiang Liu +Signed-off-by: Rafael J. Wysocki +Signed-off-by: Greg Kroah-Hartman + +--- + drivers/acpi/resource.c | 24 +++++++++++++++--------- + 1 file changed, 15 insertions(+), 9 deletions(-) + +--- a/drivers/acpi/resource.c ++++ b/drivers/acpi/resource.c +@@ -193,6 +193,7 @@ static bool acpi_decode_space(struct res + u8 iodec = attr->granularity == 0xfff ? ACPI_DECODE_10 : ACPI_DECODE_16; + bool wp = addr->info.mem.write_protect; + u64 len = attr->address_length; ++ u64 start, end, offset = 0; + struct resource *res = &win->res; + + /* +@@ -204,9 +205,6 @@ static bool acpi_decode_space(struct res + pr_debug("ACPI: Invalid address space min_addr_fix %d, max_addr_fix %d, len %llx\n", + addr->min_address_fixed, addr->max_address_fixed, len); + +- res->start = attr->minimum; +- res->end = attr->maximum; +- + /* + * For bridges that translate addresses across the bridge, + * translation_offset is the offset that must be added to the +@@ -214,12 +212,22 @@ static bool acpi_decode_space(struct res + * primary side. Non-bridge devices must list 0 for all Address + * Translation offset bits. + */ +- if (addr->producer_consumer == ACPI_PRODUCER) { +- res->start += attr->translation_offset; +- res->end += attr->translation_offset; +- } else if (attr->translation_offset) { ++ if (addr->producer_consumer == ACPI_PRODUCER) ++ offset = attr->translation_offset; ++ else if (attr->translation_offset) + pr_debug("ACPI: translation_offset(%lld) is invalid for non-bridge device.\n", + attr->translation_offset); ++ start = attr->minimum + offset; ++ end = attr->maximum + offset; ++ ++ win->offset = offset; ++ res->start = start; ++ res->end = end; ++ if (sizeof(resource_size_t) < sizeof(u64) && ++ (offset != win->offset || start != res->start || end != res->end)) { ++ pr_warn("acpi resource window ([%#llx-%#llx] ignored, not CPU addressable)\n", ++ attr->minimum, attr->maximum); ++ return false; + } + + switch (addr->resource_type) { +@@ -236,8 +244,6 @@ static bool acpi_decode_space(struct res + return false; + } + +- win->offset = attr->translation_offset; +- + if (addr->producer_consumer == ACPI_PRODUCER) + res->flags |= IORESOURCE_WINDOW; + diff --git a/queue-4.1/acpi-pnp-reserve-acpi-resources-at-the-fs_initcall_sync-stage.patch b/queue-4.1/acpi-pnp-reserve-acpi-resources-at-the-fs_initcall_sync-stage.patch new file mode 100644 index 00000000000..558121208b4 --- /dev/null +++ b/queue-4.1/acpi-pnp-reserve-acpi-resources-at-the-fs_initcall_sync-stage.patch @@ -0,0 +1,379 @@ +From 0294112ee3135fbd15eaa70015af8283642dd970 Mon Sep 17 00:00:00 2001 +From: "Rafael J. Wysocki" +Date: Sat, 4 Jul 2015 03:09:03 +0200 +Subject: ACPI / PNP: Reserve ACPI resources at the fs_initcall_sync stage + +From: "Rafael J. Wysocki" + +commit 0294112ee3135fbd15eaa70015af8283642dd970 upstream. + +This effectively reverts the following three commits: + + 7bc10388ccdd ACPI / resources: free memory on error in add_region_before() + 0f1b414d1907 ACPI / PNP: Avoid conflicting resource reservations + b9a5e5e18fbf ACPI / init: Fix the ordering of acpi_reserve_resources() + +(commit b9a5e5e18fbf introduced regressions some of which, but not +all, were addressed by commit 0f1b414d1907 and commit 7bc10388ccdd +was a fixup on top of the latter) and causes ACPI fixed hardware +resources to be reserved at the fs_initcall_sync stage of system +initialization. + +The story is as follows. First, a boot regression was reported due +to an apparent resource reservation ordering change after a commit +that shouldn't lead to such changes. Investigation led to the +conclusion that the problem happened because acpi_reserve_resources() +was executed at the device_initcall() stage of system initialization +which wasn't strictly ordered with respect to driver initialization +(and with respect to the initialization of the pcieport driver in +particular), so a random change causing the device initcalls to be +run in a different order might break things. + +The response to that was to attempt to run acpi_reserve_resources() +as soon as we knew that ACPI would be in use (commit b9a5e5e18fbf). +However, that turned out to be too early, because it caused resource +reservations made by the PNP system driver to fail on at least one +system and that failure was addressed by commit 0f1b414d1907. + +That fix still turned out to be insufficient, though, because +calling acpi_reserve_resources() before the fs_initcall stage of +system initialization caused a boot regression to happen on the +eCAFE EC-800-H20G/S netbook. That meant that we only could call +acpi_reserve_resources() at the fs_initcall initialization stage +or later, but then we might just as well call it after the PNP +initalization in which case commit 0f1b414d1907 wouldn't be +necessary any more. + +For this reason, the changes made by commit 0f1b414d1907 are reverted +(along with a memory leak fixup on top of that commit), the changes +made by commit b9a5e5e18fbf that went too far are reverted too and +acpi_reserve_resources() is changed into fs_initcall_sync, which +will cause it to be executed after the PNP subsystem initialization +(which is an fs_initcall) and before device initcalls (including +the pcieport driver initialization) which should avoid the initial +issue. + +Link: https://bugzilla.kernel.org/show_bug.cgi?id=100581 +Link: http://marc.info/?t=143092384600002&r=1&w=2 +Link: https://bugzilla.kernel.org/show_bug.cgi?id=99831 +Link: http://marc.info/?t=143389402600001&r=1&w=2 +Fixes: b9a5e5e18fbf "ACPI / init: Fix the ordering of acpi_reserve_resources()" +Reported-by: Roland Dreier +Signed-off-by: Rafael J. Wysocki +Signed-off-by: Greg Kroah-Hartman + +--- + drivers/acpi/osl.c | 12 ++- + drivers/acpi/resource.c | 162 ------------------------------------------------ + drivers/pnp/system.c | 35 ++-------- + include/linux/acpi.h | 10 -- + 4 files changed, 18 insertions(+), 201 deletions(-) + +--- a/drivers/acpi/osl.c ++++ b/drivers/acpi/osl.c +@@ -175,10 +175,14 @@ static void __init acpi_request_region ( + if (!addr || !length) + return; + +- acpi_reserve_region(addr, length, gas->space_id, 0, desc); ++ /* Resources are never freed */ ++ if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_IO) ++ request_region(addr, length, desc); ++ else if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) ++ request_mem_region(addr, length, desc); + } + +-static void __init acpi_reserve_resources(void) ++static int __init acpi_reserve_resources(void) + { + acpi_request_region(&acpi_gbl_FADT.xpm1a_event_block, acpi_gbl_FADT.pm1_event_length, + "ACPI PM1a_EVT_BLK"); +@@ -207,7 +211,10 @@ static void __init acpi_reserve_resource + if (!(acpi_gbl_FADT.gpe1_block_length & 0x1)) + acpi_request_region(&acpi_gbl_FADT.xgpe1_block, + acpi_gbl_FADT.gpe1_block_length, "ACPI GPE1_BLK"); ++ ++ return 0; + } ++fs_initcall_sync(acpi_reserve_resources); + + void acpi_os_printf(const char *fmt, ...) + { +@@ -1838,7 +1845,6 @@ acpi_status __init acpi_os_initialize(vo + + acpi_status __init acpi_os_initialize1(void) + { +- acpi_reserve_resources(); + kacpid_wq = alloc_workqueue("kacpid", 0, 1); + kacpi_notify_wq = alloc_workqueue("kacpi_notify", 0, 1); + kacpi_hotplug_wq = alloc_ordered_workqueue("kacpi_hotplug", 0); +--- a/drivers/acpi/resource.c ++++ b/drivers/acpi/resource.c +@@ -26,7 +26,6 @@ + #include + #include + #include +-#include + #include + + #ifdef CONFIG_X86 +@@ -622,164 +621,3 @@ int acpi_dev_filter_resource_type(struct + return (type & types) ? 0 : 1; + } + EXPORT_SYMBOL_GPL(acpi_dev_filter_resource_type); +- +-struct reserved_region { +- struct list_head node; +- u64 start; +- u64 end; +-}; +- +-static LIST_HEAD(reserved_io_regions); +-static LIST_HEAD(reserved_mem_regions); +- +-static int request_range(u64 start, u64 end, u8 space_id, unsigned long flags, +- char *desc) +-{ +- unsigned int length = end - start + 1; +- struct resource *res; +- +- res = space_id == ACPI_ADR_SPACE_SYSTEM_IO ? +- request_region(start, length, desc) : +- request_mem_region(start, length, desc); +- if (!res) +- return -EIO; +- +- res->flags &= ~flags; +- return 0; +-} +- +-static int add_region_before(u64 start, u64 end, u8 space_id, +- unsigned long flags, char *desc, +- struct list_head *head) +-{ +- struct reserved_region *reg; +- int error; +- +- reg = kmalloc(sizeof(*reg), GFP_KERNEL); +- if (!reg) +- return -ENOMEM; +- +- error = request_range(start, end, space_id, flags, desc); +- if (error) { +- kfree(reg); +- return error; +- } +- +- reg->start = start; +- reg->end = end; +- list_add_tail(®->node, head); +- return 0; +-} +- +-/** +- * acpi_reserve_region - Reserve an I/O or memory region as a system resource. +- * @start: Starting address of the region. +- * @length: Length of the region. +- * @space_id: Identifier of address space to reserve the region from. +- * @flags: Resource flags to clear for the region after requesting it. +- * @desc: Region description (for messages). +- * +- * Reserve an I/O or memory region as a system resource to prevent others from +- * using it. If the new region overlaps with one of the regions (in the given +- * address space) already reserved by this routine, only the non-overlapping +- * parts of it will be reserved. +- * +- * Returned is either 0 (success) or a negative error code indicating a resource +- * reservation problem. It is the code of the first encountered error, but the +- * routine doesn't abort until it has attempted to request all of the parts of +- * the new region that don't overlap with other regions reserved previously. +- * +- * The resources requested by this routine are never released. +- */ +-int acpi_reserve_region(u64 start, unsigned int length, u8 space_id, +- unsigned long flags, char *desc) +-{ +- struct list_head *regions; +- struct reserved_region *reg; +- u64 end = start + length - 1; +- int ret = 0, error = 0; +- +- if (space_id == ACPI_ADR_SPACE_SYSTEM_IO) +- regions = &reserved_io_regions; +- else if (space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) +- regions = &reserved_mem_regions; +- else +- return -EINVAL; +- +- if (list_empty(regions)) +- return add_region_before(start, end, space_id, flags, desc, regions); +- +- list_for_each_entry(reg, regions, node) +- if (reg->start == end + 1) { +- /* The new region can be prepended to this one. */ +- ret = request_range(start, end, space_id, flags, desc); +- if (!ret) +- reg->start = start; +- +- return ret; +- } else if (reg->start > end) { +- /* No overlap. Add the new region here and get out. */ +- return add_region_before(start, end, space_id, flags, +- desc, ®->node); +- } else if (reg->end == start - 1) { +- goto combine; +- } else if (reg->end >= start) { +- goto overlap; +- } +- +- /* The new region goes after the last existing one. */ +- return add_region_before(start, end, space_id, flags, desc, regions); +- +- overlap: +- /* +- * The new region overlaps an existing one. +- * +- * The head part of the new region immediately preceding the existing +- * overlapping one can be combined with it right away. +- */ +- if (reg->start > start) { +- error = request_range(start, reg->start - 1, space_id, flags, desc); +- if (error) +- ret = error; +- else +- reg->start = start; +- } +- +- combine: +- /* +- * The new region is adjacent to an existing one. If it extends beyond +- * that region all the way to the next one, it is possible to combine +- * all three of them. +- */ +- while (reg->end < end) { +- struct reserved_region *next = NULL; +- u64 a = reg->end + 1, b = end; +- +- if (!list_is_last(®->node, regions)) { +- next = list_next_entry(reg, node); +- if (next->start <= end) +- b = next->start - 1; +- } +- error = request_range(a, b, space_id, flags, desc); +- if (!error) { +- if (next && next->start == b + 1) { +- reg->end = next->end; +- list_del(&next->node); +- kfree(next); +- } else { +- reg->end = end; +- break; +- } +- } else if (next) { +- if (!ret) +- ret = error; +- +- reg = next; +- } else { +- break; +- } +- } +- +- return ret ? ret : error; +-} +-EXPORT_SYMBOL_GPL(acpi_reserve_region); +--- a/drivers/pnp/system.c ++++ b/drivers/pnp/system.c +@@ -7,7 +7,6 @@ + * Bjorn Helgaas + */ + +-#include + #include + #include + #include +@@ -23,41 +22,25 @@ static const struct pnp_device_id pnp_de + {"", 0} + }; + +-#ifdef CONFIG_ACPI +-static bool __reserve_range(u64 start, unsigned int length, bool io, char *desc) +-{ +- u8 space_id = io ? ACPI_ADR_SPACE_SYSTEM_IO : ACPI_ADR_SPACE_SYSTEM_MEMORY; +- return !acpi_reserve_region(start, length, space_id, IORESOURCE_BUSY, desc); +-} +-#else +-static bool __reserve_range(u64 start, unsigned int length, bool io, char *desc) +-{ +- struct resource *res; +- +- res = io ? request_region(start, length, desc) : +- request_mem_region(start, length, desc); +- if (res) { +- res->flags &= ~IORESOURCE_BUSY; +- return true; +- } +- return false; +-} +-#endif +- + static void reserve_range(struct pnp_dev *dev, struct resource *r, int port) + { + char *regionid; + const char *pnpid = dev_name(&dev->dev); + resource_size_t start = r->start, end = r->end; +- bool reserved; ++ struct resource *res; + + regionid = kmalloc(16, GFP_KERNEL); + if (!regionid) + return; + + snprintf(regionid, 16, "pnp %s", pnpid); +- reserved = __reserve_range(start, end - start + 1, !!port, regionid); +- if (!reserved) ++ if (port) ++ res = request_region(start, end - start + 1, regionid); ++ else ++ res = request_mem_region(start, end - start + 1, regionid); ++ if (res) ++ res->flags &= ~IORESOURCE_BUSY; ++ else + kfree(regionid); + + /* +@@ -66,7 +49,7 @@ static void reserve_range(struct pnp_dev + * have double reservations. + */ + dev_info(&dev->dev, "%pR %s reserved\n", r, +- reserved ? "has been" : "could not be"); ++ res ? "has been" : "could not be"); + } + + static void reserve_resources_of_dev(struct pnp_dev *dev) +--- a/include/linux/acpi.h ++++ b/include/linux/acpi.h +@@ -332,9 +332,6 @@ int acpi_check_region(resource_size_t st + + int acpi_resources_are_enforced(void); + +-int acpi_reserve_region(u64 start, unsigned int length, u8 space_id, +- unsigned long flags, char *desc); +- + #ifdef CONFIG_HIBERNATION + void __init acpi_no_s4_hw_signature(void); + #endif +@@ -530,13 +527,6 @@ static inline int acpi_check_region(reso + return 0; + } + +-static inline int acpi_reserve_region(u64 start, unsigned int length, +- u8 space_id, unsigned long flags, +- char *desc) +-{ +- return -ENXIO; +-} +- + struct acpi_table_header; + static inline int acpi_table_parse(char *id, + int (*handler)(struct acpi_table_header *)) diff --git a/queue-4.1/acpi-resources-free-memory-on-error-in-add_region_before.patch b/queue-4.1/acpi-resources-free-memory-on-error-in-add_region_before.patch new file mode 100644 index 00000000000..52eafa10fcb --- /dev/null +++ b/queue-4.1/acpi-resources-free-memory-on-error-in-add_region_before.patch @@ -0,0 +1,34 @@ +From 7bc10388ccdd79b3d20463151a1f8e7a590a775b Mon Sep 17 00:00:00 2001 +From: Dan Carpenter +Date: Wed, 24 Jun 2015 17:30:15 +0300 +Subject: ACPI / resources: free memory on error in add_region_before() + +From: Dan Carpenter + +commit 7bc10388ccdd79b3d20463151a1f8e7a590a775b upstream. + +There is a small memory leak on error. + +Fixes: 0f1b414d1907 (ACPI / PNP: Avoid conflicting resource reservations) +Signed-off-by: Dan Carpenter +Signed-off-by: Rafael J. Wysocki +Signed-off-by: Greg Kroah-Hartman + +--- + drivers/acpi/resource.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +--- a/drivers/acpi/resource.c ++++ b/drivers/acpi/resource.c +@@ -660,8 +660,10 @@ static int add_region_before(u64 start, + return -ENOMEM; + + error = request_range(start, end, space_id, flags, desc); +- if (error) ++ if (error) { ++ kfree(reg); + return error; ++ } + + reg->start = start; + reg->end = end; diff --git a/queue-4.1/acpica-tables-enable-both-32-bit-and-64-bit-facs.patch b/queue-4.1/acpica-tables-enable-both-32-bit-and-64-bit-facs.patch new file mode 100644 index 00000000000..459bdfa4a4e --- /dev/null +++ b/queue-4.1/acpica-tables-enable-both-32-bit-and-64-bit-facs.patch @@ -0,0 +1,205 @@ +From c04e1fb4396d27f18296db0f914760fa7fe8223a Mon Sep 17 00:00:00 2001 +From: Lv Zheng +Date: Wed, 1 Jul 2015 14:43:11 +0800 +Subject: ACPICA: Tables: Enable both 32-bit and 64-bit FACS + +From: Lv Zheng + +commit c04e1fb4396d27f18296db0f914760fa7fe8223a upstream. + +ACPICA commit f7b86f35416e3d1f71c3d816ff5075ddd33ed486 + +The following commit is reported to have broken s2ram on some platforms: + Commit: 0249ed2444d65d65fc3f3f64f398f1ad0b7e54cd + ACPICA: Add option to favor 32-bit FADT addresses. +The platform reports 2 FACS tables (which is not allowed by ACPI +specification) and the new 32-bit address favor rule forces OSPMs to use +the FACS table reported via FADT's X_FIRMWARE_CTRL field. + +The root cause of the reported bug might be one of the followings: +1. BIOS may favor the 64-bit firmware waking vector address when the + version of the FACS is greater than 0 and Linux currently only supports + resuming from the real mode, so the 64-bit firmware waking vector has + never been set and might be invalid to BIOS while the commit enables + higher version FACS. +2. BIOS may favor the FACS reported via the "FIRMWARE_CTRL" field in the + FADT while the commit doesn't set the firmware waking vector address of + the FACS reported by "FIRMWARE_CTRL", it only sets the firware waking + vector address of the FACS reported by "X_FIRMWARE_CTRL". + +This patch excludes the cases that can trigger the bugs caused by the root +cause 2. + +There is no handshaking mechanism can be used by OSPM to tell BIOS which +FACS is currently used. Thus the FACS reported by "FIRMWARE_CTRL" may still +be used by BIOS and the 0 value of the 32-bit firmware waking vector might +trigger such failure. + +This patch tries to favor 32bit FACS address in another way where both the +FACS reported by "FIRMWARE_CTRL" and the FACS reported by "X_FIRMWARE_CTRL" +are loaded so that further commit can set firmware waking vector in the +both tables to ensure we can exclude the cases that trigger the bugs caused +by the root cause 2. The exclusion is split into 2 commits as this commit +is also useful for dumping more ACPI tables, it won't get reverted when +such exclusion is no longer necessary. Lv Zheng. + +Link: https://bugzilla.kernel.org/show_bug.cgi?id=74021 +Link: https://github.com/acpica/acpica/commit/f7b86f35 +Reported-and-tested-by: Oswald Buddenhagen +Signed-off-by: Lv Zheng +Signed-off-by: Bob Moore +Signed-off-by: Rafael J. Wysocki +Signed-off-by: Greg Kroah-Hartman + +--- + drivers/acpi/acpica/aclocal.h | 1 + + drivers/acpi/acpica/tbfadt.c | 21 +++++++++++++-------- + drivers/acpi/acpica/tbutils.c | 34 +++++++++++++++++++++++----------- + drivers/acpi/acpica/tbxfload.c | 3 ++- + include/acpi/acpixf.h | 9 +++++++++ + 5 files changed, 48 insertions(+), 20 deletions(-) + +--- a/drivers/acpi/acpica/aclocal.h ++++ b/drivers/acpi/acpica/aclocal.h +@@ -213,6 +213,7 @@ struct acpi_table_list { + + #define ACPI_TABLE_INDEX_DSDT (0) + #define ACPI_TABLE_INDEX_FACS (1) ++#define ACPI_TABLE_INDEX_X_FACS (2) + + struct acpi_find_context { + char *search_for; +--- a/drivers/acpi/acpica/tbfadt.c ++++ b/drivers/acpi/acpica/tbfadt.c +@@ -350,9 +350,18 @@ void acpi_tb_parse_fadt(u32 table_index) + /* If Hardware Reduced flag is set, there is no FACS */ + + if (!acpi_gbl_reduced_hardware) { +- acpi_tb_install_fixed_table((acpi_physical_address) +- acpi_gbl_FADT.Xfacs, ACPI_SIG_FACS, +- ACPI_TABLE_INDEX_FACS); ++ if (acpi_gbl_FADT.facs) { ++ acpi_tb_install_fixed_table((acpi_physical_address) ++ acpi_gbl_FADT.facs, ++ ACPI_SIG_FACS, ++ ACPI_TABLE_INDEX_FACS); ++ } ++ if (acpi_gbl_FADT.Xfacs) { ++ acpi_tb_install_fixed_table((acpi_physical_address) ++ acpi_gbl_FADT.Xfacs, ++ ACPI_SIG_FACS, ++ ACPI_TABLE_INDEX_X_FACS); ++ } + } + } + +@@ -491,13 +500,9 @@ static void acpi_tb_convert_fadt(void) + acpi_gbl_FADT.header.length = sizeof(struct acpi_table_fadt); + + /* +- * Expand the 32-bit FACS and DSDT addresses to 64-bit as necessary. ++ * Expand the 32-bit DSDT addresses to 64-bit as necessary. + * Later ACPICA code will always use the X 64-bit field. + */ +- acpi_gbl_FADT.Xfacs = acpi_tb_select_address("FACS", +- acpi_gbl_FADT.facs, +- acpi_gbl_FADT.Xfacs); +- + acpi_gbl_FADT.Xdsdt = acpi_tb_select_address("DSDT", + acpi_gbl_FADT.dsdt, + acpi_gbl_FADT.Xdsdt); +--- a/drivers/acpi/acpica/tbutils.c ++++ b/drivers/acpi/acpica/tbutils.c +@@ -68,7 +68,8 @@ acpi_tb_get_root_table_entry(u8 *table_e + + acpi_status acpi_tb_initialize_facs(void) + { +- acpi_status status; ++ struct acpi_table_facs *facs32; ++ struct acpi_table_facs *facs64; + + /* If Hardware Reduced flag is set, there is no FACS */ + +@@ -77,11 +78,22 @@ acpi_status acpi_tb_initialize_facs(void + return (AE_OK); + } + +- status = acpi_get_table_by_index(ACPI_TABLE_INDEX_FACS, +- ACPI_CAST_INDIRECT_PTR(struct +- acpi_table_header, +- &acpi_gbl_FACS)); +- return (status); ++ (void)acpi_get_table_by_index(ACPI_TABLE_INDEX_FACS, ++ ACPI_CAST_INDIRECT_PTR(struct ++ acpi_table_header, ++ &facs32)); ++ (void)acpi_get_table_by_index(ACPI_TABLE_INDEX_X_FACS, ++ ACPI_CAST_INDIRECT_PTR(struct ++ acpi_table_header, ++ &facs64)); ++ ++ if (acpi_gbl_use32_bit_facs_addresses) { ++ acpi_gbl_FACS = facs32 ? facs32 : facs64; ++ } else { ++ acpi_gbl_FACS = facs64 ? facs64 : facs32; ++ } ++ ++ return (AE_OK); + } + #endif /* !ACPI_REDUCED_HARDWARE */ + +@@ -101,7 +113,7 @@ acpi_status acpi_tb_initialize_facs(void + u8 acpi_tb_tables_loaded(void) + { + +- if (acpi_gbl_root_table_list.current_table_count >= 3) { ++ if (acpi_gbl_root_table_list.current_table_count >= 4) { + return (TRUE); + } + +@@ -357,11 +369,11 @@ acpi_status __init acpi_tb_parse_root_ta + table_entry = ACPI_ADD_PTR(u8, table, sizeof(struct acpi_table_header)); + + /* +- * First two entries in the table array are reserved for the DSDT +- * and FACS, which are not actually present in the RSDT/XSDT - they +- * come from the FADT ++ * First three entries in the table array are reserved for the DSDT ++ * and 32bit/64bit FACS, which are not actually present in the ++ * RSDT/XSDT - they come from the FADT + */ +- acpi_gbl_root_table_list.current_table_count = 2; ++ acpi_gbl_root_table_list.current_table_count = 3; + + /* Initialize the root table array from the RSDT/XSDT */ + +--- a/drivers/acpi/acpica/tbxfload.c ++++ b/drivers/acpi/acpica/tbxfload.c +@@ -166,7 +166,8 @@ static acpi_status acpi_tb_load_namespac + + (void)acpi_ut_acquire_mutex(ACPI_MTX_TABLES); + for (i = 0; i < acpi_gbl_root_table_list.current_table_count; ++i) { +- if ((!ACPI_COMPARE_NAME ++ if (!acpi_gbl_root_table_list.tables[i].address || ++ (!ACPI_COMPARE_NAME + (&(acpi_gbl_root_table_list.tables[i].signature), + ACPI_SIG_SSDT) + && +--- a/include/acpi/acpixf.h ++++ b/include/acpi/acpixf.h +@@ -200,6 +200,15 @@ ACPI_INIT_GLOBAL(u8, acpi_gbl_do_not_use + ACPI_INIT_GLOBAL(u8, acpi_gbl_use32_bit_fadt_addresses, TRUE); + + /* ++ * Optionally use 32-bit FACS table addresses. ++ * It is reported that some platforms fail to resume from system suspending ++ * if 64-bit FACS table address is selected: ++ * https://bugzilla.kernel.org/show_bug.cgi?id=74021 ++ * Default is TRUE, favor the 32-bit addresses. ++ */ ++ACPI_INIT_GLOBAL(u8, acpi_gbl_use32_bit_facs_addresses, TRUE); ++ ++/* + * Optionally truncate I/O addresses to 16 bits. Provides compatibility + * with other ACPI implementations. NOTE: During ACPICA initialization, + * this value is set to TRUE if any Windows OSI strings have been diff --git a/queue-4.1/acpica-tables-enable-default-64-bit-fadt-addresses-favor.patch b/queue-4.1/acpica-tables-enable-default-64-bit-fadt-addresses-favor.patch new file mode 100644 index 00000000000..9fba0512937 --- /dev/null +++ b/queue-4.1/acpica-tables-enable-default-64-bit-fadt-addresses-favor.patch @@ -0,0 +1,48 @@ +From 0ea61381788a37d864f9841b0fe97d40f7058f3b Mon Sep 17 00:00:00 2001 +From: Lv Zheng +Date: Wed, 1 Jul 2015 14:43:34 +0800 +Subject: ACPICA: Tables: Enable default 64-bit FADT addresses favor + +From: Lv Zheng + +commit 0ea61381788a37d864f9841b0fe97d40f7058f3b upstream. + +ACPICA commit 4da56eeae0749dfe8491285c1e1fad48f6efafd8 + +The following commit temporarily disables correct 64-bit FADT addresses +favor during the period the root cause of the bug is not fixed: + Commit: 85dbd5801f62b66e2aa7826aaefcaebead44c8a6 + ACPICA: Tables: Restore old behavor to favor 32-bit FADT addresses. + +With enough protections, this patch re-enables 64-bit FADT addresses by +default. If regressions are reported against such change, this patch should +be bisected and reverted. +Note that 64-bit FACS favor and 64-bit firmware waking vector favor are +excluded by this commit in order not to break OSPMs. Lv Zheng. + +Link: https://bugzilla.kernel.org/show_bug.cgi?id=74021 +Link: https://github.com/acpica/acpica/commit/4da56eea +Reported-and-tested-by: Oswald Buddenhagen +Signed-off-by: Lv Zheng +Signed-off-by: Bob Moore +Signed-off-by: Rafael J. Wysocki +Signed-off-by: Greg Kroah-Hartman + +--- + include/acpi/acpixf.h | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +--- a/include/acpi/acpixf.h ++++ b/include/acpi/acpixf.h +@@ -195,9 +195,9 @@ ACPI_INIT_GLOBAL(u8, acpi_gbl_do_not_use + * address. Although ACPICA adheres to the ACPI specification which + * requires the use of the corresponding 64-bit address if it is non-zero, + * some machines have been found to have a corrupted non-zero 64-bit +- * address. Default is TRUE, favor the 32-bit addresses. ++ * address. Default is FALSE, do not favor the 32-bit addresses. + */ +-ACPI_INIT_GLOBAL(u8, acpi_gbl_use32_bit_fadt_addresses, TRUE); ++ACPI_INIT_GLOBAL(u8, acpi_gbl_use32_bit_fadt_addresses, FALSE); + + /* + * Optionally use 32-bit FACS table addresses. diff --git a/queue-4.1/acpica-tables-fix-an-issue-that-facs-initialization-is-performed-twice.patch b/queue-4.1/acpica-tables-fix-an-issue-that-facs-initialization-is-performed-twice.patch new file mode 100644 index 00000000000..72e2a52b6f6 --- /dev/null +++ b/queue-4.1/acpica-tables-fix-an-issue-that-facs-initialization-is-performed-twice.patch @@ -0,0 +1,55 @@ +From c04be18448355441a0c424362df65b6422e27bda Mon Sep 17 00:00:00 2001 +From: Lv Zheng +Date: Wed, 1 Jul 2015 14:43:26 +0800 +Subject: ACPICA: Tables: Fix an issue that FACS initialization is performed twice + +From: Lv Zheng + +commit c04be18448355441a0c424362df65b6422e27bda upstream. + +ACPICA commit 90f5332a15e9d9ba83831ca700b2b9f708274658 + +This patch adds a new FACS initialization flag for acpi_tb_initialize(). +acpi_enable_subsystem() might be invoked several times in OS bootup process, +and we don't want FACS initialization to be invoked twice. Lv Zheng. + +Link: https://github.com/acpica/acpica/commit/90f5332a +Signed-off-by: Lv Zheng +Signed-off-by: Bob Moore +Signed-off-by: Rafael J. Wysocki +Signed-off-by: Greg Kroah-Hartman + +--- + drivers/acpi/acpica/utxfinit.c | 10 ++++++---- + include/acpi/actypes.h | 1 + + 2 files changed, 7 insertions(+), 4 deletions(-) + +--- a/drivers/acpi/acpica/utxfinit.c ++++ b/drivers/acpi/acpica/utxfinit.c +@@ -179,10 +179,12 @@ acpi_status __init acpi_enable_subsystem + * Obtain a permanent mapping for the FACS. This is required for the + * Global Lock and the Firmware Waking Vector + */ +- status = acpi_tb_initialize_facs(); +- if (ACPI_FAILURE(status)) { +- ACPI_WARNING((AE_INFO, "Could not map the FACS table")); +- return_ACPI_STATUS(status); ++ if (!(flags & ACPI_NO_FACS_INIT)) { ++ status = acpi_tb_initialize_facs(); ++ if (ACPI_FAILURE(status)) { ++ ACPI_WARNING((AE_INFO, "Could not map the FACS table")); ++ return_ACPI_STATUS(status); ++ } + } + #endif /* !ACPI_REDUCED_HARDWARE */ + +--- a/include/acpi/actypes.h ++++ b/include/acpi/actypes.h +@@ -572,6 +572,7 @@ typedef u64 acpi_integer; + #define ACPI_NO_ACPI_ENABLE 0x10 + #define ACPI_NO_DEVICE_INIT 0x20 + #define ACPI_NO_OBJECT_INIT 0x40 ++#define ACPI_NO_FACS_INIT 0x80 + + /* + * Initialization state diff --git a/queue-4.1/btrfs-fix-file-corruption-after-cloning-inline-extents.patch b/queue-4.1/btrfs-fix-file-corruption-after-cloning-inline-extents.patch new file mode 100644 index 00000000000..e0e84988aa4 --- /dev/null +++ b/queue-4.1/btrfs-fix-file-corruption-after-cloning-inline-extents.patch @@ -0,0 +1,126 @@ +From ed958762644b404654a6f5d23e869f496fe127c6 Mon Sep 17 00:00:00 2001 +From: Filipe Manana +Date: Tue, 14 Jul 2015 16:09:39 +0100 +Subject: Btrfs: fix file corruption after cloning inline extents + +From: Filipe Manana + +commit ed958762644b404654a6f5d23e869f496fe127c6 upstream. + +Using the clone ioctl (or extent_same ioctl, which calls the same extent +cloning function as well) we end up allowing copy an inline extent from +the source file into a non-zero offset of the destination file. This is +something not expected and that the btrfs code is not prepared to deal +with - all inline extents must be at a file offset equals to 0. + +For example, the following excerpt of a test case for fstests triggers +a crash/BUG_ON() on a write operation after an inline extent is cloned +into a non-zero offset: + + _scratch_mkfs >>$seqres.full 2>&1 + _scratch_mount + + # Create our test files. File foo has the same 2K of data at offset 4K + # as file bar has at its offset 0. + $XFS_IO_PROG -f -s -c "pwrite -S 0xaa 0 4K" \ + -c "pwrite -S 0xbb 4k 2K" \ + -c "pwrite -S 0xcc 8K 4K" \ + $SCRATCH_MNT/foo | _filter_xfs_io + + # File bar consists of a single inline extent (2K size). + $XFS_IO_PROG -f -s -c "pwrite -S 0xbb 0 2K" \ + $SCRATCH_MNT/bar | _filter_xfs_io + + # Now call the clone ioctl to clone the extent of file bar into file + # foo at its offset 4K. This made file foo have an inline extent at + # offset 4K, something which the btrfs code can not deal with in future + # IO operations because all inline extents are supposed to start at an + # offset of 0, resulting in all sorts of chaos. + # So here we validate that clone ioctl returns an EOPNOTSUPP, which is + # what it returns for other cases dealing with inlined extents. + $CLONER_PROG -s 0 -d $((4 * 1024)) -l $((2 * 1024)) \ + $SCRATCH_MNT/bar $SCRATCH_MNT/foo + + # Because of the inline extent at offset 4K, the following write made + # the kernel crash with a BUG_ON(). + $XFS_IO_PROG -c "pwrite -S 0xdd 6K 2K" $SCRATCH_MNT/foo | _filter_xfs_io + + status=0 + exit + +The stack trace of the BUG_ON() triggered by the last write is: + + [152154.035903] ------------[ cut here ]------------ + [152154.036424] kernel BUG at mm/page-writeback.c:2286! + [152154.036424] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC + [152154.036424] Modules linked in: btrfs dm_flakey dm_mod crc32c_generic xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop fuse parport_pc acpi_cpu$ + [152154.036424] CPU: 2 PID: 17873 Comm: xfs_io Tainted: G W 4.1.0-rc6-btrfs-next-11+ #2 + [152154.036424] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014 + [152154.036424] task: ffff880429f70990 ti: ffff880429efc000 task.ti: ffff880429efc000 + [152154.036424] RIP: 0010:[] [] clear_page_dirty_for_io+0x1e/0x90 + [152154.036424] RSP: 0018:ffff880429effc68 EFLAGS: 00010246 + [152154.036424] RAX: 0200000000000806 RBX: ffffea0006a6d8f0 RCX: 0000000000000001 + [152154.036424] RDX: 0000000000000000 RSI: ffffffff81155d1b RDI: ffffea0006a6d8f0 + [152154.036424] RBP: ffff880429effc78 R08: ffff8801ce389fe0 R09: 0000000000000001 + [152154.036424] R10: 0000000000002000 R11: ffffffffffffffff R12: ffff8800200dce68 + [152154.036424] R13: 0000000000000000 R14: ffff8800200dcc88 R15: ffff8803d5736d80 + [152154.036424] FS: 00007fbf119f6700(0000) GS:ffff88043d280000(0000) knlGS:0000000000000000 + [152154.036424] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 + [152154.036424] CR2: 0000000001bdc000 CR3: 00000003aa555000 CR4: 00000000000006e0 + [152154.036424] Stack: + [152154.036424] ffff8803d5736d80 0000000000000001 ffff880429effcd8 ffffffffa04e97c1 + [152154.036424] ffff880429effd68 ffff880429effd60 0000000000000001 ffff8800200dc9c8 + [152154.036424] 0000000000000001 ffff8800200dcc88 0000000000000000 0000000000001000 + [152154.036424] Call Trace: + [152154.036424] [] lock_and_cleanup_extent_if_need+0x147/0x18d [btrfs] + [152154.036424] [] __btrfs_buffered_write+0x245/0x4c8 [btrfs] + [152154.036424] [] ? btrfs_file_write_iter+0x150/0x3e0 [btrfs] + [152154.036424] [] ? btrfs_file_write_iter+0x15f/0x3e0 [btrfs] + [152154.036424] [] btrfs_file_write_iter+0x2cc/0x3e0 [btrfs] + [152154.036424] [] __vfs_write+0x7c/0xa5 + [152154.036424] [] vfs_write+0xa0/0xe4 + [152154.036424] [] SyS_pwrite64+0x64/0x82 + [152154.036424] [] system_call_fastpath+0x12/0x6f + [152154.036424] Code: 48 89 c7 e8 0f ff ff ff 5b 41 5c 5d c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb e8 ae ef 00 00 49 89 c4 48 8b 03 a8 01 75 02 <0f> 0b 4d 85 e4 74 59 49 8b 3c 2$ + [152154.036424] RIP [] clear_page_dirty_for_io+0x1e/0x90 + [152154.036424] RSP + [152154.242621] ---[ end trace e3d3376b23a57041 ]--- + +Fix this by returning the error EOPNOTSUPP if an attempt to copy an +inline extent into a non-zero offset happens, just like what is done for +other scenarios that would require copying/splitting inline extents, +which were introduced by the following commits: + + 00fdf13a2e9f ("Btrfs: fix a crash of clone with inline extents's split") + 3f9e3df8da3c ("btrfs: replace error code from btrfs_drop_extents") + +Signed-off-by: Filipe Manana +Signed-off-by: Greg Kroah-Hartman + +--- + fs/btrfs/ioctl.c | 14 ++++++++++++++ + 1 file changed, 14 insertions(+) + +--- a/fs/btrfs/ioctl.c ++++ b/fs/btrfs/ioctl.c +@@ -3434,6 +3434,20 @@ process_slot: + u64 trim = 0; + u64 aligned_end = 0; + ++ /* ++ * Don't copy an inline extent into an offset ++ * greater than zero. Having an inline extent ++ * at such an offset results in chaos as btrfs ++ * isn't prepared for such cases. Just skip ++ * this case for the same reasons as commented ++ * at btrfs_ioctl_clone(). ++ */ ++ if (last_dest_end > 0) { ++ ret = -EOPNOTSUPP; ++ btrfs_end_transaction(trans, root); ++ goto out; ++ } ++ + if (off > key.offset) { + skip = off - key.offset; + new_key.offset += skip; diff --git a/queue-4.1/btrfs-fix-fsync-data-loss-after-append-write.patch b/queue-4.1/btrfs-fix-fsync-data-loss-after-append-write.patch new file mode 100644 index 00000000000..b8349f7b3a6 --- /dev/null +++ b/queue-4.1/btrfs-fix-fsync-data-loss-after-append-write.patch @@ -0,0 +1,175 @@ +From e4545de5b035c7debb73d260c78377dbb69cbfb5 Mon Sep 17 00:00:00 2001 +From: Filipe Manana +Date: Wed, 17 Jun 2015 12:49:23 +0100 +Subject: Btrfs: fix fsync data loss after append write + +From: Filipe Manana + +commit e4545de5b035c7debb73d260c78377dbb69cbfb5 upstream. + +If we do an append write to a file (which increases its inode's i_size) +that does not have the flag BTRFS_INODE_NEEDS_FULL_SYNC set in its inode, +and the previous transaction added a new hard link to the file, which sets +the flag BTRFS_INODE_COPY_EVERYTHING in the file's inode, and then fsync +the file, the inode's new i_size isn't logged. This has the consequence +that after the fsync log is replayed, the file size remains what it was +before the append write operation, which means users/applications will +not be able to read the data that was successsfully fsync'ed before. + +This happens because neither the inode item nor the delayed inode get +their i_size updated when the append write is made - doing so would +require starting a transaction in the buffered write path, something that +we do not do intentionally for performance reasons. + +Fix this by making sure that when the flag BTRFS_INODE_COPY_EVERYTHING is +set the inode is logged with its current i_size (log the in-memory inode +into the log tree). + +This issue is not a recent regression and is easy to reproduce with the +following test case for fstests: + + seq=`basename $0` + seqres=$RESULT_DIR/$seq + echo "QA output created by $seq" + + here=`pwd` + tmp=/tmp/$$ + status=1 # failure is the default! + + _cleanup() + { + _cleanup_flakey + rm -f $tmp.* + } + trap "_cleanup; exit \$status" 0 1 2 3 15 + + # get standard environment, filters and checks + . ./common/rc + . ./common/filter + . ./common/dmflakey + + # real QA test starts here + _supported_fs generic + _supported_os Linux + _need_to_be_root + _require_scratch + _require_dm_flakey + _require_metadata_journaling $SCRATCH_DEV + + _crash_and_mount() + { + # Simulate a crash/power loss. + _load_flakey_table $FLAKEY_DROP_WRITES + _unmount_flakey + # Allow writes again and mount. This makes the fs replay its fsync log. + _load_flakey_table $FLAKEY_ALLOW_WRITES + _mount_flakey + } + + rm -f $seqres.full + + _scratch_mkfs >> $seqres.full 2>&1 + _init_flakey + _mount_flakey + + # Create the test file with some initial data and then fsync it. + # The fsync here is only needed to trigger the issue in btrfs, as it causes the + # the flag BTRFS_INODE_NEEDS_FULL_SYNC to be removed from the btrfs inode. + $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 32k" \ + -c "fsync" \ + $SCRATCH_MNT/foo | _filter_xfs_io + sync + + # Add a hard link to our file. + # On btrfs this sets the flag BTRFS_INODE_COPY_EVERYTHING on the btrfs inode, + # which is a necessary condition to trigger the issue. + ln $SCRATCH_MNT/foo $SCRATCH_MNT/bar + + # Sync the filesystem to force a commit of the current btrfs transaction, this + # is a necessary condition to trigger the bug on btrfs. + sync + + # Now append more data to our file, increasing its size, and fsync the file. + # In btrfs because the inode flag BTRFS_INODE_COPY_EVERYTHING was set and the + # write path did not update the inode item in the btree nor the delayed inode + # item (in memory struture) in the current transaction (created by the fsync + # handler), the fsync did not record the inode's new i_size in the fsync + # log/journal. This made the data unavailable after the fsync log/journal is + # replayed. + $XFS_IO_PROG -c "pwrite -S 0xbb 32K 32K" \ + -c "fsync" \ + $SCRATCH_MNT/foo | _filter_xfs_io + + echo "File content after fsync and before crash:" + od -t x1 $SCRATCH_MNT/foo + + _crash_and_mount + + echo "File content after crash and log replay:" + od -t x1 $SCRATCH_MNT/foo + + status=0 + exit + +The expected file output before and after the crash/power failure expects the +appended data to be available, which is: + + 0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa + * + 0100000 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb + * + 0200000 + +Signed-off-by: Filipe Manana +Reviewed-by: Liu Bo +Signed-off-by: Chris Mason +Signed-off-by: Greg Kroah-Hartman + +--- + fs/btrfs/tree-log.c | 14 +++++++++----- + 1 file changed, 9 insertions(+), 5 deletions(-) + +--- a/fs/btrfs/tree-log.c ++++ b/fs/btrfs/tree-log.c +@@ -4161,6 +4161,7 @@ static int btrfs_log_inode(struct btrfs_ + u64 ino = btrfs_ino(inode); + struct extent_map_tree *em_tree = &BTRFS_I(inode)->extent_tree; + u64 logged_isize = 0; ++ bool need_log_inode_item = true; + + path = btrfs_alloc_path(); + if (!path) +@@ -4269,11 +4270,6 @@ static int btrfs_log_inode(struct btrfs_ + } else { + if (inode_only == LOG_INODE_ALL) + fast_search = true; +- ret = log_inode_item(trans, log, dst_path, inode); +- if (ret) { +- err = ret; +- goto out_unlock; +- } + goto log_extents; + } + +@@ -4296,6 +4292,9 @@ again: + if (min_key.type > max_key.type) + break; + ++ if (min_key.type == BTRFS_INODE_ITEM_KEY) ++ need_log_inode_item = false; ++ + src = path->nodes[0]; + if (ins_nr && ins_start_slot + ins_nr == path->slots[0]) { + ins_nr++; +@@ -4366,6 +4365,11 @@ next_slot: + log_extents: + btrfs_release_path(path); + btrfs_release_path(dst_path); ++ if (need_log_inode_item) { ++ err = log_inode_item(trans, log, dst_path, inode); ++ if (err) ++ goto out_unlock; ++ } + if (fast_search) { + /* + * Some ordered extents started by fsync might have completed diff --git a/queue-4.1/btrfs-fix-list-transaction-pending_ordered-corruption.patch b/queue-4.1/btrfs-fix-list-transaction-pending_ordered-corruption.patch new file mode 100644 index 00000000000..49ac6d2b4ab --- /dev/null +++ b/queue-4.1/btrfs-fix-list-transaction-pending_ordered-corruption.patch @@ -0,0 +1,86 @@ +From d3efe08400317888f559bbedf0e42cd31575d0ef Mon Sep 17 00:00:00 2001 +From: Filipe Manana +Date: Fri, 3 Jul 2015 20:30:34 +0100 +Subject: Btrfs: fix list transaction->pending_ordered corruption + +From: Filipe Manana + +commit d3efe08400317888f559bbedf0e42cd31575d0ef upstream. + +When we call btrfs_commit_transaction(), we splice the list "ordered" +of our transaction handle into the transaction's "pending_ordered" +list, but we don't re-initialize the "ordered" list of our transaction +handle, this means it still points to the same elements it used to +before the splice. Then we check if the current transaction's state is +>= TRANS_STATE_COMMIT_START and if it is we end up calling +btrfs_end_transaction() which simply splices again the "ordered" list +of our handle into the transaction's "pending_ordered" list, leaving +multiple pointers to the same ordered extents which results in list +corruption when we are iterating, removing and freeing ordered extents +at btrfs_wait_pending_ordered(), resulting in access to dangling +pointers / use-after-free issues. +Similarly, btrfs_end_transaction() can end up in some cases calling +btrfs_commit_transaction(), and both did a list splice of the transaction +handle's "ordered" list into the transaction's "pending_ordered" without +re-initializing the handle's "ordered" list, resulting in exactly the +same problem. + +This produces the following warning on a kernel with linked list +debugging enabled: + +[109749.265416] ------------[ cut here ]------------ +[109749.266410] WARNING: CPU: 7 PID: 324 at lib/list_debug.c:59 __list_del_entry+0x5a/0x98() +[109749.267969] list_del corruption. prev->next should be ffff8800ba087e20, but was fffffff8c1f7c35d +(...) +[109749.287505] Call Trace: +[109749.288135] [] dump_stack+0x4f/0x7b +[109749.298080] [] ? console_unlock+0x356/0x3a2 +[109749.331605] [] warn_slowpath_common+0xa1/0xbb +[109749.334849] [] ? __list_del_entry+0x5a/0x98 +[109749.337093] [] warn_slowpath_fmt+0x46/0x48 +[109749.337847] [] __list_del_entry+0x5a/0x98 +[109749.338678] [] btrfs_wait_pending_ordered+0x46/0xdb [btrfs] +[109749.340145] [] ? __btrfs_run_delayed_items+0x149/0x163 [btrfs] +[109749.348313] [] btrfs_commit_transaction+0x36b/0xa10 [btrfs] +[109749.349745] [] ? trace_hardirqs_on+0xd/0xf +[109749.350819] [] btrfs_sync_file+0x36f/0x3fc [btrfs] +[109749.351976] [] vfs_fsync_range+0x8f/0x9e +[109749.360341] [] vfs_fsync+0x1c/0x1e +[109749.368828] [] do_fsync+0x34/0x4e +[109749.369790] [] SyS_fsync+0x10/0x14 +[109749.370925] [] system_call_fastpath+0x12/0x6f +[109749.382274] ---[ end trace 48e0d07f7c03d95a ]--- + +On a non-debug kernel this leads to invalid memory accesses, causing a +crash. Fix this by using list_splice_init() instead of list_splice() in +btrfs_commit_transaction() and btrfs_end_transaction(). + +Fixes: 50d9aa99bd35 ("Btrfs: make sure logged extents complete in the current transaction V3" +Signed-off-by: Filipe Manana +Reviewed-by: David Sterba +Signed-off-by: Greg Kroah-Hartman + +--- + fs/btrfs/transaction.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +--- a/fs/btrfs/transaction.c ++++ b/fs/btrfs/transaction.c +@@ -758,7 +758,7 @@ static int __btrfs_end_transaction(struc + + if (!list_empty(&trans->ordered)) { + spin_lock(&info->trans_lock); +- list_splice(&trans->ordered, &cur_trans->pending_ordered); ++ list_splice_init(&trans->ordered, &cur_trans->pending_ordered); + spin_unlock(&info->trans_lock); + } + +@@ -1848,7 +1848,7 @@ int btrfs_commit_transaction(struct btrf + } + + spin_lock(&root->fs_info->trans_lock); +- list_splice(&trans->ordered, &cur_trans->pending_ordered); ++ list_splice_init(&trans->ordered, &cur_trans->pending_ordered); + if (cur_trans->state >= TRANS_STATE_COMMIT_START) { + spin_unlock(&root->fs_info->trans_lock); + atomic_inc(&cur_trans->use_count); diff --git a/queue-4.1/btrfs-fix-memory-leak-in-the-extent_same-ioctl.patch b/queue-4.1/btrfs-fix-memory-leak-in-the-extent_same-ioctl.patch new file mode 100644 index 00000000000..bd47a2221d4 --- /dev/null +++ b/queue-4.1/btrfs-fix-memory-leak-in-the-extent_same-ioctl.patch @@ -0,0 +1,53 @@ +From 497b4050e0eacd4c746dd396d14916b1e669849d Mon Sep 17 00:00:00 2001 +From: Filipe Manana +Date: Fri, 3 Jul 2015 08:36:11 +0100 +Subject: Btrfs: fix memory leak in the extent_same ioctl + +From: Filipe Manana + +commit 497b4050e0eacd4c746dd396d14916b1e669849d upstream. + +We were allocating memory with memdup_user() but we were never releasing +that memory. This affected pretty much every call to the ioctl, whether +it deduplicated extents or not. + +This issue was reported on IRC by Julian Taylor and on the mailing list +by Marcel Ritter, credit goes to them for finding the issue. + +Reported-by: Julian Taylor +Reported-by: Marcel Ritter +Signed-off-by: Filipe Manana +Reviewed-by: Mark Fasheh +Signed-off-by: Greg Kroah-Hartman + +--- + fs/btrfs/ioctl.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +--- a/fs/btrfs/ioctl.c ++++ b/fs/btrfs/ioctl.c +@@ -2938,7 +2938,7 @@ out_unlock: + static long btrfs_ioctl_file_extent_same(struct file *file, + struct btrfs_ioctl_same_args __user *argp) + { +- struct btrfs_ioctl_same_args *same; ++ struct btrfs_ioctl_same_args *same = NULL; + struct btrfs_ioctl_same_extent_info *info; + struct inode *src = file_inode(file); + u64 off; +@@ -2968,6 +2968,7 @@ static long btrfs_ioctl_file_extent_same + + if (IS_ERR(same)) { + ret = PTR_ERR(same); ++ same = NULL; + goto out; + } + +@@ -3038,6 +3039,7 @@ static long btrfs_ioctl_file_extent_same + + out: + mnt_drop_write_file(file); ++ kfree(same); + return ret; + } + diff --git a/queue-4.1/btrfs-fix-race-between-caching-kthread-and-returning-inode-to-inode-cache.patch b/queue-4.1/btrfs-fix-race-between-caching-kthread-and-returning-inode-to-inode-cache.patch new file mode 100644 index 00000000000..f6afad2eb48 --- /dev/null +++ b/queue-4.1/btrfs-fix-race-between-caching-kthread-and-returning-inode-to-inode-cache.patch @@ -0,0 +1,117 @@ +From ae9d8f17118551bedd797406a6768b87c2146234 Mon Sep 17 00:00:00 2001 +From: Filipe Manana +Date: Sat, 13 Jun 2015 06:52:57 +0100 +Subject: Btrfs: fix race between caching kthread and returning inode to inode cache + +From: Filipe Manana + +commit ae9d8f17118551bedd797406a6768b87c2146234 upstream. + +While the inode cache caching kthread is calling btrfs_unpin_free_ino(), +we could have a concurrent call to btrfs_return_ino() that adds a new +entry to the root's free space cache of pinned inodes. This concurrent +call does not acquire the fs_info->commit_root_sem before adding a new +entry if the caching state is BTRFS_CACHE_FINISHED, which is a problem +because the caching kthread calls btrfs_unpin_free_ino() after setting +the caching state to BTRFS_CACHE_FINISHED and therefore races with +the task calling btrfs_return_ino(), which is adding a new entry, while +the former (caching kthread) is navigating the cache's rbtree, removing +and freeing nodes from the cache's rbtree without acquiring the spinlock +that protects the rbtree. + +This race resulted in memory corruption due to double free of struct +btrfs_free_space objects because both tasks can end up doing freeing the +same objects. Note that adding a new entry can result in merging it with +other entries in the cache, in which case those entries are freed. +This is particularly important as btrfs_free_space structures are also +used for the block group free space caches. + +This memory corruption can be detected by a debugging kernel, which +reports it with the following trace: + +[132408.501148] slab error in verify_redzone_free(): cache `btrfs_free_space': double free detected +[132408.505075] CPU: 15 PID: 12248 Comm: btrfs-ino-cache Tainted: G W 4.1.0-rc5-btrfs-next-10+ #1 +[132408.505075] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014 +[132408.505075] ffff880023e7d320 ffff880163d73cd8 ffffffff8145eec7 ffffffff81095dce +[132408.505075] ffff880009735d40 ffff880163d73ce8 ffffffff81154e1e ffff880163d73d68 +[132408.505075] ffffffff81155733 ffffffffa054a95a ffff8801b6099f00 ffffffffa0505b5f +[132408.505075] Call Trace: +[132408.505075] [] dump_stack+0x4f/0x7b +[132408.505075] [] ? console_unlock+0x356/0x3a2 +[132408.505075] [] __slab_error.isra.28+0x25/0x36 +[132408.505075] [] __cache_free+0xe2/0x4b6 +[132408.505075] [] ? __btrfs_add_free_space+0x2f0/0x343 [btrfs] +[132408.505075] [] ? btrfs_unpin_free_ino+0x8e/0x99 [btrfs] +[132408.505075] [] ? time_hardirqs_off+0x15/0x28 +[132408.505075] [] ? trace_hardirqs_off+0xd/0xf +[132408.505075] [] ? kfree+0xb6/0x14e +[132408.505075] [] kfree+0xe5/0x14e +[132408.505075] [] btrfs_unpin_free_ino+0x8e/0x99 [btrfs] +[132408.505075] [] caching_kthread+0x29e/0x2d9 [btrfs] +[132408.505075] [] ? btrfs_unpin_free_ino+0x99/0x99 [btrfs] +[132408.505075] [] kthread+0xef/0xf7 +[132408.505075] [] ? time_hardirqs_on+0x15/0x28 +[132408.505075] [] ? __kthread_parkme+0xad/0xad +[132408.505075] [] ret_from_fork+0x42/0x70 +[132408.505075] [] ? __kthread_parkme+0xad/0xad +[132408.505075] ffff880023e7d320: redzone 1:0x9f911029d74e35b, redzone 2:0x9f911029d74e35b. +[132409.501654] slab: double free detected in cache 'btrfs_free_space', objp ffff880023e7d320 +[132409.503355] ------------[ cut here ]------------ +[132409.504241] kernel BUG at mm/slab.c:2571! + +Therefore fix this by having btrfs_unpin_free_ino() acquire the lock +that protects the rbtree while doing the searches and removing entries. + +Fixes: 1c70d8fb4dfa ("Btrfs: fix inode caching vs tree log") +Signed-off-by: Filipe Manana +Signed-off-by: Chris Mason +Signed-off-by: Greg Kroah-Hartman + +--- + fs/btrfs/inode-map.c | 15 +++++++++++---- + 1 file changed, 11 insertions(+), 4 deletions(-) + +--- a/fs/btrfs/inode-map.c ++++ b/fs/btrfs/inode-map.c +@@ -246,6 +246,7 @@ void btrfs_unpin_free_ino(struct btrfs_r + { + struct btrfs_free_space_ctl *ctl = root->free_ino_ctl; + struct rb_root *rbroot = &root->free_ino_pinned->free_space_offset; ++ spinlock_t *rbroot_lock = &root->free_ino_pinned->tree_lock; + struct btrfs_free_space *info; + struct rb_node *n; + u64 count; +@@ -254,23 +255,29 @@ void btrfs_unpin_free_ino(struct btrfs_r + return; + + while (1) { ++ bool add_to_ctl = true; ++ ++ spin_lock(rbroot_lock); + n = rb_first(rbroot); +- if (!n) ++ if (!n) { ++ spin_unlock(rbroot_lock); + break; ++ } + + info = rb_entry(n, struct btrfs_free_space, offset_index); + BUG_ON(info->bitmap); /* Logic error */ + + if (info->offset > root->ino_cache_progress) +- goto free; ++ add_to_ctl = false; + else if (info->offset + info->bytes > root->ino_cache_progress) + count = root->ino_cache_progress - info->offset + 1; + else + count = info->bytes; + +- __btrfs_add_free_space(ctl, info->offset, count); +-free: + rb_erase(&info->offset_index, rbroot); ++ spin_unlock(rbroot_lock); ++ if (add_to_ctl) ++ __btrfs_add_free_space(ctl, info->offset, count); + kmem_cache_free(btrfs_free_space_cachep, info); + } + } diff --git a/queue-4.1/btrfs-use-kmem_cache_free-when-freeing-entry-in-inode-cache.patch b/queue-4.1/btrfs-use-kmem_cache_free-when-freeing-entry-in-inode-cache.patch new file mode 100644 index 00000000000..d90d1c89744 --- /dev/null +++ b/queue-4.1/btrfs-use-kmem_cache_free-when-freeing-entry-in-inode-cache.patch @@ -0,0 +1,44 @@ +From c3f4a1685bb87e59c886ee68f7967eae07d4dffa Mon Sep 17 00:00:00 2001 +From: Filipe Manana +Date: Sat, 13 Jun 2015 06:52:56 +0100 +Subject: Btrfs: use kmem_cache_free when freeing entry in inode cache + +From: Filipe Manana + +commit c3f4a1685bb87e59c886ee68f7967eae07d4dffa upstream. + +The free space entries are allocated using kmem_cache_zalloc(), +through __btrfs_add_free_space(), therefore we should use +kmem_cache_free() and not kfree() to avoid any confusion and +any potential problem. Looking at the kfree() definition at +mm/slab.c it has the following comment: + + /* + * (...) + * + * Don't free memory not originally allocated by kmalloc() + * or you will run into trouble. + */ + +So better be safe and use kmem_cache_free(). + +Signed-off-by: Filipe Manana +Reviewed-by: David Sterba +Signed-off-by: Chris Mason +Signed-off-by: Greg Kroah-Hartman + +--- + fs/btrfs/inode-map.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/fs/btrfs/inode-map.c ++++ b/fs/btrfs/inode-map.c +@@ -271,7 +271,7 @@ void btrfs_unpin_free_ino(struct btrfs_r + __btrfs_add_free_space(ctl, info->offset, count); + free: + rb_erase(&info->offset_index, rbroot); +- kfree(info); ++ kmem_cache_free(btrfs_free_space_cachep, info); + } + } + diff --git a/queue-4.1/crush-fix-a-bug-in-tree-bucket-decode.patch b/queue-4.1/crush-fix-a-bug-in-tree-bucket-decode.patch new file mode 100644 index 00000000000..550cdea3a0d --- /dev/null +++ b/queue-4.1/crush-fix-a-bug-in-tree-bucket-decode.patch @@ -0,0 +1,36 @@ +From 82cd003a77173c91b9acad8033fb7931dac8d751 Mon Sep 17 00:00:00 2001 +From: Ilya Dryomov +Date: Mon, 29 Jun 2015 19:30:23 +0300 +Subject: crush: fix a bug in tree bucket decode + +From: Ilya Dryomov + +commit 82cd003a77173c91b9acad8033fb7931dac8d751 upstream. + +struct crush_bucket_tree::num_nodes is u8, so ceph_decode_8_safe() +should be used. -Wconversion catches this, but I guess it went +unnoticed in all the noise it spews. The actual problem (at least for +common crushmaps) isn't the u32 -> u8 truncation though - it's the +advancement by 4 bytes instead of 1 in the crushmap buffer. + +Fixes: http://tracker.ceph.com/issues/2759 + +Signed-off-by: Ilya Dryomov +Reviewed-by: Josh Durgin +Signed-off-by: Greg Kroah-Hartman + +--- + net/ceph/osdmap.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/net/ceph/osdmap.c ++++ b/net/ceph/osdmap.c +@@ -89,7 +89,7 @@ static int crush_decode_tree_bucket(void + { + int j; + dout("crush_decode_tree_bucket %p to %p\n", *p, end); +- ceph_decode_32_safe(p, end, b->num_nodes, bad); ++ ceph_decode_8_safe(p, end, b->num_nodes, bad); + b->node_weights = kcalloc(b->num_nodes, sizeof(u32), GFP_NOFS); + if (b->node_weights == NULL) + return -ENOMEM; diff --git a/queue-4.1/fuse-initialize-fc-release-before-calling-it.patch b/queue-4.1/fuse-initialize-fc-release-before-calling-it.patch new file mode 100644 index 00000000000..8b2f77421d5 --- /dev/null +++ b/queue-4.1/fuse-initialize-fc-release-before-calling-it.patch @@ -0,0 +1,41 @@ +From 0ad0b3255a08020eaf50e34ef0d6df5bdf5e09ed Mon Sep 17 00:00:00 2001 +From: Miklos Szeredi +Date: Wed, 1 Jul 2015 16:25:55 +0200 +Subject: fuse: initialize fc->release before calling it + +From: Miklos Szeredi + +commit 0ad0b3255a08020eaf50e34ef0d6df5bdf5e09ed upstream. + +fc->release is called from fuse_conn_put() which was used in the error +cleanup before fc->release was initialized. + +[Jeremiah Mahler : assign fc->release after calling +fuse_conn_init(fc) instead of before.] + +Signed-off-by: Miklos Szeredi +Fixes: a325f9b92273 ("fuse: update fuse_conn_init() and separate out fuse_conn_kill()") +Signed-off-by: Greg Kroah-Hartman + +--- + fs/fuse/inode.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/fs/fuse/inode.c ++++ b/fs/fuse/inode.c +@@ -1026,6 +1026,7 @@ static int fuse_fill_super(struct super_ + goto err_fput; + + fuse_conn_init(fc); ++ fc->release = fuse_free_conn; + + fc->dev = sb->s_dev; + fc->sb = sb; +@@ -1040,7 +1041,6 @@ static int fuse_fill_super(struct super_ + fc->dont_mask = 1; + sb->s_flags |= MS_POSIXACL; + +- fc->release = fuse_free_conn; + fc->flags = d.flags; + fc->user_id = d.user_id; + fc->group_id = d.group_id; diff --git a/queue-4.1/selinux-don-t-waste-ebitmap-space-when-importing-netlabel-categories.patch b/queue-4.1/selinux-don-t-waste-ebitmap-space-when-importing-netlabel-categories.patch new file mode 100644 index 00000000000..e0ad4a17fb2 --- /dev/null +++ b/queue-4.1/selinux-don-t-waste-ebitmap-space-when-importing-netlabel-categories.patch @@ -0,0 +1,44 @@ +From 3324603524925c7727207027d1c15e597412d15e Mon Sep 17 00:00:00 2001 +From: Paul Moore +Date: Thu, 9 Jul 2015 14:20:36 -0400 +Subject: selinux: don't waste ebitmap space when importing NetLabel categories + +From: Paul Moore + +commit 3324603524925c7727207027d1c15e597412d15e upstream. + +At present we don't create efficient ebitmaps when importing NetLabel +category bitmaps. This can present a problem when comparing ebitmaps +since ebitmap_cmp() is very strict about these things and considers +these wasteful ebitmaps not equal when compared to their more +efficient counterparts, even if their values are the same. This isn't +likely to cause problems on 64-bit systems due to a bit of luck on +how NetLabel/CIPSO works and the default ebitmap size, but it can be +a problem on 32-bit systems. + +This patch fixes this problem by being a bit more intelligent when +importing NetLabel category bitmaps by skipping over empty sections +which should result in a nice, efficient ebitmap. + +Signed-off-by: Paul Moore +Signed-off-by: Greg Kroah-Hartman + +--- + security/selinux/ss/ebitmap.c | 6 ++++++ + 1 file changed, 6 insertions(+) + +--- a/security/selinux/ss/ebitmap.c ++++ b/security/selinux/ss/ebitmap.c +@@ -153,6 +153,12 @@ int ebitmap_netlbl_import(struct ebitmap + if (offset == (u32)-1) + return 0; + ++ /* don't waste ebitmap space if the netlabel bitmap is empty */ ++ if (bitmap == 0) { ++ offset += EBITMAP_UNIT_SIZE; ++ continue; ++ } ++ + if (e_iter == NULL || + offset >= e_iter->startbit + EBITMAP_SIZE) { + e_prev = e_iter; diff --git a/queue-4.1/selinux-fix-mprotect-prot_exec-regression-caused-by-mm-change.patch b/queue-4.1/selinux-fix-mprotect-prot_exec-regression-caused-by-mm-change.patch new file mode 100644 index 00000000000..99e7dae9348 --- /dev/null +++ b/queue-4.1/selinux-fix-mprotect-prot_exec-regression-caused-by-mm-change.patch @@ -0,0 +1,48 @@ +From 892e8cac99a71f6254f84fc662068d912e1943bf Mon Sep 17 00:00:00 2001 +From: Stephen Smalley +Date: Fri, 10 Jul 2015 09:40:59 -0400 +Subject: selinux: fix mprotect PROT_EXEC regression caused by mm change + +From: Stephen Smalley + +commit 892e8cac99a71f6254f84fc662068d912e1943bf upstream. + +commit 66fc13039422ba7df2d01a8ee0873e4ef965b50b ("mm: shmem_zero_setup +skip security check and lockdep conflict with XFS") caused a regression +for SELinux by disabling any SELinux checking of mprotect PROT_EXEC on +shared anonymous mappings. However, even before that regression, the +checking on such mprotect PROT_EXEC calls was inconsistent with the +checking on a mmap PROT_EXEC call for a shared anonymous mapping. On a +mmap, the security hook is passed a NULL file and knows it is dealing +with an anonymous mapping and therefore applies an execmem check and no +file checks. On a mprotect, the security hook is passed a vma with a +non-NULL vm_file (as this was set from the internally-created shmem +file during mmap) and therefore applies the file-based execute check +and no execmem check. Since the aforementioned commit now marks the +shmem zero inode with the S_PRIVATE flag, the file checks are disabled +and we have no checking at all on mprotect PROT_EXEC. Add a test to +the mprotect hook logic for such private inodes, and apply an execmem +check in that case. This makes the mmap and mprotect checking +consistent for shared anonymous mappings, as well as for /dev/zero and +ashmem. + +Signed-off-by: Stephen Smalley +Signed-off-by: Paul Moore +Signed-off-by: Greg Kroah-Hartman + +--- + security/selinux/hooks.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +--- a/security/selinux/hooks.c ++++ b/security/selinux/hooks.c +@@ -3288,7 +3288,8 @@ static int file_map_prot_check(struct fi + int rc = 0; + + if (default_noexec && +- (prot & PROT_EXEC) && (!file || (!shared && (prot & PROT_WRITE)))) { ++ (prot & PROT_EXEC) && (!file || IS_PRIVATE(file_inode(file)) || ++ (!shared && (prot & PROT_WRITE)))) { + /* + * We are making executable an anonymous mapping or a + * private file mapping that will also be writable. diff --git a/queue-4.1/series b/queue-4.1/series index c3ace8d5a81..d8238a0618c 100644 --- a/queue-4.1/series +++ b/queue-4.1/series @@ -193,3 +193,20 @@ hwmon-nct7802-fix-visibility-of-temp3.patch arm-dts-mx23-fix-iio-hwmon-support.patch btrfs-don-t-invalidate-root-dentry-when-subvolume-deletion-fails.patch md-fix-a-build-warning.patch +btrfs-use-kmem_cache_free-when-freeing-entry-in-inode-cache.patch +btrfs-fix-race-between-caching-kthread-and-returning-inode-to-inode-cache.patch +btrfs-fix-fsync-data-loss-after-append-write.patch +btrfs-fix-memory-leak-in-the-extent_same-ioctl.patch +btrfs-fix-list-transaction-pending_ordered-corruption.patch +btrfs-fix-file-corruption-after-cloning-inline-extents.patch +selinux-don-t-waste-ebitmap-space-when-importing-netlabel-categories.patch +selinux-fix-mprotect-prot_exec-regression-caused-by-mm-change.patch +fuse-initialize-fc-release-before-calling-it.patch +crush-fix-a-bug-in-tree-bucket-decode.patch +acpi-resources-free-memory-on-error-in-add_region_before.patch +acpi-pnp-reserve-acpi-resources-at-the-fs_initcall_sync-stage.patch +acpi-lpss-fix-up-acpi_lpss_create_device.patch +acpica-tables-enable-both-32-bit-and-64-bit-facs.patch +acpica-tables-fix-an-issue-that-facs-initialization-is-performed-twice.patch +acpica-tables-enable-default-64-bit-fadt-addresses-favor.patch +acpi-pci-fix-regressions-caused-by-resource_size_t-overflow-with-32-bit-kernel.patch