From: Greg Kroah-Hartman Date: Fri, 29 Jan 2021 10:49:57 +0000 (+0100) Subject: 4.14-stable patches X-Git-Tag: v4.4.254~2 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=1c0fa75937ceb3f7d5a6ee0a8e85298506a01c34;p=thirdparty%2Fkernel%2Fstable-queue.git 4.14-stable patches added patches: fs-fix-lazytime-expiration-handling-in-__writeback_single_inode.patch fs-move-i_dirty_inode-to-fs.h.patch writeback-drop-i_dirty_time_expire.patch x86-boot-compressed-disable-relocation-relaxation.patch --- diff --git a/queue-4.14/fs-fix-lazytime-expiration-handling-in-__writeback_single_inode.patch b/queue-4.14/fs-fix-lazytime-expiration-handling-in-__writeback_single_inode.patch new file mode 100644 index 00000000000..13dc13f36b4 --- /dev/null +++ b/queue-4.14/fs-fix-lazytime-expiration-handling-in-__writeback_single_inode.patch @@ -0,0 +1,115 @@ +From foo@baz Fri Jan 29 11:42:15 AM CET 2021 +From: Eric Biggers +Date: Mon, 25 Jan 2021 12:37:44 -0800 +Subject: fs: fix lazytime expiration handling in __writeback_single_inode() +To: stable@vger.kernel.org +Cc: linux-fsdevel@vger.kernel.org, Jan Kara , Christoph Hellwig +Message-ID: <20210125203744.325479-4-ebiggers@kernel.org> + +From: Eric Biggers + +commit 1e249cb5b7fc09ff216aa5a12f6c302e434e88f9 upstream. + +When lazytime is enabled and an inode is being written due to its +in-memory updated timestamps having expired, either due to a sync() or +syncfs() system call or due to dirtytime_expire_interval having elapsed, +the VFS needs to inform the filesystem so that the filesystem can copy +the inode's timestamps out to the on-disk data structures. + +This is done by __writeback_single_inode() calling +mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC). + +However, this occurs after __writeback_single_inode() has already +cleared the dirty flags from ->i_state. This causes two bugs: + +- mark_inode_dirty_sync() redirties the inode, causing it to remain + dirty. This wastefully causes the inode to be written twice. But + more importantly, it breaks cases where sync_filesystem() is expected + to clean dirty inodes. This includes the FS_IOC_REMOVE_ENCRYPTION_KEY + ioctl (as reported at + https://lore.kernel.org/r/20200306004555.GB225345@gmail.com), as well + as possibly filesystem freezing (freeze_super()). + +- Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is + called from __writeback_single_inode() for lazytime expiration, + xfs_fs_dirty_inode() ignores the notification. (XFS only cares about + lazytime expirations, and it assumes that i_state will contain + I_DIRTY_TIME during those.) Therefore, lazy timestamps aren't + persisted by sync(), syncfs(), or dirtytime_expire_interval on XFS. + +Fix this by moving the call to mark_inode_dirty_sync() to earlier in +__writeback_single_inode(), before the dirty flags are cleared from +i_state. This makes filesystems be properly notified of the timestamp +expiration, and it avoids incorrectly redirtying the inode. + +This fixes xfstest generic/580 (which tests +FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime +enabled. It also fixes the new lazytime xfstest I've proposed, which +reproduces the above-mentioned XFS bug +(https://lore.kernel.org/r/20210105005818.92978-1-ebiggers@kernel.org). + +Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly. But +due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the +right thing to do because mark_inode_dirty_sync() now knows not to move +the inode to a writeback list if it is currently queued for sync. + +Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option") +Cc: stable@vger.kernel.org +Depends-on: 5afced3bf281 ("writeback: Avoid skipping inode writeback") +Link: https://lore.kernel.org/r/20210112190253.64307-2-ebiggers@kernel.org +Suggested-by: Jan Kara +Reviewed-by: Christoph Hellwig +Reviewed-by: Jan Kara +Signed-off-by: Eric Biggers +Signed-off-by: Jan Kara +Signed-off-by: Greg Kroah-Hartman +--- + fs/fs-writeback.c | 24 +++++++++++++----------- + 1 file changed, 13 insertions(+), 11 deletions(-) + +--- a/fs/fs-writeback.c ++++ b/fs/fs-writeback.c +@@ -1390,21 +1390,25 @@ __writeback_single_inode(struct inode *i + } + + /* +- * Some filesystems may redirty the inode during the writeback +- * due to delalloc, clear dirty metadata flags right before +- * write_inode() ++ * If the inode has dirty timestamps and we need to write them, call ++ * mark_inode_dirty_sync() to notify the filesystem about it and to ++ * change I_DIRTY_TIME into I_DIRTY_SYNC. + */ +- spin_lock(&inode->i_lock); +- +- dirty = inode->i_state & I_DIRTY; + if ((inode->i_state & I_DIRTY_TIME) && +- ((dirty & I_DIRTY_INODE) || +- wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync || ++ (wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync || + time_after(jiffies, inode->dirtied_time_when + + dirtytime_expire_interval * HZ))) { +- dirty |= I_DIRTY_TIME; + trace_writeback_lazytime(inode); ++ mark_inode_dirty_sync(inode); + } ++ ++ /* ++ * Some filesystems may redirty the inode during the writeback ++ * due to delalloc, clear dirty metadata flags right before ++ * write_inode() ++ */ ++ spin_lock(&inode->i_lock); ++ dirty = inode->i_state & I_DIRTY; + inode->i_state &= ~dirty; + + /* +@@ -1425,8 +1429,6 @@ __writeback_single_inode(struct inode *i + + spin_unlock(&inode->i_lock); + +- if (dirty & I_DIRTY_TIME) +- mark_inode_dirty_sync(inode); + /* Don't write the inode if only I_DIRTY_PAGES was set */ + if (dirty & ~I_DIRTY_PAGES) { + int err = write_inode(inode, wbc); diff --git a/queue-4.14/fs-move-i_dirty_inode-to-fs.h.patch b/queue-4.14/fs-move-i_dirty_inode-to-fs.h.patch new file mode 100644 index 00000000000..321bd7d6201 --- /dev/null +++ b/queue-4.14/fs-move-i_dirty_inode-to-fs.h.patch @@ -0,0 +1,111 @@ +From foo@baz Fri Jan 29 11:42:15 AM CET 2021 +From: Eric Biggers +Date: Mon, 25 Jan 2021 12:37:42 -0800 +Subject: fs: move I_DIRTY_INODE to fs.h +To: stable@vger.kernel.org +Cc: linux-fsdevel@vger.kernel.org, Jan Kara , Christoph Hellwig , Al Viro +Message-ID: <20210125203744.325479-2-ebiggers@kernel.org> + +From: Christoph Hellwig + +commit 0e11f6443f522f89509495b13ef1f3745640144d upstream. + +And use it in a few more places rather than opencoding the values. + +Signed-off-by: Christoph Hellwig +Signed-off-by: Al Viro +Signed-off-by: Eric Biggers +Signed-off-by: Greg Kroah-Hartman +--- + fs/ext4/inode.c | 4 ++-- + fs/fs-writeback.c | 9 +++------ + fs/gfs2/super.c | 2 +- + include/linux/fs.h | 3 ++- + 4 files changed, 8 insertions(+), 10 deletions(-) + +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -5064,12 +5064,12 @@ static int other_inode_match(struct inod + + if ((inode->i_ino != ino) || + (inode->i_state & (I_FREEING | I_WILL_FREE | I_NEW | +- I_DIRTY_SYNC | I_DIRTY_DATASYNC)) || ++ I_DIRTY_INODE)) || + ((inode->i_state & I_DIRTY_TIME) == 0)) + return 0; + spin_lock(&inode->i_lock); + if (((inode->i_state & (I_FREEING | I_WILL_FREE | I_NEW | +- I_DIRTY_SYNC | I_DIRTY_DATASYNC)) == 0) && ++ I_DIRTY_INODE)) == 0) && + (inode->i_state & I_DIRTY_TIME)) { + struct ext4_inode_info *ei = EXT4_I(inode); + +--- a/fs/fs-writeback.c ++++ b/fs/fs-writeback.c +@@ -1400,7 +1400,7 @@ __writeback_single_inode(struct inode *i + + dirty = inode->i_state & I_DIRTY; + if (inode->i_state & I_DIRTY_TIME) { +- if ((dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) || ++ if ((dirty & I_DIRTY_INODE) || + wbc->sync_mode == WB_SYNC_ALL || + unlikely(inode->i_state & I_DIRTY_TIME_EXPIRED) || + unlikely(time_after(jiffies, +@@ -2136,7 +2136,6 @@ static noinline void block_dump___mark_i + */ + void __mark_inode_dirty(struct inode *inode, int flags) + { +-#define I_DIRTY_INODE (I_DIRTY_SYNC | I_DIRTY_DATASYNC) + struct super_block *sb = inode->i_sb; + int dirtytime; + +@@ -2146,7 +2145,7 @@ void __mark_inode_dirty(struct inode *in + * Don't do this for I_DIRTY_PAGES - that doesn't actually + * dirty the inode itself + */ +- if (flags & (I_DIRTY_SYNC | I_DIRTY_DATASYNC | I_DIRTY_TIME)) { ++ if (flags & (I_DIRTY_INODE | I_DIRTY_TIME)) { + trace_writeback_dirty_inode_start(inode, flags); + + if (sb->s_op->dirty_inode) +@@ -2222,7 +2221,7 @@ void __mark_inode_dirty(struct inode *in + if (dirtytime) + inode->dirtied_time_when = jiffies; + +- if (inode->i_state & (I_DIRTY_INODE | I_DIRTY_PAGES)) ++ if (inode->i_state & I_DIRTY) + dirty_list = &wb->b_dirty; + else + dirty_list = &wb->b_dirty_time; +@@ -2246,8 +2245,6 @@ void __mark_inode_dirty(struct inode *in + } + out_unlock_inode: + spin_unlock(&inode->i_lock); +- +-#undef I_DIRTY_INODE + } + EXPORT_SYMBOL(__mark_inode_dirty); + +--- a/fs/gfs2/super.c ++++ b/fs/gfs2/super.c +@@ -791,7 +791,7 @@ static void gfs2_dirty_inode(struct inod + int need_endtrans = 0; + int ret; + +- if (!(flags & (I_DIRTY_DATASYNC|I_DIRTY_SYNC))) ++ if (!(flags & I_DIRTY_INODE)) + return; + if (unlikely(test_bit(SDF_SHUTDOWN, &sdp->sd_flags))) + return; +--- a/include/linux/fs.h ++++ b/include/linux/fs.h +@@ -2015,7 +2015,8 @@ static inline void init_sync_kiocb(struc + #define I_OVL_INUSE (1 << 14) + #define I_SYNC_QUEUED (1 << 17) + +-#define I_DIRTY (I_DIRTY_SYNC | I_DIRTY_DATASYNC | I_DIRTY_PAGES) ++#define I_DIRTY_INODE (I_DIRTY_SYNC | I_DIRTY_DATASYNC) ++#define I_DIRTY (I_DIRTY_INODE | I_DIRTY_PAGES) + #define I_DIRTY_ALL (I_DIRTY | I_DIRTY_TIME) + + extern void __mark_inode_dirty(struct inode *, int); diff --git a/queue-4.14/series b/queue-4.14/series index 7f516dc2a1b..28cdf8f0384 100644 --- a/queue-4.14/series +++ b/queue-4.14/series @@ -44,3 +44,7 @@ futex_Use_pi_state_update_owner__in_put_pi_state_.patch futex_Simplify_fixup_pi_state_owner_.patch futex_Handle_faults_correctly_for_PI_futexes.patch tracing-fix-race-in-trace_open-and-buffer-resize-call.patch +x86-boot-compressed-disable-relocation-relaxation.patch +fs-move-i_dirty_inode-to-fs.h.patch +writeback-drop-i_dirty_time_expire.patch +fs-fix-lazytime-expiration-handling-in-__writeback_single_inode.patch diff --git a/queue-4.14/writeback-drop-i_dirty_time_expire.patch b/queue-4.14/writeback-drop-i_dirty_time_expire.patch new file mode 100644 index 00000000000..adee283bfc6 --- /dev/null +++ b/queue-4.14/writeback-drop-i_dirty_time_expire.patch @@ -0,0 +1,122 @@ +From foo@baz Fri Jan 29 11:42:15 AM CET 2021 +From: Eric Biggers +Date: Mon, 25 Jan 2021 12:37:43 -0800 +Subject: writeback: Drop I_DIRTY_TIME_EXPIRE +To: stable@vger.kernel.org +Cc: linux-fsdevel@vger.kernel.org, Jan Kara , Christoph Hellwig +Message-ID: <20210125203744.325479-3-ebiggers@kernel.org> + +From: Jan Kara + +commit 5fcd57505c002efc5823a7355e21f48dd02d5a51 upstream. + +The only use of I_DIRTY_TIME_EXPIRE is to detect in +__writeback_single_inode() that inode got there because flush worker +decided it's time to writeback the dirty inode time stamps (either +because we are syncing or because of age). However we can detect this +directly in __writeback_single_inode() and there's no need for the +strange propagation with I_DIRTY_TIME_EXPIRE flag. + +Reviewed-by: Christoph Hellwig +Signed-off-by: Jan Kara +Signed-off-by: Eric Biggers +Signed-off-by: Greg Kroah-Hartman +--- + fs/ext4/inode.c | 2 +- + fs/fs-writeback.c | 28 +++++++++++----------------- + include/linux/fs.h | 1 - + include/trace/events/writeback.h | 1 - + 4 files changed, 12 insertions(+), 20 deletions(-) + +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -5073,7 +5073,7 @@ static int other_inode_match(struct inod + (inode->i_state & I_DIRTY_TIME)) { + struct ext4_inode_info *ei = EXT4_I(inode); + +- inode->i_state &= ~(I_DIRTY_TIME | I_DIRTY_TIME_EXPIRED); ++ inode->i_state &= ~I_DIRTY_TIME; + spin_unlock(&inode->i_lock); + + spin_lock(&ei->i_raw_lock); +--- a/fs/fs-writeback.c ++++ b/fs/fs-writeback.c +@@ -1154,7 +1154,7 @@ static bool inode_dirtied_after(struct i + */ + static int move_expired_inodes(struct list_head *delaying_queue, + struct list_head *dispatch_queue, +- int flags, unsigned long dirtied_before) ++ unsigned long dirtied_before) + { + LIST_HEAD(tmp); + struct list_head *pos, *node; +@@ -1170,8 +1170,6 @@ static int move_expired_inodes(struct li + list_move(&inode->i_io_list, &tmp); + moved++; + spin_lock(&inode->i_lock); +- if (flags & EXPIRE_DIRTY_ATIME) +- inode->i_state |= I_DIRTY_TIME_EXPIRED; + inode->i_state |= I_SYNC_QUEUED; + spin_unlock(&inode->i_lock); + if (sb_is_blkdev_sb(inode->i_sb)) +@@ -1219,11 +1217,11 @@ static void queue_io(struct bdi_writebac + + assert_spin_locked(&wb->list_lock); + list_splice_init(&wb->b_more_io, &wb->b_io); +- moved = move_expired_inodes(&wb->b_dirty, &wb->b_io, 0, dirtied_before); ++ moved = move_expired_inodes(&wb->b_dirty, &wb->b_io, dirtied_before); + if (!work->for_sync) + time_expire_jif = jiffies - dirtytime_expire_interval * HZ; + moved += move_expired_inodes(&wb->b_dirty_time, &wb->b_io, +- EXPIRE_DIRTY_ATIME, time_expire_jif); ++ time_expire_jif); + if (moved) + wb_io_lists_populated(wb); + trace_writeback_queue_io(wb, work, dirtied_before, moved); +@@ -1399,18 +1397,14 @@ __writeback_single_inode(struct inode *i + spin_lock(&inode->i_lock); + + dirty = inode->i_state & I_DIRTY; +- if (inode->i_state & I_DIRTY_TIME) { +- if ((dirty & I_DIRTY_INODE) || +- wbc->sync_mode == WB_SYNC_ALL || +- unlikely(inode->i_state & I_DIRTY_TIME_EXPIRED) || +- unlikely(time_after(jiffies, +- (inode->dirtied_time_when + +- dirtytime_expire_interval * HZ)))) { +- dirty |= I_DIRTY_TIME | I_DIRTY_TIME_EXPIRED; +- trace_writeback_lazytime(inode); +- } +- } else +- inode->i_state &= ~I_DIRTY_TIME_EXPIRED; ++ if ((inode->i_state & I_DIRTY_TIME) && ++ ((dirty & I_DIRTY_INODE) || ++ wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync || ++ time_after(jiffies, inode->dirtied_time_when + ++ dirtytime_expire_interval * HZ))) { ++ dirty |= I_DIRTY_TIME; ++ trace_writeback_lazytime(inode); ++ } + inode->i_state &= ~dirty; + + /* +--- a/include/linux/fs.h ++++ b/include/linux/fs.h +@@ -2010,7 +2010,6 @@ static inline void init_sync_kiocb(struc + #define I_DIO_WAKEUP (1 << __I_DIO_WAKEUP) + #define I_LINKABLE (1 << 10) + #define I_DIRTY_TIME (1 << 11) +-#define I_DIRTY_TIME_EXPIRED (1 << 12) + #define I_WB_SWITCH (1 << 13) + #define I_OVL_INUSE (1 << 14) + #define I_SYNC_QUEUED (1 << 17) +--- a/include/trace/events/writeback.h ++++ b/include/trace/events/writeback.h +@@ -20,7 +20,6 @@ + {I_CLEAR, "I_CLEAR"}, \ + {I_SYNC, "I_SYNC"}, \ + {I_DIRTY_TIME, "I_DIRTY_TIME"}, \ +- {I_DIRTY_TIME_EXPIRED, "I_DIRTY_TIME_EXPIRED"}, \ + {I_REFERENCED, "I_REFERENCED"} \ + ) + diff --git a/queue-4.14/x86-boot-compressed-disable-relocation-relaxation.patch b/queue-4.14/x86-boot-compressed-disable-relocation-relaxation.patch new file mode 100644 index 00000000000..22266d2e614 --- /dev/null +++ b/queue-4.14/x86-boot-compressed-disable-relocation-relaxation.patch @@ -0,0 +1,90 @@ +From foo@baz Fri Jan 29 11:33:22 AM CET 2021 +From: Arvind Sankar +Date: Tue, 11 Aug 2020 20:43:08 -0400 +Subject: x86/boot/compressed: Disable relocation relaxation + +From: Arvind Sankar + +commit 09e43968db40c33a73e9ddbfd937f46d5c334924 upstream. + +The x86-64 psABI [0] specifies special relocation types +(R_X86_64_[REX_]GOTPCRELX) for indirection through the Global Offset +Table, semantically equivalent to R_X86_64_GOTPCREL, which the linker +can take advantage of for optimization (relaxation) at link time. This +is supported by LLD and binutils versions 2.26 onwards. + +The compressed kernel is position-independent code, however, when using +LLD or binutils versions before 2.27, it must be linked without the -pie +option. In this case, the linker may optimize certain instructions into +a non-position-independent form, by converting foo@GOTPCREL(%rip) to $foo. + +This potential issue has been present with LLD and binutils-2.26 for a +long time, but it has never manifested itself before now: + +- LLD and binutils-2.26 only relax + movq foo@GOTPCREL(%rip), %reg + to + leaq foo(%rip), %reg + which is still position-independent, rather than + mov $foo, %reg + which is permitted by the psABI when -pie is not enabled. + +- GCC happens to only generate GOTPCREL relocations on mov instructions. + +- CLang does generate GOTPCREL relocations on non-mov instructions, but + when building the compressed kernel, it uses its integrated assembler + (due to the redefinition of KBUILD_CFLAGS dropping -no-integrated-as), + which has so far defaulted to not generating the GOTPCRELX + relocations. + +Nick Desaulniers reports [1,2]: + + "A recent change [3] to a default value of configuration variable + (ENABLE_X86_RELAX_RELOCATIONS OFF -> ON) in LLVM now causes Clang's + integrated assembler to emit R_X86_64_GOTPCRELX/R_X86_64_REX_GOTPCRELX + relocations. LLD will relax instructions with these relocations based + on whether the image is being linked as position independent or not. + When not, then LLD will relax these instructions to use absolute + addressing mode (R_RELAX_GOT_PC_NOPIC). This causes kernels built with + Clang and linked with LLD to fail to boot." + +Patch series [4] is a solution to allow the compressed kernel to be +linked with -pie unconditionally, but even if merged is unlikely to be +backported. As a simple solution that can be applied to stable as well, +prevent the assembler from generating the relaxed relocation types using +the -mrelax-relocations=no option. For ease of backporting, do this +unconditionally. + +[0] https://gitlab.com/x86-psABIs/x86-64-ABI/-/blob/master/x86-64-ABI/linker-optimization.tex#L65 +[1] https://lore.kernel.org/lkml/20200807194100.3570838-1-ndesaulniers@google.com/ +[2] https://github.com/ClangBuiltLinux/linux/issues/1121 +[3] https://reviews.llvm.org/rGc41a18cf61790fc898dcda1055c3efbf442c14c0 +[4] https://lore.kernel.org/lkml/20200731202738.2577854-1-nivedita@alum.mit.edu/ + +Reported-by: Nick Desaulniers +Signed-off-by: Arvind Sankar +Signed-off-by: Ingo Molnar +Tested-by: Nick Desaulniers +Tested-by: Sedat Dilek +Acked-by: Ard Biesheuvel +Reviewed-by: Nick Desaulniers +Cc: stable@vger.kernel.org +Link: https://lore.kernel.org/r/20200812004308.1448603-1-nivedita@alum.mit.edu +[nc: Backport to 4.14] +Signed-off-by: Nathan Chancellor +Signed-off-by: Greg Kroah-Hartman +--- + arch/x86/boot/compressed/Makefile | 2 ++ + 1 file changed, 2 insertions(+) + +--- a/arch/x86/boot/compressed/Makefile ++++ b/arch/x86/boot/compressed/Makefile +@@ -36,6 +36,8 @@ KBUILD_CFLAGS += -mno-mmx -mno-sse + KBUILD_CFLAGS += $(call cc-option,-ffreestanding) + KBUILD_CFLAGS += $(call cc-option,-fno-stack-protector) + KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member) ++# Disable relocation relaxation in case the link is not PIE. ++KBUILD_CFLAGS += $(call as-option,-Wa$(comma)-mrelax-relocations=no) + + KBUILD_AFLAGS := $(KBUILD_CFLAGS) -D__ASSEMBLY__ + GCOV_PROFILE := n