--- /dev/null
+From 4751dc99627e4d1465c5bfa8cb7ab31ed418eff5 Mon Sep 17 00:00:00 2001
+From: Filipe Manana <fdmanana@suse.com>
+Date: Mon, 28 Feb 2022 16:29:28 +0000
+Subject: btrfs: add missing run of delayed items after unlink during log replay
+
+From: Filipe Manana <fdmanana@suse.com>
+
+commit 4751dc99627e4d1465c5bfa8cb7ab31ed418eff5 upstream.
+
+During log replay, whenever we need to check if a name (dentry) exists in
+a directory we do searches on the subvolume tree for inode references or
+or directory entries (BTRFS_DIR_INDEX_KEY keys, and BTRFS_DIR_ITEM_KEY
+keys as well, before kernel 5.17). However when during log replay we
+unlink a name, through btrfs_unlink_inode(), we may not delete inode
+references and dir index keys from a subvolume tree and instead just add
+the deletions to the delayed inode's delayed items, which will only be
+run when we commit the transaction used for log replay. This means that
+after an unlink operation during log replay, if we attempt to search for
+the same name during log replay, we will not see that the name was already
+deleted, since the deletion is recorded only on the delayed items.
+
+We run delayed items after every unlink operation during log replay,
+except at unlink_old_inode_refs() and at add_inode_ref(). This was due
+to an overlook, as delayed items should be run after evert unlink, for
+the reasons stated above.
+
+So fix those two cases.
+
+Fixes: 0d836392cadd5 ("Btrfs: fix mount failure after fsync due to hard link recreation")
+Fixes: 1f250e929a9c9 ("Btrfs: fix log replay failure after unlink and link combination")
+CC: stable@vger.kernel.org # 4.19+
+Signed-off-by: Filipe Manana <fdmanana@suse.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/btrfs/tree-log.c | 18 ++++++++++++++++++
+ 1 file changed, 18 insertions(+)
+
+--- a/fs/btrfs/tree-log.c
++++ b/fs/btrfs/tree-log.c
+@@ -1308,6 +1308,15 @@ again:
+ inode, name, namelen);
+ kfree(name);
+ iput(dir);
++ /*
++ * Whenever we need to check if a name exists or not, we
++ * check the subvolume tree. So after an unlink we must
++ * run delayed items, so that future checks for a name
++ * during log replay see that the name does not exists
++ * anymore.
++ */
++ if (!ret)
++ ret = btrfs_run_delayed_items(trans);
+ if (ret)
+ goto out;
+ goto again;
+@@ -1559,6 +1568,15 @@ static noinline int add_inode_ref(struct
+ */
+ if (!ret && inode->i_nlink == 0)
+ inc_nlink(inode);
++ /*
++ * Whenever we need to check if a name exists or
++ * not, we check the subvolume tree. So after an
++ * unlink we must run delayed items, so that future
++ * checks for a name during log replay see that the
++ * name does not exists anymore.
++ */
++ if (!ret)
++ ret = btrfs_run_delayed_items(trans);
+ }
+ if (ret < 0)
+ goto out;
--- /dev/null
+From d99478874355d3a7b9d86dfb5d7590d5b1754b1f Mon Sep 17 00:00:00 2001
+From: Filipe Manana <fdmanana@suse.com>
+Date: Thu, 17 Feb 2022 12:12:02 +0000
+Subject: btrfs: fix lost prealloc extents beyond eof after full fsync
+
+From: Filipe Manana <fdmanana@suse.com>
+
+commit d99478874355d3a7b9d86dfb5d7590d5b1754b1f upstream.
+
+When doing a full fsync, if we have prealloc extents beyond (or at) eof,
+and the leaves that contain them were not modified in the current
+transaction, we end up not logging them. This results in losing those
+extents when we replay the log after a power failure, since the inode is
+truncated to the current value of the logged i_size.
+
+Just like for the fast fsync path, we need to always log all prealloc
+extents starting at or beyond i_size. The fast fsync case was fixed in
+commit 471d557afed155 ("Btrfs: fix loss of prealloc extents past i_size
+after fsync log replay") but it missed the full fsync path. The problem
+exists since the very early days, when the log tree was added by
+commit e02119d5a7b439 ("Btrfs: Add a write ahead tree log to optimize
+synchronous operations").
+
+Example reproducer:
+
+ $ mkfs.btrfs -f /dev/sdc
+ $ mount /dev/sdc /mnt
+
+ # Create our test file with many file extent items, so that they span
+ # several leaves of metadata, even if the node/page size is 64K. Use
+ # direct IO and not fsync/O_SYNC because it's both faster and it avoids
+ # clearing the full sync flag from the inode - we want the fsync below
+ # to trigger the slow full sync code path.
+ $ xfs_io -f -d -c "pwrite -b 4K 0 16M" /mnt/foo
+
+ # Now add two preallocated extents to our file without extending the
+ # file's size. One right at i_size, and another further beyond, leaving
+ # a gap between the two prealloc extents.
+ $ xfs_io -c "falloc -k 16M 1M" /mnt/foo
+ $ xfs_io -c "falloc -k 20M 1M" /mnt/foo
+
+ # Make sure everything is durably persisted and the transaction is
+ # committed. This makes all created extents to have a generation lower
+ # than the generation of the transaction used by the next write and
+ # fsync.
+ sync
+
+ # Now overwrite only the first extent, which will result in modifying
+ # only the first leaf of metadata for our inode. Then fsync it. This
+ # fsync will use the slow code path (inode full sync bit is set) because
+ # it's the first fsync since the inode was created/loaded.
+ $ xfs_io -c "pwrite 0 4K" -c "fsync" /mnt/foo
+
+ # Extent list before power failure.
+ $ xfs_io -c "fiemap -v" /mnt/foo
+ /mnt/foo:
+ EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
+ 0: [0..7]: 2178048..2178055 8 0x0
+ 1: [8..16383]: 26632..43007 16376 0x0
+ 2: [16384..32767]: 2156544..2172927 16384 0x0
+ 3: [32768..34815]: 2172928..2174975 2048 0x800
+ 4: [34816..40959]: hole 6144
+ 5: [40960..43007]: 2174976..2177023 2048 0x801
+
+ <power fail>
+
+ # Mount fs again, trigger log replay.
+ $ mount /dev/sdc /mnt
+
+ # Extent list after power failure and log replay.
+ $ xfs_io -c "fiemap -v" /mnt/foo
+ /mnt/foo:
+ EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
+ 0: [0..7]: 2178048..2178055 8 0x0
+ 1: [8..16383]: 26632..43007 16376 0x0
+ 2: [16384..32767]: 2156544..2172927 16384 0x1
+
+ # The prealloc extents at file offsets 16M and 20M are missing.
+
+So fix this by calling btrfs_log_prealloc_extents() when we are doing a
+full fsync, so that we always log all prealloc extents beyond eof.
+
+A test case for fstests will follow soon.
+
+CC: stable@vger.kernel.org # 4.19+
+Signed-off-by: Filipe Manana <fdmanana@suse.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/btrfs/tree-log.c | 43 +++++++++++++++++++++++++++++++------------
+ 1 file changed, 31 insertions(+), 12 deletions(-)
+
+--- a/fs/btrfs/tree-log.c
++++ b/fs/btrfs/tree-log.c
+@@ -4249,7 +4249,7 @@ static int log_one_extent(struct btrfs_t
+
+ /*
+ * Log all prealloc extents beyond the inode's i_size to make sure we do not
+- * lose them after doing a fast fsync and replaying the log. We scan the
++ * lose them after doing a full/fast fsync and replaying the log. We scan the
+ * subvolume's root instead of iterating the inode's extent map tree because
+ * otherwise we can log incorrect extent items based on extent map conversion.
+ * That can happen due to the fact that extent maps are merged when they
+@@ -5042,6 +5042,7 @@ static int copy_inode_items_to_log(struc
+ struct btrfs_log_ctx *ctx,
+ bool *need_log_inode_item)
+ {
++ const u64 i_size = i_size_read(&inode->vfs_inode);
+ struct btrfs_root *root = inode->root;
+ int ins_start_slot = 0;
+ int ins_nr = 0;
+@@ -5062,13 +5063,21 @@ again:
+ if (min_key->type > max_key->type)
+ break;
+
+- if (min_key->type == BTRFS_INODE_ITEM_KEY)
++ if (min_key->type == BTRFS_INODE_ITEM_KEY) {
+ *need_log_inode_item = false;
+-
+- if ((min_key->type == BTRFS_INODE_REF_KEY ||
+- min_key->type == BTRFS_INODE_EXTREF_KEY) &&
+- inode->generation == trans->transid &&
+- !recursive_logging) {
++ } else if (min_key->type == BTRFS_EXTENT_DATA_KEY &&
++ min_key->offset >= i_size) {
++ /*
++ * Extents at and beyond eof are logged with
++ * btrfs_log_prealloc_extents().
++ * Only regular files have BTRFS_EXTENT_DATA_KEY keys,
++ * and no keys greater than that, so bail out.
++ */
++ break;
++ } else if ((min_key->type == BTRFS_INODE_REF_KEY ||
++ min_key->type == BTRFS_INODE_EXTREF_KEY) &&
++ inode->generation == trans->transid &&
++ !recursive_logging) {
+ u64 other_ino = 0;
+ u64 other_parent = 0;
+
+@@ -5099,10 +5108,8 @@ again:
+ btrfs_release_path(path);
+ goto next_key;
+ }
+- }
+-
+- /* Skip xattrs, we log them later with btrfs_log_all_xattrs() */
+- if (min_key->type == BTRFS_XATTR_ITEM_KEY) {
++ } else if (min_key->type == BTRFS_XATTR_ITEM_KEY) {
++ /* Skip xattrs, logged later with btrfs_log_all_xattrs() */
+ if (ins_nr == 0)
+ goto next_slot;
+ ret = copy_items(trans, inode, dst_path, path,
+@@ -5155,9 +5162,21 @@ next_key:
+ break;
+ }
+ }
+- if (ins_nr)
++ if (ins_nr) {
+ ret = copy_items(trans, inode, dst_path, path, ins_start_slot,
+ ins_nr, inode_only, logged_isize);
++ if (ret)
++ return ret;
++ }
++
++ if (inode_only == LOG_INODE_ALL && S_ISREG(inode->vfs_inode.i_mode)) {
++ /*
++ * Release the path because otherwise we might attempt to double
++ * lock the same leaf with btrfs_log_prealloc_extents() below.
++ */
++ btrfs_release_path(path);
++ ret = btrfs_log_prealloc_extents(trans, inode, dst_path);
++ }
+
+ return ret;
+ }
--- /dev/null
+From d4aef1e122d8bbdc15ce3bd0bc813d6b44a7d63a Mon Sep 17 00:00:00 2001
+From: Sidong Yang <realwakka@gmail.com>
+Date: Mon, 28 Feb 2022 01:43:40 +0000
+Subject: btrfs: qgroup: fix deadlock between rescan worker and remove qgroup
+
+From: Sidong Yang <realwakka@gmail.com>
+
+commit d4aef1e122d8bbdc15ce3bd0bc813d6b44a7d63a upstream.
+
+The commit e804861bd4e6 ("btrfs: fix deadlock between quota disable and
+qgroup rescan worker") by Kawasaki resolves deadlock between quota
+disable and qgroup rescan worker. But also there is a deadlock case like
+it. It's about enabling or disabling quota and creating or removing
+qgroup. It can be reproduced in simple script below.
+
+for i in {1..100}
+do
+ btrfs quota enable /mnt &
+ btrfs qgroup create 1/0 /mnt &
+ btrfs qgroup destroy 1/0 /mnt &
+ btrfs quota disable /mnt &
+done
+
+Here's why the deadlock happens:
+
+1) The quota rescan task is running.
+
+2) Task A calls btrfs_quota_disable(), locks the qgroup_ioctl_lock
+ mutex, and then calls btrfs_qgroup_wait_for_completion(), to wait for
+ the quota rescan task to complete.
+
+3) Task B calls btrfs_remove_qgroup() and it blocks when trying to lock
+ the qgroup_ioctl_lock mutex, because it's being held by task A. At that
+ point task B is holding a transaction handle for the current transaction.
+
+4) The quota rescan task calls btrfs_commit_transaction(). This results
+ in it waiting for all other tasks to release their handles on the
+ transaction, but task B is blocked on the qgroup_ioctl_lock mutex
+ while holding a handle on the transaction, and that mutex is being held
+ by task A, which is waiting for the quota rescan task to complete,
+ resulting in a deadlock between these 3 tasks.
+
+To resolve this issue, the thread disabling quota should unlock
+qgroup_ioctl_lock before waiting rescan completion. Move
+btrfs_qgroup_wait_for_completion() after unlock of qgroup_ioctl_lock.
+
+Fixes: e804861bd4e6 ("btrfs: fix deadlock between quota disable and qgroup rescan worker")
+CC: stable@vger.kernel.org # 5.4+
+Reviewed-by: Filipe Manana <fdmanana@suse.com>
+Reviewed-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
+Signed-off-by: Sidong Yang <realwakka@gmail.com>
+Reviewed-by: David Sterba <dsterba@suse.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/btrfs/qgroup.c | 9 ++++++++-
+ 1 file changed, 8 insertions(+), 1 deletion(-)
+
+--- a/fs/btrfs/qgroup.c
++++ b/fs/btrfs/qgroup.c
+@@ -1117,13 +1117,20 @@ int btrfs_quota_disable(struct btrfs_fs_
+ goto out;
+
+ /*
++ * Unlock the qgroup_ioctl_lock mutex before waiting for the rescan worker to
++ * complete. Otherwise we can deadlock because btrfs_remove_qgroup() needs
++ * to lock that mutex while holding a transaction handle and the rescan
++ * worker needs to commit a transaction.
++ */
++ mutex_unlock(&fs_info->qgroup_ioctl_lock);
++
++ /*
+ * Request qgroup rescan worker to complete and wait for it. This wait
+ * must be done before transaction start for quota disable since it may
+ * deadlock with transaction by the qgroup rescan worker.
+ */
+ clear_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
+ btrfs_qgroup_wait_for_completion(fs_info, false);
+- mutex_unlock(&fs_info->qgroup_ioctl_lock);
+
+ /*
+ * 1 For the root item
input-elan_i2c-fix-regulator-enable-count-imbalance-after-suspend-resume.patch
hid-add-mapping-for-key_dictate.patch
hid-add-mapping-for-key_all_applications.patch
+tracing-histogram-fix-sorting-on-old-cpu-value.patch
+tracing-fix-return-value-of-__setup-handlers.patch
+btrfs-fix-lost-prealloc-extents-beyond-eof-after-full-fsync.patch
+btrfs-qgroup-fix-deadlock-between-rescan-worker-and-remove-qgroup.patch
+btrfs-add-missing-run-of-delayed-items-after-unlink-during-log-replay.patch
--- /dev/null
+From 1d02b444b8d1345ea4708db3bab4db89a7784b55 Mon Sep 17 00:00:00 2001
+From: Randy Dunlap <rdunlap@infradead.org>
+Date: Wed, 2 Mar 2022 19:17:44 -0800
+Subject: tracing: Fix return value of __setup handlers
+
+From: Randy Dunlap <rdunlap@infradead.org>
+
+commit 1d02b444b8d1345ea4708db3bab4db89a7784b55 upstream.
+
+__setup() handlers should generally return 1 to indicate that the
+boot options have been handled.
+
+Using invalid option values causes the entire kernel boot option
+string to be reported as Unknown and added to init's environment
+strings, polluting it.
+
+ Unknown kernel command line parameters "BOOT_IMAGE=/boot/bzImage-517rc6
+ kprobe_event=p,syscall_any,$arg1 trace_options=quiet
+ trace_clock=jiffies", will be passed to user space.
+
+ Run /sbin/init as init process
+ with arguments:
+ /sbin/init
+ with environment:
+ HOME=/
+ TERM=linux
+ BOOT_IMAGE=/boot/bzImage-517rc6
+ kprobe_event=p,syscall_any,$arg1
+ trace_options=quiet
+ trace_clock=jiffies
+
+Return 1 from the __setup() handlers so that init's environment is not
+polluted with kernel boot options.
+
+Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
+Link: https://lkml.kernel.org/r/20220303031744.32356-1-rdunlap@infradead.org
+
+Cc: stable@vger.kernel.org
+Fixes: 7bcfaf54f591 ("tracing: Add trace_options kernel command line parameter")
+Fixes: e1e232ca6b8f ("tracing: Add trace_clock=<clock> kernel parameter")
+Fixes: 970988e19eb0 ("tracing/kprobe: Add kprobe_event= boot parameter")
+Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
+Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
+Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/trace.c | 4 ++--
+ kernel/trace/trace_kprobe.c | 2 +-
+ 2 files changed, 3 insertions(+), 3 deletions(-)
+
+--- a/kernel/trace/trace.c
++++ b/kernel/trace/trace.c
+@@ -219,7 +219,7 @@ static char trace_boot_options_buf[MAX_T
+ static int __init set_trace_boot_options(char *str)
+ {
+ strlcpy(trace_boot_options_buf, str, MAX_TRACER_SIZE);
+- return 0;
++ return 1;
+ }
+ __setup("trace_options=", set_trace_boot_options);
+
+@@ -230,7 +230,7 @@ static int __init set_trace_boot_clock(c
+ {
+ strlcpy(trace_boot_clock_buf, str, MAX_TRACER_SIZE);
+ trace_boot_clock = trace_boot_clock_buf;
+- return 0;
++ return 1;
+ }
+ __setup("trace_clock=", set_trace_boot_clock);
+
+--- a/kernel/trace/trace_kprobe.c
++++ b/kernel/trace/trace_kprobe.c
+@@ -430,7 +430,7 @@ static int disable_trace_kprobe(struct t
+ */
+ trace_probe_remove_file(tp, file);
+
+- return 0;
++ return 1;
+ }
+
+ #if defined(CONFIG_DYNAMIC_FTRACE) && \
--- /dev/null
+From 1d1898f65616c4601208963c3376c1d828cbf2c7 Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Tue, 1 Mar 2022 22:29:04 -0500
+Subject: tracing/histogram: Fix sorting on old "cpu" value
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit 1d1898f65616c4601208963c3376c1d828cbf2c7 upstream.
+
+When trying to add a histogram against an event with the "cpu" field, it
+was impossible due to "cpu" being a keyword to key off of the running CPU.
+So to fix this, it was changed to "common_cpu" to match the other generic
+fields (like "common_pid"). But since some scripts used "cpu" for keying
+off of the CPU (for events that did not have "cpu" as a field, which is
+most of them), a backward compatibility trick was added such that if "cpu"
+was used as a key, and the event did not have "cpu" as a field name, then
+it would fallback and switch over to "common_cpu".
+
+This fix has a couple of subtle bugs. One was that when switching over to
+"common_cpu", it did not change the field name, it just set a flag. But
+the code still found a "cpu" field. The "cpu" field is used for filtering
+and is returned when the event does not have a "cpu" field.
+
+This was found by:
+
+ # cd /sys/kernel/tracing
+ # echo hist:key=cpu,pid:sort=cpu > events/sched/sched_wakeup/trigger
+ # cat events/sched/sched_wakeup/hist
+
+Which showed the histogram unsorted:
+
+{ cpu: 19, pid: 1175 } hitcount: 1
+{ cpu: 6, pid: 239 } hitcount: 2
+{ cpu: 23, pid: 1186 } hitcount: 14
+{ cpu: 12, pid: 249 } hitcount: 2
+{ cpu: 3, pid: 994 } hitcount: 5
+
+Instead of hard coding the "cpu" checks, take advantage of the fact that
+trace_event_field_field() returns a special field for "cpu" and "CPU" if
+the event does not have "cpu" as a field. This special field has the
+"filter_type" of "FILTER_CPU". Check that to test if the returned field is
+of the CPU type instead of doing the string compare.
+
+Also, fix the sorting bug by testing for the hist_field flag of
+HIST_FIELD_FL_CPU when setting up the sort routine. Otherwise it will use
+the special CPU field to know what compare routine to use, and since that
+special field does not have a size, it returns tracing_map_cmp_none.
+
+Cc: stable@vger.kernel.org
+Fixes: 1e3bac71c505 ("tracing/histogram: Rename "cpu" to "common_cpu"")
+Reported-by: Daniel Bristot de Oliveira <bristot@kernel.org>
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/trace_events_hist.c | 6 +++---
+ 1 file changed, 3 insertions(+), 3 deletions(-)
+
+--- a/kernel/trace/trace_events_hist.c
++++ b/kernel/trace/trace_events_hist.c
+@@ -2891,9 +2891,9 @@ parse_field(struct hist_trigger_data *hi
+ /*
+ * For backward compatibility, if field_name
+ * was "cpu", then we treat this the same as
+- * common_cpu.
++ * common_cpu. This also works for "CPU".
+ */
+- if (strcmp(field_name, "cpu") == 0) {
++ if (field && field->filter_type == FILTER_CPU) {
+ *flags |= HIST_FIELD_FL_CPU;
+ } else {
+ hist_err(tr, HIST_ERR_FIELD_NOT_FOUND,
+@@ -5247,7 +5247,7 @@ static int create_tracing_map_fields(str
+
+ if (hist_field->flags & HIST_FIELD_FL_STACKTRACE)
+ cmp_fn = tracing_map_cmp_none;
+- else if (!field)
++ else if (!field || hist_field->flags & HIST_FIELD_FL_CPU)
+ cmp_fn = tracing_map_cmp_num(hist_field->size,
+ hist_field->is_signed);
+ else if (is_string_field(field))