]> git.ipfire.org Git - people/ms/linux.git/log
people/ms/linux.git
3 years agomips: rename mt_init to mips_mt_init
Liam R. Howlett [Sat, 30 Jul 2022 01:07:13 +0000 (18:07 -0700)] 
mips: rename mt_init to mips_mt_init

Move mt_init out of the way for the maple tree.  Use mips_mt prefix to
match the rest of the functions in the file.

Link: https://lkml.kernel.org/r/20220504002554.654642-2-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 years agomm: shrinkers: fix double kfree on shrinker name
Tetsuo Handa [Wed, 20 Jul 2022 14:47:55 +0000 (23:47 +0900)] 
mm: shrinkers: fix double kfree on shrinker name

syzbot is reporting double kfree() at free_prealloced_shrinker() [1], for
destroy_unused_super() calls free_prealloced_shrinker() even if
prealloc_shrinker() returned an error.  Explicitly clear shrinker name
when prealloc_shrinker() called kfree().

[roman.gushchin@linux.dev: zero shrinker->name in all cases where shrinker->name is freed]
Link: https://lkml.kernel.org/r/YtgteTnQTgyuKUSY@castle
Link: https://syzkaller.appspot.com/bug?extid=8b481578352d4637f510
Link: https://lkml.kernel.org/r/ffa62ece-6a42-2644-16cf-0d33ef32c676@I-love.SAKURA.ne.jp
Fixes: e33c267ab70de424 ("mm: shrinkers: provide shrinkers with names")
Reported-by: syzbot <syzbot+8b481578352d4637f510@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
3 years agoNFSD: add security label to struct nfsd_attrs
NeilBrown [Tue, 26 Jul 2022 06:45:30 +0000 (16:45 +1000)] 
NFSD: add security label to struct nfsd_attrs

nfsd_setattr() now sets a security label if provided, and nfsv4 provides
it in the 'open' and 'create' paths and the 'setattr' path.
If setting the label failed (including because the kernel doesn't
support labels), an error field in 'struct nfsd_attrs' is set, and the
caller can respond.  The open/create callers clear
FATTR4_WORD2_SECURITY_LABEL in the returned attr set in this case.
The setattr caller returns the error.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: set attributes when creating symlinks
NeilBrown [Tue, 26 Jul 2022 06:45:30 +0000 (16:45 +1000)] 
NFSD: set attributes when creating symlinks

The NFS protocol includes attributes when creating symlinks.
Linux does store attributes for symlinks and allows them to be set,
though they are not used for permission checking.

NFSD currently doesn't set standard (struct iattr) attributes when
creating symlinks, but for NFSv4 it does set ACLs and security labels.
This is inconsistent.

To improve consistency, pass the provided attributes into nfsd_symlink()
and call nfsd_create_setattr() to set them.

NOTE: this results in a behaviour change for all NFS versions when the
client sends non-default attributes with a SYMLINK request. With the
Linux client, the only attributes are:
attr.ia_mode = S_IFLNK | S_IRWXUGO;
attr.ia_valid = ATTR_MODE;
so the final outcome will be unchanged. Other clients might sent
different attributes, and if they did they probably expect them to be
honoured.

We ignore any error from nfsd_create_setattr().  It isn't really clear
what should be done if a file is successfully created, but the
attributes cannot be set.  NFS doesn't allow partial success to be
reported.  Reporting failure is probably more misleading than reporting
success, so the status is ignored.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: introduce struct nfsd_attrs
NeilBrown [Tue, 26 Jul 2022 06:45:30 +0000 (16:45 +1000)] 
NFSD: introduce struct nfsd_attrs

The attributes that nfsd might want to set on a file include 'struct
iattr' as well as an ACL and security label.
The latter two are passed around quite separately from the first, in
part because they are only needed for NFSv4.  This leads to some
clumsiness in the code, such as the attributes NOT being set in
nfsd_create_setattr().

We need to keep the directory locked until all attributes are set to
ensure the file is never visibile without all its attributes.  This need
combined with the inconsistent handling of attributes leads to more
clumsiness.

As a first step towards tidying this up, introduce 'struct nfsd_attrs'.
This is passed (by reference) to vfs.c functions that work with
attributes, and is assembled by the various nfs*proc functions which
call them.  As yet only iattr is included, but future patches will
expand this.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: verify the opened dentry after setting a delegation
Jeff Layton [Tue, 26 Jul 2022 06:45:30 +0000 (16:45 +1000)] 
NFSD: verify the opened dentry after setting a delegation

Between opening a file and setting a delegation on it, someone could
rename or unlink the dentry. If this happens, we do not want to grant a
delegation on the open.

On a CLAIM_NULL open, we're opening by filename, and we may (in the
non-create case) or may not (in the create case) be holding i_rwsem
when attempting to set a delegation.  The latter case allows a
race.

After getting a lease, redo the lookup of the file being opened and
validate that the resulting dentry matches the one in the open file
description.

To properly redo the lookup we need an rqst pointer to pass to
nfsd_lookup_dentry(), so make sure that is available.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: drop fh argument from alloc_init_deleg
Jeff Layton [Tue, 26 Jul 2022 06:45:30 +0000 (16:45 +1000)] 
NFSD: drop fh argument from alloc_init_deleg

Currently, we pass the fh of the opened file down through several
functions so that alloc_init_deleg can pass it to delegation_blocked.
The filehandle of the open file is available in the nfs4_file however,
so there's no need to pass it in a separate argument.

Drop the argument from alloc_init_deleg, nfs4_open_delegation and
nfs4_set_delegation.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Move copy offload callback arguments into a separate structure
Chuck Lever [Wed, 27 Jul 2022 18:41:18 +0000 (14:41 -0400)] 
NFSD: Move copy offload callback arguments into a separate structure

Refactor so that CB_OFFLOAD arguments can be passed without
allocating a whole struct nfsd4_copy object. On my system (x86_64)
this removes another 96 bytes from struct nfsd4_copy.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Add nfsd4_send_cb_offload()
Chuck Lever [Wed, 27 Jul 2022 18:41:12 +0000 (14:41 -0400)] 
NFSD: Add nfsd4_send_cb_offload()

Refactor for legibility.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Remove kmalloc from nfsd4_do_async_copy()
Chuck Lever [Wed, 27 Jul 2022 18:41:06 +0000 (14:41 -0400)] 
NFSD: Remove kmalloc from nfsd4_do_async_copy()

Instead of manufacturing a phony struct nfsd_file, pass the
struct file returned by nfs42_ssc_open() directly to
nfsd4_do_copy().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Refactor nfsd4_do_copy()
Chuck Lever [Wed, 27 Jul 2022 18:40:59 +0000 (14:40 -0400)] 
NFSD: Refactor nfsd4_do_copy()

Refactor: Now that nfsd4_do_copy() no longer calls the cleanup
helpers, plumb the use of struct file pointers all the way down to
_nfsd_copy_file_range().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2)
Chuck Lever [Wed, 27 Jul 2022 18:40:53 +0000 (14:40 -0400)] 
NFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2)

Move the nfsd4_cleanup_*() call sites out of nfsd4_do_copy(). A
subsequent patch will modify one of the new call sites to avoid
the need to manufacture the phony struct nfsd_file.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2)
Chuck Lever [Wed, 27 Jul 2022 18:40:47 +0000 (14:40 -0400)] 
NFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2)

The @src parameter is sometimes a pointer to a struct nfsd_file and
sometimes a pointer to struct file hiding in a phony struct
nfsd_file. Refactor nfsd4_cleanup_inter_ssc() so the @src parameter
is always an explicit struct file.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Replace boolean fields in struct nfsd4_copy
Chuck Lever [Wed, 27 Jul 2022 18:40:41 +0000 (14:40 -0400)] 
NFSD: Replace boolean fields in struct nfsd4_copy

Clean up: saves 8 bytes, and we can replace check_and_set_stop_copy()
with an atomic bitop.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Make nfs4_put_copy() static
Chuck Lever [Wed, 27 Jul 2022 18:40:35 +0000 (14:40 -0400)] 
NFSD: Make nfs4_put_copy() static

Clean up: All call sites are in fs/nfsd/nfs4proc.c.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Reorder the fields in struct nfsd4_op
Chuck Lever [Wed, 27 Jul 2022 18:40:28 +0000 (14:40 -0400)] 
NFSD: Reorder the fields in struct nfsd4_op

Pack the fields to reduce the size of struct nfsd4_op, which is used
an array in struct nfsd4_compoundargs.

sizeof(struct nfsd4_op):
Before: /* size: 672, cachelines: 11, members: 5 */
After:  /* size: 640, cachelines: 10, members: 5 */

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Shrink size of struct nfsd4_copy
Chuck Lever [Wed, 27 Jul 2022 18:40:22 +0000 (14:40 -0400)] 
NFSD: Shrink size of struct nfsd4_copy

struct nfsd4_copy is part of struct nfsd4_op, which resides in an
8-element array.

sizeof(struct nfsd4_op):
Before: /* size: 1696, cachelines: 27, members: 5 */
After:  /* size: 672, cachelines: 11, members: 5 */

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Shrink size of struct nfsd4_copy_notify
Chuck Lever [Wed, 27 Jul 2022 18:40:16 +0000 (14:40 -0400)] 
NFSD: Shrink size of struct nfsd4_copy_notify

struct nfsd4_copy_notify is part of struct nfsd4_op, which resides
in an 8-element array.

sizeof(struct nfsd4_op):
Before: /* size: 2208, cachelines: 35, members: 5 */
After:  /* size: 1696, cachelines: 27, members: 5 */

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: nfserrno(-ENOMEM) is nfserr_jukebox
Chuck Lever [Wed, 27 Jul 2022 18:40:09 +0000 (14:40 -0400)] 
NFSD: nfserrno(-ENOMEM) is nfserr_jukebox

Suggested-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Fix strncpy() fortify warning
Chuck Lever [Wed, 27 Jul 2022 18:40:03 +0000 (14:40 -0400)] 
NFSD: Fix strncpy() fortify warning

In function ‘strncpy’,
    inlined from ‘nfsd4_ssc_setup_dul’ at /home/cel/src/linux/manet/fs/nfsd/nfs4proc.c:1392:3,
    inlined from ‘nfsd4_interssc_connect’ at /home/cel/src/linux/manet/fs/nfsd/nfs4proc.c:1489:11:
/home/cel/src/linux/manet/include/linux/fortify-string.h:52:33: warning: ‘__builtin_strncpy’ specified bound 63 equals destination size [-Wstringop-truncation]
   52 | #define __underlying_strncpy    __builtin_strncpy
      |                                 ^
/home/cel/src/linux/manet/include/linux/fortify-string.h:89:16: note: in expansion of macro ‘__underlying_strncpy’
   89 |         return __underlying_strncpy(p, q, size);
      |                ^~~~~~~~~~~~~~~~~~~~

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Clean up nfsd4_encode_readlink()
Chuck Lever [Fri, 22 Jul 2022 20:09:23 +0000 (16:09 -0400)] 
NFSD: Clean up nfsd4_encode_readlink()

Similar changes to nfsd4_encode_readv(), all bundled into a single
patch.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Use xdr_pad_size()
Chuck Lever [Fri, 22 Jul 2022 20:09:16 +0000 (16:09 -0400)] 
NFSD: Use xdr_pad_size()

Clean up: Use a helper instead of open-coding the calculation of
the XDR pad size.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Simplify starting_len
Chuck Lever [Fri, 22 Jul 2022 20:09:10 +0000 (16:09 -0400)] 
NFSD: Simplify starting_len

Clean-up: Now that nfsd4_encode_readv() does not have to encode the
EOF or rd_length values, it no longer needs to subtract 8 from
@starting_len.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Optimize nfsd4_encode_readv()
Chuck Lever [Fri, 22 Jul 2022 20:09:04 +0000 (16:09 -0400)] 
NFSD: Optimize nfsd4_encode_readv()

write_bytes_to_xdr_buf() is pretty expensive to use for inserting
an XDR data item that is always 1 XDR_UNIT at an address that is
always XDR word-aligned.

Since both the readv and splice read paths encode EOF and maxcount
values, move both to a common code path.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Add an nfsd4_read::rd_eof field
Chuck Lever [Fri, 22 Jul 2022 20:08:57 +0000 (16:08 -0400)] 
NFSD: Add an nfsd4_read::rd_eof field

Refactor: Make the EOF result available in the entire NFSv4 READ
path.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Clean up SPLICE_OK in nfsd4_encode_read()
Chuck Lever [Fri, 22 Jul 2022 20:08:51 +0000 (16:08 -0400)] 
NFSD: Clean up SPLICE_OK in nfsd4_encode_read()

Do the test_bit() once -- this reduces the number of locked-bus
operations and makes the function a little easier to read.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Optimize nfsd4_encode_fattr()
Chuck Lever [Fri, 22 Jul 2022 20:08:45 +0000 (16:08 -0400)] 
NFSD: Optimize nfsd4_encode_fattr()

write_bytes_to_xdr_buf() is a generic way to place a variable-length
data item in an already-reserved spot in the encoding buffer.

However, it is costly. In nfsd4_encode_fattr(), it is unnecessary
because the data item is fixed in size and the buffer destination
address is always word-aligned.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Optimize nfsd4_encode_operation()
Chuck Lever [Fri, 22 Jul 2022 20:08:38 +0000 (16:08 -0400)] 
NFSD: Optimize nfsd4_encode_operation()

write_bytes_to_xdr_buf() is a generic way to place a variable-length
data item in an already-reserved spot in the encoding buffer.
However, it is costly, and here, it is unnecessary because the
data item is fixed in size, the buffer destination address is
always word-aligned, and the destination location is already in
@p.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agonfsd: silence extraneous printk on nfsd.ko insertion
Jeff Layton [Wed, 20 Jul 2022 12:39:23 +0000 (08:39 -0400)] 
nfsd: silence extraneous printk on nfsd.ko insertion

This printk pops every time nfsd.ko gets plugged in. Most kmods don't do
that and this one is not very informative. Olaf's email address seems to
be defunct at this point anyway. Just drop it.

Cc: Olaf Kirch <okir@suse.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: limit the number of v4 clients to 1024 per 1GB of system memory
Dai Ngo [Fri, 15 Jul 2022 23:54:53 +0000 (16:54 -0700)] 
NFSD: limit the number of v4 clients to 1024 per 1GB of system memory

Currently there is no limit on how many v4 clients are supported
by the system. This can be a problem in systems with small memory
configuration to function properly when a very large number of
clients exist that creates memory shortage conditions.

This patch enforces a limit of 1024 NFSv4 clients, including courtesy
clients, per 1GB of system memory.  When the number of the clients
reaches the limit, requests that create new clients are returned
with NFS4ERR_DELAY and the laundromat is kicked start to trim old
clients. Due to the overhead of the upcall to remove the client
record, the maximun number of clients the laundromat removes on
each run is limited to 128. This is done to ensure the laundromat
can still process the other tasks in a timely manner.

Since there is now a limit of the number of clients, the 24-hr
idle time limit of courtesy client is no longer needed and was
removed.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: keep track of the number of v4 clients in the system
Dai Ngo [Fri, 15 Jul 2022 23:54:52 +0000 (16:54 -0700)] 
NFSD: keep track of the number of v4 clients in the system

Add counter nfs4_client_count to keep track of the total number
of v4 clients, including courtesy clients, in the system.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: refactoring v4 specific code to a helper in nfs4state.c
Dai Ngo [Fri, 15 Jul 2022 23:54:51 +0000 (16:54 -0700)] 
NFSD: refactoring v4 specific code to a helper in nfs4state.c

This patch moves the v4 specific code from nfsd_init_net() to
nfsd4_init_leases_net() helper in nfs4state.c

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Ensure nf_inode is never dereferenced
Chuck Lever [Fri, 8 Jul 2022 18:27:09 +0000 (14:27 -0400)] 
NFSD: Ensure nf_inode is never dereferenced

The documenting comment for struct nf_file states:

/*
 * A representation of a file that has been opened by knfsd. These are hashed
 * in the hashtable by inode pointer value. Note that this object doesn't
 * hold a reference to the inode by itself, so the nf_inode pointer should
 * never be dereferenced, only used for comparison.
 */

Replace the two existing dereferences to make the comment always
true.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: NFSv4 CLOSE should release an nfsd_file immediately
Chuck Lever [Fri, 8 Jul 2022 18:27:02 +0000 (14:27 -0400)] 
NFSD: NFSv4 CLOSE should release an nfsd_file immediately

The last close of a file should enable other accessors to open and
use that file immediately. Leaving the file open in the filecache
prevents other users from accessing that file until the filecache
garbage-collects the file -- sometimes that takes several seconds.

Reported-by: Wang Yugui <wangyugui@e16-tech.com>
Link: https://bugzilla.linux-nfs.org/show_bug.cgi?387
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Move nfsd_file_trace_alloc() tracepoint
Chuck Lever [Fri, 8 Jul 2022 18:26:49 +0000 (14:26 -0400)] 
NFSD: Move nfsd_file_trace_alloc() tracepoint

Avoid recording the allocation of an nfsd_file item that is
immediately released because a matching item was already
inserted in the hash.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Separate tracepoints for acquire and create
Chuck Lever [Fri, 8 Jul 2022 18:26:43 +0000 (14:26 -0400)] 
NFSD: Separate tracepoints for acquire and create

These tracepoints collect different information: the create case does
not open a file, so there's no nf_file available.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Clean up unused code after rhashtable conversion
Chuck Lever [Fri, 8 Jul 2022 18:26:36 +0000 (14:26 -0400)] 
NFSD: Clean up unused code after rhashtable conversion

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Convert the filecache to use rhashtable
Chuck Lever [Fri, 8 Jul 2022 18:26:30 +0000 (14:26 -0400)] 
NFSD: Convert the filecache to use rhashtable

Enable the filecache hash table to start small, then grow with the
workload. Smaller server deployments benefit because there should
be lower memory utilization. Larger server deployments should see
improved scaling with the number of open files.

Suggested-by: Jeff Layton <jlayton@kernel.org>
Suggested-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Set up an rhashtable for the filecache
Chuck Lever [Fri, 8 Jul 2022 18:26:23 +0000 (14:26 -0400)] 
NFSD: Set up an rhashtable for the filecache

Add code to initialize and tear down an rhashtable. The rhashtable
is not used yet.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Replace the "init once" mechanism
Chuck Lever [Fri, 8 Jul 2022 18:26:16 +0000 (14:26 -0400)] 
NFSD: Replace the "init once" mechanism

In a moment, the nfsd_file_hashtbl global will be replaced with an
rhashtable. Replace the one or two spots that need to check if the
hash table is available. We can easily reuse the SHUTDOWN flag for
this purpose.

Document that this mechanism relies on callers to hold the
nfsd_mutex to prevent init, shutdown, and purging to run
concurrently.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Remove nfsd_file::nf_hashval
Chuck Lever [Fri, 8 Jul 2022 18:26:10 +0000 (14:26 -0400)] 
NFSD: Remove nfsd_file::nf_hashval

The value in this field can always be computed from nf_inode, thus
it is no longer used.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: nfsd_file_hash_remove can compute hashval
Chuck Lever [Fri, 8 Jul 2022 18:26:03 +0000 (14:26 -0400)] 
NFSD: nfsd_file_hash_remove can compute hashval

Remove an unnecessary use of nf_hashval.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Refactor __nfsd_file_close_inode()
Chuck Lever [Fri, 8 Jul 2022 18:25:57 +0000 (14:25 -0400)] 
NFSD: Refactor __nfsd_file_close_inode()

The code that computes the hashval is the same in both callers.

To prevent them from going stale, reframe the documenting comments
to remove descriptions of the underlying hash table structure, which
is about to be replaced.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: nfsd_file_unhash can compute hashval from nf->nf_inode
Chuck Lever [Fri, 8 Jul 2022 18:25:50 +0000 (14:25 -0400)] 
NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode

Remove an unnecessary usage of nf_hashval.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Remove lockdep assertion from unhash_and_release_locked()
Chuck Lever [Fri, 8 Jul 2022 18:25:44 +0000 (14:25 -0400)] 
NFSD: Remove lockdep assertion from unhash_and_release_locked()

IIUC, holding the hash bucket lock is needed only in
nfsd_file_unhash, and there is already a lockdep assertion there.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: No longer record nf_hashval in the trace log
Chuck Lever [Fri, 8 Jul 2022 18:25:37 +0000 (14:25 -0400)] 
NFSD: No longer record nf_hashval in the trace log

I'm about to replace nfsd_file_hashtbl with an rhashtable. The
individual hash values will no longer be visible or relevant, so
remove them from the tracepoints.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Never call nfsd_file_gc() in foreground paths
Chuck Lever [Fri, 8 Jul 2022 18:25:30 +0000 (14:25 -0400)] 
NFSD: Never call nfsd_file_gc() in foreground paths

The checks in nfsd_file_acquire() and nfsd_file_put() that directly
invoke filecache garbage collection are intended to keep cache
occupancy between a low- and high-watermark. The reason to limit the
capacity of the filecache is to keep filecache lookups reasonably
fast.

However, invoking garbage collection at those points has some
undesirable negative impacts. Files that are held open by NFSv4
clients often push the occupancy of the filecache over these
watermarks. At that point:

- Every call to nfsd_file_acquire() and nfsd_file_put() results in
  an LRU walk. This has the same effect on lookup latency as long
  chains in the hash table.
- Garbage collection will then run on every nfsd thread, causing a
  lot of unnecessary lock contention.
- Limiting cache capacity pushes out files used only by NFSv3
  clients, which are the type of files the filecache is supposed to
  help.

To address those negative impacts, remove the direct calls to the
garbage collector. Subsequent patches will address maintaining
lookup efficiency as cache capacity increases.

Suggested-by: Wang Yugui <wangyugui@e16-tech.com>
Suggested-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Fix the filecache LRU shrinker
Chuck Lever [Fri, 8 Jul 2022 18:25:24 +0000 (14:25 -0400)] 
NFSD: Fix the filecache LRU shrinker

Without LRU item rotation, the shrinker visits only a few items on
the end of the LRU list, and those would always be long-term OPEN
files for NFSv4 workloads. That makes the filecache shrinker
completely ineffective.

Adopt the same strategy as the inode LRU by using LRU_ROTATE.

Suggested-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Leave open files out of the filecache LRU
Chuck Lever [Fri, 8 Jul 2022 18:25:17 +0000 (14:25 -0400)] 
NFSD: Leave open files out of the filecache LRU

There have been reports of problems when running fstests generic/531
against Linux NFS servers with NFSv4. The NFS server that hosts the
test's SCRATCH_DEV suffers from CPU soft lock-ups during the test.
Analysis shows that:

fs/nfsd/filecache.c
 482                 ret = list_lru_walk(&nfsd_file_lru,
 483                                 nfsd_file_lru_cb,
 484                                 &head, LONG_MAX);

causes nfsd_file_gc() to walk the entire length of the filecache LRU
list every time it is called (which is quite frequently). The walk
holds a spinlock the entire time that prevents other nfsd threads
from accessing the filecache.

What's more, for NFSv4 workloads, none of the items that are visited
during this walk may be evicted, since they are all files that are
held OPEN by NFS clients.

Address this by ensuring that open files are not kept on the LRU
list.

Reported-by: Frank van der Linden <fllinden@amazon.com>
Reported-by: Wang Yugui <wangyugui@e16-tech.com>
Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=386
Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Trace filecache LRU activity
Chuck Lever [Fri, 8 Jul 2022 18:25:11 +0000 (14:25 -0400)] 
NFSD: Trace filecache LRU activity

Observe the operation of garbage collection and the lifetime of
filecache items.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: WARN when freeing an item still linked via nf_lru
Chuck Lever [Fri, 8 Jul 2022 18:25:04 +0000 (14:25 -0400)] 
NFSD: WARN when freeing an item still linked via nf_lru

Add a guardrail to prevent freeing memory that is still on a list.
This includes either a dispose list or the LRU list.

This is the sign of a bug, but this class of bugs can be detected
so that they don't endanger system stability, especially while
debugging.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Hook up the filecache stat file
Chuck Lever [Fri, 8 Jul 2022 18:24:58 +0000 (14:24 -0400)] 
NFSD: Hook up the filecache stat file

There has always been the capability of exporting filecache metrics
via /proc, but it was never hooked up. Let's surface these metrics
to enable better observability of the filecache.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Zero counters when the filecache is re-initialized
Chuck Lever [Fri, 8 Jul 2022 18:24:51 +0000 (14:24 -0400)] 
NFSD: Zero counters when the filecache is re-initialized

If nfsd_file_cache_init() is called after a shutdown, be sure the
stat counters are reset.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Record number of flush calls
Chuck Lever [Fri, 8 Jul 2022 18:24:45 +0000 (14:24 -0400)] 
NFSD: Record number of flush calls

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Report the number of items evicted by the LRU walk
Chuck Lever [Fri, 8 Jul 2022 18:24:38 +0000 (14:24 -0400)] 
NFSD: Report the number of items evicted by the LRU walk

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Refactor nfsd_file_lru_scan()
Chuck Lever [Fri, 8 Jul 2022 18:24:31 +0000 (14:24 -0400)] 
NFSD: Refactor nfsd_file_lru_scan()

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Refactor nfsd_file_gc()
Chuck Lever [Fri, 8 Jul 2022 18:24:25 +0000 (14:24 -0400)] 
NFSD: Refactor nfsd_file_gc()

Refactor nfsd_file_gc() to use the new list_lru helper.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Add nfsd_file_lru_dispose_list() helper
Chuck Lever [Fri, 8 Jul 2022 18:24:18 +0000 (14:24 -0400)] 
NFSD: Add nfsd_file_lru_dispose_list() helper

Refactor the invariant part of nfsd_file_lru_walk_list() into a
separate helper function.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Report average age of filecache items
Chuck Lever [Fri, 8 Jul 2022 18:24:12 +0000 (14:24 -0400)] 
NFSD: Report average age of filecache items

This is a measure of how long items stay in the filecache, to help
assess how efficient the cache is.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Report count of freed filecache items
Chuck Lever [Fri, 8 Jul 2022 18:24:05 +0000 (14:24 -0400)] 
NFSD: Report count of freed filecache items

Surface the count of freed nfsd_file items.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Report count of calls to nfsd_file_acquire()
Chuck Lever [Fri, 8 Jul 2022 18:23:59 +0000 (14:23 -0400)] 
NFSD: Report count of calls to nfsd_file_acquire()

Count the number of successful acquisitions that did not create a
file (ie, acquisitions that do not result in a compulsory cache
miss). This count can be compared directly with the reported hit
count to compute a hit ratio.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Report filecache LRU size
Chuck Lever [Fri, 8 Jul 2022 18:23:52 +0000 (14:23 -0400)] 
NFSD: Report filecache LRU size

Surface the NFSD filecache's LRU list length to help field
troubleshooters monitor filecache issues.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Demote a WARN to a pr_warn()
Chuck Lever [Fri, 8 Jul 2022 18:23:45 +0000 (14:23 -0400)] 
NFSD: Demote a WARN to a pr_warn()

The call trace doesn't add much value, but it sure is noisy.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoSUNRPC: Fix server-side fault injection documentation
Chuck Lever [Fri, 1 Jul 2022 14:37:15 +0000 (10:37 -0400)] 
SUNRPC: Fix server-side fault injection documentation

Fixes: 37324e6bb120 ("SUNRPC: Cache deferral injection")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agonfsd: remove redundant assignment to variable len
Colin Ian King [Tue, 28 Jun 2022 21:25:25 +0000 (22:25 +0100)] 
nfsd: remove redundant assignment to variable len

Variable len is being assigned a value zero and this is never
read, it is being re-assigned later. The assignment is redundant
and can be removed.

Cleans up clang scan-build warning:
fs/nfsd/nfsctl.c:636:2: warning: Value stored to 'len' is never read

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Fix space and spelling mistake
Zhang Jiaming [Thu, 23 Jun 2022 08:20:05 +0000 (16:20 +0800)] 
NFSD: Fix space and spelling mistake

Add a blank space after ','.
Change 'succesful' to 'successful'.

Signed-off-by: Zhang Jiaming <jiaming@nfschina.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Instrument fh_verify()
Chuck Lever [Tue, 21 Jun 2022 14:06:23 +0000 (10:06 -0400)] 
NFSD: Instrument fh_verify()

Capture file handles and how they map to local inodes. In particular,
NFSv4 PUTFH uses fh_verify() so we can now observe which file handles
are the target of OPEN, LOOKUP, RENAME, and so on.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoSUNRPC: Expand the svc_alloc_arg_err tracepoint
Chuck Lever [Tue, 21 Jun 2022 14:06:16 +0000 (10:06 -0400)] 
SUNRPC: Expand the svc_alloc_arg_err tracepoint

Record not only the number of pages requested, but the number of
pages that were actually allocated, to get a measure of progress
(or lack thereof).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNLM: Defend against file_lock changes after vfs_test_lock()
Benjamin Coddington [Mon, 13 Jun 2022 13:40:06 +0000 (09:40 -0400)] 
NLM: Defend against file_lock changes after vfs_test_lock()

Instead of trusting that struct file_lock returns completely unchanged
after vfs_test_lock() when there's no conflicting lock, stash away our
nlm_lockowner reference so we can properly release it for all cases.

This defends against another file_lock implementation overwriting fl_owner
when the return type is F_UNLCK.

Reported-by: Roberto Bergantinos Corpas <rbergant@redhat.com>
Tested-by: Roberto Bergantinos Corpas <rbergant@redhat.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoSUNRPC: Fix xdr_encode_bool()
Chuck Lever [Tue, 19 Jul 2022 13:18:35 +0000 (09:18 -0400)] 
SUNRPC: Fix xdr_encode_bool()

I discovered that xdr_encode_bool() was returning the same address
that was passed in the @p parameter. The documenting comment states
that the intent is to return the address of the next buffer
location, just like the other "xdr_encode_*" helpers.

The result was the encoded results of NFSv3 PATHCONF operations were
not formed correctly.

Fixes: ded04a587f6c ("NFSD: Update the NFSv3 PATHCONF3res encoder to use struct xdr_stream")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
3 years agonfsd: eliminate the NFSD_FILE_BREAK_* flags
Jeff Layton [Fri, 29 Jul 2022 21:01:07 +0000 (17:01 -0400)] 
nfsd: eliminate the NFSD_FILE_BREAK_* flags

We had a report from the spring Bake-a-thon of data corruption in some
nfstest_interop tests. Looking at the traces showed the NFS server
allowing a v3 WRITE to proceed while a read delegation was still
outstanding.

Currently, we only set NFSD_FILE_BREAK_* flags if
NFSD_MAY_NOT_BREAK_LEASE was set when we call nfsd_file_alloc.
NFSD_MAY_NOT_BREAK_LEASE was intended to be set when finding files for
COMMIT ops, where we need a writeable filehandle but don't need to
break read leases.

It doesn't make any sense to consult that flag when allocating a file
since the file may be used on subsequent calls where we do want to break
the lease (and the usage of it here seems to be reverse from what it
should be anyway).

Also, after calling nfsd_open_break_lease, we don't want to clear the
BREAK_* bits. A lease could end up being set on it later (more than
once) and we need to be able to break those leases as well.

This means that the NFSD_FILE_BREAK_* flags now just mirror
NFSD_MAY_{READ,WRITE} flags, so there's no need for them at all. Just
drop those flags and unconditionally call nfsd_open_break_lease every
time.

Reported-by: Olga Kornieskaia <kolga@netapp.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2107360
Fixes: 65294c1f2c5e (nfsd: add a new struct file caching facility to nfsd)
Cc: <stable@vger.kernel.org> # 5.4.x : bb283ca18d1e NFSD: Clean up the show_nf_flags() macro
Cc: <stable@vger.kernel.org> # 5.4.x
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoclk: fixed-factor: Introduce *clk_hw_register_fixed_factor_parent_hw()
Marijn Suijten [Wed, 29 Jun 2022 22:53:23 +0000 (00:53 +0200)] 
clk: fixed-factor: Introduce *clk_hw_register_fixed_factor_parent_hw()

Add the devres and non-devres variant of
clk_hw_register_fixed_factor_parent_hw() for registering a fixed factor
clock with clk_hw parent pointer instead of parent name.

Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Link: https://lore.kernel.org/r/20220629225331.357308-4-marijn.suijten@somainline.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
3 years agoclk: mux: Introduce devm_clk_hw_register_mux_parent_hws()
Marijn Suijten [Wed, 29 Jun 2022 22:53:22 +0000 (00:53 +0200)] 
clk: mux: Introduce devm_clk_hw_register_mux_parent_hws()

Add the devres variant of clk_hw_register_mux_hws() for registering a
mux clock with clk_hw parent pointers instead of parent names.

Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20220629225331.357308-3-marijn.suijten@somainline.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
3 years agoclk: divider: Introduce devm_clk_hw_register_divider_parent_hw()
Marijn Suijten [Wed, 29 Jun 2022 22:53:21 +0000 (00:53 +0200)] 
clk: divider: Introduce devm_clk_hw_register_divider_parent_hw()

Add the devres variant of clk_hw_register_divider_parent_hw() for
registering a divider clock with clk_hw parent pointer instead of parent
name.

Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20220629225331.357308-2-marijn.suijten@somainline.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
3 years agodt-bindings: eeprom: microchip,93lc46b: move to eeprom directory
Krzysztof Kozlowski [Wed, 27 Jul 2022 16:44:24 +0000 (18:44 +0200)] 
dt-bindings: eeprom: microchip,93lc46b: move to eeprom directory

Move the Atmel/Microchip 93xx46 SPI compatible EEPROM family bindings
from misc to eeprom directory to properly match subsystem.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220727164424.386499-2-krzysztof.kozlowski@linaro.org
3 years agodt-bindings: eeprom: at25: use spi-peripheral-props.yaml
Krzysztof Kozlowski [Wed, 27 Jul 2022 16:44:23 +0000 (18:44 +0200)] 
dt-bindings: eeprom: at25: use spi-peripheral-props.yaml

Instead of listing directly properties typical for SPI peripherals,
reference the spi-peripheral-props.yaml schema.  This allows using all
properties typical for SPI-connected devices, even these which device
bindings author did not tried yet.

Remove the spi-* properties which now come via spi-peripheral-props.yaml
schema, except for the cases when device schema adds some constraints
like maximum frequency.

While changing additionalProperties->unevaluatedProperties, put it in
typical place, just before example DTS.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220727164424.386499-1-krzysztof.kozlowski@linaro.org
3 years agodt-bindings: display: use spi-peripheral-props.yaml
Krzysztof Kozlowski [Wed, 27 Jul 2022 16:43:12 +0000 (18:43 +0200)] 
dt-bindings: display: use spi-peripheral-props.yaml

Instead of listing directly properties typical for SPI peripherals,
reference the spi-peripheral-props.yaml schema.  This allows using all
properties typical for SPI-connected devices, even these which device
bindings author did not tried yet.

Remove the spi-* properties which now come via spi-peripheral-props.yaml
schema, except for the cases when device schema adds some constraints
like maximum frequency.

While changing additionalProperties->unevaluatedProperties, put it in
typical place, just before example DTS.

The sitronix,st7735r references also panel-common.yaml and lists
explicitly allowed properties, thus here reference only
spi-peripheral-props.yaml for purpose of documenting the SPI slave
device and bringing spi-max-frequency type validation.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220727164312.385836-1-krzysztof.kozlowski@linaro.org
3 years agorandom: correct spelling of "overwrites"
Jason A. Donenfeld [Fri, 29 Jul 2022 23:12:25 +0000 (01:12 +0200)] 
random: correct spelling of "overwrites"

It was missing an 'r'.

Fixes: 186873c549df ("random: use simpler fast key erasure flow on per-cpu keys")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
3 years agoMerge tag 'block-5.19-2022-07-29' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 29 Jul 2022 23:07:35 +0000 (16:07 -0700)] 
Merge tag 'block-5.19-2022-07-29' of git://git.kernel.dk/linux-block

Pull block fix from Jens Axboe:
 "Just a single fix for NVMe, yet another quirk addition"

* tag 'block-5.19-2022-07-29' of git://git.kernel.dk/linux-block:
  nvme-pci: Crucial P2 has bogus namespace ids

3 years agobpf: Remove unneeded semicolon
Yang Li [Mon, 25 Jul 2022 22:27:33 +0000 (06:27 +0800)] 
bpf: Remove unneeded semicolon

Eliminate the following coccicheck warning:
/kernel/bpf/trampoline.c:101:2-3: Unneeded semicolon

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220725222733.55613-1-yang.lee@linux.alibaba.com
3 years agolibbpf: Add bpf_obj_get_opts()
Joe Burton [Fri, 29 Jul 2022 20:27:27 +0000 (20:27 +0000)] 
libbpf: Add bpf_obj_get_opts()

Add an extensible variant of bpf_obj_get() capable of setting the
`file_flags` parameter.

This parameter is needed to enable unprivileged access to BPF maps.
Without a method like this, users must manually make the syscall.

Signed-off-by: Joe Burton <jevburton@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220729202727.3311806-1-jevburton.kernel@gmail.com
3 years agonetdevsim: Avoid allocation warnings triggered from user space
Jakub Kicinski [Tue, 26 Jul 2022 21:36:05 +0000 (14:36 -0700)] 
netdevsim: Avoid allocation warnings triggered from user space

We need to suppress warnings from sily map sizes. Also switch
from GFP_USER to GFP_KERNEL_ACCOUNT, I'm pretty sure I misunderstood
the flags when writing this code.

Fixes: 395cacb5f1a0 ("netdevsim: bpf: support fake map offload")
Reported-by: syzbot+ad24705d3fd6463b18c6@syzkaller.appspotmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220726213605.154204-1-kuba@kernel.org
3 years agobpf: Fix NULL pointer dereference when registering bpf trampoline
Xu Kuohai [Thu, 28 Jul 2022 11:40:48 +0000 (07:40 -0400)] 
bpf: Fix NULL pointer dereference when registering bpf trampoline

A panic was reported on arm64:

[   44.517109] audit: type=1334 audit(1658859870.268:59): prog-id=19 op=LOAD
[   44.622031] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000010
[   44.624321] Mem abort info:
[   44.625049]   ESR = 0x0000000096000004
[   44.625935]   EC = 0x25: DABT (current EL), IL = 32 bits
[   44.627182]   SET = 0, FnV = 0
[   44.627930]   EA = 0, S1PTW = 0
[   44.628684]   FSC = 0x04: level 0 translation fault
[   44.629788] Data abort info:
[   44.630474]   ISV = 0, ISS = 0x00000004
[   44.631362]   CM = 0, WnR = 0
[   44.632041] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000100ab5000
[   44.633494] [0000000000000010] pgd=0000000000000000, p4d=0000000000000000
[   44.635202] Internal error: Oops: 96000004 [#1] SMP
[   44.636452] Modules linked in: xfs crct10dif_ce ghash_ce virtio_blk
virtio_console virtio_mmio qemu_fw_cfg
[   44.638713] CPU: 2 PID: 1 Comm: systemd Not tainted 5.19.0-rc7 #1
[   44.640164] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[   44.641799] pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   44.643404] pc : ftrace_set_filter_ip+0x24/0xa0
[   44.644659] lr : bpf_trampoline_update.constprop.0+0x428/0x4a0
[   44.646118] sp : ffff80000803b9f0
[   44.646950] x29: ffff80000803b9f0 x28: ffff0b5d80364400 x27: ffff80000803bb48
[   44.648721] x26: ffff8000085ad000 x25: ffff0b5d809d2400 x24: 0000000000000000
[   44.650493] x23: 00000000ffffffed x22: ffff0b5dd7ea0900 x21: 0000000000000000
[   44.652279] x20: 0000000000000000 x19: 0000000000000000 x18: ffffffffffffffff
[   44.654067] x17: 0000000000000000 x16: 0000000000000000 x15: ffffffffffffffff
[   44.655787] x14: ffff0b5d809d2498 x13: ffff0b5d809d2432 x12: 0000000005f5e100
[   44.657535] x11: abcc77118461cefd x10: 000000000000005f x9 : ffffa7219cb5b190
[   44.659254] x8 : ffffa7219c8e0000 x7 : 0000000000000000 x6 : ffffa7219db075e0
[   44.661066] x5 : ffffa7219d3130e0 x4 : ffffa7219cab9da0 x3 : 0000000000000000
[   44.662837] x2 : 0000000000000000 x1 : ffffa7219cb7a5c0 x0 : 0000000000000000
[   44.664675] Call trace:
[   44.665274]  ftrace_set_filter_ip+0x24/0xa0
[   44.666327]  bpf_trampoline_update.constprop.0+0x428/0x4a0
[   44.667696]  __bpf_trampoline_link_prog+0xcc/0x1c0
[   44.668834]  bpf_trampoline_link_prog+0x40/0x64
[   44.669919]  bpf_tracing_prog_attach+0x120/0x490
[   44.671011]  link_create+0xe0/0x2b0
[   44.671869]  __sys_bpf+0x484/0xd30
[   44.672706]  __arm64_sys_bpf+0x30/0x40
[   44.673678]  invoke_syscall+0x78/0x100
[   44.674623]  el0_svc_common.constprop.0+0x4c/0xf4
[   44.675783]  do_el0_svc+0x38/0x4c
[   44.676624]  el0_svc+0x34/0x100
[   44.677429]  el0t_64_sync_handler+0x11c/0x150
[   44.678532]  el0t_64_sync+0x190/0x194
[   44.679439] Code: 2a0203f4 f90013f5 2a0303f5 f9001fe1 (f9400800)
[   44.680959] ---[ end trace 0000000000000000 ]---
[   44.682111] Kernel panic - not syncing: Oops: Fatal exception
[   44.683488] SMP: stopping secondary CPUs
[   44.684551] Kernel Offset: 0x2721948e0000 from 0xffff800008000000
[   44.686095] PHYS_OFFSET: 0xfffff4a380000000
[   44.687144] CPU features: 0x010,00022811,19001080
[   44.688308] Memory Limit: none
[   44.689082] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---

It's caused by a NULL tr->fops passed to ftrace_set_filter_ip(). tr->fops
is initialized to NULL and is assigned to an allocated memory address if
CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS is enabled. Since there is no
direct call on arm64 yet, the config can't be enabled.

To fix it, call ftrace_set_filter_ip() only if tr->fops is not NULL.

Fixes: 00963a2e75a8 ("bpf: Support bpf_trampoline on functions with IPMODIFY (e.g. livepatch)")
Reported-by: Bruno Goncalves <bgoncalv@redhat.com>
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Bruno Goncalves <bgoncalv@redhat.com>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20220728114048.3540461-1-xukuohai@huaweicloud.com
3 years agobpf: Fix test_progs -j error with fentry/fexit tests
Song Liu [Fri, 29 Jul 2022 19:41:06 +0000 (12:41 -0700)] 
bpf: Fix test_progs -j error with fentry/fexit tests

When multiple threads are attaching/detaching fentry/fexit programs to
the same trampoline, we may call register_fentry on the same trampoline
twice: register_fentry(), unregister_fentry(), then register_fentry again.
This causes ftrace_set_filter_ip() for the same ip on tr->fops twice,
which leaves duplicated ip in tr->fops. The extra ip is not cleaned up
properly on unregister and thus causes failures with further register in
register_ftrace_direct_multi():

register_ftrace_direct_multi()
{
        ...
        for (i = 0; i < size; i++) {
                hlist_for_each_entry(entry, &hash->buckets[i], hlist) {
                        if (ftrace_find_rec_direct(entry->ip))
                                goto out_unlock;
                }
        }
        ...
}

This can be triggered with parallel fentry/fexit tests with test_progs:

  ./test_progs -t fentry,fexit -j

Fix this by resetting tr->fops in ftrace_set_filter_ip(), so that there
will never be duplicated entries in tr->fops.

Fixes: 00963a2e75a8 ("bpf: Support bpf_trampoline on functions with IPMODIFY (e.g. livepatch)")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220729194106.1207472-1-song@kernel.org
3 years agovideo: fbdev: imxfb: fix return value check in imxfb_probe()
Yang Yingliang [Fri, 29 Jul 2022 02:41:34 +0000 (10:41 +0800)] 
video: fbdev: imxfb: fix return value check in imxfb_probe()

If devm_ioremap_resource() fails, it never return NULL, replace
NULL test with IS_ERR().

Fixes: b083c22d5114 ("video: fbdev: imxfb: Convert request_mem_region + ioremap to devm_ioremap_resource")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Helge Deller <deller@gmx.de>
3 years agoopenrisc: io: Define iounmap argument as volatile
Stafford Horne [Fri, 29 Jul 2022 10:54:08 +0000 (19:54 +0900)] 
openrisc: io: Define iounmap argument as volatile

When OpenRISC enables PCI it allows for more drivers to be compiled
resulting in exposing the following with -Werror.

    drivers/video/fbdev/riva/fbdev.c: In function 'rivafb_probe':
    drivers/video/fbdev/riva/fbdev.c:2062:42: error:
    passing argument 1 of 'iounmap' discards 'volatile' qualifier from pointer target type

    drivers/video/fbdev/nvidia/nvidia.c: In function 'nvidiafb_probe':
    drivers/video/fbdev/nvidia/nvidia.c:1414:20: error:
    passing argument 1 of 'iounmap' discards 'volatile' qualifier from pointer target type

    drivers/scsi/aic7xxx/aic7xxx_osm.c: In function 'ahc_platform_free':
    drivers/scsi/aic7xxx/aic7xxx_osm.c:1231:41: error:
    passing argument 1 of 'iounmap' discards 'volatile' qualifier from pointer target type

Most architectures define the iounmap argument to be volatile.  To fix this
issue we do the same for OpenRISC.  This patch must go before PCI is enabled on
OpenRISC to avoid any compile failures.

Link: https://lore.kernel.org/lkml/20220729033728.GA2195022@roeck-us.net/
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Stafford Horne <shorne@gmail.com>
3 years agoMAINTAINERS: Update Richard Henderson's address
Stafford Horne [Fri, 22 Jul 2022 21:00:41 +0000 (06:00 +0900)] 
MAINTAINERS: Update Richard Henderson's address

Richard's address at twiddle.net no longer works and we are getting
bounces.

This patch updates to his Linaro address.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Stafford Horne <shorne@gmail.com>
3 years agoopenrisc: Add virt defconfig
Stafford Horne [Sun, 5 Jun 2022 02:03:31 +0000 (11:03 +0900)] 
openrisc: Add virt defconfig

I have been developing a new qemu virt platform to help with more
efficient toolchain and kernel testing [1].

This patch adds the defconfig which is needed to support booting
linux on the platform.

[1] https://lore.kernel.org/qemu-devel/YpwNtowUTxRbh2Uq@antec/T/#m6db180b0d682785fb320e4a05345c12a063e0c47

Signed-off-by: Stafford Horne <shorne@gmail.com>
3 years agoopenrisc: Add pci bus support
Stafford Horne [Sat, 11 Jun 2022 23:42:33 +0000 (08:42 +0900)] 
openrisc: Add pci bus support

This patch adds required definitions to allow for PCI buses on OpenRISC.
This is being tested on the OpenRISC QEMU virt platform which is in
development.

OpenRISC does not have IO ports so we keep the definition of
IO_SPACE_LIMIT and PIO_RESERVED to be 0.

Note, since commit 66bcd06099bb ("parport_pc: Also enable driver for PCI
systems") all platforms that support PCI also need to support parallel
port.  We add a generic header to support compiling parallel port
drivers, though they generally will not work as they require IO ports.

Signed-off-by: Stafford Horne <shorne@gmail.com>
3 years agoMerge branch 'pci/header-cleanup-immutable' of git://git.kernel.org/pub/scm/linux...
Stafford Horne [Fri, 29 Jul 2022 20:47:13 +0000 (05:47 +0900)] 
Merge branch 'pci/header-cleanup-immutable' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git

The OpenRISC PCI support depends on the fixups done in the
pci/header-cleanup-immutable branch.  Also, there are OpenRISC
irqchip fixups in v5.19-rc6 that are needed to test the virt platform.

This merge creates a base for the OpenRISC PCI changes.

3 years agoMerge tag 'drm-fixes-2022-07-30' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 29 Jul 2022 20:25:31 +0000 (13:25 -0700)] 
Merge tag 'drm-fixes-2022-07-30' of git://anongit.freedesktop.org/drm/drm

Pull more drm fixes from Dave Airlie:
 "Maxime had the dog^Wmailing list server eat his homework^Wmisc pull
  request.

  Two more small fixes, one in nouveau svm code and the other in
  simpledrm.

  nouveau:
   - page migration fix

  simpledrm:
   - fix mode_valid return value"

* tag 'drm-fixes-2022-07-30' of git://anongit.freedesktop.org/drm/drm:
  nouveau/svm: Fix to migrate all requested pages
  drm/simpledrm: Fix return type of simpledrm_simple_display_pipe_mode_valid()

3 years agoMerge tag 'drm-misc-fixes-2022-07-29' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 29 Jul 2022 20:09:48 +0000 (06:09 +1000)] 
Merge tag 'drm-misc-fixes-2022-07-29' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

One fix to fix simpledrm mode_valid return value, and one for page
migration in nouveau

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220729094514.sfzhc3gqjgwgal62@penduick
3 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Fri, 29 Jul 2022 20:07:03 +0000 (13:07 -0700)] 
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Four fixes, three in drivers.

  The two biggest fixes are ufs and the remaining driver and core fix
  are small and obvious (and the core fix is low risk)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: core: Fix a race condition related to device management
  scsi: core: Fix warning in scsi_alloc_sgtables()
  scsi: ufs: host: Hold reference returned by of_parse_phandle()
  scsi: mpt3sas: Stop fw fault watchdog work item during system shutdown

3 years agoRDMA/srpt: Fix a use-after-free
Bart Van Assche [Wed, 27 Jul 2022 19:34:15 +0000 (12:34 -0700)] 
RDMA/srpt: Fix a use-after-free

Change the LIO port members inside struct srpt_port from regular members
into pointers. Allocate the LIO port data structures from inside
srpt_make_tport() and free these from inside srpt_make_tport(). Keep
struct srpt_device as long as either an RDMA port or a LIO target port is
associated with it. This patch decouples the lifetime of struct srpt_port
(controlled by the RDMA core) and struct srpt_port_id (controlled by LIO).
This patch fixes the following KASAN complaint:

  BUG: KASAN: use-after-free in srpt_enable_tpg+0x31/0x70 [ib_srpt]
  Read of size 8 at addr ffff888141cc34b8 by task check/5093

  Call Trace:
   <TASK>
   show_stack+0x4e/0x53
   dump_stack_lvl+0x51/0x66
   print_address_description.constprop.0.cold+0xea/0x41e
   print_report.cold+0x90/0x205
   kasan_report+0xb9/0xf0
   __asan_load8+0x69/0x90
   srpt_enable_tpg+0x31/0x70 [ib_srpt]
   target_fabric_tpg_base_enable_store+0xe2/0x140 [target_core_mod]
   configfs_write_iter+0x18b/0x210
   new_sync_write+0x1f2/0x2f0
   vfs_write+0x3e3/0x540
   ksys_write+0xbb/0x140
   __x64_sys_write+0x42/0x50
   do_syscall_64+0x34/0x80
   entry_SYSCALL_64_after_hwframe+0x46/0xb0
   </TASK>

Link: https://lore.kernel.org/r/20220727193415.1583860-4-bvanassche@acm.org
Reported-by: Li Zhijian <lizhijian@fujitsu.com>
Tested-by: Li Zhijian <lizhijian@fujitsu.com>
Fixes: a42d985bd5b2 ("ib_srpt: Initial SRP Target merge for v3.3-rc1")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
3 years agoRDMA/srpt: Introduce a reference count in struct srpt_device
Bart Van Assche [Wed, 27 Jul 2022 19:34:14 +0000 (12:34 -0700)] 
RDMA/srpt: Introduce a reference count in struct srpt_device

This will be used to keep struct srpt_device around as long as either the
RDMA port exists or a LIO target port is associated with the struct
srpt_device.

Link: https://lore.kernel.org/r/20220727193415.1583860-3-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
3 years agoRDMA/srpt: Duplicate port name members
Bart Van Assche [Wed, 27 Jul 2022 19:34:13 +0000 (12:34 -0700)] 
RDMA/srpt: Duplicate port name members

Prepare for decoupling the lifetimes of struct srpt_port and struct
srpt_port_id by duplicating the port name into struct srpt_port.

Link: https://lore.kernel.org/r/20220727193415.1583860-2-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
3 years agodrm/amd/display: Fix a compilation failure on PowerPC caused by FPU code
Rodrigo Siqueira [Thu, 28 Jul 2022 20:33:47 +0000 (16:33 -0400)] 
drm/amd/display: Fix a compilation failure on PowerPC caused by FPU code

We got a report from Stephen/Michael that the PowerPC build was failing
with the following error:

ld: drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/display/dc/dcn30/dcn30_optc.o uses soft float
ld: failed to merge target specific data of file drivers/gpu/drm/amd/display/dc/dcn30/dcn30_optc.o

This error happened because of the function optc3_set_vrr_m_const. This
function expects a double as a parameter in a code that is not allowed
to have FPU operations. After further investigation, it became clear
that optc3_set_vrr_m_const was never invoked, so we can safely drop this
function and fix the ld issue.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Melissa Wen <mwen@igalia.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: enable support for psp 13.0.4 block
Xiaojian Du [Wed, 27 Jul 2022 07:52:33 +0000 (15:52 +0800)] 
drm/amdgpu: enable support for psp 13.0.4 block

This patch will enable support for psp 13.0.4 blcok.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add files for PSP 13.0.4
Xiaojian Du [Thu, 28 Jul 2022 05:25:26 +0000 (13:25 +0800)] 
drm/amdgpu: add files for PSP 13.0.4

This patch will add files for PSP 13.0.4.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add header files for MP 13.0.4
Xiaojian Du [Thu, 28 Jul 2022 05:23:38 +0000 (13:23 +0800)] 
drm/amdgpu: add header files for MP 13.0.4

This patch will add header files for MP 13.0.4.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>