Calling e.g. blk_queue_max_hw_sectors() after calls to
disk_stack_limits() discards the settings determined by
disk_stack_limits().
So we need to make those calls first.
Fixes: 199dc6ed5179 ("md/raid0: update queue parameter in a safer location.") Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When a (e.g.) RAID5 array is reshaped to RAID0, the updating
of queue parameters (e.g. max number of sectors per bio) is
done in the wrong place.
It should be part of ->run, but it is actually part of ->takeover.
This means it happens before level_store() calls:
blk_set_stacking_limits(&mddev->queue->limits);
and so it ineffective. This can lead to errors from underlying
devices.
So move all the relevant settings out of create_stripe_zones()
and into raid0_run().
As this can lead to a bug-on it is suitable for any -stable
kernel which supports reshape to RAID0. So 2.6.35 or later.
As the bug has been present for five years there is no urgency,
so no need to rush into -stable.
Fixes: 9af204cf720c ("md: Add support for Raid5->Raid0 and Raid10->Raid0 takeover") Reported-by: Yi Zhang <yizhan@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com>
[ luis: backported to 3.16:
- raid0 isn't accessed from dm-raid so no conditional mddev->queue accesses
(done with commit 753f2856cda2 "md raid0: access mddev->queue (request
queue member) conditionally because it is not set when accessed from
dm-raid")
- adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Reference to the 'np' node is dropped before dereferencing the 'sizep' and
'basep' pointers, which could by then point to junk if the node has been
freed.
Refactor code to call 'of_node_put' later.
Fixes: c5df39262dd5 ("drivers/char/tpm: Add securityfs support for event log") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Acked-by: Peter Huewe <PeterHuewe@gmx.de>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
mcp794xx alarm registers must be written in BCD format. However, the
alarm programming logic neglected this by adding one to the value
after bin2bcd conversion has been already done, writing bad values
to month register in case the alarm being set is in October. In this
case, the alarm month value becomes 0x0a instead of the expected 0x10.
Fix by moving the +1 addition within the bin2bcd call also.
Fixes: 1d1945d261a2 ("drivers/rtc/rtc-ds1307.c: add alarm support for mcp7941x chips") Signed-off-by: Tero Kristo <t-kristo@ti.com> Acked-by: Nishanth Menon <nm@ti.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Since commit 7d5cd2ce529b, when bond_enslave fails on devices that
are not ARPHRD_ETHER, if needed, it resets the bonding device back to
ARPHRD_ETHER by calling ether_setup.
Unfortunately, ether_setup clobbers dev->flags, clearing IFF_UP
if the bond device is up, leaving it in a quasi-down state without
having actually gone through dev_close. For bonding, if any periodic
work queue items are active (miimon, arp_interval, etc), those will
remain running, as they are stopped by bond_close. At this point, if
the bonding module is unloaded or the bond is deleted, the system will
panic when the work function is called.
This panic is resolved by calling dev_close on the bond itself
prior to calling ether_setup.
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Fixes: 7d5cd2ce5292 ("bonding: correctly handle bonding type change on enslave failure") Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The -i flag was incorrectly listed as a short flag for --no-inherit. It
should have only been listed as a short flag for --input.
This documentation error has existed since the --input flag was
introduced in 6810fc915f7a89d8134edb3996dbbf8eac386c26 (perf trace: Add
option to analyze events in a file versus live).
Signed-off-by: Peter Feiner <pfeiner@google.com> Cc: David Ahern <dsahern@gmail.com> Link: http://lkml.kernel.org/r/1446657706-14518-1-git-send-email-pfeiner@google.com Fixes: 6810fc915f7a ("perf trace: Add option to analyze events in a file versus live") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Both tunnel6_protocol and tunnel46_protocol share the same error
handler, tunnel6_err(), which traverses through tunnel6_handlers list.
For ipip6 tunnels, we need to traverse tunnel46_handlers as we do e.g.
in tunnel46_rcv(). Current code can generate an ICMPv6 error message
with an IPv4 packet embedded in it.
Fixes: 73d605d1abbd ("[IPSEC]: changing API of xfrm6_tunnel_register") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
create_request_message() computes the maximum length of a message,
but uses the wrong type for the time stamp: sizeof(struct timespec)
may be 8 or 16 depending on the architecture, while sizeof(struct
ceph_timespec) is always 8, and that is what gets put into the
message.
Found while auditing the uses of timespec for y2038 problems.
Fixes: b8e69066d8af ("ceph: include time stamp in every MDS request") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Yan, Zheng <zyan@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
ib_req_notify_cq(IB_CQ_REPORT_MISSED_EVENTS) returns a positive
value if WCs were added to a CQ after the last completion upcall
but before the CQ has been re-armed.
Commit 7f23f6f6e388 ("xprtrmda: Reduce lock contention in
completion handlers") assumed that when ib_req_notify_cq() returned
a positive RC, the CQ had also been successfully re-armed, making
it safe to return control to the provider without losing any
completion signals. That is an invalid assumption.
Change both completion handlers to continue polling while
ib_req_notify_cq() returns a positive value.
Fixes: 7f23f6f6e388 ("xprtrmda: Reduce lock contention in ...") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Devesh Sharma <devesh.sharma@avagotech.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit 68bab8662f49 ("mfd: twl6040: Optional clk32k clock handling")
added clock handling for the 32k clock from palmas-clk. However, that
patch did not consider a typical situation where twl6040 is built-in,
and palmas-clk is a loadable module like we have in omap2plus_defconfig.
If palmas-clk is not loaded before twl6040 probes, we will get a
"clk32k is not handled" warning during booting. This means that any
drivers relying on this clock will mysteriously fail, including
omap5-uevm WLAN and audio.
Note that for WLAN, we probably should also eventually get
the clk32kgaudio for MMC3 directly as that's shared between
audio and WLAN SDIO at least for omap5-uevm. It seems the
WLAN chip cannot get it as otherwise MMC3 won't get properly
probed.
Fixes: 68bab8662f49 ("mfd: twl6040: Optional clk32k clock handling") Signed-off-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Currently ca_seq_rtt_us does not use Kern's check. Fix that by
checking if any packet acked is a retransmit, for both RTT used
for RTT estimation and congestion control.
Fixes: 5b08e47ca ("tcp: prefer packet timing to TS-ECR for RTT") Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In former commit, metering is supported for BeBoB based models
customized by M-Audio. The data in transaction is aligned to
big-endianness, while in the driver code u16 typed variable is assigned
to the data. This causes sparse warnings.
bebob_maudio.c:651:31: warning: cast to restricted __be16
bebob_maudio.c:651:31: warning: cast to restricted __be16
bebob_maudio.c:651:31: warning: cast to restricted __be16
bebob_maudio.c:651:31: warning: cast to restricted __be16
This commit fixes this bug by using __be16 variable for the data.
Fixes: 3149ac489ff8('ALSA: bebob: Add support for M-Audio special Firewire series') Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In former commit, snd_efw_command_get_phys_meters() was added to handle
metering data. The given buffer is used to save transaction result and to
convert between endianness. But this causes sparse warnings.
fireworks_command.c:269:25: warning: incorrect type in argument 1 (different base types)
fireworks_command.c:269:25: expected unsigned int [usertype] *p
fireworks_command.c:269:25: got restricted __be32 [usertype] *
This commit fixes this bug.
Fixes: bde8a8f23bbe('ALSA: fireworks: Add transaction and some commands') Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In its original version, drm_framebuffer_init() returned a negative int
if drm_mode_object_get() failed (f453ba046074, "DRM: add mode setting
support").
This was accidentally disabled by commit 4b096ac10da0 ("drm: revamp
locking around fb creation/destruction"). Thus, drm_framebuffer_init()
pretends success if drm_mode_object_get() failed.
Reinstate the original behaviour. Also fix erroneous kernel-doc of
drm_mode_object_get().
Fixes: 4b096ac10da0 ("drm: revamp locking around fb creation/
destruction") Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When committed to upstream, these four modules had wrong entries for
Makefile. This forces them to be loadable modules even if they're set
as built-in.
This commit fixes this bug.
Fixes: b5b04336015e('ALSA: fireworks: Add skelton for Fireworks based devices') Fixes: fd6f4b0dc167('ALSA: bebob: Add skelton for BeBoB based devices') Fixes: 1a4e39c2e5ca('ALSA: oxfw: Move to its own directory') Fixes: 14ff6a094815('ALSA: dice: Move file to its own directory') Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Signed-off-by: Takashi Iwai <tiwai@suse.de>
[ luis: backported to 3.16:
- dropped changes to oxfw and dice modules as they do not exist in
the 3.16 kernel ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The z2 machine calls pxa27x_set_pwrmode() in order to power off
the machine, but this function gets discarded early at boot because
it is marked __init, as pointed out by kbuild:
WARNING: vmlinux.o(.text+0x145c4): Section mismatch in reference from the function z2_power_off() to the function .init.text:pxa27x_set_pwrmode()
The function z2_power_off() references
the function __init pxa27x_set_pwrmode().
This is often because z2_power_off lacks a __init
annotation or the annotation of pxa27x_set_pwrmode is wrong.
This removes the __init section modifier to fix rebooting and the
build error.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: ba4a90a6d86a ("ARM: pxa/z2: fix building error of pxa27x_cpu_suspend() no longer available") Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The error handling path is broken as cawake_gpio was defined as
unsigned integer causing the following warnings on boards that don't
use SSI port and so don't have cawake_gpio defined. e.g. beagleboard C4.
[ 30.094635] WARNING: CPU: 0 PID: 322 at drivers/gpio/gpiolib.c:86 gpio_to_desc+0xa4/0xb8()
[ 30.103363] invalid GPIO -2
[ 30.106292] Modules linked in: omap_ssi_port(+) cpufreq_dt cfbfillrect cfbimgblt leds_gpio cfbcopyarea thermal_sys led_class hwmon gpio_keys encoder_tfp410 connector_analog_tv connector_dvi omap_hdq snd phy_i
[ 30.145477] CPU: 0 PID: 322 Comm: modprobe Not tainted 4.3.0-rc4-00030-gca978c0-dirty #335
[ 30.154174] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
[ 30.160827] [<c0016ef4>] (unwind_backtrace) from [<c00131f4>] (show_stack+0x10/0x14)
[ 30.168975] [<c00131f4>] (show_stack) from [<c033cf08>] (dump_stack+0x80/0x9c)
[ 30.176635] [<c033cf08>] (dump_stack) from [<c003e920>] (warn_slowpath_common+0x7c/0xb8)
[ 30.185180] [<c003e920>] (warn_slowpath_common) from [<c003e9f0>] (warn_slowpath_fmt+0x30/0x40)
[ 30.194366] [<c003e9f0>] (warn_slowpath_fmt) from [<c0376314>] (gpio_to_desc+0xa4/0xb8)
[ 30.202819] [<c0376314>] (gpio_to_desc) from [<c0376ac8>] (gpio_request_one+0x14/0x11c)
[ 30.211273] [<c0376ac8>] (gpio_request_one) from [<c037370c>] (devm_gpio_request_one+0x3c/0x78)
[ 30.220458] [<c037370c>] (devm_gpio_request_one) from [<bf184210>] (ssi_port_probe+0x118/0x504 [omap_ssi_port])
[ 30.231170] [<bf184210>] (ssi_port_probe [omap_ssi_port]) from [<c03d4cfc>] (platform_drv_probe+0x48/0xa4)
[ 30.241424] [<c03d4cfc>] (platform_drv_probe) from [<c03d3678>] (driver_probe_device+0x1dc/0x2a0)
[ 30.250793] [<c03d3678>] (driver_probe_device) from [<c03d37d0>] (__driver_attach+0x94/0x98)
[ 30.259643] [<c03d37d0>] (__driver_attach) from [<c03d1d60>] (bus_for_each_dev+0x54/0x88)
[ 30.268249] [<c03d1d60>] (bus_for_each_dev) from [<c03d2d50>] (bus_add_driver+0xe8/0x1f8)
[ 30.276916] [<c03d2d50>] (bus_add_driver) from [<c03d4118>] (driver_register+0x78/0xf4)
[ 30.285369] [<c03d4118>] (driver_register) from [<c03d5380>] (__platform_driver_probe+0x34/0xd8)
[ 30.294647] [<c03d5380>] (__platform_driver_probe) from [<c00097e4>] (do_one_initcall+0x80/0x1d8)
[ 30.303985] [<c00097e4>] (do_one_initcall) from [<c011617c>] (do_init_module+0x5c/0x1cc)
[ 30.312561] [<c011617c>] (do_init_module) from [<c00c7a68>] (load_module+0x18c8/0x1f0c)
[ 30.320983] [<c00c7a68>] (load_module) from [<c00c8188>] (SyS_init_module+0xdc/0x150)
[ 30.329223] [<c00c8188>] (SyS_init_module) from [<c000f7e0>] (ret_fast_syscall+0x0/0x1c)
Fixes: b209e047bc743 ("HSI: Introduce OMAP SSI driver") Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Sebastian Reichel <sre@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Recent TCP listener patches exposed a prior af_packet bug :
match_fanout_group() blindly assumes it is always safe
to cast sk to a packet socket to compare fanout with af_packet_priv
But SYNACK packets can be sent while attached to request_sock, which
are smaller than a "struct sock".
We can read non existent memory and crash.
Fixes: c0de08d04215 ("af_packet: don't emit packet on orig fanout group") Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Eric Leblond <eric@regit.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
After a packet has been encapsulated by a tunnel we should use the
tunnel sockets local multicast loopback flag to control if the
encapsulated packet should be locally loopback back.
Pass sk into ip_local_out_sk so that in the rare case we are dealing
with a tunneled packet whose tunnel destination address is a multicast
address the kernel properly decides to loopback this packet.
In practice I don't think this matters as ip_queue_xmit is used by
tcp, l2tp and sctp none of which I am aware of uses ip level
multicasting as they are all point to point communications protocols.
Let's fix this before someone uses ip_queue_xmit for a tunnel protocol
that does use multicast.
Fixes: aad88724c9d5 ("ipv4: add a sock pointer to dst->output() path.") Fixes: b0270e91014d ("ipv4: add a sock pointer to ip_queue_xmit()") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The iomap[] array has PCIM_IOMAP_MAX (6) elements and not
DEVICE_COUNT_RESOURCE (16). This bug was found using a static checker.
It may be that the "if (!(mask & (1 << i)))" check means we never
actually go past the end of the array in real life.
Commit d445913ce0ab7f ("usb: ehci-orion: add optional PHY support")
added support for optional phys, but devm_phy_optional_get returns
-ENOSYS if GENERIC_PHY is not enabled.
This causes probe failures, even when there are no phys specified:
[ 1.443365] orion-ehci f1058000.usb: init f1058000.usb fail, -38
[ 1.449403] orion-ehci: probe of f1058000.usb failed with error -38
Similar to dwc3, treat -ENOSYS as no phy.
Fixes: d445913ce0ab7f ("usb: ehci-orion: add optional PHY support") Signed-off-by: Jonas Gorski <jogo@openwrt.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
1) The done label was in the wrong place so we didn't copy any
information out when there was no command given.
2) We were using PAGE_SIZE as the size of the buffer instead of
"PAGE_SIZE - pos".
3) snprintf() returns the number of characters that would have been
printed if there were enough space. If there was not enough space
(and we had fixed the memory corruption bug #2) then it would result
in an information leak when we do simple_read_from_buffer(). I've
changed it to use scnprintf() instead.
I also removed the initialization at the start of the function, because
I thought it made the code a little more clear.
Fixes: 5e6e3a92b9a4 ('wireless: mwifiex: initial commit for Marvell mwifiex driver') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Since commit 1c6c69525b40 ("genirq: Reject bogus threaded irq requests")
threaded IRQs without a primary handler need to be requested with
IRQF_ONESHOT, otherwise the request will fail.
scripts/coccinelle/misc/irqf_oneshot.cocci detected this issue.
Fixes: b5874f33bbaf ("wm831x_power: Use genirq") Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com> Signed-off-by: Sebastian Reichel <sre@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The ifmgd->ave_beacon_signal value cannot be taken as is for
comparisons, it must be divided by since it's represented
like that for better accuracy of the EWMA calculations. This
would lead to invalid driver RSSI events. Fix the used value.
Fixes: 615f7b9bb1f8 ("mac80211: add driver RSSI threshold events") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Using sendfile with below small program to get MD5 sums of some files,
it appear that big files (over 64kbytes with 4k pages system) get a
wrong MD5 sum while small files get the correct sum.
This program uses sendfile() to send a file to an AF_ALG socket
for hashing.
/* md5sum2.c */
int main(int argc, char **argv)
{
int sk = socket(AF_ALG, SOCK_SEQPACKET, 0);
struct stat st;
struct sockaddr_alg sa = {
.salg_family = AF_ALG,
.salg_type = "hash",
.salg_name = "md5",
};
int n;
bind(sk, (struct sockaddr*)&sa, sizeof(sa));
for (n = 1; n < argc; n++) {
int size;
int offset = 0;
char buf[4096];
int fd;
int sko;
int i;
After investivation, it appears that sendfile() sends the files by blocks
of 64kbytes (16 times PAGE_SIZE). The problem is that at the end of each
block, the SPLICE_F_MORE flag is missing, therefore the hashing operation
is reset as if it was the end of the file.
This patch adds SPLICE_F_MORE to the flags when more data is pending.
pipe_write() would return 0 if it failed to merge the beginning of the
data to write with the last, partially filled pipe buffer. It should
return an error code instead. Userspace programs could be confused by
write() returning 0 when called with a nonzero 'count'.
The EFAULT error case was a regression from f0d1bec9d5 ("new helper:
copy_page_from_iter()"), while the ops->confirm() error case was a much
older bug.
/* prior to this patch, write() returned 0 here */
assert(-1 == write(fd[1], NULL, 1));
assert(errno == EFAULT);
}
Signed-off-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Do not clobber the buffer space passed from `search_binary_handler' and
originally preloaded by `prepare_binprm' with the executable's file
header by overwriting it with its interpreter's file header. Instead
keep the buffer space intact and directly use the data structure locally
allocated for the interpreter's file header, fixing a bug introduced in
2.1.14 with loadable module support (linux-mips.org commit beb11695
[Import of Linux/MIPS 2.1.14], predating kernel.org repo's history).
Adjust the amount of data read from the interpreter's file accordingly.
This was not an issue before loadable module support, because back then
`load_elf_binary' was executed only once for a given ELF executable,
whether the function succeeded or failed.
With loadable module support supported and enabled, upon a failure of
`load_elf_binary' -- which may for example be caused by architecture
code rejecting an executable due to a missing hardware feature requested
in the file header -- a module load is attempted and then the function
reexecuted by `search_binary_handler'. With the executable's file
header replaced with its interpreter's file header the executable can
then be erroneously accepted in this subsequent attempt.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Only override netfs->primary_index when registering success.
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Due to a missing initialization there was no way to map fbdev memory.
Thus for example using the Xserver with the fbdev driver failed.
This fix adds initialization for fix.smem_start and fix.smem_len
in the fb_info structure, which fixes this problem.
Requested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Egbert Eich <eich@suse.de>
[pulled from SuSE tree by me - airlied] Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
There is an alignment mismatch issue between the of_reserved_mem and
the CMA setup requirement. The of_reserved_mem will try to get the
alignment value from the DTS and pass it to __memblock_alloc_base to
do the memory block base allocation, but the alignment value specified
in the DTS may not satisfy the CAM setup requirement since CMA setup
required the alignment as the following in the code:
The sanity check in the function of rmem_cma_setup will fail if the
alignment does not setup correctly and thus CMA will fail to setup.
This patch is to fixup the alignment to meet the CMA setup required.
Mailing-list-thread: https://lkml.org/lkml/2015/11/9/138 Signed-off-by: Jason Liu <r64343@freescale.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This is needed to avoid the possibility that the guest triggers
an infinite stream of #DB exceptions (CVE-2015-8104).
VMX is not affected: because it does not save DR6 in the VMCS,
it already intercepts #DB unconditionally.
Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
It was found that a guest can DoS a host by triggering an infinite
stream of "alignment check" (#AC) exceptions. This causes the
microcode to enter an infinite loop where the core never receives
another interrupt. The host kernel panics pretty quickly due to the
effects (CVE-2015-5307).
Signed-off-by: Eric Northup <digitaleric@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Don't set the SRB_FLAGS_QUEUE_ACTION_ENABLE flag since we are not specifying
tags. Without this, the qlogic driver doesn't work properly with storvsc.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Like some of the other Yoga models the Lenovo Yoga 900 does not have a
hw rfkill switch, and trying to read the hw rfkill switch through the
ideapad module causes it to always reported blocking breaking wifi.
This commit adds the Lenovo Yoga 900 to the no_hw_rfkill dmi list, fixing
the wifi breakage.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1275490 Reported-and-tested-by: Kevin Fenzi <kevin@scrye.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Darren Hart <dvhart@linux.intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When listing a inode's xattrs we have a time window where we race against
a concurrent operation for adding a new hard link for our inode that makes
us not return any xattr to user space. In order for this to happen, the
first xattr of our inode needs to be at slot 0 of a leaf and the previous
leaf must still have room for an inode ref (or extref) item, and this can
happen because an inode's listxattrs callback does not lock the inode's
i_mutex (nor does the VFS does it for us), but adding a hard link to an
inode makes the VFS lock the inode's i_mutex before calling the inode's
link callback.
The race illustrated by the following sequence diagram is possible:
CPU 1 CPU 2
btrfs_listxattr()
searches for key (257 XATTR_ITEM 0)
gets path with path->nodes[0] == leaf X
and path->slots[0] == N
because path->slots[0] is >=
btrfs_header_nritems(leaf X), it calls
btrfs_next_leaf()
btrfs_next_leaf()
releases the path
adds key (257 INODE_REF 666)
to the end of leaf X (slot N),
and leaf X now has N + 1 items
searches for the key (257 INODE_REF 256),
with path->keep_locks == 1, because that
is the last key it saw in leaf X before
releasing the path
ends up at leaf X again and it verifies
that the key (257 INODE_REF 256) is no
longer the last key in leaf X, so it
returns with path->nodes[0] == leaf X
and path->slots[0] == N, pointing to
the new item with key (257 INODE_REF 666)
btrfs_listxattr's loop iteration sees that
the type of the key pointed by the path is
different from the type BTRFS_XATTR_ITEM_KEY
and so it breaks the loop and stops looking
for more xattr items
--> the application doesn't get any xattr
listed for our inode
So fix this by breaking the loop only if the key's type is greater than
BTRFS_XATTR_ITEM_KEY and skip the current key if its type is smaller.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
[ luis: backported to 3.16:
- drop btrfs_key_type(), which was dropped upstream by 962a298f3511 ("btrfs: kill the key type accessor helpers") ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Writing a number to /sys/bus/scsi/devices/<sdev>/queue_ramp_up_period
returns the value of that number instead of the number of bytes written.
This behavior can confuse programs expecting POSIX write() semantics.
Fix this by returning the number of bytes written instead.
Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Arnaldo reported that tracepoint filters seem to misbehave (ie. not
apply) on inherited events.
The fix is obvious; filters are only set on the actual (parent)
event, use the normal pattern of using this parent event for filters.
This is safe because each child event has a reference to it.
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20151102095051.GN17308@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
If we are using the NO_HOLES feature, we have a tiny time window when
running delalloc for a nodatacow inode where we can race with a concurrent
link or xattr add operation leading to a BUG_ON.
This happens because at run_delalloc_nocow() we end up casting a leaf item
of type BTRFS_INODE_[REF|EXTREF]_KEY or of type BTRFS_XATTR_ITEM_KEY to a
file extent item (struct btrfs_file_extent_item) and then analyse its
extent type field, which won't match any of the expected extent types
(values BTRFS_FILE_EXTENT_[REG|PREALLOC|INLINE]) and therefore trigger an
explicit BUG_ON(1).
The following sequence diagram shows how the race happens when running a
no-cow dellaloc range [4K, 8K[ for inode 257 and we have the following
neighbour leafs:
(Note the implicit hole for inode 257 regarding the [0, 8K[ range)
CPU 1 CPU 2
run_dealloc_nocow()
btrfs_lookup_file_extent()
--> searches for a key with value
(257 EXTENT_DATA 4096) in the
fs/subvol tree
--> returns us a path with
path->nodes[0] == leaf X and
path->slots[0] == N
because path->slots[0] is >=
btrfs_header_nritems(leaf X), it
calls btrfs_next_leaf()
btrfs_next_leaf()
--> releases the path
hard link added to our inode,
with key (257 INODE_REF 500)
added to the end of leaf X,
so leaf X now has N + 1 keys
--> searches for the key
(257 INODE_REF 256), because
it was the last key in leaf X
before it released the path,
with path->keep_locks set to 1
--> ends up at leaf X again and
it verifies that the key
(257 INODE_REF 256) is no longer
the last key in the leaf, so it
returns with path->nodes[0] ==
leaf X and path->slots[0] == N,
pointing to the new item with
key (257 INODE_REF 500)
the loop iteration of run_dealloc_nocow()
does not break out the loop and continues
because the key referenced in the path
at path->nodes[0] and path->slots[0] is
for inode 257, its type is < BTRFS_EXTENT_DATA_KEY
and its offset (500) is less then our delalloc
range's end (8192)
the item pointed by the path, an inode reference item,
is (incorrectly) interpreted as a file extent item and
we get an invalid extent type, leading to the BUG_ON(1):
This happened because the item we were processing did not match a file
extent item (its key type != BTRFS_EXTENT_DATA_KEY), and even on this
case we cast the item to a struct btrfs_file_extent_item pointer and
then find a type field value that does not match any of the expected
values (BTRFS_FILE_EXTENT_[REG|PREALLOC|INLINE]). This scenario happens
due to a tiny time window where a race can happen as exemplified below.
For example, consider the following scenario where we're using the
NO_HOLES feature and we have the following two neighbour leafs:
Our inode 257 has an implicit hole in the range [0, 8K[ (implicit rather
than explicit because NO_HOLES is enabled). Now if our inode has an
ordered extent for the range [4K, 8K[ that is finishing, the following
can happen:
CPU 1 CPU 2
btrfs_finish_ordered_io()
insert_reserved_file_extent()
__btrfs_drop_extents()
Searches for the key
(257 EXTENT_DATA 4096) through
btrfs_lookup_file_extent()
Key not found and we get a path where
path->nodes[0] == leaf X and
path->slots[0] == N
Because path->slots[0] is >=
btrfs_header_nritems(leaf X), we call
btrfs_next_leaf()
btrfs_next_leaf() releases the path
inserts key
(257 INODE_REF 4096)
at the end of leaf X,
leaf X now has N + 1 keys,
and the new key is at
slot N
btrfs_next_leaf() searches for
key (257 INODE_REF 256), with
path->keep_locks set to 1,
because it was the last key it
saw in leaf X
finds it in leaf X again and
notices it's no longer the last
key of the leaf, so it returns 0
with path->nodes[0] == leaf X and
path->slots[0] == N (which is now
< btrfs_header_nritems(leaf X)),
pointing to the new key
(257 INODE_REF 4096)
__btrfs_drop_extents() casts the
item at path->nodes[0], slot
path->slots[0], to a struct
btrfs_file_extent_item - it does
not skip keys for the target
inode with a type less than
BTRFS_EXTENT_DATA_KEY
(BTRFS_INODE_REF_KEY < BTRFS_EXTENT_DATA_KEY)
sees a bogus value for the type
field triggering the WARN_ON in
the trace shown above, and sets
extent_end = search_start (4096)
does the if-then-else logic to
fixup 0 length extent items created
by a past bug from hole punching:
if (extent_end == key.offset &&
extent_end >= search_start)
goto delete_extent_item;
that evaluates to true and it ends
up deleting the key pointed to by
path->slots[0], (257 INODE_REF 4096),
from leaf X
The same could happen for example for a xattr that ends up having a key
with an offset value that matches search_start (very unlikely but not
impossible).
So fix this by ensuring that keys smaller than BTRFS_EXTENT_DATA_KEY are
skipped, never casted to struct btrfs_file_extent_item and never deleted
by accident. Also protect against the unexpected case of getting a key
for a lower inode number by skipping that key and issuing a warning.
Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When we get loaded by a 64-bit bootloader, kernel entry point is
startup_64 in head_64.S. We don't trust any and all bootloaders because
some will fiddle with CPU configuration so we go ahead and massage each
CPU into sanity again.
For example, some dell BIOSes have this XD disable feature which set
IA32_MISC_ENABLE[34] and disable NX. This might be some dumb workaround
for other OSes but Linux sure doesn't need it.
A similar thing is present in the Surface 3 firmware - see
https://bugzilla.kernel.org/show_bug.cgi?id=106051 - which sets this bit
only on the BSP:
The commit f5f3497cad8c extended the low identity mapping. However, if
the kernel uses more than 2 GB (VMSPLIT_2G_OPT or VMSPLIT_1G memory
split), the normal memory mapping is overwritten by the low identity
mapping causing a crash. To avoid overwritting, limit the low identity
map to cover only memory before kernel range (PAGE_OFFSET).
Fixes: f5f3497cad8c "x86/setup: Extend low identity map to cover whole kernel range Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lkml.kernel.org/r/1446815916-22105-1-git-send-email-krzysiek@podlesie.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
On 32-bit systems, the initial_page_table is reused by
efi_call_phys_prolog as an identity map to call
SetVirtualAddressMap. efi_call_phys_prolog takes care of
converting the current CPU's GDT to a physical address too.
For PAE kernels the identity mapping is achieved by aliasing the
first PDPE for the kernel memory mapping into the first PDPE
of initial_page_table. This makes the EFI stub's trick "just work".
However, for non-PAE kernels there is no guarantee that the identity
mapping in the initial_page_table extends as far as the GDT; in this
case, accesses to the GDT will cause a page fault (which quickly becomes
a triple fault). Fix this by copying the kernel mappings from
swapper_pg_dir to initial_page_table twice, both at PAGE_OFFSET and at
identity mapping.
For some reason, this is only reproducible with QEMU's dynamic translation
mode, and not for example with KVM. However, even under KVM one can clearly
see that the page table is bogus:
[ It appears that you can skate past this issue if you don't receive
any interrupts while the bogus GDT pointer is loaded, or if you avoid
reloading the segment registers in general.
Andy Lutomirski provides some additional insight:
"AFAICT it's entirely permissible for the GDTR and/or LDT
descriptor to point to unmapped memory. Any attempt to use them
(segment loads, interrupts, IRET, etc) will try to access that memory
as if the access came from CPL 0 and, if the access fails, will
generate a valid page fault with CR2 pointing into the GDT or
LDT."
Up until commit 23a0d4e8fa6d ("efi: Disable interrupts around EFI
calls, not in the epilog/prolog calls") interrupts were disabled
around the prolog and epilog calls, and the functional GDT was
re-installed before interrupts were re-enabled.
Which explains why no one has hit this issue until now. ]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reported-by: Laszlo Ersek <lersek@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
[ Updated changelog. ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The commit 96d0df79f264 ("proc: make proc_fd_permission() thread-friendly")
fixed the access to /proc/self/fd from sub-threads, but introduced another
problem: a sub-thread can't access /proc/<tid>/fd/ or /proc/thread-self/fd
if generic_permission() fails.
Change proc_fd_permission() to check same_thread_group(pid_task(), current).
Fixes: 96d0df79f264 ("proc: make proc_fd_permission() thread-friendly") Reported-by: "Jin, Yihua" <yihua.jin@intel.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
On systems with a KMALLOC_MIN_SIZE of 128 (arm64, some mips and powerpc
configurations defining ARCH_DMA_MINALIGN to 128), the first
kmalloc_caches[] entry to be initialised after slab_early_init = 0 is
"kmalloc-128" with index 7. Depending on the debug kernel configuration,
sizeof(struct kmem_cache) can be larger than 128 resulting in an
INDEX_NODE of 8.
Commit 8fc9cf420b36 ("slab: make more slab management structure off the
slab") enables off-slab management objects for sizes starting with
PAGE_SIZE >> 5 (128 bytes for a 4KB page configuration) and the creation
of the "kmalloc-128" cache would try to place the management objects
off-slab. However, since KMALLOC_MIN_SIZE is already 128 and
freelist_size == 32 in __kmem_cache_create(), kmalloc_slab(freelist_size)
returns NULL (kmalloc_caches[7] not populated yet). This triggers the
following bug on arm64:
kernel BUG at /work/Linux/linux-2.6-aarch64/mm/slab.c:2283!
Internal error: Oops - BUG: 0 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 4.3.0-rc4+ #540
Hardware name: Juno (DT)
PC is at __kmem_cache_create+0x21c/0x280
LR is at __kmem_cache_create+0x210/0x280
[...]
Call trace:
__kmem_cache_create+0x21c/0x280
create_boot_cache+0x48/0x80
create_kmalloc_cache+0x50/0x88
create_kmalloc_caches+0x4c/0xf4
kmem_cache_init+0x100/0x118
start_kernel+0x214/0x33c
This patch introduces an OFF_SLAB_MIN_SIZE definition to avoid off-slab
management objects for sizes equal to or smaller than KMALLOC_MIN_SIZE.
Fixes: 8fc9cf420b36 ("slab: make more slab management structure off the slab") Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When dropping a lock while iterating a list we must restart the search
as other threads could have manipulated the list under us. Without this
we can get stuck in an endless loop. This bug was introduced by
Thanks go to Dan Williams <dan.j.williams@intel.com> for tracking all this
prior history down.
Reported-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: bc3f02a795d3b4faa99d37390174be2a75d091bd Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Reported by Clifford and Craig for JMicron OHCI-1394 + SDHCI combo
controllers: Often or even most of the time, the controller is
initialized with the message "added OHCI v1.10 device as card 0, 4 IR +
0 IT contexts, quirks 0x10". With 0 isochronous transmit DMA contexts
(IT contexts), applications like audio output are impossible.
However, OHCI-1394 demands that at least 4 IT contexts are implemented
by the link layer controller, and indeed JMicron JMB38x do implement
four of them. Only their IsoXmitIntMask register is unreliable at early
access.
With my own JMB381 single function controller I found:
- I can reproduce the problem with a lower probability than Craig's.
- If I put a loop around the section which clears and reads
IsoXmitIntMask, then either the first or the second attempt will
return the correct initial mask of 0x0000000f. I never encountered
a case of needing more than a second attempt.
- Consequently, if I put a dummy reg_read(...IsoXmitIntMaskSet)
before the first write, the subsequent read will return the correct
result.
- If I merely ignore a wrong read result and force the known real
result, later isochronous transmit DMA usage works just fine.
So let's just fix this chip bug up by the latter method. Tested with
JMB381 on kernel 3.13 and 4.3.
Since OHCI-1394 generally requires 4 IT contexts at a minium, this
workaround is simply applied whenever the initial read of IsoXmitIntMask
returns 0, regardless whether it's a JMicron chip or not. I never heard
of this issue together with any other chip though.
I am not 100% sure that this fix works on the OHCI-1394 part of JMB380
and JMB388 combo controllers exactly the same as on the JMB381 single-
function controller, but so far I haven't had a chance to let an owner
of a combo chip run a patched kernel.
Strangely enough, IsoRecvIntMask is always reported correctly, even
though it is probed right before IsoXmitIntMask.
Reported-by: Clifford Dunn Reported-by: Craig Moore <craig.moore@qenos.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
HP ProBook 6550b needs the same pin fixup applied to other HP B-series
laptops with docks for making its headphone and dock headphone jacks
working properly. We just need to add the codec SSID to the list.
Bugzilla: https://bugzilla.kernel.org/attachment.cgi?id=191971 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
During probe if the regulator could not be enabled, the error exit path
would still disable it. This could lead to unbalanced counter of
regulator enable/disable.
The patch moves code for getting and enabling the regulator from
exynos_map_dt_data() to probe function because it is really not a part
of getting Device Tree properties.
Acked-by: Lukasz Majewski <l.majewski@samsung.com> Tested-by: Lukasz Majewski <l.majewski@samsung.com> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Fixes: 5f09a5cbd14a ("thermal: exynos: Disable the regulator on probe failure") Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In nop_mcount, shdr->sh_offset and welp->r_offset should handle
endianness properly, otherwise it will trigger Segmentation fault
if the recordmcount main and file.o have different endianness.
Link: http://lkml.kernel.org/r/563806C7.7070606@huawei.com Signed-off-by: Li Bin <huawei.libin@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
There are multiple factors adding to the issue in different
configurations:
- commit 17290231df16eeee ("xtensa: add fixup for double exception raised
in window overflow") added function window_overflow_restore_a0_fixup to
double exception vector overlapping reset vector location of secondary
processor cores.
- on MMUv2 cores RESET_VECTOR1_VADDR may point to uncached kernel memory
making code overlapping depend on cache type and size, so that without
cache or with WT cache reset vector code overwrites double exception
code, making issue even harder to detect.
- on MMUv3 cores RESET_VECTOR1_VADDR may point to unmapped area, as
MMUv3 cores change virtual address map to match MMUv2 layout, but
reset vector virtual address is given for the original MMUv3 mapping.
- physical memory region of the secondary reset vector is not reserved
in the physical memory map, and thus may be allocated and overwritten
at arbitrary moment.
Fix it as follows:
- move window_overflow_restore_a0_fixup code to .text section.
- define RESET_VECTOR1_VADDR so that it points to reset vector in the
cacheable MMUv2 map for cores with MMU.
- reserve reset vector region in the physical memory map. Drop separate
literal section and build mxhead.S with text section literals.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In TDLS channel-switch operations the chandef can sometimes be NULL.
Avoid an oops in the trace code for these cases and just print a
chandef full of zeros.
Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This is an issue on SMAP enabled CPUs and 32 bit apps running on 64 bit
OS. Do not access user memory from kernel code. The SMAP bit restricts
accessing user memory from kernel code.
Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Build-time fixes:
- make lbeg/lend/lcount save/restore conditional on kernel entry;
- don't clear lcount in platform_restart functions unconditionally.
Run-time fixes:
- use correct end of range register in __endla paired with __loopt, not
the unused temporary register. This fixes .bss zero-initialization.
Update comments in asmmacro.h;
- don't clobber a10 in the usercopy that leads to access to unmapped
memory.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Reported-by: Keith Webb <khwebb@gmail.com> Suggested-by: Keith Webb <khwebb@gmail.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=106671 Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1446209424-28801-1-git-send-email-jani.nikula@intel.com Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
That commit introduced a regression at least for the case of the SG_IO ioctl()
running without CAP_SYS_RAWIO capability (e.g., unprivileged users) when there
are no active paths: the ioctl() fails with the ENOTTY errno immediately rather
than blocking due to queue_if_no_path until a path becomes active, for example.
That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices
(qemu "-device scsi-block" [1], libvirt "<disk type='block' device='lun'>" [2])
from multipath devices; which leads to SCSI/filesystem errors in such a guest.
More general scenarios can hit that regression too. The following demonstration
employs a SG_IO ioctl() with a standard SCSI INQUIRY command for this objective
(some output & user changes omitted for brevity and comments added for clarity).
Reverting that commit restores normal operation (queueing) in failing scenarios;
tested on linux-next (next-20151022).
1) Test-case is based on sg_simple0 [3] (just SG_IO; remove SG_GET_VERSION_NUM)
$ cat sg_simple0.c
... see [3] ...
$ sed '/SG_GET_VERSION_NUM/,/}/d' sg_simple0.c > sgio_inquiry.c
$ gcc sgio_inquiry.c -o sgio_inquiry
2) The ioctl() works fine with active paths present.
5) The ioctl() only works if the SYS_CAP_RAWIO capability is present
(then queueing happens -- in this example, queue_if_no_path is set);
this is due to a conditional check in scsi_verify_blk_ioctl().
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit 073db4a51ee4 ("mtd: fix: avoid race condition when accessing
mtd->usecount") fixed a race condition but due to poor ordering of the
mutex acquisition, introduced a potential deadlock.
The deadlock can occur, for example, when rmmod'ing the m25p80 module, which
will delete one or more MTDs, along with any corresponding mtdblock
devices. This could potentially race with an acquisition of the block
device as follows.
This is a classic (potential) ABBA deadlock, which can be fixed by
making the A->B ordering consistent everywhere. There was no real
purpose to the ordering in the original patch, AFAIR, so this shouldn't
be a problem. This ordering was actually already present in
del_mtd_blktrans_dev(), for one, where the function tried to ensure that
its caller already held mtd_table_mutex before it acquired &dev->lock:
if (mutex_trylock(&mtd_table_mutex)) {
mutex_unlock(&mtd_table_mutex);
BUG();
}
So, reverse the ordering of acquisition of &dev->lock and &mtd_table_mutex so
we always acquire mtd_table_mutex first.
The sizeof() is invoked on an incorrect variable, likely due to some
copy-paste error, and this might result in memory corruption. Fix this.
Signed-off-by: Marek Vasut <marex@denx.de> Cc: Wolfgang Grandegger <wg@grandegger.com> Cc: netdev@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
For reasons not entirely apparent, but now enshrined in history, the
architectural mapping of AArch32 banked registers to AArch64 registers
actually orders SP_<mode> and LR_<mode> backwards compared to the
intuitive r13/r14 order, for all modes except FIQ.
Fix the compat_<reg>_<mode> macros accordingly, in the hope of avoiding
subtle bugs with KVM and AArch32 guests.
Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
We seemed to have missed a few corner cases in commit f6c137ff00a4
("KVM: s390: randomize sca address").
The SCA has a maximum size of 2112 bytes. By setting the sca_offset to
some unlucky numbers, we exceed the page.
0x7c0 (1984) -> Fits exactly
0x7d0 (2000) -> 16 bytes out
0x7e0 (2016) -> 32 bytes out
0x7f0 (2032) -> 48 bytes out
One VCPU entry is 32 bytes long.
For the last two cases, we actually write data to the other page.
1. The address of the VCPU.
2. Injection/delivery/clearing of SIGP externall calls via SIGP IF.
Especially the 2. happens regularly. So this could produce two problems:
1. The guest losing/getting external calls.
2. Random memory overwrites in the host.
So this problem happens on every 127 + 128 created VM with 64 VCPUs.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Do not use PAGE_SIZE marco to calculate max_sectors per I/O
request. Driver code assumes PAGE_SIZE will be always 4096 which can
lead to wrongly calculated value if PAGE_SIZE is not 4096. This issue
was reported in Ubuntu Bugzilla Bug #1475166.
Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Currently when the system is trying to uninstall the ACPI interrupt
handler, it uses acpi_gbl_FADT.sci_interrupt as the IRQ number.
However, the IRQ number that the ACPI interrupt handled is installed
for comes from acpi_gsi_to_irq() and that is the number that should
be used for the handler removal.
Fix this problem by using the mapped IRQ returned from acpi_gsi_to_irq()
as appropriate.
Acked-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This adds the USB ID for the Sitecom WLA2100. The Windows 10 inf file
was checked to verify that the addition is correct.
Reported-by: Frans van de Wiel <fvdw@fvdw.eu> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Frans van de Wiel <fvdw@fvdw.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The DMA-slave configuration depends on the whether <= 8 or > 8 bits
are transferred per word, so we need to call
atmel_spi_dma_slave_config() with the correct value.
Signed-off-by: David Mosberger <davidm@egauge.net> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The HIDP specs define an idle-timeout which automatically disconnects a
device. This has always been implemented in the HIDP layer and forced a
synchronous shutdown of the hidp-scheduler. This works just fine, but
lacks a forced disconnect on the underlying l2cap channels. This has been
broken since:
The old session-management always forced an l2cap error on the ctrl/intr
channels when shutting down. The new session-management skips this, as we
don't want to enforce channel policy on the caller. In other words, if
user-space removes an HIDP device, the underlying channels (which are
*owned* and *referenced* by user-space) are still left active. User-space
needs to call shutdown(2) or close(2) to release them.
Unfortunately, this does not work with idle-timeouts. There is no way to
signal user-space that the HIDP layer has been stopped. The API simply
does not support any event-passing except for poll(2). Hence, we restore
old behavior and force EUNATCH on the sockets if the HIDP layer is
disconnected due to idle-timeouts (behavior of explicit disconnects
remains unmodified). User-space can still call
getsockopt(..., SO_ERROR, ...)
..to retrieve the EUNATCH error and clear sk_err. Hence, the channels can
still be re-used (which nobody does so far, though). Therefore, the API
still supports the new behavior, but with this patch it's also compatible
to the old implicit channel shutdown.
Reported-by: Mark Haun <haunma@keteu.org> Reported-by: Luiz Augusto von Dentz <luiz.dentz@gmail.com> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In videobuf2 dma-contig memory type the prepare and finish ops, instead of
passing the number of entries in the original scatterlist as the "nents"
parameter to dma_sync_sg_for_device() and dma_sync_sg_for_cpu(), the value
returned by dma_map_sg() was used. Albeit this has been suggested in
comments of some implementations (which have since been corrected), this
is wrong.
Fixes: 199d101efdba ("v4l: vb2-dma-contig: add prepare/finish to dma-contig allocator") Signed-off-by: Tiffany Lin <tiffany.lin@mediatek.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The following warning occurs when DW SPI is compiled as a module and it's a PCI
device. On the removal stage pcibios_free_irq() is called earlier than
free_irq() due to the latter is called at managed resources free strage.
Explicitly call free_irq() at removal stage of the DW SPI driver.
Fixes: 04f421e7b0b1 (spi: dw: use managed resources) Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
At ibm vtpm initialzation, tpm_ibmvtpm_probe() registers its interrupt
handler, ibmvtpm_interrupt, which calls ibmvtpm_crq_process to allocate
memory for rtce buffer. The current code uses 'GFP_KERNEL' as the
type of kernel memory allocation, which resulted a warning at
kernel/lockdep.c. This patch uses 'GFP_ATOMIC' instead so that the
allocation is high-priority and does not sleep.
Signed-off-by: Hon Ching(Vicky) Lo <honclo@linux.vnet.ibm.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
If a EXT4 filesystem utilizes JBD2 journaling and an error occurs, the
journaling will be aborted first and the error number will be recorded
into JBD2 superblock and, finally, the system will enter into the
panic state in "errors=panic" option. But, in the rare case, this
sequence is little twisted like the below figure and it will happen
that the system enters into panic state, which means the system reset
in mobile environment, before completion of recording an error in the
journal superblock. In this case, e2fsck cannot recognize that the
filesystem failure occurred in the previous run and the corruption
wouldn't be fixed.
"group" is the group where the backup will be placed, and is
initialized to zero in the declaration. This meant that backups for
meta_bg descriptors were erroneously written to the backup block group
descriptors in groups 1 and (desc_per_block-1).
There is a use-after-free possibility in __ext4_journal_stop() in the
case that we free the handle in the first jbd2_journal_stop() because
we're referencing handle->h_err afterwards. This was introduced in 9705acd63b125dee8b15c705216d7186daea4625 and it is wrong. Fix it by
storing the handle->h_err value beforehand and avoid referencing
potentially freed handle.
Fixes: 9705acd63b125dee8b15c705216d7186daea4625 Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When truncating a file to a smaller size which consists of an inline
extent that is compressed, we did not discard (or made unusable) the
data between the new file size and the old file size, wasting metadata
space and allowing for the truncated data to be leaked and the data
corruption/loss mentioned below.
We were also not correctly decrementing the number of bytes used by the
inode, we were setting it to zero, giving a wrong report for callers of
the stat(2) syscall. The fsck tool also reported an error about a mismatch
between the nbytes of the file versus the real space used by the file.
Now because we weren't discarding the truncated region of the file, it
was possible for a caller of the clone ioctl to actually read the data
that was truncated, allowing for a security breach without requiring root
access to the system, using only standard filesystem operations. The
scenario is the following:
1) User A creates a file which consists of an inline and compressed
extent with a size of 2000 bytes - the file is not accessible to
any other users (no read, write or execution permission for anyone
else);
2) The user truncates the file to a size of 1000 bytes;
3) User A makes the file world readable;
4) User B creates a file consisting of an inline extent of 2000 bytes;
5) User B issues a clone operation from user A's file into its own
file (using a length argument of 0, clone the whole range);
6) User B now gets to see the 1000 bytes that user A truncated from
its file before it made its file world readbale. User B also lost
the bytes in the range [1000, 2000[ bytes from its own file, but
that might be ok if his/her intention was reading stale data from
user A that was never supposed to be public.
Note that this contrasts with the case where we truncate a file from 2000
bytes to 1000 bytes and then truncate it back from 1000 to 2000 bytes. In
this case reading any byte from the range [1000, 2000[ will return a value
of 0x00, instead of the original data.
This problem exists since the clone ioctl was added and happens both with
and without my recent data loss and file corruption fixes for the clone
ioctl (patch "Btrfs: fix file corruption and data loss after cloning
inline extents").
So fix this by truncating the compressed inline extents as we do for the
non-compressed case, which involves decompressing, if the data isn't already
in the page cache, compressing the truncated version of the extent, writing
the compressed content into the inline extent and then truncate it.
The following test case for fstests reproduces the problem. In order for
the test to pass both this fix and my previous fix for the clone ioctl
that forbids cloning a smaller inline extent into a larger one,
which is titled "Btrfs: fix file corruption and data loss after cloning
inline extents", are needed. Without that other fix the test fails in a
different way that does not leak the truncated data, instead part of
destination file gets replaced with zeroes (because the destination file
has a larger inline extent than the source).
seq=`basename $0`
seqres=$RESULT_DIR/$seq
echo "QA output created by $seq"
tmp=/tmp/$$
status=1 # failure is the default!
trap "_cleanup; exit \$status" 0 1 2 3 15
_cleanup()
{
rm -f $tmp.*
}
# get standard environment, filters and checks
. ./common/rc
. ./common/filter
# real QA test starts here
_need_to_be_root
_supported_fs btrfs
_supported_os Linux
_require_scratch
_require_cloner
# Create our test files. File foo is going to be the source of a clone operation
# and consists of a single inline extent with an uncompressed size of 512 bytes,
# while file bar consists of a single inline extent with an uncompressed size of
# 256 bytes. For our test's purpose, it's important that file bar has an inline
# extent with a size smaller than foo's inline extent.
$XFS_IO_PROG -f -c "pwrite -S 0xa1 0 128" \
-c "pwrite -S 0x2a 128 384" \
$SCRATCH_MNT/foo | _filter_xfs_io
$XFS_IO_PROG -f -c "pwrite -S 0xbb 0 256" $SCRATCH_MNT/bar | _filter_xfs_io
# Now durably persist all metadata and data. We do this to make sure that we get
# on disk an inline extent with a size of 512 bytes for file foo.
sync
# Now truncate our file foo to a smaller size. Because it consists of a
# compressed and inline extent, btrfs did not shrink the inline extent to the
# new size (if the extent was not compressed, btrfs would shrink it to 128
# bytes), it only updates the inode's i_size to 128 bytes.
$XFS_IO_PROG -c "truncate 128" $SCRATCH_MNT/foo
# Now clone foo's inline extent into bar.
# This clone operation should fail with errno EOPNOTSUPP because the source
# file consists only of an inline extent and the file's size is smaller than
# the inline extent of the destination (128 bytes < 256 bytes). However the
# clone ioctl was not prepared to deal with a file that has a size smaller
# than the size of its inline extent (something that happens only for compressed
# inline extents), resulting in copying the full inline extent from the source
# file into the destination file.
#
# Note that btrfs' clone operation for inline extents consists of removing the
# inline extent from the destination inode and copy the inline extent from the
# source inode into the destination inode, meaning that if the destination
# inode's inline extent is larger (N bytes) than the source inode's inline
# extent (M bytes), some bytes (N - M bytes) will be lost from the destination
# file. Btrfs could copy the source inline extent's data into the destination's
# inline extent so that we would not lose any data, but that's currently not
# done due to the complexity that would be needed to deal with such cases
# (specially when one or both extents are compressed), returning EOPNOTSUPP, as
# it's normally not a very common case to clone very small files (only case
# where we get inline extents) and copying inline extents does not save any
# space (unlike for normal, non-inlined extents).
$CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/foo $SCRATCH_MNT/bar
# Now because the above clone operation used to succeed, and due to foo's inline
# extent not being shinked by the truncate operation, our file bar got the whole
# inline extent copied from foo, making us lose the last 128 bytes from bar
# which got replaced by the bytes in range [128, 256[ from foo before foo was
# truncated - in other words, data loss from bar and being able to read old and
# stale data from foo that should not be possible to read anymore through normal
# filesystem operations. Contrast with the case where we truncate a file from a
# size N to a smaller size M, truncate it back to size N and then read the range
# [M, N[, we should always get the value 0x00 for all the bytes in that range.
# We expected the clone operation to fail with errno EOPNOTSUPP and therefore
# not modify our file's bar data/metadata. So its content should be 256 bytes
# long with all bytes having the value 0xbb.
#
# Without the btrfs bug fix, the clone operation succeeded and resulted in
# leaking truncated data from foo, the bytes that belonged to its range
# [128, 256[, and losing data from bar in that same range. So reading the
# file gave us the following content:
#
# 0000000 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1
# *
# 0000200 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a
# *
# 0000400
echo "File bar's content after the clone operation:"
od -t x1 $SCRATCH_MNT/bar
# Also because the foo's inline extent was not shrunk by the truncate
# operation, btrfs' fsck, which is run by the fstests framework everytime a
# test completes, failed reporting the following error:
#
# root 5 inode 257 errors 400, nbytes wrong
status=0
exit
Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The VT-d specification says that "Software must enable ATS on endpoint
devices behind a Root Port only if the Root Port is reported as
supporting ATS transactions."
We walk up the tree to find a Root Port, but for integrated devices we
don't find one — we get to the host bridge. In that case we *should*
allow ATS. Currently we don't, which means that we are incorrectly
failing to use ATS for the integrated graphics. Fix that.
We should never break out of this loop "naturally" with bus==NULL,
since we'll always find bridge==NULL in that case (and now return 1).
So remove the check for (!bridge) after the loop, since it can never
happen. If it did, it would be worthy of a BUG_ON(!bridge). But since
it'll oops anyway in that case, that'll do just as well.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Currently the clone ioctl allows to clone an inline extent from one file
to another that already has other (non-inlined) extents. This is a problem
because btrfs is not designed to deal with files having inline and regular
extents, if a file has an inline extent then it must be the only extent
in the file and must start at file offset 0. Having a file with an inline
extent followed by regular extents results in EIO errors when doing reads
or writes against the first 4K of the file.
Also, the clone ioctl allows one to lose data if the source file consists
of a single inline extent, with a size of N bytes, and the destination
file consists of a single inline extent with a size of M bytes, where we
have M > N. In this case the clone operation removes the inline extent
from the destination file and then copies the inline extent from the
source file into the destination file - we lose the M - N bytes from the
destination file, a read operation will get the value 0x00 for any bytes
in the the range [N, M] (the destination inode's i_size remained as M,
that's why we can read past N bytes).
So fix this by not allowing such destructive operations to happen and
return errno EOPNOTSUPP to user space.
Currently the fstest btrfs/035 tests the data loss case but it totally
ignores this - i.e. expects the operation to succeed and does not check
the we got data loss.
The following test case for fstests exercises all these cases that result
in file corruption and data loss:
seq=`basename $0`
seqres=$RESULT_DIR/$seq
echo "QA output created by $seq"
tmp=/tmp/$$
status=1 # failure is the default!
trap "_cleanup; exit \$status" 0 1 2 3 15
_cleanup()
{
rm -f $tmp.*
}
# get standard environment, filters and checks
. ./common/rc
. ./common/filter
# real QA test starts here
_need_to_be_root
_supported_fs btrfs
_supported_os Linux
_require_scratch
_require_cloner
_require_btrfs_fs_feature "no_holes"
_require_btrfs_mkfs_feature "no-holes"
rm -f $seqres.full
test_cloning_inline_extents()
{
local mkfs_opts=$1
local mount_opts=$2
# File bar, the source for all the following clone operations, consists
# of a single inline extent (50 bytes).
$XFS_IO_PROG -f -c "pwrite -S 0xbb 0 50" $SCRATCH_MNT/bar \
| _filter_xfs_io
# Test cloning into a file with an extent (non-inlined) where the
# destination offset overlaps that extent. It should not be possible to
# clone the inline extent from file bar into this file.
$XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 16K" $SCRATCH_MNT/foo \
| _filter_xfs_io
$CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo
# Doing IO against any range in the first 4K of the file should work.
# Due to a past clone ioctl bug which allowed cloning the inline extent,
# these operations resulted in EIO errors.
echo "File foo data after clone operation:"
# All bytes should have the value 0xaa (clone operation failed and did
# not modify our file).
od -t x1 $SCRATCH_MNT/foo
$XFS_IO_PROG -c "pwrite -S 0xcc 0 100" $SCRATCH_MNT/foo | _filter_xfs_io
# Test cloning the inline extent against a file which has a hole in its
# first 4K followed by a non-inlined extent. It should not be possible
# as well to clone the inline extent from file bar into this file.
$XFS_IO_PROG -f -c "pwrite -S 0xdd 4K 12K" $SCRATCH_MNT/foo2 \
| _filter_xfs_io
$CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo2
# Doing IO against any range in the first 4K of the file should work.
# Due to a past clone ioctl bug which allowed cloning the inline extent,
# these operations resulted in EIO errors.
echo "File foo2 data after clone operation:"
# All bytes should have the value 0x00 (clone operation failed and did
# not modify our file).
od -t x1 $SCRATCH_MNT/foo2
$XFS_IO_PROG -c "pwrite -S 0xee 0 90" $SCRATCH_MNT/foo2 | _filter_xfs_io
# Test cloning the inline extent against a file which has a size of zero
# but has a prealloc extent. It should not be possible as well to clone
# the inline extent from file bar into this file.
$XFS_IO_PROG -f -c "falloc -k 0 1M" $SCRATCH_MNT/foo3 | _filter_xfs_io
$CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo3
# Doing IO against any range in the first 4K of the file should work.
# Due to a past clone ioctl bug which allowed cloning the inline extent,
# these operations resulted in EIO errors.
echo "First 50 bytes of foo3 after clone operation:"
# Should not be able to read any bytes, file has 0 bytes i_size (the
# clone operation failed and did not modify our file).
od -t x1 $SCRATCH_MNT/foo3
$XFS_IO_PROG -c "pwrite -S 0xff 0 90" $SCRATCH_MNT/foo3 | _filter_xfs_io
# Test cloning the inline extent against a file which consists of a
# single inline extent that has a size not greater than the size of
# bar's inline extent (40 < 50).
# It should be possible to do the extent cloning from bar to this file.
$XFS_IO_PROG -f -c "pwrite -S 0x01 0 40" $SCRATCH_MNT/foo4 \
| _filter_xfs_io
$CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo4
# Doing IO against any range in the first 4K of the file should work.
echo "File foo4 data after clone operation:"
# Must match file bar's content.
od -t x1 $SCRATCH_MNT/foo4
$XFS_IO_PROG -c "pwrite -S 0x02 0 90" $SCRATCH_MNT/foo4 | _filter_xfs_io
# Test cloning the inline extent against a file which consists of a
# single inline extent that has a size greater than the size of bar's
# inline extent (60 > 50).
# It should not be possible to clone the inline extent from file bar
# into this file.
$XFS_IO_PROG -f -c "pwrite -S 0x03 0 60" $SCRATCH_MNT/foo5 \
| _filter_xfs_io
$CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo5
# Reading the file should not fail.
echo "File foo5 data after clone operation:"
# Must have a size of 60 bytes, with all bytes having a value of 0x03
# (the clone operation failed and did not modify our file).
od -t x1 $SCRATCH_MNT/foo5
# Test cloning the inline extent against a file which has no extents but
# has a size greater than bar's inline extent (16K > 50).
# It should not be possible to clone the inline extent from file bar
# into this file.
$XFS_IO_PROG -f -c "truncate 16K" $SCRATCH_MNT/foo6 | _filter_xfs_io
$CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo6
# Reading the file should not fail.
echo "File foo6 data after clone operation:"
# Must have a size of 16K, with all bytes having a value of 0x00 (the
# clone operation failed and did not modify our file).
od -t x1 $SCRATCH_MNT/foo6
# Test cloning the inline extent against a file which has no extents but
# has a size not greater than bar's inline extent (30 < 50).
# It should be possible to clone the inline extent from file bar into
# this file.
$XFS_IO_PROG -f -c "truncate 30" $SCRATCH_MNT/foo7 | _filter_xfs_io
$CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo7
# Reading the file should not fail.
echo "File foo7 data after clone operation:"
# Must have a size of 50 bytes, with all bytes having a value of 0xbb.
od -t x1 $SCRATCH_MNT/foo7
# Test cloning the inline extent against a file which has a size not
# greater than the size of bar's inline extent (20 < 50) but has
# a prealloc extent that goes beyond the file's size. It should not be
# possible to clone the inline extent from bar into this file.
$XFS_IO_PROG -f -c "falloc -k 0 1M" \
-c "pwrite -S 0x88 0 20" \
$SCRATCH_MNT/foo8 | _filter_xfs_io
$CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo8
echo "File foo8 data after clone operation:"
# Must have a size of 20 bytes, with all bytes having a value of 0x88
# (the clone operation did not modify our file).
od -t x1 $SCRATCH_MNT/foo8
_scratch_unmount
}
echo -e "\nTesting without compression and without the no-holes feature...\n"
test_cloning_inline_extents
echo -e "\nTesting with compression and without the no-holes feature...\n"
test_cloning_inline_extents "" "-o compress"
echo -e "\nTesting without compression and with the no-holes feature...\n"
test_cloning_inline_extents "-O no-holes" ""
echo -e "\nTesting with compression and with the no-holes feature...\n"
test_cloning_inline_extents "-O no-holes" "-o compress"
status=0
exit
Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit cb7323fffa85 ("lockd: create and use per-net NSM
RPC clients on MON/UNMON requests") introduced per-net
NSM RPC clients. Unfortunately this doesn't make any sense
without per-net nsm_handle.
E.g. the following scenario could happen
Two hosts (X and Y) in different namespaces (A and B) share
the same nsm struct.
1. nsm_monitor(host_X) called => NSM rpc client created,
nsm->sm_monitored bit set.
2. nsm_mointor(host-Y) called => nsm->sm_monitored already set,
we just exit. Thus in namespace B ln->nsm_clnt == NULL.
3. host X destroyed => nsm->sm_count decremented to 1
4. host Y destroyed => nsm_unmonitor() => nsm_mon_unmon() => NULL-ptr
dereference of *ln->nsm_clnt
So this could be fixed by making per-net nsm_handles list,
instead of global. Thus different net namespaces will not be able
share the same nsm_handle.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Writing invalid command to QSPI_SPI_CMD_REG will terminate current
transfer and de-assert the chip select. This has to be done before
calling spi_finalize_current_message(). Because
spi_finalize_current_message() will mark the end of current message
transfer and schedule the next transfer. If the chipselect is not
de-asserted before calling spi_finalize_current_message() then the next
transfer will overlap with the previous transfer leading to data
corruption.
__spi_pump_message() can be called either from kthread worker context or
directly from the calling process's context. It is possible that these
two calls can race against each other. But race is serialized by
checking whether master->cur_msg == NULL (pointer to msg being handled
by transfer_one() at present). The master->cur_msg is set to NULL when
spi_finalize_current_message() is called on that message, which means
calling spi_finalize_current_message() allows __spi_sync() to pump next
message in calling process context.
Now if spi-ti-qspi calls spi_finalize_current_message() before we
terminate transfer at hardware side, if __spi_pump_message() is called
from process context then the successive transactions can overlap.
Fix this by moving writing invalid command to QSPI_SPI_CMD_REG to
before calling spi_finalize_current_message() call.
Signed-off-by: Vignesh R <vigneshr@ti.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
IOMMU-based dma_mmap() implementation lacked proper support for offset
parameter used in mmap call (it always assumed that mapping starts from
offset zero). This patch adds support for offset parameter to IOMMU-based
implementation.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
dma_mmap() function in IOMMU-based dma-mapping implementation lacked
a check for valid range of mmap parameters (offset and buffer size), what
might have caused access beyond the allocated buffer. This patch fixes
this issue.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
If we fail to allocate a partition structure in the middle of the partition
creation process, the already allocated partitions are never removed, which
means they are still present in the partition list and their resources are
never freed.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>