]> git.ipfire.org Git - thirdparty/kernel/stable.git/log
thirdparty/kernel/stable.git
11 years agonet: macb: Migrate to dev_pm_ops
Soren Brinkmann [Wed, 11 Dec 2013 00:07:19 +0000 (16:07 -0800)] 
net: macb: Migrate to dev_pm_ops

Migrate the suspend/resume functions to use the dev_pm_ops PM interface.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet_sched: sfq: put sfq_unlink in a do - while loop
Yang Yingliang [Tue, 10 Dec 2013 12:55:33 +0000 (20:55 +0800)] 
net_sched: sfq: put sfq_unlink in a do - while loop

Macros with multiple statements should be enclosed in a do - while loop

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet_sched: add space around '>' and before '('
Yang Yingliang [Tue, 10 Dec 2013 12:55:32 +0000 (20:55 +0800)] 
net_sched: add space around '>' and before '('

Spaces required around that '>' (ctx:VxV) and
before the open parenthesis '('.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet_sched: change "foo* bar" to "foo *bar"
Yang Yingliang [Tue, 10 Dec 2013 12:55:31 +0000 (20:55 +0800)] 
net_sched: change "foo* bar" to "foo *bar"

"foo* bar" or "foo * bar" should be "foo *bar".

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet_sched: cls_bpf: use tabs to do indent
Yang Yingliang [Tue, 10 Dec 2013 12:55:30 +0000 (20:55 +0800)] 
net_sched: cls_bpf: use tabs to do indent

Code indent should use tabs where possible

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet_sched: remove unnecessary parentheses while return
Yang Yingliang [Tue, 10 Dec 2013 12:55:29 +0000 (20:55 +0800)] 
net_sched: remove unnecessary parentheses while return

return is not a function, parentheses are not required.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoatm: solos-pci: remove unnecessary pci_set_drvdata()
Jingoo Han [Tue, 10 Dec 2013 03:52:14 +0000 (12:52 +0900)] 
atm: solos-pci: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoatm: he: remove unnecessary pci_set_drvdata()
Jingoo Han [Tue, 10 Dec 2013 03:51:46 +0000 (12:51 +0900)] 
atm: he: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: ieee802154: remove unnecessary spi_set_drvdata()
Jingoo Han [Tue, 10 Dec 2013 03:51:09 +0000 (12:51 +0900)] 
net: ieee802154: remove unnecessary spi_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: phy: spi_ks8995: remove unnecessary spi_set_drvdata()
Jingoo Han [Tue, 10 Dec 2013 03:50:42 +0000 (12:50 +0900)] 
net: phy: spi_ks8995: remove unnecessary spi_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: vmxnet3: remove unnecessary pci_set_drvdata()
Jingoo Han [Tue, 10 Dec 2013 03:50:09 +0000 (12:50 +0900)] 
net: vmxnet3: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: fddi: remove unnecessary pci_set_drvdata()
Jingoo Han [Tue, 10 Dec 2013 03:49:34 +0000 (12:49 +0900)] 
net: fddi: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: hippi: remove unnecessary pci_set_drvdata()
Jingoo Han [Tue, 10 Dec 2013 03:48:38 +0000 (12:48 +0900)] 
net: hippi: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agovirtio_net: spelling fixes
stephen hemminger [Tue, 10 Dec 2013 00:18:45 +0000 (16:18 -0800)] 
virtio_net: spelling fixes

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agovirtio_net: remove unused parameter to send_command
stephen hemminger [Tue, 10 Dec 2013 00:17:40 +0000 (16:17 -0800)] 
virtio_net: remove unused parameter to send_command

All the code passes NULL for the last sg list (in).
Simplify by just removing it.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: handle error more gracefully in socketpair()
Yann Droneaud [Mon, 9 Dec 2013 21:42:20 +0000 (22:42 +0100)] 
net: handle error more gracefully in socketpair()

This patch makes socketpair() use error paths which do not
rely on heavy-weight call to sys_close(): it's better to try
to push the file descriptor to userspace before installing
the socket file to the file descriptor, so that errors are
catched earlier and being easier to handle.

Using sys_close() seems to be the exception, while writing the
file descriptor before installing it look like it's more or less
the norm: eg. except for code used in init/, error handling
involve fput() and put_unused_fd(), but not sys_close().

This make socketpair() usage of sys_close() quite unusual.
So it deserves to be replaced by the common pattern relying on
fput() and put_unused_fd() just like, for example, the one used
in pipe(2) or recvmsg(2).

Three distinct error paths are still needed since calling
fput() on file structure returned by sock_alloc_file() will
implicitly call sock_release() on the associated socket
structure.

Cc: David S. Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=1385979146-13825-1-git-send-email-ydroneaud@opteya.com
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoRevert "macvtap: remove useless codes in macvtap_aio_read() and macvtap_recvmsg()"
David S. Miller [Wed, 11 Dec 2013 03:06:18 +0000 (22:06 -0500)] 
Revert "macvtap: remove useless codes in macvtap_aio_read() and macvtap_recvmsg()"

This reverts commit 41e4af69a5984a3193ba3108fb4e067b0e34dc73.

MSG_TRUNC handling was broken and is going to be fixed in the
'net' tree, so revert this.

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoRevert "tun: remove useless codes in tun_chr_aio_read() and tun_recvmsg()"
David S. Miller [Wed, 11 Dec 2013 03:05:45 +0000 (22:05 -0500)] 
Revert "tun: remove useless codes in tun_chr_aio_read() and tun_recvmsg()"

This reverts commit 73713357ab58aacda1af715bb5a623528dbbfd79.

MSG_TRUNC handling was broken and is going to be fixed in
the 'net' tree, so revert this.

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: more spelling fixes
stephen hemminger [Sun, 8 Dec 2013 20:15:44 +0000 (12:15 -0800)] 
net: more spelling fixes

Various spelling fixes in networking stack

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'ifla_flags'
David S. Miller [Wed, 11 Dec 2013 02:50:06 +0000 (21:50 -0500)] 
Merge branch 'ifla_flags'

Jiri Pirko says:

====================
add support for IFA_FLAGS nl attribute

As this was recently added for ipv6, add it for the rest of occurences
as requested by DaveM.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv4: add support for IFA_FLAGS nl attribute
Jiri Pirko [Sun, 8 Dec 2013 11:16:10 +0000 (12:16 +0100)] 
ipv4: add support for IFA_FLAGS nl attribute

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodn_dev: add support for IFA_FLAGS nl attribute
Jiri Pirko [Sun, 8 Dec 2013 11:16:09 +0000 (12:16 +0100)] 
dn_dev: add support for IFA_FLAGS nl attribute

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agosh_eth: add R8A7791 support
Sergei Shtylyov [Sat, 7 Dec 2013 23:59:18 +0000 (02:59 +0300)] 
sh_eth: add R8A7791 support

Add support for yet another ARM member of the R-Car family, R-Car M2, also known
as R8A7791 -- it will share the code and data with previously added R8A7790.
Despite the Ether devices in these SoCs are indistinguishable at least from the
driver's point of view, we do introduce a new platform device ID "r8a7791-ether"
unlike the wildcard ID used for R8A7778/9 SoCs, due to newly established policy
for the Renesas SoCs.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Wed, 11 Dec 2013 02:30:16 +0000 (21:30 -0500)] 
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to i40e, igb, ixgbe and ixgbevf.

Shannon provides a couple of i40e patches, first restricts the ethtool
diag test messages by using netif_info() macro to when the hardware
bit is enabled in the message level netdev message mask.  Second
provides a fix for when there is an out-of-range descriptor request.

Kamil provides a fix for i40e by updating the loopback enum types and
add information about the current loopback mode to data returned from
get_link_info().

Jesse provides a fix for i40e define name that was being mis-used.
I40E_ITR_NONE was being used as an ITRN register index by accident
because it was easily associated with the i40e Rx ITR and friends
defines, when it should be associated with the DYN_CTL register sets.

Jacob provides an update for ixgbevf Kconfig description since the VF
driver supports more than just the 82599 device.

Don and Alex provide a cleanup patch for ixgbe to make it where head,
tail, next to clean and next to use are all reset in a single function
for both Tx and Rx path.  Before, the code for this was spread out over
several areas which made it difficult to track what the values were for
each of the values.

Carolyn provides two igb patches to add a media switching feature for
i354 PHY's and new Media Auto Sense for 82580 devices only.

Aaron Sierra provides a fix for igb to resolve an issue with the 64-bit
PCI addresses being truncated because the return values of
pci_resource_start() and pci_resouce_end() were being cast to unsigned
long.

Guenter Roeck provides two igb patches, first simplifies the code by
attaching the hwmon sysfs attributes to hwmon device instead of the
PCI device.  Second fixes the temperature sensor attribute index by
setting it to 1 instead of 0 (per hwmon ABI).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobgmac: start/stop PHY on netdev open/stop
Rafał Miłecki [Tue, 10 Dec 2013 16:19:39 +0000 (17:19 +0100)] 
bgmac: start/stop PHY on netdev open/stop

I've realized that I need to call ethtool command to get Ethernet
working after booting. Ex call: ethtool -s eth0 autoneg on
It was fixing Ethernet even if auto-negotiation was already on.

Adding calls to phy_start and phy_stop look like a real solution.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoneigh: use neigh_parms_net() to get struct neigh_parms->net pointer
Jiri Pirko [Tue, 10 Dec 2013 22:55:07 +0000 (23:55 +0100)] 
neigh: use neigh_parms_net() to get struct neigh_parms->net pointer

This fixes compile error when CONFIG_NET_NS is not set.

Introduced by:
commit 1d4c8c29841b9991cdf3c7cc4ba7f96a94f104ca
    "neigh: restore old behaviour of default parms values"

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agocipso: cleanup cipso_v4_translate() when !CONFIG_NETLABEL
Paul Moore [Tue, 10 Dec 2013 20:00:50 +0000 (15:00 -0500)] 
cipso: cleanup cipso_v4_translate() when !CONFIG_NETLABEL

Don't needlessly recompute 'opt[opt_iter + 1]' as we already have it
stored in 'tag_len'.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6 addrconf: revert /proc/net/if_inet6 ifa_flag format
Jiri Pirko [Tue, 10 Dec 2013 12:56:29 +0000 (13:56 +0100)] 
ipv6 addrconf: revert /proc/net/if_inet6 ifa_flag format

Turned out that applications like ifconfig do not handle the change.
So revert ifa_flag format back to 2-letter hex value.

Introduced by:
commit 479840ffdbe4242e8a25349218c8e0859223aa35
    "ipv6 addrconf: extend ifa_flags to u32"

Reported-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Tested-by: FLorent Fourcot <florent.fourcot@enst-bretagne.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoigb: Start temperature sensor attribute index with 1
Guenter Roeck [Tue, 26 Nov 2013 07:15:34 +0000 (07:15 +0000)] 
igb: Start temperature sensor attribute index with 1

Per hwmon ABI, temperature sensor attribute index starts with 1, not 0.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoigb: Convert to use devm_hwmon_device_register_with_groups
Guenter Roeck [Tue, 26 Nov 2013 07:15:23 +0000 (07:15 +0000)] 
igb: Convert to use devm_hwmon_device_register_with_groups

Simplify the code. Attach hwmon sysfs attributes to hwmon device
instead of pci device. Avoid race conditions caused by attributes
being created after registration and provide mandatory 'name'
attribute by using new hwmon API.

Other cleanup:

Instead of allocating memory for hwmon attributes, move attributes
and all other hwmon related data into struct hwmon_buff and allocate
the entire structure using devm_kzalloc.

Check return value from calls to igb_add_hwmon_attr() one by one instead
of logically combining them all together.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoigb: Add new feature Media Auto Sense for 82580 devices only
Carolyn Wyborny [Thu, 17 Oct 2013 05:36:26 +0000 (05:36 +0000)] 
igb: Add new feature Media Auto Sense for 82580 devices only

This patch adds support for the hardware feature Media Auto Sense.  This
feature requires a custom EEPROM image provided by our customer support
team.  The feature allows hardware designed with dual PHY's, fiber and
copper to be used with either media without additional EEPROM changes.
Fiber is preferred and driver will swap and configure for fiber media if
sensed by the device at any time. Device will swap back to copper if it
is the only media detected.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoigb: Support ports mapped in 64-bit PCI space
Aaron Sierra [Thu, 31 Oct 2013 00:32:34 +0000 (00:32 +0000)] 
igb: Support ports mapped in 64-bit PCI space

This patch resolves an issue with 64-bit PCI addresses being truncated
because the return values of pci_resource_start() and pci_resource_end()
were being cast to unsigned long.

Signed-off-by: Aaron Sierra <asierra@xes-inc.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoigb: Add media switching feature for i354 PHY's
Carolyn Wyborny [Thu, 17 Oct 2013 05:23:01 +0000 (05:23 +0000)] 
igb: Add media switching feature for i354 PHY's

This patch adds a new feature which is supported in some PHY's on some i354
devices.  This feature is Auto Media Detect and allows which ever media is
detected first by the PHY to be the media used and configured by the
device.  This is a media swapping feature that is wholly contained in the
Marvell PHY.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbe: Focus config of head, tail ntc, and ntu all into a single function
Don Skidmore [Wed, 30 Oct 2013 07:45:39 +0000 (07:45 +0000)] 
ixgbe: Focus config of head, tail ntc, and ntu all into a single function

This patch makes it so that head, tail, next to clean, and next to use are
all reset in a single function for the Tx or Rx path. Previously the code
for this was spread out over several areas which could make it difficult to
track what the values for these were.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoixgbevf: update Kconfig description
Jacob Keller [Fri, 22 Nov 2013 05:58:15 +0000 (05:58 +0000)] 
ixgbevf: update Kconfig description

This patch updates the ixgbevf Kconfig description, as the VF driver supports
more than just the 82599 device. This patch renames the config menu item, as
well as updates the help description to make it more obvious that the driver
supports more than just a single device group.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoi40e: Bump version number
Catherine Sullivan [Wed, 20 Nov 2013 10:03:10 +0000 (10:03 +0000)] 
i40e: Bump version number

Version updated to 0.3.13-k

Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoi40e: remove and fix confusing define name
Jesse Brandeburg [Wed, 20 Nov 2013 10:03:09 +0000 (10:03 +0000)] 
i40e: remove and fix confusing define name

I40E_ITR_NONE was being used as an ITRN register index by
accident because it was easily associated with the I40E_RX_ITR
and friends defines.

Change the name slightly in order to make it clear that
I40E_ITR_NONE is really associated with the DYN_CTL register
sets.

Change-Id: I04702c027c7495b90a8bf2db85d3e085a2c7d02a
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoi40e: complain about out-of-range descriptor request
Shannon Nelson [Wed, 20 Nov 2013 10:03:08 +0000 (10:03 +0000)] 
i40e: complain about out-of-range descriptor request

Instead of silently clamping the descriptor change request into
the proper range, fail the request and complain in the log file.

Change-Id: Id55ef59255d93c04bedffa8e25fe7ea796c90f32
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoi40e: loopback info and set loopback fix
Kamil Krawczyk [Wed, 20 Nov 2013 10:03:07 +0000 (10:03 +0000)] 
i40e: loopback info and set loopback fix

Add information about current loopback mode to data returned from
get_link_info function. Minor fix in set_loopback function and
update in loopback types enum.

Change-Id: I9d1c540a84ab18eef5ea6429be6331f33fc06aca
Signed-off-by: Kamil Krawczyk <kamil.krawczyk@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoi40e: restrict diag test messages
Shannon Nelson [Wed, 20 Nov 2013 10:03:06 +0000 (10:03 +0000)] 
i40e: restrict diag test messages

Use the netif_info() macro to restrict messaging to when the HW
bit is enabled in the msglvl netdev message mask.

Change-Id: I83030d4402991cfb7da100da00f05ce502ada4ae
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agomlx4_core: Roll back round robin bitmap allocation commit for CQs, SRQs, and MPTs
Jack Morgenstein [Sun, 8 Dec 2013 14:50:17 +0000 (16:50 +0200)] 
mlx4_core: Roll back round robin bitmap allocation commit for CQs, SRQs, and MPTs

Commit f4ec9e9 "mlx4_core: Change bitmap allocator to work in round-robin fashion"
introduced round-robin allocation (via bitmap) for all resources which allocate
via a bitmap.

Round robin allocation is desirable for mcgs, counters, pd's, UARs, and xrcds.
These are simply numbers, with no involvement of ICM memory mapping.

Round robin is required for QPs, since we had a problem with immediate
reuse of a 24-bit QP number (commit f4ec9e9).

However, for other resources which use the bitmap allocator and involve
mapping ICM memory -- MPTs, CQs, SRQs -- round-robin is not desirable.

What happens in these cases is the following:

ICM memory is allocated and mapped in chunks of 256K.

Since the resource allocation index goes up monotonically, the allocator
will eventually require mapping a new chunk. Now, chunks are also unmapped
when their reference count goes back to zero.  Thus, if a single app is
running and starts/exits frequently we will have the following situation:

When the app starts, a new chunk must be allocated and mapped.

When the app exits, the chunk reference count goes back to zero, and the
chunk is unmapped and freed. Therefore, the app must pay the cost of allocation
and mapping of ICM memory each time it runs (although the price is paid only when
allocating the initial entry in the new chunk).

For apps which allocate MPTs/SRQs/CQs and which operate as described above,
this presented a performance problem.

We therefore roll back the round-robin allocator modification for MPTs, CQs, SRQs.

Reported-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: use ip6_flowinfo helper
Florent Fourcot [Sun, 8 Dec 2013 14:47:01 +0000 (15:47 +0100)] 
ipv6: use ip6_flowinfo helper

Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: add ip6_flowlabel helper
Florent Fourcot [Sun, 8 Dec 2013 14:47:00 +0000 (15:47 +0100)] 
ipv6: add ip6_flowlabel helper

And use it if possible.

Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: remove rcv_tclass of ipv6_pinfo
Florent Fourcot [Sun, 8 Dec 2013 14:46:59 +0000 (15:46 +0100)] 
ipv6: remove rcv_tclass of ipv6_pinfo

tclass information in now already stored in rcv_flowinfo
We do not need to store the same information twice.

Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: move IPV6_TCLASS_MASK definition in ipv6.h
Florent Fourcot [Sun, 8 Dec 2013 14:46:58 +0000 (15:46 +0100)] 
ipv6: move IPV6_TCLASS_MASK definition in ipv6.h

Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: add flowinfo for tcp6 pkt_options for all cases
Florent Fourcot [Sun, 8 Dec 2013 14:46:57 +0000 (15:46 +0100)] 
ipv6: add flowinfo for tcp6 pkt_options for all cases

The current implementation of IPV6_FLOWINFO only gives a
result if pktoptions is available (thanks to the
ip6_datagram_recv_ctl function).
It gives inconsistent results to user space, sometimes
there is a result for getsockopt(IPV6_FLOWINFO), sometimes
not.

This patch add rcv_flowinfo to store it, and return it to
the userspace in the same way than other pkt_options.

Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobgmac: connect to PHY and make use of PHY device
Rafał Miłecki [Fri, 6 Dec 2013 23:53:55 +0000 (00:53 +0100)] 
bgmac: connect to PHY and make use of PHY device

We were already registering MDIO bus, but we were not connecting bgmac
to the PHY. Add proper call and implement adjust link function to switch
MAC into requested state.
At the same time it's possible to drop our internal PHY management.
This is a "standard" PHY, so the "Generic PHY" driver works perfectly
fine with this. Don't duplicate the code.
Finally make use of phy_ethtool_[gs]set functions instead implementing
them from scratch.

This change was successfully tested on BCM5357. I was able to
autonegotiate 1000Mb/s full duplex, as well as force any of the
10/100/1000 half/full modes.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoetherdevice: Optimize a few is_<foo>_ether_addr functions
Joe Perches [Fri, 6 Dec 2013 23:44:21 +0000 (15:44 -0800)] 
etherdevice: Optimize a few is_<foo>_ether_addr functions

If CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is set,
several is_<foo>_ether_addr functions can be slightly
improved by using u32 dereferences.

I believe all current uses of is_zero_ether_addr and
is_broadcast_ether_addr are u16 aligned, so always use
u16 references to improve those functions performance.

Document the u16 alignment requirements.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobatadv: Slight optimization of batadv_compare_eth
Joe Perches [Fri, 6 Dec 2013 22:39:46 +0000 (14:39 -0800)] 
batadv: Slight optimization of batadv_compare_eth

Use the newly added generic routine ether_addr_equal_unaligned
to test if possibly unaligned to u16 Ethernet addresses are equal.

This slightly improves comparison time for systems with
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoetherdevice: Add ether_addr_equal_unaligned
Joe Perches [Fri, 6 Dec 2013 22:21:01 +0000 (14:21 -0800)] 
etherdevice: Add ether_addr_equal_unaligned

Add a generic routine to test if possibly unaligned
to u16 Ethernet addresses are equal.

If CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is set,
this uses the slightly faster generic routine
ether_addr_equal, otherwise this uses memcmp.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'neigh'
David S. Miller [Tue, 10 Dec 2013 01:56:27 +0000 (20:56 -0500)] 
Merge branch 'neigh'

Jiri Pirko says:

====================
neigh: respect default parms values

This is a long standing regression. But since the patchset is bigger and
the regression happened in 2007, I'm proposing this to net-next instead.

Basically the problem is that if user wants to use /etc/sysctl.conf to specify
default values of neigh related params, he is not able to do that.

The reason is that the default values are copied to dev instance right after
netdev is registered. And that is way to early. The original behaviour
for ipv4 was that this happened after first address was assigned to device.
For ipv6 this was apparently from the very beginning.

So this patchset basically reverts the behaviour back to what it was in 2007 for
ipv4 and changes the behaviour for ipv6 so they are both the same.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoneigh: ipv6: respect default values set before an address is assigned to device
Jiri Pirko [Sat, 7 Dec 2013 18:26:57 +0000 (19:26 +0100)] 
neigh: ipv6: respect default values set before an address is assigned to device

Make the behaviour similar to ipv4. This will allow user to set sysctl
default neigh param values and these values will be respected even by
devices registered before (that ones what do not have address set yet).

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoneigh: restore old behaviour of default parms values
Jiri Pirko [Sat, 7 Dec 2013 18:26:56 +0000 (19:26 +0100)] 
neigh: restore old behaviour of default parms values

Previously inet devices were only constructed when addresses are added.
Therefore the default neigh parms values they get are the ones at the
time of these operations.

Now that we're creating inet devices earlier, this changes the behaviour
of default neigh parms values in an incompatible way (see bug #8519).

This patch creates a compromise by setting the default values at the
same point as before but only for those that have not been explicitly
set by the user since the inet device's creation.

Introduced by:
commit 8030f54499925d073a88c09f30d5d844fb1b3190
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Thu Feb 22 01:53:47 2007 +0900

    [IPV4] devinet: Register inetdev earlier.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoneigh: use tbl->family to distinguish ipv4 from ipv6
Jiri Pirko [Sat, 7 Dec 2013 18:26:55 +0000 (19:26 +0100)] 
neigh: use tbl->family to distinguish ipv4 from ipv6

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoneigh: wrap proc dointvec functions
Jiri Pirko [Sat, 7 Dec 2013 18:26:54 +0000 (19:26 +0100)] 
neigh: wrap proc dointvec functions

This will be needed later on to provide better management of default values.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoneigh: convert parms to an array
Jiri Pirko [Sat, 7 Dec 2013 18:26:53 +0000 (19:26 +0100)] 
neigh: convert parms to an array

This patch converts the neigh param members to an array. This allows easier
manipulation which will be needed later on to provide better management of
default values.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'phy_reset'
David S. Miller [Tue, 10 Dec 2013 01:39:05 +0000 (20:39 -0500)] 
Merge branch 'phy_reset'

Florian Fainelli says:

====================
net: phy: consolidate PHY reset

This patchset consolidates the PHY reset through the MII BMCR
register by using a central place were this is done.

This patchset resumes the work Kyle Moffett started here:
https://lkml.org/lkml/2011/10/20/301

Note that at this point, drivers doing funky things after issuing
a PHY reset using phy_init_hw() will still suffer from PHY state
machine problems, this will be taken care of later on.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: sh_eth: do not issue a wild PHY reset through BMCR
Florian Fainelli [Fri, 6 Dec 2013 21:01:38 +0000 (13:01 -0800)] 
net: sh_eth: do not issue a wild PHY reset through BMCR

The sh_eth driver issues an uncontrolled PHY reset through the MII
register BMCR but fails to wait for the reset to complete, and will also
implicitely wipe out all possible PHY fixups applied. Use phy_init_hw()
which remedies both problems.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: tc35815: use phy_init_hw for PHY reset
Florian Fainelli [Fri, 6 Dec 2013 21:01:37 +0000 (13:01 -0800)] 
net: tc35815: use phy_init_hw for PHY reset

Instead of open-coding the PHY reset through MII BMCR, use phy_init_hw()
which does that for us and also makes sure that any PHY specific fixups
are applied.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: pxa168_eth: use phy_init_hw for PHY reset
Florian Fainelli [Fri, 6 Dec 2013 21:01:36 +0000 (13:01 -0800)] 
net: pxa168_eth: use phy_init_hw for PHY reset

Instead of open-coding a PHY reset through the MII BMCR register, use
phy_init_hw() which does this for us and ensures that PHY device fixups
are also applied. We also remove a call to ethernet_phy_reset() which is
now unncessary since phy_attach() calls phy_attach_direct() which in
turns calls phy_init_hw().

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: mv643xx_eth: use phy_init_hw to reset PHY
Florian Fainelli [Fri, 6 Dec 2013 21:01:35 +0000 (13:01 -0800)] 
net: mv643xx_eth: use phy_init_hw to reset PHY

Instead of open-coding a PHY reset through the MII BMCR register, use
phy_init_hw() which does that for us and will also make sure that PHY
fixups are applied if required. We also remove a call to phy_reset()
due to the following sequence of calls in the driver:

phy_scan()
-> phy_connect()
-> phy_connect_direct()
-> phy_attach_direct()
-> phy_init_hw()

and we only have a call to phy_init() after phy_scan().

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: phy: consolidate PHY reset in phy_init_hw()
Florian Fainelli [Fri, 6 Dec 2013 21:01:34 +0000 (13:01 -0800)] 
net: phy: consolidate PHY reset in phy_init_hw()

There are quite a lot of drivers touching a PHY device MII_BMCR
register to reset the PHY without taking care of:

1) ensuring that BMCR_RESET is cleared after a given timeout
2) the PHY state machine resuming to the proper state and re-applying
potentially changed settings such as auto-negotiation

Introduce phy_poll_reset() which will take care of polling the MII_BMCR
for the BMCR_RESET bit to be cleared after a given timeout or return a
timeout error code.

In order to make sure the PHY is in a correct state, phy_init_hw() first
issues a software reset through MII_BMCR and then applies any fixups.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: bfin_mac: do not reset PHY after phy_start()
Florian Fainelli [Fri, 6 Dec 2013 21:01:33 +0000 (13:01 -0800)] 
net: bfin_mac: do not reset PHY after phy_start()

The PHY is already reset during driver probing, and this manual reset
after calling phy_start() will wipe out board-specific PHY fixups and
driver specific configuration initialization. Remove that explicit PHY
reset.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: greth: use phy_read_status()
Florian Fainelli [Fri, 6 Dec 2013 21:01:32 +0000 (13:01 -0800)] 
net: greth: use phy_read_status()

In case the greth driver is bound to anything but the Generic PHY
driver or the PHY has a special read_status callback implemented,
unexpected things will happen. Make sure we that we use
phy_read_status() which does the proper abstraction of calling the
driver specific read_status() callback for a given PHY.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: phy: use phy_init_hw instead of open-coding it
Florian Fainelli [Fri, 6 Dec 2013 21:01:31 +0000 (13:01 -0800)] 
net: phy: use phy_init_hw instead of open-coding it

Use phy_init_hw() instead of open-coding it in phy_mii_ioctl(), this
improves consistenty and makes sure that we will not duplicate the same
routine somewhere else.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: phy: report link partner features through ethtool
Florian Fainelli [Fri, 6 Dec 2013 21:01:30 +0000 (13:01 -0800)] 
net: phy: report link partner features through ethtool

The PHY library already reads the MII_STAT1000 and MII_LPA registers in
genphy_read_status(), so extend it to also populate the PHY device link
partner advertised features such that we can feed this back into ethtool
when asked for it in phy_ethtool_gset().

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agotun: remove useless codes in tun_chr_aio_read() and tun_recvmsg()
Zhi Yong Wu [Fri, 6 Dec 2013 20:55:00 +0000 (04:55 +0800)] 
tun: remove useless codes in tun_chr_aio_read() and tun_recvmsg()

By checking related codes, it is impossible that ret > len or total_len,
so we should remove some useless codes in both above functions.

Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agomacvtap: remove useless codes in macvtap_aio_read() and macvtap_recvmsg()
Zhi Yong Wu [Fri, 6 Dec 2013 20:54:59 +0000 (04:54 +0800)] 
macvtap: remove useless codes in macvtap_aio_read() and macvtap_recvmsg()

By checking related codes, it is impossible that ret > len or total_len,
so we should remove some useless coeds in both above functions.

Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoxen-netback: improve guest-receive-side flow control
Paul Durrant [Fri, 6 Dec 2013 16:36:07 +0000 (16:36 +0000)] 
xen-netback: improve guest-receive-side flow control

The way that flow control works without this patch is that, in start_xmit()
the code uses xenvif_count_skb_slots() to predict how many slots
xenvif_gop_skb() will consume and then adds this to a 'req_cons_peek'
counter which it then uses to determine if the shared ring has that amount
of space available by checking whether 'req_prod' has passed that value.
If the ring doesn't have space the tx queue is stopped.
xenvif_gop_skb() will then consume slots and update 'req_cons' and issue
responses, updating 'rsp_prod' as it goes. The frontend will consume those
responses and post new requests, by updating req_prod. So, req_prod chases
req_cons which chases rsp_prod, and can never exceed that value. Thus if
xenvif_count_skb_slots() ever returns a number of slots greater than
xenvif_gop_skb() uses, req_cons_peek will get to a value that req_prod cannot
possibly achieve (since it's limited by the 'real' req_cons) and, if this
happens enough times, req_cons_peek gets more than a ring size ahead of
req_cons and the tx queue then remains stopped forever waiting for an
unachievable amount of space to become available in the ring.

Having two routines trying to calculate the same value is always going to be
fragile, so this patch does away with that. All we essentially need to do is
make sure that we have 'enough stuff' on our internal queue without letting
it build up uncontrollably. So start_xmit() makes a cheap optimistic check
of how much space is needed for an skb and only turns the queue off if that
is unachievable. net_rx_action() is the place where we could do with an
accurate predicition but, since that has proven tricky to calculate, a cheap
worse-case (but not too bad) estimate is all we really need since the only
thing we *must* prevent is xenvif_gop_skb() consuming more slots than are
available.

Without this patch I can trivially stall netback permanently by just doing
a large guest to guest file copy between two Windows Server 2008R2 VMs on a
single host.

Patch tested with frontends in:
- Windows Server 2008R2
- CentOS 6.0
- Debian Squeeze
- Debian Wheezy
- SLES11

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agotipc: remove interface state mirroring in bearer
Erik Hugne [Fri, 6 Dec 2013 15:08:00 +0000 (10:08 -0500)] 
tipc: remove interface state mirroring in bearer

struct 'tipc_bearer' is a generic representation of the underlying
media type, and exists in a one-to-one relationship to each interface
TIPC is using. The struct contains a 'blocked' flag that mirrors the
operational and execution state of the represented interface, and is
updated through notification calls from the latter. The users of
tipc_bearer are checking this flag before each attempt to send a
packet via the interface.

This state mirroring serves no purpose in the current code base. TIPC
links will not discover a media failure any faster through this
mechanism, and in reality the flag only adds overhead at packet
sending and reception.

Furthermore, the fact that the flag needs to be protected by a spinlock
aggregated into tipc_bearer has turned out to cause a serious and
completely unnecessary deadlock problem.

CPU0                                    CPU1
----                                    ----
Time 0: bearer_disable()                link_timeout()
Time 1:   spin_lock_bh(&b_ptr->lock)      tipc_link_push_queue()
Time 2:   tipc_link_delete()                tipc_bearer_blocked(b_ptr)
Time 3:     k_cancel_timer(&req->timer)       spin_lock_bh(&b_ptr->lock)
Time 4:       del_timer_sync(&req->timer)

I.e., del_timer_sync() on CPU0 never returns, because the timer handler
on CPU1 is waiting for the bearer lock.

We eliminate the 'blocked' flag from struct tipc_bearer, along with all
tests on this flag. This not only resolves the deadlock, but also
simplifies and speeds up the data path execution of TIPC. It also fits
well into our ongoing effort to make the locking policy simpler and
more manageable.

An effect of this change is that we can get rid of functions such as
tipc_bearer_blocked(), tipc_continue() and tipc_block_bearer().
We replace the latter with a new function, tipc_reset_bearer(), which
resets all links associated to the bearer immediately after an
interface goes down.

A user might notice one slight change in link behaviour after this
change. When an interface goes down, (e.g. through a NETDEV_DOWN
event) all attached links will be reset immediately, instead of
leaving it to each link to detect the failure through a timer-driven
mechanism. We consider this an improvement, and see no obvious risks
with the new behavior.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Paul Gortmaker <Paul.Gortmaker@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agox25: convert printks to pr_<level>
wangweidong [Fri, 6 Dec 2013 11:24:33 +0000 (19:24 +0800)] 
x25: convert printks to pr_<level>

use pr_<level> instead of printk(LEVEL)

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agopacket: introduce PACKET_QDISC_BYPASS socket option
Daniel Borkmann [Fri, 6 Dec 2013 10:36:17 +0000 (11:36 +0100)] 
packet: introduce PACKET_QDISC_BYPASS socket option

This patch introduces a PACKET_QDISC_BYPASS socket option, that
allows for using a similar xmit() function as in pktgen instead
of taking the dev_queue_xmit() path. This can be very useful when
PF_PACKET applications are required to be used in a similar
scenario as pktgen, but with full, flexible packet payload that
needs to be provided, for example.

On default, nothing changes in behaviour for normal PF_PACKET
TX users, so everything stays as is for applications. New users,
however, can now set PACKET_QDISC_BYPASS if needed to prevent
own packets from i) reentering packet_rcv() and ii) to directly
push the frame to the driver.

In doing so we can increase pps (here 64 byte packets) for
PF_PACKET a bit:

  # CPUs -- QDISC_BYPASS   -- qdisc path -- qdisc path[**]
  1 CPU  ==  1,509,628 pps --  1,208,708 --  1,247,436
  2 CPUs ==  3,198,659 pps --  2,536,012 --  1,605,779
  3 CPUs ==  4,787,992 pps --  3,788,740 --  1,735,610
  4 CPUs ==  6,173,956 pps --  4,907,799 --  1,909,114
  5 CPUs ==  7,495,676 pps --  5,956,499 --  2,014,422
  6 CPUs ==  9,001,496 pps --  7,145,064 --  2,155,261
  7 CPUs == 10,229,776 pps --  8,190,596 --  2,220,619
  8 CPUs == 11,040,732 pps --  9,188,544 --  2,241,879
  9 CPUs == 12,009,076 pps -- 10,275,936 --  2,068,447
 10 CPUs == 11,380,052 pps -- 11,265,337 --  1,578,689
 11 CPUs == 11,672,676 pps -- 11,845,344 --  1,297,412
 [...]
 20 CPUs == 11,363,192 pps -- 11,014,933 --  1,245,081

 [**]: qdisc path with packet_rcv(), how probably most people
       seem to use it (hopefully not anymore if not needed)

The test was done using a modified trafgen, sending a simple
static 64 bytes packet, on all CPUs.  The trick in the fast
"qdisc path" case, is to avoid reentering packet_rcv() by
setting the RAW socket protocol to zero, like:
socket(PF_PACKET, SOCK_RAW, 0);

Tradeoffs are documented as well in this patch, clearly, if
queues are busy, we will drop more packets, tc disciplines are
ignored, and these packets are not visible to taps anymore. For
a pktgen like scenario, we argue that this is acceptable.

The pointer to the xmit function has been placed in packet
socket structure hole between cached_dev and prot_hook that
is hot anyway as we're working on cached_dev in each send path.

Done in joint work together with Jesper Dangaard Brouer.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: dev: move inline skb_needs_linearize helper to header
Daniel Borkmann [Fri, 6 Dec 2013 10:36:16 +0000 (11:36 +0100)] 
net: dev: move inline skb_needs_linearize helper to header

As we need it elsewhere, move the inline helper function of
skb_needs_linearize() over to skbuff.h include file. While
at it, also convert the return to 'bool' instead of 'int'
and add a proper kernel doc.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Tue, 10 Dec 2013 01:20:14 +0000 (20:20 -0500)] 
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Merge 'net' into 'net-next' to get the AF_PACKET bug fix that
Daniel's direct transmit changes depend upon.

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agopacket: fix send path when running with proto == 0
Daniel Borkmann [Fri, 6 Dec 2013 10:36:15 +0000 (11:36 +0100)] 
packet: fix send path when running with proto == 0

Commit e40526cb20b5 introduced a cached dev pointer, that gets
hooked into register_prot_hook(), __unregister_prot_hook() to
update the device used for the send path.

We need to fix this up, as otherwise this will not work with
sockets created with protocol = 0, plus with sll_protocol = 0
passed via sockaddr_ll when doing the bind.

So instead, assign the pointer directly. The compiler can inline
these helper functions automagically.

While at it, also assume the cached dev fast-path as likely(),
and document this variant of socket creation as it seems it is
not widely used (seems not even the author of TX_RING was aware
of that in his reference example [1]). Tested with reproducer
from e40526cb20b5.

 [1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap#Example

Fixes: e40526cb20b5 ("packet: fix use after free race in send path when dev is released")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Salam Noureddine <noureddine@aristanetworks.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agopkt_sched: give visibility to mq slave qdiscs
Eric Dumazet [Thu, 5 Dec 2013 19:12:02 +0000 (11:12 -0800)] 
pkt_sched: give visibility to mq slave qdiscs

Commit 6da7c8fcbcbd ("qdisc: allow setting default queuing discipline")
added the ability to change default qdisc from pfifo_fast to say fq

But as most modern ethernet devices are multiqueue, we cant really
see all the statistics from "tc -s qdisc show", as the default root
qdisc is mq.

This patch adds the calls to qdisc_list_add() to mq and mqprio

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Tue, 10 Dec 2013 00:21:31 +0000 (19:21 -0500)] 
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to i40e only.

Jacob provides a i40e patch to get 1588 work correctly by separating
TSYNVALID and TSYNINDX fields in the receive descriptor.

Jesse provides several i40e patches, first to correct the checking
of the multi-bit state.  The hash is reported correctly in the RSS
field if and only if the filter status is 3.  Other values of the
filter status mean different things and we should not depend on a
bitwise result.  Then provides a patch to enable a couple of
workarounds based on revision ID that allow the driver to work
more fully on early hardware.

Shannon provides several i40e patches as well.  First sets the media
type in the hardware structure based on the external connection type.
Then provides a patch to only setup the rings that will be used.  Lastly
provides a fix where the TESTING state was still set when exiting the
ethtool diagnostics.

Kevin Scott provides one i40e patch to add a new flag to the i40e_add_veb()
which allows the driver to request the hardware to filter on layer 2
parameters.

Anjali provides four i40e patches, first refactors the reset code in
order to re-size queues and vectors while the interface is still up.
Then provides a patch to enable all PCTYPEs expect FCoE for RSS.  Adds
a message to notify the user of how many VFs are initialized on each
port.  Lastly adds a new variable to track the number of PF instances,
this is a global counter on purpose so that each PF loaded has a
unique ID.

Catherine bumps the driver version.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agowan: wanxl: remove unnecessary pci_set_drvdata()
Jingoo Han [Mon, 9 Dec 2013 03:32:10 +0000 (12:32 +0900)] 
wan: wanxl: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agowan: pci200syn: remove unnecessary pci_set_drvdata()
Jingoo Han [Mon, 9 Dec 2013 03:31:46 +0000 (12:31 +0900)] 
wan: pci200syn: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agowan: pc300too: remove unnecessary pci_set_drvdata()
Jingoo Han [Mon, 9 Dec 2013 03:31:12 +0000 (12:31 +0900)] 
wan: pc300too: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agowan: lmc: remove unnecessary pci_set_drvdata()
Jingoo Han [Mon, 9 Dec 2013 03:30:41 +0000 (12:30 +0900)] 
wan: lmc: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agowan: dscc4: remove unnecessary pci_set_drvdata()
Jingoo Han [Mon, 9 Dec 2013 03:29:46 +0000 (12:29 +0900)] 
wan: dscc4: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoirda: vlsi_ir: remove unnecessary pci_set_drvdata()
Jingoo Han [Mon, 9 Dec 2013 03:29:08 +0000 (12:29 +0900)] 
irda: vlsi_ir: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoirda: via-ircc: remove unnecessary pci_set_drvdata()
Jingoo Han [Mon, 9 Dec 2013 03:28:28 +0000 (12:28 +0900)] 
irda: via-ircc: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: forcedeth: remove unnecessary pci_set_drvdata()
Jingoo Han [Mon, 9 Dec 2013 03:27:50 +0000 (12:27 +0900)] 
net: forcedeth: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: ns83820: remove unnecessary pci_set_drvdata()
Jingoo Han [Mon, 9 Dec 2013 03:27:27 +0000 (12:27 +0900)] 
net: ns83820: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: bna: remove unnecessary pci_set_drvdata()
Jingoo Han [Mon, 9 Dec 2013 03:26:44 +0000 (12:26 +0900)] 
net: bna: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: sis900: remove unnecessary pci_set_drvdata()
Jingoo Han [Mon, 9 Dec 2013 03:25:44 +0000 (12:25 +0900)] 
net: sis900: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: sfc: remove unnecessary pci_set_drvdata()
Jingoo Han [Mon, 9 Dec 2013 03:24:07 +0000 (12:24 +0900)] 
net: sfc: remove unnecessary pci_set_drvdata()

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoi40e: Add a new variable to track number of pf instances
Anjali Singhai Jain [Wed, 20 Nov 2013 10:03:01 +0000 (10:03 +0000)] 
i40e: Add a new variable to track number of pf instances

Track the number of physical functions (PFs) found, this is a global counter
on purpose so that each pf loaded has a unique ID.

Change-Id: I74d618520afbce4a774d0235449e3b5f97ff6d4a
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoi40e: add num_VFs message
Anjali Singhai Jain [Wed, 20 Nov 2013 10:03:00 +0000 (10:03 +0000)] 
i40e: add num_VFs message

Print a message to notify the user of how many VFs are initialized on each
port.

Change-Id: I29ac2acc478ee4e588fd6ffcc35133d4c6607ca9
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoi40e: refactor ethtool tests
Shannon Nelson [Wed, 20 Nov 2013 10:02:59 +0000 (10:02 +0000)] 
i40e: refactor ethtool tests

Put the print and reset statements in the actual test functions to make
them more self-contained, and only run the reset for tests that need it.

Change-Id: Ic70f49b11bf8bae82e59d8fd25b46215c90c4510
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoi40e: clear test state bit after all ethtool tests
Shannon Nelson [Wed, 20 Nov 2013 10:02:58 +0000 (10:02 +0000)] 
i40e: clear test state bit after all ethtool tests

Fix a bug where the TESTING state was still set when
exiting the ethtool diagnostics.

Change-Id: Ic47950d2e86a67167d1d282256d477cecd86d820
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoi40e: only set up the rings to be used
Shannon Nelson [Wed, 20 Nov 2013 10:02:57 +0000 (10:02 +0000)] 
i40e: only set up the rings to be used

The VSI may be allocated more queues (alloc_queue_pairs) than actually
are to be used (num_queue_pairs), so only allocate rings for the queues
to be used.  The numbers will likely be the same for most VSIs, but can
be different based on how TCs are assigned and enabled.

Change-Id: Ie40f7ad0affbc4b45d6f049bcf02ee2fa24edc74
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoi40e: Enable all PCTYPEs except FCOE for RSS.
Anjali Singhai Jain [Wed, 20 Nov 2013 10:02:56 +0000 (10:02 +0000)] 
i40e: Enable all PCTYPEs except FCOE for RSS.

RSS can steer packets based on recognition of all
sorts of different headers.  Enable some more of them.

Change-Id: I2264dedae66fb0bceca6fb6e772e050e3ca8efc8
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoi40e: refactor reset code
Anjali Singhai Jain [Wed, 20 Nov 2013 10:02:55 +0000 (10:02 +0000)] 
i40e: refactor reset code

In order to re-size queues and vectors while the interface is
still up, we need to be able to call functions to free and
re-allocate without bringing down the VSI.

We also need to reset the existing setup, update the
configuration and then rebuild again. This requires us to have
the reset flow broken down into two parts.

Change-Id: I374dd25aabf769decda69b676491c7b7730a4635
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoi40e: Bump version
Catherine Sullivan [Wed, 20 Nov 2013 10:02:54 +0000 (10:02 +0000)] 
i40e: Bump version

Update the driver version to 0.3.12-k

Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoi40e: whitespace
Jeff Kirsher [Wed, 20 Nov 2013 10:02:53 +0000 (10:02 +0000)] 
i40e: whitespace

Whitespace fixes

Change-Id: I95f4d02e4a2a92d6b6fca3ae2b7865c4b916a9bb
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
11 years agoi40e: enable early hardware support
Jesse Brandeburg [Tue, 26 Nov 2013 08:56:05 +0000 (08:56 +0000)] 
i40e: enable early hardware support

Enable a couple of workarounds based on revision ID that allow the
driver to work more fully on early hardware.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
11 years agoi40e: Add flag for L2 VEB filtering
Kevin Scott [Wed, 20 Nov 2013 10:02:51 +0000 (10:02 +0000)] 
i40e: Add flag for L2 VEB filtering

Add a new flag to the add VEB command which allows the
driver to request the hardware to filter on L2 parameters.

This is an implementation of the driver access to a new firmware
feature.

Change-Id: Id61d3cad4125bdc68b8fd9d555c448a10c344b6b
Signed-off-by: Kevin Scott <kevin.c.scott@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>