git.samba.org - sfrench/cifs-2.6.git/log

git.samba.org / sfrench / cifs-2.6.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Huazhong Tan [Mon, 2 Jul 2018 07:50:25 +0000 (15:50 +0800)]

net: hns3: use dma_zalloc_coherent instead of kzalloc/dma_map_single

Reference to Documentation/DMA-API-HOWTO.txt,
Streaming DMA mappings which are usually mapped for one DMA transfer,
Network card DMA ring descriptors should use Consistent DMA mappings.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Huazhong Tan [Mon, 2 Jul 2018 07:50:24 +0000 (15:50 +0800)]

net: hns3: give default option while dependency HNS3 set

Give default option for HNS3_HCLGE and HNS3_ENET will be helpful,
while dependency HNS3 is set. Meanwhile, use "if HNS3" section
instead of all the "depends on HNS3".

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Huazhong Tan [Mon, 2 Jul 2018 07:50:23 +0000 (15:50 +0800)]

net: hns3: remove some unused members of some structures

Some members in struct hns3_enet_tqp_vector, struct hnae3_client
and struct hnae3_ae_algo are unused.
This patch removes them.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Huazhong Tan [Mon, 2 Jul 2018 07:50:22 +0000 (15:50 +0800)]

net: hns3: remove a redundant hclge_cmd_csq_done

Set complete in the first hclge_cmd_csq_done of hclge_cmd_send,
and check if complete later, unnecessary to do hclge_cmd_csq_done
again.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Huazhong Tan [Mon, 2 Jul 2018 07:50:21 +0000 (15:50 +0800)]

net: hns3: simplify hclge_cmd_csq_clean

csq is used as a ring buffer, the value of the desc will be replaced
in next use. This patch removes the unnecessary memset, and just
updates the next_to_clean.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Huazhong Tan [Mon, 2 Jul 2018 07:50:20 +0000 (15:50 +0800)]

net: hns3: remove some redundant assignments

Remove some redundant assignments.
desc->flag = cpu_to_le16(HCLGE_CMD_FLAG_NO_INTR | HCLGE_CMD_FLAG_IN)
has set bit HCLGE_CMD_FLAG_WR to zero, so does others.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Huazhong Tan [Mon, 2 Jul 2018 07:50:19 +0000 (15:50 +0800)]

net: hns3: remove useless code in hclge_cmd_send

There are some useless type cast, print in hclge_cmd_send.
This patch removes them.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Huazhong Tan [Mon, 2 Jul 2018 07:50:18 +0000 (15:50 +0800)]

net: hns3: remove unused hclge_ring_to_dma_dir

hclge_ring_to_dma_dir is unused anywhere.
This patch removes it.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Colin Ian King [Mon, 2 Jul 2018 07:37:21 +0000 (08:37 +0100)]

atm: zatm: remove redundant pointer zatm_dev

Pointer zatm_dev is being assigned but is never used hence it is redundant
and can be removed.

Cleans up clang warning:
warning: variable 'zatm_dev' set but not used [-Wunused-but-set-variable]

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Heiner Kallweit [Mon, 2 Jul 2018 06:08:13 +0000 (08:08 +0200)]

net: phy: realtek: add support for RTL8211C

RTL8211C has an issue when operating in Gigabit slave mode, therefore
genphy driver can't be used. See also this U-boot change.
https://lists.denx.de/pipermail/u-boot/2016-March/249712.html

Add a PHY driver for this chip with the quirk to force Gigabit master
mode. As a note: This will make it impossible to connect two network
ports directly which both are driven by a RTl8211C.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Roman Mashak [Mon, 2 Jul 2018 04:02:02 +0000 (00:02 -0400)]

net sched actions: add extack messages in pedit action

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Guenter Roeck [Sun, 1 Jul 2018 20:57:38 +0000 (13:57 -0700)]

TTY: isdn: Replace strncpy with memcpy

gcc 8.1.0 complains:

drivers/isdn/i4l/isdn_tty.c: In function 'isdn_tty_suspend.isra.1':
drivers/isdn/i4l/isdn_tty.c:790:3: warning:
'strncpy' output truncated before terminating nul copying
as many bytes from a string as its length
drivers/isdn/i4l/isdn_tty.c:778:6: note: length computed here

drivers/isdn/i4l/isdn_tty.c: In function 'isdn_tty_resume':
drivers/isdn/i4l/isdn_tty.c:880:3: warning:
'strncpy' output truncated before terminating nul copying
as many bytes from a string as its length
drivers/isdn/i4l/isdn_tty.c:817:6: note: length computed here

Using strncpy() is indeed less than perfect since the length of data to
be copied has already been determined with strlen(). Replace strncpy()
with memcpy() to address the warning and optimize the code a little.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Heiner Kallweit [Sun, 1 Jul 2018 17:14:58 +0000 (19:14 +0200)]

net: phy: realtek: add missing entry for RTL8211 to mdio_device_id table

When adding support for RTL8211 I forgot to update the mdio_device_id
table.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Fixes: d241d4aac93f ("net: phy: realtek: add support for RTL8211")
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Yafang Shao [Sun, 1 Jul 2018 15:31:30 +0000 (23:31 +0800)]

net: expose sk wmem in sock_exceed_buf_limit tracepoint

Currently trace_sock_exceed_buf_limit() only show rmem info,
but wmem limit may also be hit.
So expose wmem info in this tracepoint as well.

Regarding memcg, I think it is better to introduce a new tracepoint(if
that is needed), i.e. trace_memcg_limit_hit other than show memcg info in
trace_sock_exceed_buf_limit.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Heiner Kallweit [Sat, 30 Jun 2018 22:25:19 +0000 (00:25 +0200)]

r8169: remove old PHY reset hack

This hack (affecting the non-PCIe models only) was introduced in 2004
to deal with link negotiation failures in 1GBit mode. Based on a
comment in the r8169 vendor driver I assume the issue affects RTL8169sb
in combination with particular 1GBit switch models.

Resetting the PHY every 10s and hoping that one fine day we will make
it to establish the link seems to be very hacky to me. I'd say:
If 1GBit doesn't work reliably in a users environment then the user
should remove 1GBit from the advertised modes, e.g. by using
ethtool -s <if> advertise <10/100 modes>

If the issue affects one chip version only and that with most link
partners, then we could also think of removing 1GBit from the
advertised modes for this chip version in the driver.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Colin Ian King [Sat, 30 Jun 2018 20:39:24 +0000 (21:39 +0100)]

netdevsim: fix sa_idx out of bounds check

Currently if sa_idx is equal to NSIM_IPSEC_MAX_SA_COUNT then
an out-of-bounds read on ipsec->sa will occur. Fix the
incorrect bounds check by using >= rather than >.

Detected by CoverityScan, CID#1470226 ("Out-of-bounds-read")

Fixes: 7699353da875 ("netdevsim: add ipsec offload testing")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Mon, 2 Jul 2018 00:06:24 +0000 (09:06 +0900)]

Merge branch 'xps-symmretric-queue-selection'

Amritha Nambiar says:

====================
Symmetric queue selection using XPS for Rx queues

This patch series implements support for Tx queue selection based on
Rx queue(s) map. This is done by configuring Rx queue(s) map per Tx-queue
using sysfs attribute. If the user configuration for Rx queues does
not apply, then the Tx queue selection falls back to XPS using CPUs and
finally to hashing.

XPS is refactored to support Tx queue selection based on either the
CPUs map or the Rx-queues map. The config option CONFIG_XPS needs to be
enabled. By default no receive queues are configured for the Tx queue.

- /sys/class/net/<dev>/queues/tx-*/xps_rxqs

A set of receive queues can be mapped to a set of transmit queues (many:many),
although the common use case is a 1:1 mapping. This will enable sending
packets on the same Tx-Rx queue association as this is useful for busy polling
multi-threaded workloads where it is not possible to pin the threads to
a CPU. This is a rework of Sridhar's patch for symmetric queueing via
socket option:
https://www.spinics.net/lists/netdev/msg453106.html

Testing Hints:
Kernel:  Linux 4.17.0-rc7+
Interface:
driver: ixgbe
version: 5.1.0-k
firmware-version: 0x00015e0b

Configuration:
ethtool -L $iface combined 16
ethtool -C $iface rx-usecs 1000
sysctl net.core.busy_poll=1000
ATR disabled:
ethtool -K $iface ntuple on

Workload:
Modified memcached that changes the thread selection policy to be based
on the incoming rx-queue of a connection using SO_INCOMING_NAPI_ID socket
option. The default is round-robin.

Default: No rxqs_map configured
Symmetric queues: Enable rxqs_map for all queues 1:1 mapped to Tx queue

System:
Architecture:          x86_64
CPU(s):                72
Model name:            Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz

16 threads  400K requests/sec
=============================
-------------------------------------------------------------------------------
                                Default                 Symmetric queues
-------------------------------------------------------------------------------
RTT min/avg/max                 4/51/2215               2/30/5163
(usec)

intr/sec                        26655                   18606

contextswitch/sec               5145                    4044

insn per cycle                  0.43                    0.72

cache-misses                    6.919                   4.310
(% of all cache refs)

L1-dcache-load-                 4.49                    3.29
-misses
(% of all L1-dcache hits)

LLC-load-misses                 13.26                   8.96
(% of all LL-cache hits)

-------------------------------------------------------------------------------

32 threads  400K requests/sec
=============================
-------------------------------------------------------------------------------
                                Default                 Symmetric queues
-------------------------------------------------------------------------------
RTT min/avg/max                 10/112/5562             9/46/4637
(usec)

intr/sec                        30456                   27666

contextswitch/sec               7552                    5133

insn per cycle                  0.41                    0.49

cache-misses                    9.357                   2.769
(% of all cache refs)

L1-dcache-load-                 4.09                    3.98
-misses
(% of all L1-dcache hits)

LLC-load-misses                 12.96                   3.96
(% of all LL-cache hits)

-------------------------------------------------------------------------------

16 threads  800K requests/sec
=============================
-------------------------------------------------------------------------------
                                Default                 Symmetric queues
-------------------------------------------------------------------------------
RTT min/avg/max                  5/151/4989             9/69/2611
(usec)

intr/sec                        35686                   22907

contextswitch/sec               25522                   12281

insn per cycle                  0.67                    0.74

cache-misses                    8.652                   6.38
(% of all cache refs)

L1-dcache-load-                 3.19                    2.86
-misses
(% of all L1-dcache hits)

LLC-load-misses                 16.53                   11.99
(% of all LL-cache hits)

-------------------------------------------------------------------------------
32 threads  800K requests/sec
=============================
-------------------------------------------------------------------------------
                                Default                 Symmetric queues
-------------------------------------------------------------------------------
RTT min/avg/max                  6/163/6152             8/88/4209
(usec)

intr/sec                        47079                   26548

contextswitch/sec               42190                   39168

insn per cycle                  0.45                    0.54

cache-misses                    8.798                   4.668
(% of all cache refs)

L1-dcache-load-                 6.55                    6.29
-misses
(% of all L1-dcache hits)

LLC-load-misses                 13.91                   10.44
(% of all LL-cache hits)

-------------------------------------------------------------------------------

v6:
- Changed the names of some functions to begin with net_if.
- Cleaned up sk_tx_queue_set/sk_rx_queue_set functions.
- Added sk_rx_queue_clear to make it consistent with tx_queue_mapping
  initialization.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Amritha Nambiar [Sat, 30 Jun 2018 04:27:12 +0000 (21:27 -0700)]

Documentation: Add explanation for XPS using Rx-queue(s) map

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Amritha Nambiar [Sat, 30 Jun 2018 04:27:07 +0000 (21:27 -0700)]

net-sysfs: Add interface for Rx queue(s) map per Tx queue

Extend transmit queue sysfs attribute to configure Rx queue(s) map
per Tx queue. By default no receive queues are configured for the
Tx queue.

- /sys/class/net/eth0/queues/tx-*/xps_rxqs

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Amritha Nambiar [Sat, 30 Jun 2018 04:27:02 +0000 (21:27 -0700)]

net: Enable Tx queue selection based on Rx queues

This patch adds support to pick Tx queue based on the Rx queue(s) map
configuration set by the admin through the sysfs attribute
for each Tx queue. If the user configuration for receive queue(s) map
does not apply, then the Tx queue selection falls back to CPU(s) map
based selection and finally to hashing.

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Amritha Nambiar [Sat, 30 Jun 2018 04:26:57 +0000 (21:26 -0700)]

net: Record receive queue number for a connection

This patch adds a new field to sock_common 'skc_rx_queue_mapping'
which holds the receive queue number for the connection. The Rx queue
is marked in tcp_finish_connect() to allow a client app to do
SO_INCOMING_NAPI_ID after a connect() call to get the right queue
association for a socket. Rx queue is also marked in tcp_conn_request()
to allow syn-ack to go on the right tx-queue associated with
the queue on which syn is received.

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Amritha Nambiar [Sat, 30 Jun 2018 04:26:51 +0000 (21:26 -0700)]

net: sock: Change tx_queue_mapping in sock_common to unsigned short

Change 'skc_tx_queue_mapping' field in sock_common structure from
'int' to 'unsigned short' type with ~0 indicating unset and
other positive queue values being set. This will accommodate adding
a new 'unsigned short' field in sock_common in the next patch for
rx_queue_mapping.

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Amritha Nambiar [Sat, 30 Jun 2018 04:26:46 +0000 (21:26 -0700)]

net: Use static_key for XPS maps

Use static_key for XPS maps to reduce the cost of extra map checks,
similar to how it is used for RPS and RFS. This includes static_key
'xps_needed' for XPS and another for 'xps_rxqs_needed' for XPS using
Rx queues map.

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Amritha Nambiar [Sat, 30 Jun 2018 04:26:41 +0000 (21:26 -0700)]

net: Refactor XPS for CPUs and Rx queues

Refactor XPS code to support Tx queue selection based on
CPU(s) map or Rx queue(s) map.

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Sat, 30 Jun 2018 13:06:16 +0000 (22:06 +0900)]

Merge branch 'mlxsw-Add-resource-scale-tests'

Petr Machata says:

====================
mlxsw: Add resource scale tests

There are a number of tests that check features of the Linux networking
stack. By running them on suitable interfaces, one can exercise the
mlxsw offloading code. However none of these tests attempts to push
mlxsw to the limits supported by the ASIC.

As an additional wrinkle, the "limits supported by the ASIC" themselves
may not be a set of fixed numbers, but rather depend on a profile that
determines how the ASIC resources are allocated for different purposes.

This patchset introduces several tests that verify capability of mlxsw
to offload amounts of routes, flower rules, and mirroring sessions that
match predicted ASIC capacity, at different configuration profiles.
Additionally they verify that amounts exceeding the predicted capacity
can *not* be offloaded.

These are not generic tests, but ones that are tailored for mlxsw
specifically. For that reason they are not added to net/forwarding
selftests subdirectory, but rather to a newly-added drivers/net/mlxsw.

Patches #1, #2 and #3 tweak the generic forwarding/lib.sh to support the
new additions.

In patches #4 and #5, new libraries for interfacing with devlink are
introduced, first a generic one, then a Spectrum-specific one.

In patch #6, a devlink resource test is introduced.

Patches #7 and #8, #9 and #10, and #11 and #12 introduce three scale
tests: router, flower and mirror-to-gretap. The first of each pair of
patches introduces a generic portion of the test (mlxsw-specific), the
second introduces a Spectrum-specific wrapper.

Patch #13 then introduces a scale test driver that runs (possibly a
subset of) the tests introduced by patches from previous paragraph.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Yuval Mintz [Sat, 30 Jun 2018 00:53:52 +0000 (02:53 +0200)]

selftests: mlxsw: Add scale test for resources

Add a scale test capable of validating that offloaded network
functionality is indeed functional at scale when configured to
the different KVD profiles available.

Start by testing offloaded routes are functional at scale by
passing traffic on each one of them in turn.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Petr Machata [Sat, 30 Jun 2018 00:53:14 +0000 (02:53 +0200)]

selftests: mlxsw: Add target for mirror-to-gretap test on spectrum

Add a wrapper around mlxsw/mirror_gre_scale.sh that parameterized number
of offloadable mirrors on Spectrum machines.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Petr Machata [Sat, 30 Jun 2018 00:52:34 +0000 (02:52 +0200)]

selftests: mlxsw: Add scale test for mirror-to-gretap

Test that it's possible to offload a given number of mirrors.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Petr Machata [Sat, 30 Jun 2018 00:51:55 +0000 (02:51 +0200)]

selftests: mlxsw: Add target for tc flower test on spectrum

Add a wrapper around mlxsw/tc_flower_scale.sh that parameterizes the
generic tc flower scale test template with Spectrum-specific target
values.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Petr Machata [Sat, 30 Jun 2018 00:51:05 +0000 (02:51 +0200)]

selftests: mlxsw: Add tc flower scale test

Add test of capacity to offload flower.

This is a generic portion of the test that is meant to be called from a
driver that supplies a particular number of rules to be tested with.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Yuval Mintz [Sat, 30 Jun 2018 00:50:28 +0000 (02:50 +0200)]

selftests: mlxsw: Add target for router test on spectrum

IPv4 routes in Spectrum are based on the kvd single-hash, but as it's
a hash we need to assume we cannot reach 100% of its capacity.

Add a wrapper that provides us with good/bad target numbers for the
Spectrum ASIC.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
[petrm@mellanox.com: Drop shebang.]
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Arkadi Sharshevsky [Sat, 30 Jun 2018 00:49:32 +0000 (02:49 +0200)]

selftests: mlxsw: Add router test

This test aims for both stand alone and internal usage by the resource
infra. The test receives the number routes to offload and checks:
- The routes were offloaded correctly
- Traffic for each route.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Yuval Mintz [Sat, 30 Jun 2018 00:48:27 +0000 (02:48 +0200)]

selftests: mlxsw: Add devlink KVD resource test

Add a selftest that can be used to perform basic sanity of the devlink
resource API as well as test the behavior of KVD manipulation in the
driver.

This is the first case of a HW-only test - in order to test the devlink
resource a driver capable of exposing resources has to be provided
first.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
[petrm@mellanox.com: Extracted two patches out of this patch. Tweaked
commit message.]
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Petr Machata [Sat, 30 Jun 2018 00:48:12 +0000 (02:48 +0200)]

selftests: mlxsw: Add devlink_lib_spectrum.sh

This library builds on top of devlink_lib.sh and contains functionality
specific to Spectrum ASICs, e.g., re-partitioning the various KVD
sub-parts.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
[petrm@mellanox.com: Split this out from another patch. Fix line length
in devlink_sp_read_kvd_defaults().]
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Petr Machata [Sat, 30 Jun 2018 00:47:08 +0000 (02:47 +0200)]

selftests: forwarding: Add devlink_lib.sh

This helper library contains wrappers to devlink functionality agnostic
to the underlying device.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
[petrm@mellanox.com: Split this out from another patch.]
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Petr Machata [Sat, 30 Jun 2018 00:46:05 +0000 (02:46 +0200)]

selftests: forwarding: lib: Parameterize NUM_NETIFS in two functions

setup_wait() and tc_offload_check() both assume that all NUM_NETIFS
interfaces are relevant for a given test. However, the scale test script
acts as an umbrella for a number of sub-tests, some of which may not
require all the interfaces.

Thus it's suboptimal for tc_offload_check() to query all the interfaces.
In case of setup_wait() it's incorrect, because the sub-test in question
of course doesn't configure any interfaces beyond what it needs, and
setup_wait() then ends up waiting indefinitely for the extraneous
interfaces to come up.

For that reason, give setup_wait() and tc_offload_check() an optional
parameter with a number of interfaces to probe. Fall back to global
NUM_NETIFS if the parameter is not given.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Petr Machata [Sat, 30 Jun 2018 00:45:47 +0000 (02:45 +0200)]

selftests: forwarding: lib: Add check_err_fail()

In the scale testing scenarios, one usually has a condition that is
expected to either fail, or pass, depending on which side of the scale
is being tested.

To capture this logic, add a function check_err_fail(), which dispatches
either to check_err() or check_fail(), depending on the value of the
first argument, should_fail.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Yuval Mintz [Sat, 30 Jun 2018 00:45:19 +0000 (02:45 +0200)]

selftests: forwarding: Allow lib.sh sourcing from other directories

The devlink related scripts are mlxsw-specific. As a result, they'll
reside in a different directory - but would still need the common logic
implemented in lib.sh.
So as a preliminary step, allow lib.sh to be sourced from other
directories as well.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Sat, 30 Jun 2018 12:31:57 +0000 (21:31 +0900)]

Merge branch 'nfp-flower-updates-and-netconsole'

Jakub Kicinski says:

====================
nfp: flower updates and netconsole

This set contains assorted updates to driver base and flower.
First patch is a follow up to a fix to calculating counters which
went into net.  For ethtool counters we should also make sure
they are visible even after ring reconfiguration.  Next patch
is a safety measure in case we are running on a machine with a
broken BIOS we should fail the probe when risk of incorrect
operation is too high.  The next two patches add netpoll support
and make use of napi_consume_skb().  Last we populate bus info
on all representors.

Pieter adds support for offload of the checksum action in flower.

John follows up to another fix he's done in net, we set TTL
values on tunnels to stack's default, now Johns does a full
route lookup to get a more precise information, he populates
ToS field as well.  Last but not least he follows up on Jiri's
request to enable LAG offload in case the team driver is used
and then hash function is unknown.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

John Hurley [Sat, 30 Jun 2018 00:04:42 +0000 (17:04 -0700)]

nfp: flower: enabled offloading of Team LAG

Currently the NFP fw only supports L3/L4 hashing so rejects the offload of
filters that output to LAG ports implementing other hash algorithms. Team,
however, uses a BPF function for the hash that is not defined. To support
Team offload, accept hashes that are defined as 'unknown' (only Team
defines such hash types). In this case, use the NFP default of L3/L4
hashing for egress port selection.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

John Hurley [Sat, 30 Jun 2018 00:04:41 +0000 (17:04 -0700)]

nfp: flower: offload tos and tunnel flags for ipv4 udp tunnels

Extract the tos and the tunnel flags from the tunnel key and offload these
action fields. Only the checksum and tunnel key flags are implemented in
fw so reject offloads of other flags. The tunnel key flag is always
considered set in the fw so enforce that it is set in the rule. Note that
the compulsory setting of the tunnel key flag and optional setting of
checksum is inline with how tc currently generates ipv4 udp tunnel
actions.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

John Hurley [Sat, 30 Jun 2018 00:04:40 +0000 (17:04 -0700)]

nfp: flower: extract ipv4 udp tunnel ttl from route

Previously the ttl for ipv4 udp tunnels was set to the namespace default.
Modify this to attempt to extract the ttl from a full route lookup on the
tunnel destination. If this is not possible then resort to the default.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Pieter Jansen van Vuuren [Sat, 30 Jun 2018 00:04:39 +0000 (17:04 -0700)]

nfp: flower: ignore checksum actions when performing pedit actions

Hardware will automatically update csum in headers when a set action has
been performed. This means we could in the driver ignore the explicit
checksum action when performing a set action.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jakub Kicinski [Sat, 30 Jun 2018 00:04:38 +0000 (17:04 -0700)]

nfp: populate bus-info on representors

We used to leave bus-info in ethtool driver info empty for
representors in case multi-PCIe-to-single-host cards make
the association between PCIe device and NFP many to one.
It seems these attempts are futile, we need to link the
representors to one PCIe device in sysfs to get consistent
naming, plus devlink uses one PCIe as a handle, anyway.
The multi-PCIe-to-single-system support won't be clean,
if it ever comes.

Turns out some user space (RHEL tests) likes to read bus-info
so just populate it.

While at it remove unnecessary app NULL-check, representors
are spawned by an app, so it must exist.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jakub Kicinski [Sat, 30 Jun 2018 00:04:37 +0000 (17:04 -0700)]

nfp: make use of napi_consume_skb()

Use napi_consume_skb() in nfp_net_tx_complete() to get bulk free.
Pass 0 as budget for ctrl queue completion since it runs out of
a tasklet.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jakub Kicinski [Sat, 30 Jun 2018 00:04:36 +0000 (17:04 -0700)]

nfp: implement netpoll ndo (thus enabling netconsole)

NFP NAPI handling will only complete the TXed packets when called
with budget of 0, implement ndo_poll_controller by scheduling NAPI
on all TX queues.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jakub Kicinski [Sat, 30 Jun 2018 00:04:35 +0000 (17:04 -0700)]

nfp: fail probe if serial or interface id is missing

On some platforms with broken ACPI tables we may not have access
to the Serial Number PCIe capability. This capability is crucial
for us for switchdev operation as we use serial number as switch ID,
and for communication with management FW where interface ID is used.

If we can't determine the Serial Number we have to fail device probe.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jakub Kicinski [Sat, 30 Jun 2018 00:04:34 +0000 (17:04 -0700)]

nfp: expose ring stats of inactive rings via ethtool

After user changes the ring count statistics for deactivated
rings disappear from ethtool -S output.  This causes loss of
information to the user and means that ethtool stats may not
add up to interface stats.  Always expose counters from all
the rings.  Note that we allocate at most num_possible_cpus()
rings so number of rings should be reasonable.

The alternative of only listing stats for rings which were
ever in use could be confusing.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Vakul Garg [Fri, 29 Jun 2018 19:15:55 +0000 (00:45 +0530)]

strparser: Call skb_unclone conditionally

Calling skb_unclone() is expensive as it triggers a memcpy operation.
Instead of calling skb_unclone() unconditionally, call it only when skb
has a shared frag_list. This improves tls rx throughout significantly.

Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Suggested-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Keara Leibovitz [Fri, 29 Jun 2018 14:47:31 +0000 (10:47 -0400)]

tc-testing: initial version of tunnel_key unit tests

Create unittests for the tc tunnel_key action.

v2:
For the tests expecting failures, added non-zero exit codes in the
teardowns. This prevents those tests from failing if the act_tunnel_key
module is unloaded.

Signed-off-by: Keara Leibovitz <kleib@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Sat, 30 Jun 2018 12:08:12 +0000 (21:08 +0900)]

Merge tag 'mac80211-next-for-davem-2018-06-29' of git://git./linux/kernel/git/jberg/mac80211-next

Small merge conflict in net/mac80211/scan.c, I preserved
the kcalloc() conversion. -DaveM

Johannes Berg says:

====================
This round's updates:
* finally some of the promised HE code, but it turns
   out to be small - but everything kept changing, so
   one part I did in the driver was >30 patches for
   what was ultimately <200 lines of code ... similar
   here for this code.
* improved scan privacy support - can now specify scan
   flags for randomizing the sequence number as well as
   reducing the probe request element content
* rfkill cleanups
* a timekeeping cleanup from Arnd
* various other cleanups
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

GhantaKrishnamurthy MohanKrishna [Fri, 29 Jun 2018 11:26:18 +0000 (13:26 +0200)]

tipc: extend sock diag for group communication

This commit extends the existing TIPC socket diagnostics framework
for information related to TIPC group communication.

Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

GhantaKrishnamurthy MohanKrishna [Fri, 29 Jun 2018 11:23:41 +0000 (13:23 +0200)]

tipc: Auto removal of peer down node instance

A peer node is considered down if there are no
active links (or) lost contact to the node. In current implementation,
a peer node instance is deleted either if

a) TIPC module is removed (or)
b) Application can use a netlink/iproute2 interface to delete a
specific down node.

Thus, a down node instance lives in the system forever, unless the
application explicitly removes it.

We fix this by deleting the nodes which are down for
a specified amount of time (5 minutes).
Existing node supervision timer is used to achieve this.

Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Heiner Kallweit [Fri, 29 Jun 2018 06:07:04 +0000 (08:07 +0200)]

r8169: remove TBI 1000BaseX support

The very first version of RTL8169 from 2002 (and only this one) has
support for a TBI 1000BaseX fiber interface. The TBI support in the
driver makes switching to phylib tricky, so best would be to get
rid of it. I found no report from anybody using a device with RTL8169
and fiber interface, also the vendor driver doesn't support this mode
(any longer).
So remove TBI support and bail out with a message if a card with
activated TBI is detected. If there really should be any user of it
out there, we could add a stripped-down version of the driver
supporting chip version 01 and TBI only (and maybe move it to
staging).

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Tung Nguyen [Thu, 28 Jun 2018 20:39:25 +0000 (22:39 +0200)]

tipc: optimize function tipc_node_timeout()

In single-link usage, the function tipc_node_timeout() still iterates
over the whole link array to handle each link. Given that the maximum
number of bearers are 3, there are 2 redundant iterations with lock
grab/release. Since this function is executing very frequently it makes
sense to optimize it.

This commit adds conditional checking to exit from the loop if the
known number of configured links has already been accessed.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Tung Nguyen [Thu, 28 Jun 2018 20:25:04 +0000 (22:25 +0200)]

tipc: eliminate buffer cloning in function tipc_msg_extract()

The function tipc_msg_extract() is using skb_clone() to clone inner
messages from a message bundle buffer. Although this method is safe,
it has an undesired effect that each buffer clone inherits the
true-size of the bundling buffer. As a result, the buffer clone
almost always ends up with being copied anyway by the message
validation function. This makes the cloning into a sub-optimization.

In this commit we take the consequence of this realization, and copy
each inner message to a separately allocated buffer up front in the
extraction function.

As a bonus we can now eliminate the two cases where we had to copy
re-routed packets that may potentially go out on the wire again.

Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Gustavo A. R. Silva [Thu, 28 Jun 2018 18:50:48 +0000 (13:50 -0500)]

net: usb: Mark expected switch fall-throughs

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Heiner Kallweit [Thu, 28 Jun 2018 18:46:45 +0000 (20:46 +0200)]

net: phy: realtek: add support for RTL8211

In preparation of adding phylib support to the r8169 driver we need
PHY drivers for all chip-internal PHY types. Fortunately almost all
of them are either supported by the Realtek PHY driver already or work
with the genphy driver.
Still missing is support for the PHY of RTL8169s, it requires a quirk
to properly support 100Mbit-fixed mode. The quirk was copied from
r8169 driver which copied it from the vendor driver.
Based on the PHYID the internal PHY seems to be a RTL8211.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Heiner Kallweit [Thu, 28 Jun 2018 18:36:15 +0000 (20:36 +0200)]

r8169: use standard debug output functions

I see no need to define a private debug output symbol, let's use the
standard debug output functions instead. In this context also remove
the deprecated PFX define.

The one assertion is wrong IMO anyway, this code path is used also
by chip version 01.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Sat, 30 Jun 2018 11:42:26 +0000 (20:42 +0900)]

Merge branch 'smc-pnetid-and-SMC-D-support'

Ursula Braun says:

====================
smc: pnetid and SMC-D support

SMC requires a configured pnet table to map Ethernet interfaces to
RoCE adapter ports. For s390 there exists hardware support to group
such devices. The first three patches cover the s390 pnetid support,
enabling SMC-R usage on s390 without configuring an extra pnet table.

SMC currently requires RoCE adapters, and uses RDMA-techniques
implemented with IB-verbs. But s390 offers another method for
intra-CEC Shared Memory communication. The following seven patches
implement a solution to run SMC traffic based on intra-CEC DMA,
called SMC-D.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Sebastian Ott [Thu, 28 Jun 2018 17:05:13 +0000 (19:05 +0200)]

s390/ism: add device driver for internal shared memory

Add support for the Internal Shared Memory vPCI Adapter.
This driver implements the interfaces of the SMC-D protocol.

Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Hans Wippel [Thu, 28 Jun 2018 17:05:12 +0000 (19:05 +0200)]

net/smc: add SMC-D diag support

This patch adds diag support for SMC-D.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Hans Wippel [Thu, 28 Jun 2018 17:05:11 +0000 (19:05 +0200)]

net/smc: add SMC-D support in af_smc

This patch ties together the previous SMC-D patches. It adds support for
SMC-D to the listen and connect functions and, thus, enables SMC-D
support in the SMC code. If a connection supports both SMC-R and SMC-D,
SMC-D is preferred.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Hans Wippel [Thu, 28 Jun 2018 17:05:10 +0000 (19:05 +0200)]

net/smc: add SMC-D support in data transfer

The data transfer and CDC message headers differ in SMC-R and SMC-D.
This patch adds support for the SMC-D data transfer to the existing SMC
code. It consists of the following:

* SMC-D CDC support
* SMC-D tx support
* SMC-D rx support

The CDC header is stored at the beginning of the receive buffer. Thus, a
rx_offset variable is added for the CDC header offset within the buffer
(0 for SMC-R).

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Hans Wippel [Thu, 28 Jun 2018 17:05:09 +0000 (19:05 +0200)]

net/smc: add SMC-D support in CLC messages

There are two types of SMC: SMC-R and SMC-D. These types are signaled
within the CLC messages during the CLC handshake. This patch adds
support for and checks of the SMC type.

Also, SMC-R and SMC-D need to exchange different information during the
CLC handshake. So, this patch extends the current message formats to
support the SMC-D header fields. The Proposal message can contain both
SMC-R and SMC-D information. The Accept and Confirm messages contain
either SMC-R or SMC-D information.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Hans Wippel [Thu, 28 Jun 2018 17:05:08 +0000 (19:05 +0200)]

net/smc: add pnetid support for SMC-D and ISM

SMC-D relies on PNETIDs to find usable SMC-D/ISM devices for a SMC
connection. This patch adds SMC-D/ISM support to the current PNETID
implementation.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Hans Wippel [Thu, 28 Jun 2018 17:05:07 +0000 (19:05 +0200)]

net/smc: add base infrastructure for SMC-D and ISM

SMC supports two variants: SMC-R and SMC-D. For data transport, SMC-R
uses RDMA devices, SMC-D uses so-called Internal Shared Memory (ISM)
devices. An ISM device only allows shared memory communication between
SMC instances on the same machine. For example, this allows virtual
machines on the same host to communicate via SMC without RDMA devices.

This patch adds the base infrastructure for SMC-D and ISM devices to
the existing SMC code. It contains the following:

* ISM driver interface:
  This interface allows an ISM driver to register ISM devices in SMC. In
  the process, the driver provides a set of device ops for each device.
  SMC uses these ops to execute SMC specific operations on or transfer
  data over the device.

* Core SMC-D link group, connection, and buffer support:
  Link groups, SMC connections and SMC buffers (in smc_core) are
  extended to support SMC-D.

* SMC type checks:
  Some type checks are added to prevent using SMC-R specific code for
  SMC-D and vice versa.

To actually use SMC-D, additional changes to pnetid, CLC, CDC, etc. are
required. These are added in follow-up patches.

Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Ursula Braun [Thu, 28 Jun 2018 17:05:06 +0000 (19:05 +0200)]

net/smc: optimize consumer cursor updates

The SMC protocol requires to send a separate consumer cursor update,
if it cannot be piggybacked to updates of the producer cursor.
Currently the decision to send a separate consumer cursor update
just considers the amount of data already received by the socket
program. It does not consider the amount of data already arrived, but
not yet consumed by the receiver. Basing the decision on the
difference between already confirmed and already arrived data
(instead of difference between already confirmed and already consumed
data), may lead to a somewhat earlier consumer cursor update send in
fast unidirectional traffic scenarios, and thus to better throughput.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Suggested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Ursula Braun [Thu, 28 Jun 2018 17:05:05 +0000 (19:05 +0200)]

net/smc: add pnetid support

s390 hardware supports the definition of a so-call Physical NETwork
IDentifier (short PNETID) per network device port. These PNETIDS
can be used to identify network devices that are attached to the same
physical network (broadcast domain).

On s390 try to use the PNETID of the ethernet device port used for
initial connecting, and derive the IB device port used for SMC RDMA
traffic.

On platforms without PNETID support fall back to the existing
solution of a configured pnet table.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Ursula Braun [Thu, 28 Jun 2018 17:05:04 +0000 (19:05 +0200)]

net/smc: determine port attributes independent from pnet table

For SMC it is important to know the current port state of RoCE devices.
Monitoring port states has been triggered, when a RoCE device was added
to the pnet table. To support future alternatives to the pnet table the
monitoring of ports is made independent of the existence of a pnet table.
It starts once the smc_ib_device is established.

Due to this change smc_ib_remember_port_attr() is now a local function
and shuffling its location and the location of its used functions
makes any forward references obsolete.

And the duplicate SMC_MAX_PORTS definition is removed.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Sat, 30 Jun 2018 11:34:09 +0000 (20:34 +0900)]

Merge branch 'Fixes-for-running-mirror-to-gretap-tests-on-veth'

Petr Machata says:

====================
Fixes for running mirror-to-gretap tests on veth

The forwarding selftests infrastructure makes it possible to run the
individual tests on a purely software netdevices. Names of interfaces to
run the test with can be passed as command line arguments to a test.
lib.sh then creates veth pairs backing the interfaces if none exist in
the system.

However, the tests need to recognize that they might be run on a soft
device. Many mirror-to-gretap tests are buggy in this regard. This patch
set aims to fix the problems in running mirror-to-gretap tests on veth
devices.

In patch #1, a service function is split out of setup_wait().
In patch #2, installing a trap is made optional.
In patch #3, tc filters in several tests are tweaked to work with veth.
In patch #4, the logic for waiting for neighbor is fixed for veth.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Petr Machata [Thu, 28 Jun 2018 16:56:39 +0000 (18:56 +0200)]

selftests: forwarding: mirror_gre_changes: Fix waiting for neighbor

When running the test on soft devices, there's no mechanism to
gratuitously start resolving the neighbor for remote tunnel endpoint.
So instead of passively waiting, wait for the device to be up, and then
probe the neighbor with a ping.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Petr Machata [Thu, 28 Jun 2018 16:56:33 +0000 (18:56 +0200)]

selftests: forwarding: Tweak tc filters for mirror-to-gretap tests

When running mirror_gre_bridge_1d_vlan tests on veth, several issues
cause spurious failures:

- vlan_ethtype should be ip, not ipv6 even in mirror-to-ip6gretap case,
  because the overlay packet is still IPv4.
- Similarly ip_proto matches the innermost IP protocol, so can't be used
  to filter out GRE packet. Drop the corresponding condition.
- Because the above fixes the filters to match in slow path as well,
  they need to be made skip_hw so as not to double-count packets.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Petr Machata [Thu, 28 Jun 2018 16:56:28 +0000 (18:56 +0200)]

selftests: forwarding: lib: Avoid trapping soft devices

There are several cases where traffic that would normally be forwarded
in silicon needs to be observed in slow path. That's achieved by
trapping such traffic, and the functions trap_install() and
trap_uninstall() realize that. However, such treatment is obviously
wrong if the device in question is actually a soft device not backed by
an ASIC.

Therefore try to trap if possible, but fall back to inserting a continue
if not.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Petr Machata [Thu, 28 Jun 2018 16:56:20 +0000 (18:56 +0200)]

selftests: forwarding: lib: Split out setup_wait_dev()

Split out of setup_wait() a function setup_wait_dev() that waits for a
single device. This gives tests the opportunity to wait for a selected
device after they tinkered with its upness.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Sat, 30 Jun 2018 11:15:45 +0000 (20:15 +0900)]

Merge branch 'xilinx_emaclite-coding-style'

Radhey Shyam Pandey says:

====================
Fixes coding style in xilinx_emaclite.c

This patchset fixes checkpatch and kernel-doc warnings in
xilinx emaclite driver. No functional change.

Changes from v2:
-In 2/5 patch refactor if-else to make failure path return early.
-In 2/5 patch coalesce the format onto a single line and add the
missing space after the comma.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Radhey Shyam Pandey [Thu, 28 Jun 2018 13:11:50 +0000 (18:41 +0530)]

net: emaclite: Remove unnecessary spaces

This patch fixes below checkpatch checks-

CHECK: spaces preferred around that '*' (ctx:VxV)
CHECK: No space is necessary after a cast

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Radhey Shyam Pandey [Thu, 28 Jun 2018 13:11:49 +0000 (18:41 +0530)]

net: emaclite: Fix block comments style

This patch fixes below checkpatch warnings-

WARNING: Block comments use a trailing */ on a separate line
WARNING: Block comments use * on subsequent lines
WARNING: networking block comments don't use an empty /* line,
use /* Comment

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Radhey Shyam Pandey [Thu, 28 Jun 2018 13:11:48 +0000 (18:41 +0530)]

net: emaclite: update kernel-doc comments

This patch fixes below kernel-doc warnings:

Function parameter or member 'maxlen' not described in 'xemaclite_recv_data'
Function parameter or member 'address'not described in 'xemaclite_set_mac_address'
Excess function parameter 'addr' description in 'xemaclite_set_mac_address'
No description found for return value of 'xemaclite_interrupt'
No description found for return value of 'xemaclite_mdio_write'
Function parameter or member 'dev' not described in 'xemaclite_mdio_setup'
Excess function parameter 'ofdev' description in 'xemaclite_mdio_setup'
No description found for return value of 'xemaclite_open'
No description found for return value of 'xemaclite_close'
Excess function parameter 'match' description in 'xemaclite_of_probe'

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Radhey Shyam Pandey [Thu, 28 Jun 2018 13:11:47 +0000 (18:41 +0530)]

net: emaclite: Simplify if-else statements

Remove else as it is not required with if doing a return.
It also coalesce the format onto a single line and add the
missing space after the comma. Fixes below checkpatch warning-

WARNING: else is not generally useful after a break or return

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Radhey Shyam Pandey [Thu, 28 Jun 2018 13:11:46 +0000 (18:41 +0530)]

net: emaclite: Use __func__ instead of hardcoded name

Switch hardcoded function name with a reference to __func__ making
the code more maintainable. Address below checkpatch warning:

WARNING: Prefer using '"%s...", __func__' to using 'xemaclite_mdio_read',
this function's name, in a string
+ "xemaclite_mdio_read(phy_id=%i, reg=%x) == %x\n",

WARNING: Prefer using '"%s...", __func__' to using 'xemaclite_mdio_write',
this function's name, in a string
+ "xemaclite_mdio_write(phy_id=%i, reg=%x, val=%x)\n",

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Sat, 30 Jun 2018 09:54:09 +0000 (18:54 +0900)]

Merge branch 'mvpp2-Add-big-endian-support'

Maxime Chevallier says:

====================
net: mvpp2: Add big-endian support

This series allows to use PPv2 on system built as big endian.

The first patch fixes the way we represent TX and RX descriptors, so that
they used fixed little endianness as expected by the PPv2 controller.

The second reworks the way we handle the software representation of the
Header Parser entries, so that we don't use a union of arrays.

The last two patches fixes some incorrect byte swapping logic, that wen't
un-noticed on little-endian.

This whole series doesn't fix any existing bug for little-endian systems, and
since big-endian never worked for this driver, I didn't include 'fixes' tags.

This was tested on MacchiatoBin (Armada 8040).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Maxime Chevallier [Thu, 28 Jun 2018 12:42:07 +0000 (14:42 +0200)]

net: mvpp2: Use htons when checking protocol info

When checking the skb->protocol field, we have to make sure we use the
proper endianness using htons, and not swab16.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Maxime Chevallier [Thu, 28 Jun 2018 12:42:06 +0000 (14:42 +0200)]

net: mvpp2: prs: Drop unnecessary swab16 in vlan detection

Vlan IDs must not be swapped when creating Header Parser entries. This
has no effect on little-endian systems, but is wrong for big-endian.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Maxime Chevallier [Thu, 28 Jun 2018 12:42:05 +0000 (14:42 +0200)]

net: mvpp2: prs: Drop unions representing TCAM and SRAM entries

PPv2's Header Parser use some large TCAM and SRAM entries, that are
duplicated in software so that we can write them to hardware only when
we are done modifying them.

Currently, PPv2 uses a union containing arrays of u32 and u8 to represent
these entries, to facilitate byte per byte access. This representation is
broken when we want to support big endian, and this makes the code
confusing to read.

This patch drops the union, and simply stores the TCAM and SRAM entries
as u32 arrays, each entry corresponding to a 32-bit register.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Maxime Chevallier [Thu, 28 Jun 2018 12:42:04 +0000 (14:42 +0200)]

net: mvpp2: Make TX / RX descriptors little-endian

The PPv2 controller always expect descriptors to be in little endian. We
must therefore force descriptors to use that format, and convert to the
host endianness when necessary.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Yafang Shao [Thu, 28 Jun 2018 04:22:56 +0000 (00:22 -0400)]

tcp: add new SNMP counter for drops when try to queue in rcv queue

When sk_rmem_alloc is larger than the receive buffer and we can't
schedule more memory for it, the skb will be dropped.

In above situation, if this skb is put into the ofo queue,
LINUX_MIB_TCPOFODROP is incremented to track it.

While if this skb is put into the receive queue, there's no record.
So a new SNMP counter is introduced to track this behavior.

LINUX_MIB_TCPRCVQDROP: Number of packets meant to be queued in rcv queue
but dropped because socket rcvbuf limit hit.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Gustavo A. R. Silva [Thu, 28 Jun 2018 01:32:23 +0000 (20:32 -0500)]

bnx2x: Mark expected switch fall-throughs

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jose Abreu [Wed, 27 Jun 2018 14:57:02 +0000 (15:57 +0100)]

net: stmmac: Add support for CBS QDISC

This adds support for CBS reconfiguration using the TC application.

A new callback was added to TC ops struct and another one to DMA ops to
reconfigure the channel mode.

Tested in GMAC5.10.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Fri, 29 Jun 2018 14:54:31 +0000 (23:54 +0900)]

Merge tag 'mlx5e-updates-2018-06-28' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-updates-2018-06-28

mlx5e netdevice driver updates:

- Boris Pismenny added the support for UDP GSO in the first two patches.
  Impressive performance numbers are included in the commit message,
  @Line rate with ~half of the cpu utilization compared to non offload
  or no GSO at all.

- From Tariq Toukan:
  - Convert large order kzalloc allocations to kvzalloc.
  - Added performance diagnostic statistics to several places in data path.

From Saeed and Eran,
  - Update NIC HW stats on demand only, this is to eliminate the background
    thread needed to update some HW statistics in the driver cache in
    order to report error and drop counters from HW in ndo_get_stats.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Fri, 29 Jun 2018 14:50:27 +0000 (23:50 +0900)]

Merge branch 'net-Geneve-options-support-for-TC-act_tunnel_key'

Jakub Kicinski says:

====================
net: Geneve options support for TC act_tunnel_key

Simon & Pieter say:

This set adds Geneve Options support to the TC tunnel key action.
It provides the plumbing required to configure Geneve variable length
options. The options can be configured in the form CLASS:TYPE:DATA,
where CLASS is represented as a 16bit hexadecimal value, TYPE as an 8bit
hexadecimal value and DATA as a variable length hexadecimal value.
Additionally multiple options may be listed using a comma delimiter.

v2:
- fix sparse warnings in patches 3 and 4 (first one reported by
build bot).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Simon Horman [Wed, 27 Jun 2018 04:39:37 +0000 (21:39 -0700)]

net/sched: add tunnel option support to act_tunnel_key

Allow setting tunnel options using the act_tunnel_key action.

Options are expressed as class:type:data and multiple options
may be listed using a comma delimiter.

# ip link add name geneve0 type geneve dstport 0 external
# tc qdisc add dev eth0 ingress
# tc filter add dev eth0 protocol ip parent ffff: \
     flower indev eth0 \
        ip_proto udp \
        action tunnel_key \
            set src_ip 10.0.99.192 \
            dst_ip 10.0.99.193 \
            dst_port 6081 \
            id 11 \
            geneve_opts 0102:80:00800022,0102:80:00800022 \
    action mirred egress redirect dev geneve0

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Pieter Jansen van Vuuren [Wed, 27 Jun 2018 04:39:36 +0000 (21:39 -0700)]

net: check tunnel option type in tunnel flags

Check the tunnel option type stored in tunnel flags when creating options
for tunnels. Thereby ensuring we do not set geneve, vxlan or erspan tunnel
options on interfaces that are not associated with them.

Make sure all users of the infrastructure set correct flags, for the BPF
helper we have to set all bits to keep backward compatibility.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Simon Horman [Wed, 27 Jun 2018 04:39:35 +0000 (21:39 -0700)]

net/sched: act_tunnel_key: add extended ack support

Add extended ack support for the tunnel key action by using NL_SET_ERR_MSG
during validation of user input.

Cc: Alexander Aring <aring@mojatatu.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Simon Horman [Wed, 27 Jun 2018 04:39:34 +0000 (21:39 -0700)]

net/sched: act_tunnel_key: disambiguate metadata dst error cases

Metadata may be NULL for one of two reasons:
* Missing user input
* Failure to allocate the metadata dst

Disambiguate these case by returning -EINVAL for the former and -ENOMEM
for the latter rather than -EINVAL for both cases.

This is in preparation for using extended ack to provide more information
to users when parsing their input.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Arjun Vynipadath [Tue, 26 Jun 2018 11:40:50 +0000 (17:10 +0530)]

cxgb4: Support ethtool private flags

This is used to change TX workrequests, which helps in
host->vf communication.

Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Arjun Vynipadath [Tue, 26 Jun 2018 11:40:25 +0000 (17:10 +0530)]

cxgb4: Add support for FW_ETH_TX_PKT_VM_WR

The present TX workrequest(FW_ETH_TX_PKT_WR) cant be used for
host->vf communication, since it doesn't loopback the outgoing
packets to virtual interfaces on the same port. This can be done
using FW_ETH_TX_PKT_VM_WR.
This fix depends on ethtool_flags to determine what WR to use for
TX path. Support for setting this flags by user is added in next
commit.

Based on the original work by : Casey Leedom <leedom@chelsio.com>

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Xin Long [Thu, 28 Jun 2018 07:31:00 +0000 (15:31 +0800)]

sctp: add support for SCTP_REUSE_PORT sockopt

This feature is actually already supported by sk->sk_reuse which can be
set by socket level opt SO_REUSEADDR. But it's not working exactly as
RFC6458 demands in section 8.1.27, like:

  - This option only supports one-to-one style SCTP sockets
  - This socket option must not be used after calling bind()
    or sctp_bindx().

Besides, SCTP_REUSE_PORT sockopt should be provided for user's programs.
Otherwise, the programs with SCTP_REUSE_PORT from other systems will not
work in linux.

To separate it from the socket level version, this patch adds 'reuse' in
sctp_sock and it works pretty much as sk->sk_reuse, but with some extra
setup limitations that are needed when it is being enabled.

"It should be noted that the behavior of the socket-level socket option
to reuse ports and/or addresses for SCTP sockets is unspecified", so it
leaves SO_REUSEADDR as is for the compatibility.

Note that the name SCTP_REUSE_PORT is somewhat confusing, as its
functionality is nearly identical to SO_REUSEADDR, but with some
extra restrictions. Here it uses 'reuse' in sctp_sock instead of
'reuseport'. As for sk->sk_reuseport support for SCTP, it will be
added in another patch.

Thanks to Neil to make this clear.

v1->v2:
  - add sctp_sk->reuse to separate it from the socket level version.
v2->v3:
  - improve changelog according to Marcelo's suggestion.

Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David Wu [Thu, 28 Jun 2018 01:33:21 +0000 (09:33 +0800)]

net: ethernet: stmmac: dwmac-rk: Add GMAC support for px30

Add constants and callback functions for the dwmac on px30 Soc.
The base structure is the same, but registers and the bits in
them are moved slightly, and add the clk_mac_speed for selecting
mac speed.

Signed-off-by: David Wu <david.wu@rock-chips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Gustavo A. R. Silva [Thu, 28 Jun 2018 01:45:24 +0000 (20:45 -0500)]

tg3: Mark expected switch fall-throughs

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Unnamed repository; edit this file 'description' to name the repository.