sfrench/cifs-2.6.git
7 months agoMerge tag 'wireless-next-2023-10-06' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Fri, 6 Oct 2023 23:07:28 +0000 (16:07 -0700)]
Merge tag 'wireless-next-2023-10-06' of git://git./linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.7

The first pull request for v6.7, with both stack and driver changes.
We have a big change how locking is handled in cfg80211 and mac80211
which removes several locks and hopefully simplifies the locking
overall. In drivers rtw89 got MCC support and smaller features to
other active drivers but nothing out of ordinary.

Major changes:

cfg80211
 - remove wdev mutex, use the wiphy mutex instead
 - annotate iftype_data pointer with sparse
 - first kunit tests, for element defrag
 - remove unused scan_width support

mac80211
 - major locking rework, remove several locks like sta_mtx, key_mtx
   etc. and use the wiphy mutex instead
 - remove unused shifted rate support
 - support antenna control in frame injection (requires driver support)
 - convert RX_DROP_UNUSABLE to more detailed reason codes

rtw89
 - TDMA-based multi-channel concurrency (MCC) support

iwlwifi
 - support set_antenna() operation
 - support frame injection antenna control

ath12k
 - WCN7850: enable 320 MHz channels in 6 GHz band
 - WCN7850: hardware rfkill support
 - WCN7850: enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS to make scan faster

ath11k
 - add chip id board name while searching board-2.bin

* tag 'wireless-next-2023-10-06' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (272 commits)
  wifi: rtlwifi: remove unreachable code in rtl92d_dm_check_edca_turbo()
  wifi: rtw89: debug: txpwr table supports Wi-Fi 7 chips
  wifi: rtw89: debug: show txpwr table according to chip gen
  wifi: rtw89: phy: set TX power RU limit according to chip gen
  wifi: rtw89: phy: set TX power limit according to chip gen
  wifi: rtw89: phy: set TX power offset according to chip gen
  wifi: rtw89: phy: set TX power by rate according to chip gen
  wifi: rtw89: mac: get TX power control register according to chip gen
  wifi: rtlwifi: use unsigned long for rtl_bssid_entry timestamp
  wifi: rtlwifi: fix EDCA limit set by BT coexistence
  wifi: rt2x00: fix MT7620 low RSSI issue
  wifi: rtw89: refine bandwidth 160MHz uplink OFDMA performance
  wifi: rtw89: refine uplink trigger based control mechanism
  wifi: rtw89: 8851b: update TX power tables to R34
  wifi: rtw89: 8852b: update TX power tables to R35
  wifi: rtw89: 8852c: update TX power tables to R67
  wifi: rtw89: regd: configure Thailand in regulation type
  wifi: mac80211: add back SPDX identifier
  wifi: mac80211: fix ieee80211_drop_unencrypted_mgmt return type/value
  wifi: rtlwifi: cleanup few rtlxxxx_set_hw_reg() routines
  ...

====================

Link: https://lore.kernel.org/r/87jzrz6bvw.fsf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agonet: phy: broadcom: add support for BCM5221 phy
Giulio Benetti [Thu, 5 Oct 2023 18:29:15 +0000 (20:29 +0200)]
net: phy: broadcom: add support for BCM5221 phy

This patch adds the BCM5221 PHY support by reusing brcm_fet_*()
callbacks and adding quirks for BCM5221 when needed.

Cc: Jim Reinhart <jimr@tekvox.com>
Cc: James Autry <jautry@tekvox.com>
Cc: Matthew Maron <matthewm@tekvox.com>
Signed-off-by: Giulio Benetti <giulio.benetti+tekvox@benettiengineering.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20231005182915.153815-1-giulio.benetti@benettiengineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
Jakub Kicinski [Fri, 6 Oct 2023 22:59:26 +0000 (15:59 -0700)]
Merge branch '40GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
i40e: House-keeping and clean-up

Ivan Vecera says:

The series makes some house-keeping tasks on i40e driver:

Patch 1: Removes unnecessary back pointer from i40e_hw
Patch 2: Moves I40E_MASK macro to i40e_register.h where is used
Patch 3: Refactors I40E_MDIO_CLAUSE* to use the common macro
Patch 4: Add header dependencies to <linux/avf/virtchnl.h>
Patch 5: Simplifies memory alloction functions
Patch 6: Moves mem alloc structures to i40e_alloc.h
Patch 7: Splits i40e_osdep.h to i40e_debug.h and i40e_io.h
Patch 8: Removes circular header deps, fixes and cleans headers
Patch 9: Moves DDP specific macros and structs to i40e_ddp.c

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  i40e: Move DDP specific macros and structures to i40e_ddp.c
  i40e: Remove circular header dependencies and fix headers
  i40e: Split i40e_osdep.h
  i40e: Move memory allocation structures to i40e_alloc.h
  i40e: Simplify memory allocation functions
  virtchnl: Add header dependencies
  i40e: Refactor I40E_MDIO_CLAUSE* macros
  i40e: Move I40E_MASK macro to i40e_register.h
  i40e: Remove back pointer from i40e_hw structure
====================

Link: https://lore.kernel.org/r/20231005162850.3218594-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agonet: atheros: replace deprecated strncpy with strscpy
Justin Stitt [Thu, 5 Oct 2023 01:29:45 +0000 (01:29 +0000)]
net: atheros: replace deprecated strncpy with strscpy

`strncpy` is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

We expect netdev->name to be NUL-terminated based on its use with format
strings and dev_info():
|     dev_info(&adapter->pdev->dev,
|             "%s link is up %d Mbps %s\n",
|             netdev->name, adapter->link_speed,
|             adapter->link_duplex == FULL_DUPLEX ?
|             "full duplex" : "half duplex");

Furthermore, NUL-padding is not required as netdev is already
zero-initialized through alloc_etherdev().

Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20231005-strncpy-drivers-net-ethernet-atheros-atlx-atl2-c-v1-1-493f113ebfc7@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agonet: ax88796c: replace deprecated strncpy with strscpy
Justin Stitt [Thu, 5 Oct 2023 01:06:26 +0000 (01:06 +0000)]
net: ax88796c: replace deprecated strncpy with strscpy

`strncpy` is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

A suitable replacement is `strscpy` [2] due to the fact that it
guarantees NUL-termination on the destination buffer without
unnecessarily NUL-padding.

It should be noted that there doesn't currently exist a bug here as
DRV_NAME is a small string literal which means no overread bugs are
present.

Also to note, other ethernet drivers are using strscpy in a similar
pattern:
|       dec/tulip/tulip_core.c
|       861:    strscpy(info->driver, DRV_NAME, sizeof(info->driver));
|
|       8390/ax88796.c
|       582:    strscpy(info->driver, DRV_NAME, sizeof(info->driver));
|
|       dec/tulip/dmfe.c
|       1077:   strscpy(info->driver, DRV_NAME, sizeof(info->driver));
|
|       8390/etherh.c
|       558:    strscpy(info->driver, DRV_NAME, sizeof(info->driver));

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Lukasz Stelmach <l.stelmach@samsung.com>
Link: https://lore.kernel.org/r/20231005-strncpy-drivers-net-ethernet-asix-ax88796c_ioctl-c-v1-1-6fafdc38b170@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agonet: ixp4xx_eth: Support changing the MTU
Linus Walleij [Wed, 4 Oct 2023 22:43:53 +0000 (00:43 +0200)]
net: ixp4xx_eth: Support changing the MTU

As we don't specify the MTU in the driver, the framework
will fall back to 1500 bytes and this doesn't work very
well when we try to attach a DSA switch:

  eth1: mtu greater than device maximum
  ixp4xx_eth c800a000.ethernet eth1: error -22 setting
  MTU to 1504 to include DSA overhead

After locating an out-of-tree patch in OpenWrt I found
suitable code to set the MTU on the interface and ported
it and updated it. Now the MTU gets set properly.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20231005-ixp4xx-eth-mtu-v4-1-08c66ed0bc69@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agoMAINTAINERS: Update LL TEMAC entry to Orphan
Harini Katakam [Thu, 5 Oct 2023 13:10:39 +0000 (18:40 +0530)]
MAINTAINERS: Update LL TEMAC entry to Orphan

Since there's no alternate driver, change this entry from obsolete
to orphan.

Signed-off-by: Harini Katakam <harini.katakam@amd.com>
Link: https://lore.kernel.org/r/20231005131039.25881-1-harini.katakam@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agoMerge tag 'linux-can-next-for-6.7-20231005' of git://git.kernel.org/pub/scm/linux...
Jakub Kicinski [Fri, 6 Oct 2023 22:42:12 +0000 (15:42 -0700)]
Merge tag 'linux-can-next-for-6.7-20231005' of git://git./linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2023-10-05

The first patch is by Miquel Raynal and fixes a comment in the sja1000
driver.

Vincent Mailhol contributes 2 patches that fix W=1 compiler warnings
in the etas_es58x driver.

Jiapeng Chong's patch removes an unneeded NULL pointer check before
dev_put() in the CAN raw protocol.

A patch by Justin Stittreplaces a strncpy() by strscpy() in the
peak_pci sja1000 driver.

The next 5 patches are by me and fix the can_restart() handler and
replace BUG_ON()s in the CAN dev helpers with proper error handling.

The last 27 patches are also by me and target the at91_can driver.
First a new helper function is introduced, the at91_can driver is
cleaned up and updated to use the rx-offload helper.

* tag 'linux-can-next-for-6.7-20231005' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: (37 commits)
  can: at91_can: switch to rx-offload implementation
  can: at91_can: at91_alloc_can_err_skb() introduce new function
  can: at91_can: at91_irq_err_line(): send error counters with state change
  can: at91_can: at91_irq_err_line(): make use of can_change_state() and can_bus_off()
  can: at91_can: at91_irq_err_line(): take reg_sr into account for bus off
  can: at91_can: at91_irq_err_line(): make use of can_state_get_by_berr_counter()
  can: at91_can: at91_irq_err(): rename to at91_irq_err_line()
  can: at91_can: at91_irq_err_frame(): move next to at91_irq_err()
  can: at91_can: at91_irq_err_frame(): call directly from IRQ handler
  can: at91_can: at91_poll_err(): increase stats even if no quota left or OOM
  can: at91_can: at91_poll_err(): fold in at91_poll_err_frame()
  can: at91_can: add CAN transceiver support
  can: at91_can: at91_open(): forward request_irq()'s return value in case or an error
  can: at91_can: at91_chip_start(): don't disable IRQs twice
  can: at91_can: at91_set_bittiming(): demote register output to debug level
  can: at91_can: rename struct at91_priv::{tx_next,tx_echo} to {tx_head,tx_tail}
  can: at91_can: at91_setup_mailboxes(): update comments
  can: at91_can: add more register definitions
  can: at91_can: MCR Register: convert to FIELD_PREP()
  can: at91_can: MSR Register: convert to FIELD_PREP()
  ...
====================

Link: https://lore.kernel.org/r/20231005195812.549776-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agoMerge wireless into wireless-next
Johannes Berg [Thu, 5 Oct 2023 20:57:34 +0000 (22:57 +0200)]
Merge wireless into wireless-next

Resolve several conflicts, mostly between changes/fixes in
wireless and the locking rework in wireless-next. One of
the conflicts actually shows a bug in wireless that we'll
want to fix separately.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
7 months agonet: phy: dp83867: Add support for hardware blinking LEDs
Sascha Hauer [Wed, 4 Oct 2023 08:40:26 +0000 (10:40 +0200)]
net: phy: dp83867: Add support for hardware blinking LEDs

This implements the led_hw_* hooks to support hardware blinking LEDs on
the DP83867 phy. The driver supports all LED modes that have a
corresponding TRIGGER_NETDEV_* define. Error and collision do not have
a TRIGGER_NETDEV_* define, so these modes are currently not supported.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com> #TQMa8MxML/MBa8Mx
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agoflow_offload: Annotate struct flow_action_entry with __counted_by
Kees Cook [Tue, 3 Oct 2023 23:18:33 +0000 (16:18 -0700)]
flow_offload: Annotate struct flow_action_entry with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct flow_action_entry.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org
Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agonet/packet: Annotate struct packet_fanout with __counted_by
Kees Cook [Tue, 3 Oct 2023 23:17:41 +0000 (16:17 -0700)]
net/packet: Annotate struct packet_fanout with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct packet_fanout.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Anqi Shen <amy.saq@antgroup.com>
Cc: netdev@vger.kernel.org
Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agotools: ynl-gen: use uapi header name for the header guard
Jakub Kicinski [Tue, 3 Oct 2023 22:57:35 +0000 (15:57 -0700)]
tools: ynl-gen: use uapi header name for the header guard

Chuck points out that we should use the uapi-header property
when generating the guard. Otherwise we may generate the same
guard as another file in the tree.

Tested-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agoMerge branch 'mlxsw-ACL-region'
David S. Miller [Fri, 6 Oct 2023 10:08:07 +0000 (11:08 +0100)]
Merge branch 'mlxsw-ACL-region'

Petr Machata says:

====================
mlxsw: Control the order of blocks in ACL region

Amit Cohen writes:

For 12 key blocks in the A-TCAM, rules are split into two records, which
constitute two lookups. The two records are linked using a
"large entry key ID".

Due to a Spectrum-4 hardware issue, KVD entries that correspond to key
blocks 0 to 5 of 12 key blocks will be placed in the same KVD pipe if they
only differ in their "large entry key ID", as it is ignored. This results
in a reduced scale, we can insert less than 20k filters and get an error:

    $ tc -b flower.batch
    RTNETLINK answers: Input/output error
    We have an error talking to the kernel

To reduce the probability of this issue, we can place key blocks with
high entropy in blocks 0 to 5. The idea is to place blocks that are often
changed in blocks 0 to 5, for example, key blocks that match on IPv4
addresses or the LSBs of IPv6 addresses. Such placement will reduce the
probability of these blocks to be same.

Mark several blocks with 'high_entropy' flag and place them in blocks 0
to 5. Note that the list of the blocks is just a suggestion, I will verify
it with architects.

Currently, there is a one loop that chooses which blocks should be used
for a given list of elements and fills the blocks - when a block is
chosen, it fills it in the region. To be able to control the order of
the blocks, separate between searching blocks and filling them. Several
pre-changes are required.

Patch set overview:
Patch #1 marks several blocks with 'high_entropy' flag.
Patches #2-#4 prepare the code for filling blocks at the end of the search.
Patch #5 changes the loop to just choose the blocks and fill the blocks at
the end.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agomlxsw: core_acl_flex_keys: Fill blocks with high entropy first
Amit Cohen [Tue, 3 Oct 2023 11:25:30 +0000 (13:25 +0200)]
mlxsw: core_acl_flex_keys: Fill blocks with high entropy first

The previous patches prepared the code to allow separating between
choosing blocks and filling blocks.

Do not add blocks as part of the loop that chooses them. When all the
required blocks are set in the bitmap 'chosen_blocks_bm', start filling
blocks. Iterate over the bitmap twice - first add only blocks that are
marked with 'high_entropy' flag. Then, fill the rest of the blocks.

The idea is to place key blocks with high entropy in blocks 0 to 5. See
more details in previous patches.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agomlxsw: core_acl_flex_keys: Save chosen elements in all blocks per search
Amit Cohen [Tue, 3 Oct 2023 11:25:29 +0000 (13:25 +0200)]
mlxsw: core_acl_flex_keys: Save chosen elements in all blocks per search

Currently, mlxsw_afk_picker() chooses which blocks will be used for a
given list of elements, and fills the blocks during the searching - when a
key block is found with most hits, it adds it and removes the elements from
the count of hits. This should be changed as we want to be able to choose
which blocks will be placed in blocks 0 to 5.

To separate between choosing blocks and filling blocks, several pre-changes
are required. Currently, the indication of whether all elements were
found in the chosen blocks is by the structure 'key_info->elusage'. This
structure is updated when block is filled as part of
mlxsw_afk_picker_key_info_add(). A following patch will call this
function only after choosing all the blocks. Add a bitmap called
'elusage_chosen' to store which elements were chosen in the chosen blocks.
Change the condition in the loop to check elements that were chosen, not
elements that were already filled in the blocks.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agomlxsw: core_acl_flex_keys: Save chosen elements per block
Amit Cohen [Tue, 3 Oct 2023 11:25:28 +0000 (13:25 +0200)]
mlxsw: core_acl_flex_keys: Save chosen elements per block

Currently, mlxsw_afk_picker() chooses which blocks will be used for a
given list of elements, and fills the blocks during the searching - when a
key block is found with most hits, it adds it and removes the elements from
the count of hits. This should be changed as we want to be able to choose
which blocks will be placed in blocks 0 to 5.

To separate between choosing blocks and filling blocks, several pre-changes
are required. During the search, the structure 'mlxsw_afk_picker' is
used per block, it contains how many elements from the required list appear
in the block. When a block is chosen and filled, this bitmap of elements is
cleaned. To be able to fill the blocks at the end, add a bitmap called
'chosen_element' as part of picker. When a block is chosen, copy the
'element' bitmap to it. Use the new bitmap as part of
mlxsw_afk_picker_key_info_add(). So later, when filling the block will
be done at the end of the searching, we will use the copied bitmap that
contains the elements that should be used in the block.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agomlxsw: core_acl_flex_keys: Add a bitmap to save which blocks are chosen
Amit Cohen [Tue, 3 Oct 2023 11:25:27 +0000 (13:25 +0200)]
mlxsw: core_acl_flex_keys: Add a bitmap to save which blocks are chosen

Currently, mlxsw_afk_picker() chooses which blocks will be used for a
given list of elements, and fills the blocks during the searching - when a
key block is found with most hits, it adds it and removes the elements from
the count of hits. This should be changed as we want to be able to choose
which blocks will be placed in blocks 0 to 5.

To separate between choosing blocks and filling blocks, several pre-changes
are required. The indexes of the chosen blocks should be saved, so then
the relevant blocks will be filled at the end of search.

Allocate a bitmap for chosen blocks, when a block is found with most
hits, set the relevant bit in the bitmap. This bitmap will be used in a
following patch.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agomlxsw: Mark high entropy key blocks
Amit Cohen [Tue, 3 Oct 2023 11:25:26 +0000 (13:25 +0200)]
mlxsw: Mark high entropy key blocks

For 12 key blocks in the A-TCAM, rules are split into two records, which
constitute two lookups. The two records are linked using a
"large entry key ID".

Due to a Spectrum-4 hardware issue, KVD entries that correspond to key
blocks 0 to 5 of 12 key blocks A-TCAM entries will be placed in the same
KVD pipe if they only differ in their "large entry key ID", as it is
ignored. This results in a reduced scale. To reduce the probability of this
issue, we can place key blocks with high entropy in blocks 0 to 5. The idea
is to place blocks that are changed often in blocks 0 to 5, for
example, key blocks that match on IPv4 addresses or the LSBs of IPv6
addresses. Such placement will reduce the probability of these blocks to be
same.

Mark several blocks with 'high_entropy' flag, so later we will take into
account this flag and place them in blocks 0 to 5.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agoMerge branch 'sfc-conntrack-offloads'
David S. Miller [Fri, 6 Oct 2023 10:05:45 +0000 (11:05 +0100)]
Merge branch 'sfc-conntrack-offloads'

Edward Cree says:

====================
sfc: conntrack offload for tunnels

This series adds support for offloading TC flower rules which require
both connection tracking and tunnel decapsulation.  Depending on the
match keys required, the left-hand-side rule may go in either the
Outer Rule table or the Action Rule table.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agosfc: support TC rules which require OR-AR-CT-AR flow
Edward Cree [Mon, 2 Oct 2023 15:44:44 +0000 (16:44 +0100)]
sfc: support TC rules which require OR-AR-CT-AR flow

When a foreign LHS rule (TC rule from a tunnel netdev which requests
 conntrack lookup) matches on inner headers or enc_key_id, these matches
 cannot be performed by the Outer Rule table, as the keys are only
 available after the tunnel type has been identified (by the OR lookup)
 and the rest of the headers parsed accordingly.
Offload such rules with an Action Rule, using the LOOKUP_CONTROL section
 of the AR response to specify the conntrack and/or recirculation actions,
 combined with an Outer Rule which performs only the usual Encap Match
 duties.
This processing flow, as it requires two AR lookups per packet, is less
 performant than OR-CT-AR, so only use it where necessary.

Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agosfc: ensure an extack msg from efx_tc_flower_replace_foreign EOPNOTSUPPs
Edward Cree [Mon, 2 Oct 2023 15:44:43 +0000 (16:44 +0100)]
sfc: ensure an extack msg from efx_tc_flower_replace_foreign EOPNOTSUPPs

There were a few places where no extack error message was set, or the
 extack was not forwarded to callees, potentially resulting in a return
 of -EOPNOTSUPP with no additional information.
Make sure to populate the error message in these cases.  In practice
 this does us no good as TC indirect block callbacks don't come with an
 extack to fill in; but maybe they will someday and when debugging it's
 possible to provide a fake extack and emit its message to the console.

Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agosfc: offload foreign RHS rules without an encap match
Edward Cree [Mon, 2 Oct 2023 15:44:42 +0000 (16:44 +0100)]
sfc: offload foreign RHS rules without an encap match

Normally, if a TC filter on a tunnel netdev does not match on any
 encap fields, we decline to offload it, as it cannot meet our
 requirement for a <sip,dip,dport> tuple for the encap match.
However, if the rule has a nonzero chain_index, then for a packet to
 reach the rule, it must already have matched a LHS rule which will
 have included an encap match and determined the tunnel type, so in
 that case we can offload the right-hand-side rule.

Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agosfc: support TC left-hand-side rules on foreign netdevs
Edward Cree [Mon, 2 Oct 2023 15:44:41 +0000 (16:44 +0100)]
sfc: support TC left-hand-side rules on foreign netdevs

Allow a tunnel netdevice (such as a vxlan) to offload conntrack lookups,
 in much the same way as efx netdevs.
To ensure this rule does not overlap with other tunnel rules on the same
 sip,dip,dport tuple, register a pseudo encap match of a new type
 (EFX_TC_EM_PSEUDO_OR), which unlike PSEUDO_MASK may only be referenced
 once (because an actual Outer Rule in hardware exists, although its
 fw_id is not recorded in the encap match entry).

Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agonexthop: Annotate struct nh_group with __counted_by
Kees Cook [Wed, 4 Oct 2023 01:44:49 +0000 (18:44 -0700)]
nexthop: Annotate struct nh_group with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct nh_group.

Cc: David Ahern <dsahern@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org
Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agonexthop: Annotate struct nh_notifier_grp_info with __counted_by
Kees Cook [Tue, 3 Oct 2023 23:21:47 +0000 (16:21 -0700)]
nexthop: Annotate struct nh_notifier_grp_info with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct nh_notifier_grp_info.

Cc: David Ahern <dsahern@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: netdev@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agonetlink: Annotate struct netlink_policy_dump_state with __counted_by
Kees Cook [Tue, 3 Oct 2023 23:21:02 +0000 (16:21 -0700)]
netlink: Annotate struct netlink_policy_dump_state with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct netlink_policy_dump_state.

Additionally update the size of the usage array length before accessing
it. This requires remembering the old size for the memset() and later
assignments.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: netdev@vger.kernel.org
Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agonfp: nsp: Annotate struct nfp_eth_table with __counted_by
Kees Cook [Tue, 3 Oct 2023 23:18:51 +0000 (16:18 -0700)]
nfp: nsp: Annotate struct nfp_eth_table with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct nfp_eth_table.

Cc: Simon Horman <simon.horman@corigine.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Yinjun Zhang <yinjun.zhang@corigine.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Yu Xiao <yu.xiao@corigine.com>
Cc: Sixiang Chen <sixiang.chen@corigine.com>
Cc: oss-drivers@corigine.com
Cc: netdev@vger.kernel.org
Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Acked-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agonfp: Annotate struct nfp_reprs with __counted_by
Kees Cook [Tue, 3 Oct 2023 23:18:43 +0000 (16:18 -0700)]
nfp: Annotate struct nfp_reprs with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct nfp_reprs.

Cc: Simon Horman <simon.horman@corigine.com>
Cc: oss-drivers@corigine.com
Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Acked-by: Louis Peens <louis.peens@corigine.com>
Link: https://lore.kernel.org/r/20231003231843.work.811-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agonetem: Annotate struct disttable with __counted_by
Kees Cook [Tue, 3 Oct 2023 23:18:23 +0000 (16:18 -0700)]
netem: Annotate struct disttable with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct disttable.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Link: https://lore.kernel.org/r/20231003231823.work.684-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agonexthop: Annotate struct nh_notifier_res_table_info with __counted_by
Kees Cook [Tue, 3 Oct 2023 23:18:18 +0000 (16:18 -0700)]
nexthop: Annotate struct nh_notifier_res_table_info with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct
nh_notifier_res_table_info.

Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231003231818.work.883-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agonexthop: Annotate struct nh_res_table with __counted_by
Kees Cook [Tue, 3 Oct 2023 23:18:13 +0000 (16:18 -0700)]
nexthop: Annotate struct nh_res_table with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct nh_res_table.

Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231003231813.work.042-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agoMerge branch 'rework-tx-fault-fixups'
Jakub Kicinski [Fri, 6 Oct 2023 01:05:07 +0000 (18:05 -0700)]
Merge branch 'rework-tx-fault-fixups'

Russell King says:

====================
Rework tx fault fixups

This series reworks the tx-fault fixup and then improves the Nokia GPON
workaround to also ignore the RX LOS signal as well. We do this by
introducing a mask of hardware pin states that should be ignored,
converting the tx-fault fixup to use that, and then augmenting it for
RX LOS.
====================

Link: https://lore.kernel.org/r/ZRwYJXRizvkhm83M@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agonet: sfp: improve Nokia GPON sfp fixup
Russell King (Oracle) [Tue, 3 Oct 2023 13:34:29 +0000 (14:34 +0100)]
net: sfp: improve Nokia GPON sfp fixup

Improve the Nokia GPON fixup - we need to ignore not only the hardware
LOS signal, but also the software implementation as well. Do this by
using the new state_ignore_mask to indicate that we should ignore not
only the hardware RX_LOS signal, and also clear the LOS bits in the
option field.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/E1qnfXh-008UDe-F9@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agonet: sfp: re-implement ignoring the hardware TX_FAULT signal
Russell King (Oracle) [Tue, 3 Oct 2023 13:34:24 +0000 (14:34 +0100)]
net: sfp: re-implement ignoring the hardware TX_FAULT signal

Re-implement how we ignore the hardware TX_FAULT signal. Rather than
having a separate boolean for this, use a bitmask of the hardware
signals that we wish to ignore. This gives more flexibility in the
future to ignore other signals such as RX_LOS.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/E1qnfXc-008UDY-91@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agonet: cpmac: remove driver to prepare for platform removal
Wolfram Sang [Fri, 22 Sep 2023 06:15:26 +0000 (08:15 +0200)]
net: cpmac: remove driver to prepare for platform removal

AR7 is going to be removed from the Kernel, so remove its networking
support in form of the cpmac driver. This allows us to remove the
platform because this driver includes a platform specific header.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/all/20230922061530.3121-6-wsa+renesas@sang-engineering.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 5 Oct 2023 20:16:31 +0000 (13:16 -0700)]
Merge git://git./linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR.

No conflicts (or adjacent changes of note).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agoMerge patch series "can: at91: add can_state_get_by_berr_counter() helper, cleanup...
Marc Kleine-Budde [Thu, 5 Oct 2023 19:48:09 +0000 (21:48 +0200)]
Merge patch series "can: at91: add can_state_get_by_berr_counter() helper, cleanup and convert to rx_offload"

Marc Kleine-Budde <mkl@pengutronix.de> says:

This series first introduces the can_state_get_by_berr_counter()
helper function. It returns the current TX and RX state depending on
the provided CAN bit error counters. It will be later used by the
at91_can driver.

The remaining patches of this series first clean up the at91_can
driver, clean up the bus- and line error (including bus-off) handling,
and then convert it use the rx_offload helper. The driver works better
under high system load and the order of received CAN frames is better
maintained.

Due to a hardware limitation the converted driver could trigger a race
condition in the can_restart() CAN bus-off handler. The patch series
[1] fixes the issue.

[1] https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-0-91b5c1fd922c@pengutronix.de

Changes in v2:
- 1/27: can_state_err_to_state(): use symbolic error values instead of
  plain numbers (Thanks Vincent)
- 27/27: fix patch description and typos (Thanks Vincent)
- Link to v1: https://lore.kernel.org/all/20231004-at91_can-rx_offload-v1-0-c32bf99097db@pengutronix.de

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-0-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: switch to rx-offload implementation
Marc Kleine-Budde [Sun, 10 May 2015 15:25:14 +0000 (17:25 +0200)]
can: at91_can: switch to rx-offload implementation

The current at91_can driver uses NAPI to handle RX'ed CAN frames, the
RX IRQ is disabled and a NAPI poll is scheduled. Then in
at91_poll_rx() the RX'ed CAN frames are tried to read in order from
the device.

This approach has 2 drawbacks:

- Under high system load it might take too long from the initial RX
  IRQ to the NAPI poll function to run. This causes RX buffer
  overflows.
- The algorithm to read the CAN frames in order is not bullet proof
  and may fail under certain use cases/system loads.

The rx-offload helper fixes these problems by reading the RX'ed CAN
frames in the interrupt handler and adding it to a list sorted by RX
timestamp. This list of RX'ed SKBs is then passed to the networking
stack via NAPI.

Convert the RX path to rx-offload, pass all CAN error frames with
can_rx_offload_queue_timestamp().

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-27-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: at91_alloc_can_err_skb() introduce new function
Marc Kleine-Budde [Thu, 28 Sep 2023 09:15:15 +0000 (11:15 +0200)]
can: at91_can: at91_alloc_can_err_skb() introduce new function

This is a preparation patch to convert the driver to make use of the
rx-offload helper. With rx-offload the received CAN frames are sorted
by their timestamp. Regular CAN RX'ed and TX'ed CAN frames are
timestamped by the hardware. Error events are not.

Introduce a new function at91_alloc_can_err_skb() the allocates an
error SKB and reads the current timestamp from the controller.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-26-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: at91_irq_err_line(): send error counters with state change
Marc Kleine-Budde [Thu, 28 Sep 2023 08:05:17 +0000 (10:05 +0200)]
can: at91_can: at91_irq_err_line(): send error counters with state change

Since 3e5c291c7942 ("can: add CAN_ERR_CNT flag to notify availability
of error counter") there is a dedicated flag to inform the user space,
that there are CAN error counters in the CAN error frame.

In case the device is not in bus off mode, send the error counters to
user space and set CAN_ERR_CNT.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-25-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: at91_irq_err_line(): make use of can_change_state() and can_bus_off()
Marc Kleine-Budde [Mon, 1 May 2023 16:14:41 +0000 (18:14 +0200)]
can: at91_can: at91_irq_err_line(): make use of can_change_state() and can_bus_off()

The driver implements a hand crafted CAN state handling. Update the
driver to make use of can_change_state(), introduced in ("can: dev:
Consolidate and unify state change handling")

Also switch from hand crafted CAN bus off handling to can_bus_off():
In case of a bus off, abort all pending TX requests, switch off the
device and let can_bus_off() handle the device restart.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-24-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: at91_irq_err_line(): take reg_sr into account for bus off
Marc Kleine-Budde [Mon, 1 May 2023 16:14:41 +0000 (18:14 +0200)]
can: at91_can: at91_irq_err_line(): take reg_sr into account for bus off

The at91 CAN controller automatically recovers from bus-off after 128
occurrences of 11 consecutive recessive bits.

After an auto-recovered bus-off, the error counters no longer reflect
this fact. On the sam9263 the state bits in the SR register show the
current state (based on the current error counters), while on sam9x5
and newer SoCs these bits are latched.

Take any latched bus-off information from the SR register into account
when calculating the CAN new state, to start the standard CAN bus off
handling.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-23-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: at91_irq_err_line(): make use of can_state_get_by_berr_counter()
Marc Kleine-Budde [Mon, 1 May 2023 16:14:41 +0000 (18:14 +0200)]
can: at91_can: at91_irq_err_line(): make use of can_state_get_by_berr_counter()

On the sam9263 the SR bits for bus off, error passive, warning limit,
and error active are not latched and reflect the current status of the
controller. On the sam9x5 and newer SoCs these bits are latched.

To simplify the code, use can_state_get_by_berr_counter() to get the
state of the controller regardless of the SoC version.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-22-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: at91_irq_err(): rename to at91_irq_err_line()
Marc Kleine-Budde [Mon, 1 May 2023 16:14:41 +0000 (18:14 +0200)]
can: at91_can: at91_irq_err(): rename to at91_irq_err_line()

This is a cleanup patch, no functional change intended.

The function at91_irq_err() only handles the CAN line errors, so
rename it accordingly to at91_irq_err_line().

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-21-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: at91_irq_err_frame(): move next to at91_irq_err()
Marc Kleine-Budde [Thu, 28 Sep 2023 09:15:15 +0000 (11:15 +0200)]
can: at91_can: at91_irq_err_frame(): move next to at91_irq_err()

This is a cleanup patch, no functional change intended. As
at91_irq_err_frame() is called from the IRQ handler move it in front
of the IRQ handler next to at91_irq_err().

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-20-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: at91_irq_err_frame(): call directly from IRQ handler
Marc Kleine-Budde [Mon, 1 May 2023 16:14:41 +0000 (18:14 +0200)]
can: at91_can: at91_irq_err_frame(): call directly from IRQ handler

This is a preparation patch to convert the driver to the rx-offload
helper. In rx-offload RX, TX-done and CAN error handling are done in
the IRQ handler, SKB are pushed to the network stack in the NAPI poll
function.

Move the CAN frame error handling from the NAPI function at91_poll()
to the IRQ handler at91_poll(). To reflect this change, rename
at91_poll_err() to at91_irq_err_frame().

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-19-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: at91_poll_err(): increase stats even if no quota left or OOM
Marc Kleine-Budde [Mon, 1 May 2023 16:14:41 +0000 (18:14 +0200)]
can: at91_can: at91_poll_err(): increase stats even if no quota left or OOM

at91_poll_err() allocates a can error SKB, to inform the user space
about the CAN error. Then it fills the SKB with information the error
information and increases the net device error stats.

In case no SBK can be allocated (e.g. due to an OOM) or the NAPI quota
is 0 the function is left early and no stats are updated. This is not
helpful to the user, as there is no information about the faulty CAN
bus.

Increase the error stats even if no quota is left or no SKB can be
allocated.

While there treat No-Acknowledgment as a bus error, too.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-18-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: at91_poll_err(): fold in at91_poll_err_frame()
Marc Kleine-Budde [Mon, 1 May 2023 16:14:41 +0000 (18:14 +0200)]
can: at91_can: at91_poll_err(): fold in at91_poll_err_frame()

This is a preparation patch for the cleanup of at91_poll_err(). Fold
at91_poll_err_frame() into at91_poll_err() so that it can be easier
modified.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-17-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: add CAN transceiver support
Marc Kleine-Budde [Mon, 1 May 2023 16:14:41 +0000 (18:14 +0200)]
can: at91_can: add CAN transceiver support

Add support for Linux-PHY based CAN transceivers.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-16-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: at91_open(): forward request_irq()'s return value in case or an error
Marc Kleine-Budde [Mon, 1 May 2023 16:14:41 +0000 (18:14 +0200)]
can: at91_can: at91_open(): forward request_irq()'s return value in case or an error

If request_irq() fails, forward the return value.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-15-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: at91_chip_start(): don't disable IRQs twice
Marc Kleine-Budde [Mon, 1 May 2023 16:14:41 +0000 (18:14 +0200)]
can: at91_can: at91_chip_start(): don't disable IRQs twice

In at91_chip_start() first all IRQs are disabled, they do not have to
be disabled again at the end of the function before the requested IRQs
are enabled.

Remove the 2nd disable of all IRQs at the end of the function.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-14-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: at91_set_bittiming(): demote register output to debug level
Marc Kleine-Budde [Tue, 18 Apr 2023 14:36:30 +0000 (16:36 +0200)]
can: at91_can: at91_set_bittiming(): demote register output to debug level

This message isn't really helpful for the general reader of the kernel
logs, so should not be printed with info level.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-13-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: rename struct at91_priv::{tx_next,tx_echo} to {tx_head,tx_tail}
Marc Kleine-Budde [Fri, 21 Apr 2023 16:15:15 +0000 (18:15 +0200)]
can: at91_can: rename struct at91_priv::{tx_next,tx_echo} to {tx_head,tx_tail}

To increase code readability, use the same naming of the counters for
the TX FIFO as in the other drivers implementing the same algorithm.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-12-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: at91_setup_mailboxes(): update comments
Marc Kleine-Budde [Thu, 28 Sep 2023 20:02:16 +0000 (22:02 +0200)]
can: at91_can: at91_setup_mailboxes(): update comments

Since 6388b3961420 ("can: at91_can: add support for the AT91SAM9X5
SOCs") the number of mailboxes used for RX and TX is no longer
constant, but depends on the IP core used.

Remove the fixed number of mailboxes from the comment.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-11-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: add more register definitions
Marc Kleine-Budde [Sun, 10 May 2015 15:25:14 +0000 (17:25 +0200)]
can: at91_can: add more register definitions

Add more register definitions found in the data sheet.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-10-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: MCR Register: convert to FIELD_PREP()
Marc Kleine-Budde [Tue, 18 Apr 2023 14:35:54 +0000 (16:35 +0200)]
can: at91_can: MCR Register: convert to FIELD_PREP()

Use FIELD_PREP() to access the individual fields of the MCR register.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-9-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: MSR Register: convert to FIELD_PREP()
Marc Kleine-Budde [Tue, 18 Apr 2023 14:35:54 +0000 (16:35 +0200)]
can: at91_can: MSR Register: convert to FIELD_PREP()

Use FIELD_PREP() to access the individual fields of the MSR register.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-8-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: MID registers: convert access to FIELD_PREP(), FIELD_GET()
Marc Kleine-Budde [Tue, 18 Apr 2023 14:35:54 +0000 (16:35 +0200)]
can: at91_can: MID registers: convert access to FIELD_PREP(), FIELD_GET()

Use FIELD_PREP() and FIELD_GET() to access the individual fields of
the MID register.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-7-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: MMR registers: convert to FIELD_PREP()
Marc Kleine-Budde [Tue, 18 Apr 2023 14:35:54 +0000 (16:35 +0200)]
can: at91_can: MMR registers: convert to FIELD_PREP()

Use FIELD_PREP() to access the individual fields of the MMR register.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-6-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: ECR register: convert to FIELD_GET()
Marc Kleine-Budde [Tue, 18 Apr 2023 14:35:54 +0000 (16:35 +0200)]
can: at91_can: ECR register: convert to FIELD_GET()

Use FIELD_GET() to access the individual fields of the ECR register.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-5-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: BR register: convert to FIELD_PREP()
Marc Kleine-Budde [Tue, 18 Apr 2023 14:35:54 +0000 (16:35 +0200)]
can: at91_can: BR register: convert to FIELD_PREP()

Use FIELD_PREP() to access the individual fields of the BR register.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-4-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: at91_irq_tx(): remove one level of indention
Marc Kleine-Budde [Sun, 23 Apr 2023 11:47:40 +0000 (13:47 +0200)]
can: at91_can: at91_irq_tx(): remove one level of indention

Improve code readability by removing one level of indention.

If a mailbox is not ready, continue the loop early.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-3-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: at91_can: use a consistent indention
Marc Kleine-Budde [Tue, 18 Apr 2023 14:26:52 +0000 (16:26 +0200)]
can: at91_can: use a consistent indention

Convert the driver to use a consistent indention of one space after
defines and in enums. That makes it easier to add new defines, which
will be done in the coming patches.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-2-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: dev: add can_state_get_by_berr_counter() to return the CAN state based on the...
Marc Kleine-Budde [Thu, 28 Sep 2023 07:24:28 +0000 (09:24 +0200)]
can: dev: add can_state_get_by_berr_counter() to return the CAN state based on the current error counters

Some CAN controllers do not have a register that contains the current
CAN state, but only a register that contains the error counters.

Introduce a new function can_state_get_by_berr_counter() that returns
the current TX and RX state depending on the provided CAN bit error
counters.

Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-1-9987d53600e0@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agoMerge patch series "can: dev: fix can_restart() and replace BUG_ON() by error handling"
Marc Kleine-Budde [Thu, 5 Oct 2023 19:34:37 +0000 (21:34 +0200)]
Merge patch series "can: dev: fix can_restart() and replace BUG_ON() by error handling"

Marc Kleine-Budde <mkl@pengutronix.de> says:

There are 2 BUG_ON() in the CAN dev helpers. During the update/test of
the at91_can driver to rx-offload the one in can_restart() was
triggered, due to a race condition in can_restart() and a hardware
limitation of the at91_can IP core.

This series fixes the race condition, replaces BUG_ON() with an error
message, and does some cleanup. Finally, the BUG_ON() in
can_put_echo_skb() is also replaced with error handling.

Changes in v2:
- 4/5: move "Restarted" debug message and stats after successful restart (Thanks Vincent)
- Link to v1: https://lore.kernel.org/all/20231004-can-dev-fix-can-restart-v1-0-2e52899eaaf5@pengutronix.de

Link: https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-0-91b5c1fd922c@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: dev: can_put_echo_skb(): don't crash kernel if can_priv::echo_skb is accessed...
Marc Kleine-Budde [Fri, 29 Sep 2023 08:23:47 +0000 (10:23 +0200)]
can: dev: can_put_echo_skb(): don't crash kernel if can_priv::echo_skb is accessed out of bounds

If the "struct can_priv::echoo_skb" is accessed out of bounds, this
would cause a kernel crash. Instead, issue a meaningful warning
message and return with an error.

Fixes: a6e4bc530403 ("can: make the number of echo skb's configurable")
Link: https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-5-91b5c1fd922c@pengutronix.de
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: dev: can_restart(): move debug message and stats after successful restart
Marc Kleine-Budde [Fri, 29 Sep 2023 08:18:02 +0000 (10:18 +0200)]
can: dev: can_restart(): move debug message and stats after successful restart

Move the debug message "restarted" and the CAN restart stats_after_
the successful restart of the CAN device, because the restart may
fail.

While there update the error message from printing the error number to
printing symbolic error names.

Link: https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-4-91b5c1fd922c@pengutronix.de
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
[mkl: mention stats in subject and description, too]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: dev: can_restart(): reverse logic to remove need for goto
Marc Kleine-Budde [Fri, 29 Sep 2023 07:47:38 +0000 (09:47 +0200)]
can: dev: can_restart(): reverse logic to remove need for goto

Reverse the logic in the if statement and eliminate the need for a
goto to simplify code readability.

Link: https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-3-91b5c1fd922c@pengutronix.de
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: dev: can_restart(): fix race condition between controller restart and netif_carr...
Marc Kleine-Budde [Fri, 29 Sep 2023 08:25:11 +0000 (10:25 +0200)]
can: dev: can_restart(): fix race condition between controller restart and netif_carrier_on()

This race condition was discovered while updating the at91_can driver
to use can_bus_off(). The following scenario describes how the
converted at91_can driver would behave.

When a CAN device goes into BUS-OFF state, the driver usually
stops/resets the CAN device and calls can_bus_off().

This function sets the netif carrier to off, and (if configured by
user space) schedules a delayed work that calls can_restart() to
restart the CAN device.

The can_restart() function first checks if the carrier is off and
triggers an error message if the carrier is OK.

Then it calls the driver's do_set_mode() function to restart the
device, then it sets the netif carrier to on. There is a race window
between these two calls.

The at91 CAN controller (observed on the sama5d3, a single core 32 bit
ARM CPU) has a hardware limitation. If the device goes into bus-off
while sending a CAN frame, there is no way to abort the sending of
this frame. After the controller is enabled again, another attempt is
made to send it.

If the bus is still faulty, the device immediately goes back to the
bus-off state. The driver calls can_bus_off(), the netif carrier is
switched off and another can_restart is scheduled. This occurs within
the race window before the original can_restart() handler marks the
netif carrier as OK. This would cause the 2nd can_restart() to be
called with an OK netif carrier, resulting in an error message.

The flow of the 1st can_restart() looks like this:

can_restart()
    // bail out if netif_carrier is OK

    netif_carrier_ok(dev)
    priv->do_set_mode(dev, CAN_MODE_START)
        // enable CAN controller
        // sama5d3 restarts sending old message

        // CAN devices goes into BUS_OFF, triggers IRQ

// IRQ handler start
    at91_irq()
        at91_irq_err_line()
            can_bus_off()
                netif_carrier_off()
                schedule_delayed_work()
// IRQ handler end

    netif_carrier_on()

The 2nd can_restart() will be called with an OK netif carrier and the
error message will be printed.

To close the race window, first set the netif carrier to on, then
restart the controller. In case the restart fails with an error code,
roll back the netif carrier to off.

Fixes: 39549eef3587 ("can: CAN Network device driver and Netlink interface")
Link: https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-2-91b5c1fd922c@pengutronix.de
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agocan: dev: can_restart(): don't crash kernel if carrier is OK
Marc Kleine-Budde [Thu, 28 Sep 2023 19:58:23 +0000 (21:58 +0200)]
can: dev: can_restart(): don't crash kernel if carrier is OK

During testing, I triggered a can_restart() with the netif carrier
being OK [1]. The BUG_ON, which checks if the carrier is OK, results
in a fatal kernel crash. This is neither helpful for debugging nor for
a production system.

[1] The root cause is a race condition in can_restart() which will be
fixed in the next patch.

Do not crash the kernel, issue an error message instead, and continue
restarting the CAN device anyway.

Fixes: 39549eef3587 ("can: CAN Network device driver and Netlink interface")
Link: https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-1-91b5c1fd922c@pengutronix.de
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
7 months agoMerge tag 'net-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 5 Oct 2023 18:29:21 +0000 (11:29 -0700)]
Merge tag 'net-6.6-rc5' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from Bluetooth, netfilter, BPF and WiFi.

  I didn't collect precise data but feels like we've got a lot of 6.5
  fixes here. WiFi fixes are most user-awaited.

  Current release - regressions:

   - Bluetooth: fix hci_link_tx_to RCU lock usage

  Current release - new code bugs:

   - bpf: mprog: fix maximum program check on mprog attachment

   - eth: ti: icssg-prueth: fix signedness bug in prueth_init_tx_chns()

  Previous releases - regressions:

   - ipv6: tcp: add a missing nf_reset_ct() in 3WHS handling

   - vringh: don't use vringh_kiov_advance() in vringh_iov_xfer(), it
     doesn't handle zero length like we expected

   - wifi:
      - cfg80211: fix cqm_config access race, fix crashes with brcmfmac
      - iwlwifi: mvm: handle PS changes in vif_cfg_changed
      - mac80211: fix mesh id corruption on 32 bit systems
      - mt76: mt76x02: fix MT76x0 external LNA gain handling

   - Bluetooth: fix handling of HCI_QUIRK_STRICT_DUPLICATE_FILTER

   - l2tp: fix handling of transhdrlen in __ip{,6}_append_data()

   - dsa: mv88e6xxx: avoid EEPROM timeout when EEPROM is absent

   - eth: stmmac: fix the incorrect parameter after refactoring

  Previous releases - always broken:

   - net: replace calls to sock->ops->connect() with kernel_connect(),
     prevent address rewrite in kernel_bind(); otherwise BPF hooks may
     modify arguments, unexpectedly to the caller

   - tcp: fix delayed ACKs when reads and writes align with MSS

   - bpf:
      - verifier: unconditionally reset backtrack_state masks on global
        func exit
      - s390: let arch_prepare_bpf_trampoline return program size, fix
        struct_ops offsets
      - sockmap: fix accounting of available bytes in presence of PEEKs
      - sockmap: reject sk_msg egress redirects to non-TCP sockets

   - ipv4/fib: send netlink notify when delete source address routes

   - ethtool: plca: fix width of reads when parsing netlink commands

   - netfilter: nft_payload: rebuild vlan header on h_proto access

   - Bluetooth: hci_codec: fix leaking memory of local_codecs

   - eth: intel: ice: always add legacy 32byte RXDID in supported_rxdids

   - eth: stmmac:
     - dwmac-stm32: fix resume on STM32 MCU
     - remove buggy and unneeded stmmac_poll_controller, depend on NAPI

   - ibmveth: always recompute TCP pseudo-header checksum, fix use of
     the driver with Open vSwitch

   - wifi:
      - rtw88: rtw8723d: fix MAC address offset in EEPROM
      - mt76: fix lock dependency problem for wed_lock
      - mwifiex: sanity check data reported by the device
      - iwlwifi: ensure ack flag is properly cleared
      - iwlwifi: mvm: fix a memory corruption due to bad pointer arithm
      - iwlwifi: mvm: fix incorrect usage of scan API

  Misc:

   - wifi: mac80211: work around Cisco AP 9115 VHT MPDU length"

* tag 'net-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (99 commits)
  MAINTAINERS: update Matthieu's email address
  mptcp: userspace pm allow creating id 0 subflow
  mptcp: fix delegated action races
  net: stmmac: remove unneeded stmmac_poll_controller
  net: lan743x: also select PHYLIB
  net: ethernet: mediatek: disable irq before schedule napi
  net: mana: Fix oversized sge0 for GSO packets
  net: mana: Fix the tso_bytes calculation
  net: mana: Fix TX CQE error handling
  netlink: annotate data-races around sk->sk_err
  sctp: update hb timer immediately after users change hb_interval
  sctp: update transport state when processing a dupcook packet
  tcp: fix delayed ACKs for MSS boundary condition
  tcp: fix quick-ack counting to count actual ACKs of new data
  page_pool: fix documentation typos
  tipc: fix a potential deadlock on &tx->lock
  net: stmmac: dwmac-stm32: fix resume on STM32 MCU
  ipv4: Set offload_failed flag in fibmatch results
  netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure
  netfilter: nf_tables: Deduplicate nft_register_obj audit logs
  ...

7 months agoMerge tag 'integrity-v6.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar...
Linus Torvalds [Thu, 5 Oct 2023 18:12:33 +0000 (11:12 -0700)]
Merge tag 'integrity-v6.6-fix' of git://git./linux/kernel/git/zohar/linux-integrity

Pull integrity fixes from Mimi Zohar:
 "Two additional patches to fix the removal of the deprecated
  IMA_TRUSTED_KEYRING Kconfig"

* tag 'integrity-v6.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: rework CONFIG_IMA dependency block
  ima: Finish deprecation of IMA_TRUSTED_KEYRING Kconfig

7 months agoMerge tag 'leds-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
Linus Torvalds [Thu, 5 Oct 2023 18:07:03 +0000 (11:07 -0700)]
Merge tag 'leds-fixes-6.6' of git://git./linux/kernel/git/lee/leds

Pull LED fix from Lee Jones:
 "Just the one bug-fix:

   - Fix regression affecting LED_COLOR_ID_MULTI users"

* tag 'leds-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds:
  leds: Drop BUG_ON check for LED_COLOR_ID_MULTI

7 months agoMerge tag 'mfd-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Linus Torvalds [Thu, 5 Oct 2023 18:03:20 +0000 (11:03 -0700)]
Merge tag 'mfd-fixes-6.6' of git://git./linux/kernel/git/lee/mfd

Pull MFD fixes from Lee Jones:
 "A couple of small fixes:

   - Potential build failure in CS42L43

   - Device Tree bindings clean-up for a superseded patch"

* tag 'mfd-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
  dt-bindings: mfd: Revert "dt-bindings: mfd: maxim,max77693: Add USB connector"
  mfd: cs42l43: Fix MFD_CS42L43 dependency on REGMAP_IRQ

7 months agoMerge tag 'ovl-fixes-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/overla...
Linus Torvalds [Thu, 5 Oct 2023 17:56:18 +0000 (10:56 -0700)]
Merge tag 'ovl-fixes-6.6-rc5' of git://git./linux/kernel/git/overlayfs/vfs

Pull overlayfs fixes from Amir Goldstein:

 - Fix for file reference leak regression

 - Fix for NULL pointer deref regression

 - Fixes for RCU-walk race regressions:

   Two of the fixes were taken from Al's RCU pathwalk race fixes series
   with his consent [1].

   Note that unlike most of Al's series, these two patches are not about
   racing with ->kill_sb() and they are also very recent regressions
   from v6.5, so I think it's worth getting them into v6.5.y.

   There is also a fix for an RCU pathwalk race with ->kill_sb(), which
   may have been solved in vfs generic code as you suggested, but it
   also rids overlayfs from a nasty hack, so I think it's worth anyway.

Link: https://lore.kernel.org/linux-fsdevel/20231003204749.GA800259@ZenIV/
* tag 'ovl-fixes-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
  ovl: fix NULL pointer defer when encoding non-decodable lower fid
  ovl: make use of ->layers safe in rcu pathwalk
  ovl: fetch inode once in ovl_dentry_revalidate_common()
  ovl: move freeing ovl_entry past rcu delay
  ovl: fix file reference leak when submitting aio

7 months agoMerge branch 'mptcp-fixes-and-maintainer-email-update-for-v6-6'
Jakub Kicinski [Thu, 5 Oct 2023 16:34:34 +0000 (09:34 -0700)]
Merge branch 'mptcp-fixes-and-maintainer-email-update-for-v6-6'

Mat Martineau says:

====================
mptcp: Fixes and maintainer email update for v6.6

Patch 1 addresses a race condition in MPTCP "delegated actions"
infrastructure. Affects v5.19 and later.

Patch 2 removes an unnecessary restriction that did not allow additional
outgoing subflows using the local address of the initial MPTCP subflow.
v5.16 and later.

Patch 3 updates Matthieu's email address.
====================

Link: https://lore.kernel.org/r/20231004-send-net-20231004-v1-0-28de4ac663ae@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agoMAINTAINERS: update Matthieu's email address
Matthieu Baerts [Wed, 4 Oct 2023 20:38:13 +0000 (13:38 -0700)]
MAINTAINERS: update Matthieu's email address

Use my kernel.org account instead.

The other one will bounce by the end of the year.

Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231004-send-net-20231004-v1-3-28de4ac663ae@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agomptcp: userspace pm allow creating id 0 subflow
Geliang Tang [Wed, 4 Oct 2023 20:38:12 +0000 (13:38 -0700)]
mptcp: userspace pm allow creating id 0 subflow

This patch drops id 0 limitation in mptcp_nl_cmd_sf_create() to allow
creating additional subflows with the local addr ID 0.

There is no reason not to allow additional subflows from this local
address: we should be able to create new subflows from the initial
endpoint. This limitation was breaking fullmesh support from userspace.

Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/391
Cc: stable@vger.kernel.org
Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231004-send-net-20231004-v1-2-28de4ac663ae@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agomptcp: fix delegated action races
Paolo Abeni [Wed, 4 Oct 2023 20:38:11 +0000 (13:38 -0700)]
mptcp: fix delegated action races

The delegated action infrastructure is prone to the following
race: different CPUs can try to schedule different delegated
actions on the same subflow at the same time.

Each of them will check different bits via mptcp_subflow_delegate(),
and will try to schedule the action on the related per-cpu napi
instance.

Depending on the timing, both can observe an empty delegated list
node, causing the same entry to be added simultaneously on two different
lists.

The root cause is that the delegated actions infra does not provide
a single synchronization point. Address the issue reserving an additional
bit to mark the subflow as scheduled for delegation. Acquiring such bit
guarantee the caller to own the delegated list node, and being able to
safely schedule the subflow.

Clear such bit only when the subflow scheduling is completed, ensuring
proper barrier in place.

Additionally swap the meaning of the delegated_action bitmask, to allow
the usage of the existing helper to set multiple bit at once.

Fixes: bcd97734318d ("mptcp: use delegate action to schedule 3rd ack retrans")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231004-send-net-20231004-v1-1-28de4ac663ae@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agoi40e: Move DDP specific macros and structures to i40e_ddp.c
Ivan Vecera [Wed, 27 Sep 2023 08:31:35 +0000 (10:31 +0200)]
i40e: Move DDP specific macros and structures to i40e_ddp.c

Move several DDP related macros and structures from i40e.h header
to i40e_ddp.c where are privately used. Make static i40e_ddp_load()
function that is also used only in i40e_ddp and move declaration of
i40e_ddp_flash() used by i40e_ethtool.c to i40e_prototype.h

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
7 months agoi40e: Remove circular header dependencies and fix headers
Ivan Vecera [Wed, 27 Sep 2023 08:31:34 +0000 (10:31 +0200)]
i40e: Remove circular header dependencies and fix headers

Similarly as for ice driver [1] there are also circular header
dependencies in i40e driver:
i40e.h -> i40e_virtchnl_pf.h -> i40e.h

Another issue is that i40e header files does not contain their own
dependencies on other header files (both private and standard) so their
inclusion in .c file require to add these deps in certain order to
that .c file to make it compilable.

Fix both issues by removal the mentioned circular dependency, by filling
i40e headers with their dependencies so they can be placed anywhere in
a source code. Additionally remove bunch of includes from i40e.h super
header file that are not necessary and include i40e.h only in .c files
that really require it.

[1] 649c87c6ff52 ("ice: remove circular header dependencies on ice.h")

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
7 months agonet: stmmac: remove unneeded stmmac_poll_controller
Remi Pommarel [Wed, 4 Oct 2023 14:33:56 +0000 (16:33 +0200)]
net: stmmac: remove unneeded stmmac_poll_controller

Using netconsole netpoll_poll_dev could be called from interrupt
context, thus using disable_irq() would cause the following kernel
warning with CONFIG_DEBUG_ATOMIC_SLEEP enabled:

  BUG: sleeping function called from invalid context at kernel/irq/manage.c:137
  in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 10, name: ksoftirqd/0
  CPU: 0 PID: 10 Comm: ksoftirqd/0 Tainted: G        W         5.15.42-00075-g816b502b2298-dirty #117
  Hardware name: aml (r1) (DT)
  Call trace:
   dump_backtrace+0x0/0x270
   show_stack+0x14/0x20
   dump_stack_lvl+0x8c/0xac
   dump_stack+0x18/0x30
   ___might_sleep+0x150/0x194
   __might_sleep+0x64/0xbc
   synchronize_irq+0x8c/0x150
   disable_irq+0x2c/0x40
   stmmac_poll_controller+0x140/0x1a0
   netpoll_poll_dev+0x6c/0x220
   netpoll_send_skb+0x308/0x390
   netpoll_send_udp+0x418/0x760
   write_msg+0x118/0x140 [netconsole]
   console_unlock+0x404/0x500
   vprintk_emit+0x118/0x250
   dev_vprintk_emit+0x19c/0x1cc
   dev_printk_emit+0x90/0xa8
   __dev_printk+0x78/0x9c
   _dev_warn+0xa4/0xbc
   ath10k_warn+0xe8/0xf0 [ath10k_core]
   ath10k_htt_txrx_compl_task+0x790/0x7fc [ath10k_core]
   ath10k_pci_napi_poll+0x98/0x1f4 [ath10k_pci]
   __napi_poll+0x58/0x1f4
   net_rx_action+0x504/0x590
   _stext+0x1b8/0x418
   run_ksoftirqd+0x74/0xa4
   smpboot_thread_fn+0x210/0x3c0
   kthread+0x1fc/0x210
   ret_from_fork+0x10/0x20

Since [0] .ndo_poll_controller is only needed if driver doesn't or
partially use NAPI. Because stmmac does so, stmmac_poll_controller
can be removed fixing the above warning.

[0] commit ac3d9dd034e5 ("netpoll: make ndo_poll_controller() optional")

Cc: <stable@vger.kernel.org> # 5.15.x
Fixes: 47dd7a540b8a ("net: add support for STMicroelectronics Ethernet controllers")
Signed-off-by: Remi Pommarel <repk@triplefau.lt>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/1c156a6d8c9170bd6a17825f2277115525b4d50f.1696429960.git.repk@triplefau.lt
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agoi40e: Split i40e_osdep.h
Ivan Vecera [Wed, 27 Sep 2023 08:31:33 +0000 (10:31 +0200)]
i40e: Split i40e_osdep.h

Header i40e_osdep.h contains only IO primitives and couple of debug
printing macros. Split this header file to i40e_io.h and i40e_debug.h
and move i40e_debug_mask enum to i40e_debug.h

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
7 months agoi40e: Move memory allocation structures to i40e_alloc.h
Ivan Vecera [Wed, 27 Sep 2023 08:31:32 +0000 (10:31 +0200)]
i40e: Move memory allocation structures to i40e_alloc.h

Structures i40e_dma_mem & i40e_virt_mem are defined i40e_osdep.h while
memory allocation functions that use them are declared in i40e_alloc.h
Move them there.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
7 months agoi40e: Simplify memory allocation functions
Ivan Vecera [Wed, 27 Sep 2023 08:31:31 +0000 (10:31 +0200)]
i40e: Simplify memory allocation functions

Enum i40e_memory_type enum is unused in i40e_allocate_dma_mem() thus
can be safely removed. Useless macros in i40e_alloc.h can be removed
as well.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
7 months agovirtchnl: Add header dependencies
Ivan Vecera [Wed, 27 Sep 2023 08:31:30 +0000 (10:31 +0200)]
virtchnl: Add header dependencies

The <linux/avf/virtchnl.h> uses BIT, struct_size and ETH_ALEN macros
but does not include appropriate header files that defines them.
Add these dependencies so this header file can be included anywhere.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
7 months agoi40e: Refactor I40E_MDIO_CLAUSE* macros
Ivan Vecera [Wed, 27 Sep 2023 08:31:29 +0000 (10:31 +0200)]
i40e: Refactor I40E_MDIO_CLAUSE* macros

The macros I40E_MDIO_CLAUSE22* and I40E_MDIO_CLAUSE45* are using I40E_MASK
together with the same values I40E_GLGEN_MSCA_STCODE_SHIFT and
I40E_GLGEN_MSCA_OPCODE_SHIFT to define masks.
Introduce I40E_GLGEN_MSCA_OPCODE_MASK and I40E_GLGEN_MSCA_STCODE_MASK
for both shifts in i40e_register.h and use them to refactor the macros
mentioned above.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
7 months agoi40e: Move I40E_MASK macro to i40e_register.h
Ivan Vecera [Wed, 27 Sep 2023 08:31:28 +0000 (10:31 +0200)]
i40e: Move I40E_MASK macro to i40e_register.h

The macro is practically used only in i40e_register.h header file except
few I40E_MDIO_CLAUSE* macros that are defined in i40e_type.h
Move I40E_MASK macro to i40e_register.h header, I40E_MDIO_CLAUSE* macros
are refactored in subsequent patch.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
7 months agoi40e: Remove back pointer from i40e_hw structure
Ivan Vecera [Wed, 27 Sep 2023 08:31:27 +0000 (10:31 +0200)]
i40e: Remove back pointer from i40e_hw structure

The .back field placed in i40e_hw is used to get pointer to i40e_pf
instance but it is not necessary as the i40e_hw is a part of i40e_pf
and containerof macro can be used to obtain the pointer to i40e_pf.
Remove .back field from i40e_hw structure, introduce i40e_hw_to_pf()
and i40e_hw_to_dev() helpers and use them.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
7 months agonet: lan743x: also select PHYLIB
Randy Dunlap [Mon, 2 Oct 2023 19:35:44 +0000 (12:35 -0700)]
net: lan743x: also select PHYLIB

Since FIXED_PHY depends on PHYLIB, PHYLIB needs to be set to avoid
a kconfig warning:

WARNING: unmet direct dependencies detected for FIXED_PHY
  Depends on [n]: NETDEVICES [=y] && PHYLIB [=n]
  Selected by [y]:
  - LAN743X [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICROCHIP [=y] && PCI [=y] && PTP_1588_CLOCK_OPTIONAL [=y]

Fixes: 73c4d1b307ae ("net: lan743x: select FIXED_PHY")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: lore.kernel.org/r/202309261802.JPbRHwti-lkp@intel.com
Cc: Bryan Whitehead <bryan.whitehead@microchip.com>
Cc: UNGLinuxDriver@microchip.com
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Link: https://lore.kernel.org/r/20231002193544.14529-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
7 months agonet: ethernet: mediatek: disable irq before schedule napi
Christian Marangi [Mon, 2 Oct 2023 14:08:05 +0000 (16:08 +0200)]
net: ethernet: mediatek: disable irq before schedule napi

While searching for possible refactor of napi_schedule_prep and
__napi_schedule it was notice that the mtk eth driver disable the
interrupt for rx and tx AFTER napi is scheduled.

While this is a very hard to repro case it might happen to have
situation where the interrupt is disabled and never enabled again as the
napi completes and the interrupt is enabled before.

This is caused by the fact that a napi driven by interrupt expect a
logic with:
1. interrupt received. napi prepared -> interrupt disabled -> napi
   scheduled
2. napi triggered. ring cleared -> interrupt enabled -> wait for new
   interrupt

To prevent this case, disable the interrupt BEFORE the napi is
scheduled.

Fixes: 656e705243fd ("net-next: mediatek: add support for MT7623 ethernet")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20231002140805.568-1-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 months agonet_sched: sch_fq: add TCA_FQ_WEIGHTS attribute
Eric Dumazet [Mon, 2 Oct 2023 13:17:38 +0000 (13:17 +0000)]
net_sched: sch_fq: add TCA_FQ_WEIGHTS attribute

This attribute can be used to tune the per band weight
and report them in "tc qdisc show" output:

qdisc fq 802f: parent 1:9 limit 100000p flow_limit 500p buckets 1024 orphan_mask 1023
 quantum 8364b initial_quantum 41820b low_rate_threshold 550Kbit
 refill_delay 40ms timer_slack 10us horizon 10s horizon_drop
 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 weights 589824 196608 65536
 Sent 236460814 bytes 792991 pkt (dropped 0, overlimits 0 requeues 0)
 rate 25816bit 10pps backlog 0b 0p requeues 0
  flows 4 (inactive 4 throttled 0)
  gc 0 throttled 19 latency 17.6us fastpath 773882

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 months agonet_sched: sch_fq: add 3 bands and WRR scheduling
Eric Dumazet [Mon, 2 Oct 2023 13:17:37 +0000 (13:17 +0000)]
net_sched: sch_fq: add 3 bands and WRR scheduling

Before Google adopted FQ for its production servers,
we had to ensure AF4 packets would get a higher share
than BE1 ones.

As discussed this week in Netconf 2023 in Paris, it is time
to upstream this for public use.

After this patch FQ can replace pfifo_fast, with the following
differences :

- FQ uses WRR instead of strict prio, to avoid starvation of
  low priority packets.

- We make sure each band/prio tracks its own usage against sch->limit.
  This was done to make sure flood of low priority packets would not
  prevent AF4 packets to be queued. Contributed by Willem.

- priomap can be changed, if needed (default value are the ones
  coming from pfifo_fast).

In this patch, we set default band weights so that :

- high prio (band=0) packets get 90% of the bandwidth
  if they compete with low prio (band=2) packets.

- high prio packets get 75% of the bandwidth
  if they compete with medium prio (band=1) packets.

Following patch in this series adds the possibility to tune
the per-band weights.

As we added many fields in 'struct fq_sched_data', we had
to make sure to have the first cache line read-mostly, and
avoid wasting precious cache lines.

More optimizations are possible but will be sent separately.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 months agonet_sched: export pfifo_fast prio2band[]
Eric Dumazet [Mon, 2 Oct 2023 13:17:36 +0000 (13:17 +0000)]
net_sched: export pfifo_fast prio2band[]

pfifo_fast prio2band[] is renamed to sch_default_prio2band[]
and exported because we want to share it in FQ.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 months agonet_sched: sch_fq: remove q->ktime_cache
Eric Dumazet [Mon, 2 Oct 2023 13:17:35 +0000 (13:17 +0000)]
net_sched: sch_fq: remove q->ktime_cache

Now that both enqueue() and dequeue() need to use ktime_get_ns(),
there is no point wasting 8 bytes in struct fq_sched_data.

This makes room for future fields. ;)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 months agoMerge branch 'net-mana-fix-some-tx-processing-bugs'
Paolo Abeni [Thu, 5 Oct 2023 09:45:08 +0000 (11:45 +0200)]
Merge branch 'net-mana-fix-some-tx-processing-bugs'

Haiyang Zhang says:

====================
net: mana: Fix some TX processing bugs

Fix TX processing bugs on error handling, tso_bytes calculation,
and sge0 size.
====================

Link: https://lore.kernel.org/r/1696020147-14989-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 months agonet: mana: Fix oversized sge0 for GSO packets
Haiyang Zhang [Fri, 29 Sep 2023 20:42:27 +0000 (13:42 -0700)]
net: mana: Fix oversized sge0 for GSO packets

Handle the case when GSO SKB linear length is too large.

MANA NIC requires GSO packets to put only the header part to SGE0,
otherwise the TX queue may stop at the HW level.

So, use 2 SGEs for the skb linear part which contains more than the
packet header.

Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 months agonet: mana: Fix the tso_bytes calculation
Haiyang Zhang [Fri, 29 Sep 2023 20:42:26 +0000 (13:42 -0700)]
net: mana: Fix the tso_bytes calculation

sizeof(struct hop_jumbo_hdr) is not part of tso_bytes, so remove
the subtraction from header size.

Cc: stable@vger.kernel.org
Fixes: bd7fc6e1957c ("net: mana: Add new MANA VF performance counters for easier troubleshooting")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
7 months agonet: mana: Fix TX CQE error handling
Haiyang Zhang [Fri, 29 Sep 2023 20:42:25 +0000 (13:42 -0700)]
net: mana: Fix TX CQE error handling

For an unknown TX CQE error type (probably from a newer hardware),
still free the SKB, update the queue tail, etc., otherwise the
accounting will be wrong.

Also, TX errors can be triggered by injecting corrupted packets, so
replace the WARN_ONCE to ratelimited error logging.

Cc: stable@vger.kernel.org
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>