sfrench/cifs-2.6.git
5 months agoHID: nintendo: add support for nso controllers
Ryan McClelland [Mon, 4 Dec 2023 08:17:21 +0000 (00:17 -0800)]
HID: nintendo: add support for nso controllers

This adds support for the nintendo switch online controllers which
include the SNES, Genesis, and N64 Controllers.

As each nso controller only implements a subset of what a pro
controller can do. Each of these 'features' were broken up in to
seperate functions which include right stick, left stick, imu, and
dpad and depending on the controller type that it is, it will call
the supported functions appropriately.

Each controller now has a struct which maps the bit within the hid
in report to a button.

The name given to the device now comes directly from the hid
device name rather than looking up a predefined string.

Signed-off-by: Ryan McClelland <rymcclel@gmail.com>
Reviewed-by: Daniel J. Ogorchock <djogorchock@gmail.com>
Tested-by: Daniel J. Ogorchock <djogorchock@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
5 months agoMerge tag 'for-linus-2023112301' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 24 Nov 2023 01:31:53 +0000 (17:31 -0800)]
Merge tag 'for-linus-2023112301' of git://git./linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:

 - revert of commit that caused regression to many Logitech unifying
   receiver users (Jiri Kosina)

 - power management fix for hid-mcp2221 (Hamish Martin)

 - fix for race condition between HID core and HID debug (Charles Yi)

 - a couple of assorted device-ID-specific quirks

* tag 'for-linus-2023112301' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: multitouch: Add quirk for HONOR GLO-GXXX touchpad
  HID: hid-asus: reset the backlight brightness level on resume
  HID: hid-asus: add const to read-only outgoing usb buffer
  Revert "HID: logitech-dj: Add support for a new lightspeed receiver iteration"
  HID: add ALWAYS_POLL quirk for Apple kb
  HID: glorious: fix Glorious Model I HID report
  HID: fix HID device resource race between HID core and debugging support
  HID: apple: add Jamesdonkey and A3R to non-apple keyboards list
  HID: mcp2221: Allow IO to start during probe
  HID: mcp2221: Set driver data before I2C adapter add

5 months agoMerge tag 'net-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 23 Nov 2023 18:40:13 +0000 (10:40 -0800)]
Merge tag 'net-6.7-rc3' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf.

  Current release - regressions:

   - Revert "net: r8169: Disable multicast filter for RTL8168H and
     RTL8107E"

   - kselftest: rtnetlink: fix ip route command typo

  Current release - new code bugs:

   - s390/ism: make sure ism driver implies smc protocol in kconfig

   - two build fixes for tools/net

  Previous releases - regressions:

   - rxrpc: couple of ACK/PING/RTT handling fixes

  Previous releases - always broken:

   - bpf: verify bpf_loop() callbacks as if they are called unknown
     number of times

   - improve stability of auto-bonding with Hyper-V

   - account BPF-neigh-redirected traffic in interface statistics

  Misc:

   - net: fill in some more MODULE_DESCRIPTION()s"

* tag 'net-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (58 commits)
  tools: ynl: fix duplicate op name in devlink
  tools: ynl: fix header path for nfsd
  net: ipa: fix one GSI register field width
  tls: fix NULL deref on tls_sw_splice_eof() with empty record
  net: axienet: Fix check for partial TX checksum
  vsock/test: fix SEQPACKET message bounds test
  i40e: Fix adding unsupported cloud filters
  ice: restore timestamp configuration after device reset
  ice: unify logic for programming PFINT_TSYN_MSK
  ice: remove ptp_tx ring parameter flag
  amd-xgbe: propagate the correct speed and duplex status
  amd-xgbe: handle the corner-case during tx completion
  amd-xgbe: handle corner-case during sfp hotplug
  net: veth: fix ethtool stats reporting
  octeontx2-pf: Fix ntuple rule creation to direct packet to VF with higher Rx queue than its PF
  net: usb: qmi_wwan: claim interface 4 for ZTE MF290
  Revert "net: r8169: Disable multicast filter for RTL8168H and RTL8107E"
  net/smc: avoid data corruption caused by decline
  nfc: virtual_ncidev: Add variable to check if ndev is running
  dpll: Fix potential msg memleak when genlmsg_put_reply failed
  ...

5 months agotools: ynl: fix duplicate op name in devlink
Jakub Kicinski [Thu, 23 Nov 2023 03:05:58 +0000 (19:05 -0800)]
tools: ynl: fix duplicate op name in devlink

We don't support CRUD-inspired message types in YNL too well.
One aspect that currently trips us up is the fact that single
message ID can be used in multiple commands (as the response).
This leads to duplicate entries in the id-to-string tables:

devlink-user.c:19:34: warning: initialized field overwritten [-Woverride-init]
   19 |         [DEVLINK_CMD_PORT_NEW] = "port-new",
      |                                  ^~~~~~~~~~
devlink-user.c:19:34: note: (near initialization for ‘devlink_op_strmap[7]’)

Fixes tag points at where the code was generated, the "real" problem
is that the code generator does not support CRUD.

Fixes: f2f9dd164db0 ("netlink: specs: devlink: add the remaining command to generate complete split_ops")
Link: https://lore.kernel.org/r/20231123030558.1611831-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 months agotools: ynl: fix header path for nfsd
Jakub Kicinski [Thu, 23 Nov 2023 03:06:24 +0000 (19:06 -0800)]
tools: ynl: fix header path for nfsd

The makefile dependency is trying to include the wrong header:

<command-line>: fatal error: ../../../../include/uapi//linux/nfsd.h: No such file or directory

The guard also looks wrong.

Fixes: f14122b2c2ac ("tools: ynl: Add source files for nfsd netlink protocol")
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/20231123030624.1611925-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 months agonet: ipa: fix one GSI register field width
Alex Elder [Wed, 22 Nov 2023 23:17:08 +0000 (17:17 -0600)]
net: ipa: fix one GSI register field width

The width of the R_LENGTH field of the EV_CH_E_CNTXT_1 GSI register
is 24 bits (not 20 bits) starting with IPA v5.0.  Fix this.

Fixes: faf0678ec8a0 ("net: ipa: add IPA v5.0 GSI register definitions")
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20231122231708.896632-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 months agotls: fix NULL deref on tls_sw_splice_eof() with empty record
Jann Horn [Wed, 22 Nov 2023 21:44:47 +0000 (22:44 +0100)]
tls: fix NULL deref on tls_sw_splice_eof() with empty record

syzkaller discovered that if tls_sw_splice_eof() is executed as part of
sendfile() when the plaintext/ciphertext sk_msg are empty, the send path
gets confused because the empty ciphertext buffer does not have enough
space for the encryption overhead. This causes tls_push_record() to go on
the `split = true` path (which is only supposed to be used when interacting
with an attached BPF program), and then get further confused and hit the
tls_merge_open_record() path, which then assumes that there must be at
least one populated buffer element, leading to a NULL deref.

It is possible to have empty plaintext/ciphertext buffers if we previously
bailed from tls_sw_sendmsg_locked() via the tls_trim_both_msgs() path.
tls_sw_push_pending_record() already handles this case correctly; let's do
the same check in tls_sw_splice_eof().

Fixes: df720d288dbb ("tls/sw: Use splice_eof() to flush")
Cc: stable@vger.kernel.org
Reported-by: syzbot+40d43509a099ea756317@syzkaller.appspotmail.com
Signed-off-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/r/20231122214447.675768-1-jannh@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 months agonet: axienet: Fix check for partial TX checksum
Samuel Holland [Wed, 22 Nov 2023 00:42:17 +0000 (16:42 -0800)]
net: axienet: Fix check for partial TX checksum

Due to a typo, the code checked the RX checksum feature in the TX path.

Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Link: https://lore.kernel.org/r/20231122004219.3504219-1-samuel.holland@sifive.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 months agovsock/test: fix SEQPACKET message bounds test
Arseniy Krasnov [Tue, 21 Nov 2023 21:16:42 +0000 (00:16 +0300)]
vsock/test: fix SEQPACKET message bounds test

Tune message length calculation to make this test work on machines
where 'getpagesize()' returns >32KB. Now maximum message length is not
hardcoded (on machines above it was smaller than 'getpagesize()' return
value, thus we get negative value and test fails), but calculated at
runtime and always bigger than 'getpagesize()' result. Reproduced on
aarch64 with 64KB page size.

Fixes: 5c338112e48a ("test/vsock: rework message bounds test")
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
Reported-by: Bogdan Marcynkov <bmarcynk@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20231121211642.163474-1-avkrasnov@salutedevices.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 months agoi40e: Fix adding unsupported cloud filters
Ivan Vecera [Tue, 21 Nov 2023 21:13:36 +0000 (13:13 -0800)]
i40e: Fix adding unsupported cloud filters

If a VF tries to add unsupported cloud filter through virtchnl
then i40e_add_del_cloud_filter(_big_buf) returns -ENOTSUPP but
this error code is stored in 'ret' instead of 'aq_ret' that
is used as error code sent back to VF. In this scenario where
one of the mentioned functions fails the value of 'aq_ret'
is zero so the VF will incorrectly receive a 'success'.

Use 'aq_ret' to store return value and remove 'ret' local
variable. Additionally fix the issue when filter allocation
fails, in this case no notification is sent back to the VF.

Fixes: e284fc280473 ("i40e: Add and delete cloud filter")
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20231121211338.3348677-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 months agoMerge branch 'ice-restore-timestamp-config-after-reset'
Paolo Abeni [Thu, 23 Nov 2023 14:27:35 +0000 (15:27 +0100)]
Merge branch 'ice-restore-timestamp-config-after-reset'

Tony Nguyen says:

====================
ice: restore timestamp config after reset

Jake Keller says:

We recently discovered during internal validation that the ice driver has
not been properly restoring Tx timestamp configuration after a device reset,
which resulted in application failures after a device reset.

After some digging, it turned out this problem is two-fold. Since the
introduction of the PTP support the driver has been clobbering the storage
of the current timestamp configuration during reset. Thus after a reset, the
driver will no longer perform Tx or Rx timestamps, and will report
timestamp configuration as disabled if SIOCGHWTSTAMP ioctl is issued.

In addition, the recently merged auxiliary bus support code missed that
PFINT_TSYN_MSK must be reprogrammed on the clock owner for E822 devices.
Failure to restore this register configuration results in the driver no
longer responding to interrupts from other ports. Depending on the traffic
pattern, this can either result in increased latency responding to
timestamps on the non-owner ports, or it can result in the driver never
reporting any timestamps. The configuration of PFINT_TSYN_MSK was only done
during initialization. Due to this, the Tx timestamp issue persists even if
userspace reconfigures timestamping.

This series fixes both issues, as well as removes a redundant Tx ring field
since we can rely on the skb flag as the primary detector for a Tx timestamp
request.

Note that I don't think this series will directly apply to older stable
releases (even v6.6) as we recently refactored a lot of the PTP code to
support auxiliary bus. Patch 2/3 only matters for the post-auxiliary bus
implementation. The principle of patch 1/3 and 3/3 could apply as far back
as the initial PTP support, but I don't think it will apply cleanly as-is.
====================

Link: https://lore.kernel.org/r/20231121211259.3348630-1-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agoice: restore timestamp configuration after device reset
Jacob Keller [Tue, 21 Nov 2023 21:12:57 +0000 (13:12 -0800)]
ice: restore timestamp configuration after device reset

The driver calls ice_ptp_cfg_timestamp() during ice_ptp_prepare_for_reset()
to disable timestamping while the device is resetting. This operation
destroys the user requested configuration. While the driver does call
ice_ptp_cfg_timestamp in ice_rebuild() to restore some hardware settings
after a reset, it unconditionally passes true or false, resulting in
failure to restore previous user space configuration.

This results in a device reset forcibly disabling timestamp configuration
regardless of current user settings.

This was not detected previously due to a quirk of the LinuxPTP ptp4l
application. If ptp4l detects a missing timestamp, it enters a fault state
and performs recovery logic which includes executing SIOCSHWTSTAMP again,
restoring the now accidentally cleared configuration.

Not every application does this, and for these applications, timestamps
will mysteriously stop after a PF reset, without being restored until an
application restart.

Fix this by replacing ice_ptp_cfg_timestamp() with two new functions:

1) ice_ptp_disable_timestamp_mode() which unconditionally disables the
   timestamping logic in ice_ptp_prepare_for_reset() and ice_ptp_release()

2) ice_ptp_restore_timestamp_mode() which calls
   ice_ptp_restore_tx_interrupt() to restore Tx timestamping configuration,
   calls ice_set_rx_tstamp() to restore Rx timestamping configuration, and
   issues an immediate TSYN_TX interrupt to ensure that timestamps which
   may have occurred during the device reset get processed.

Modify the ice_ptp_set_timestamp_mode to directly save the user
configuration and then call ice_ptp_restore_timestamp_mode. This way, reset
no longer destroys the saved user configuration.

This obsoletes the ice_set_tx_tstamp() function which can now be safely
removed.

With this change, all devices should now restore Tx and Rx timestamping
functionality correctly after a PF reset without application intervention.

Fixes: 77a781155a65 ("ice: enable receive hardware timestamping")
Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agoice: unify logic for programming PFINT_TSYN_MSK
Jacob Keller [Tue, 21 Nov 2023 21:12:56 +0000 (13:12 -0800)]
ice: unify logic for programming PFINT_TSYN_MSK

Commit d938a8cca88a ("ice: Auxbus devices & driver for E822 TS") modified
how Tx timestamps are handled for E822 devices. On these devices, only the
clock owner handles reading the Tx timestamp data from firmware. To do
this, the PFINT_TSYN_MSK register is modified from the default value to one
which enables reacting to a Tx timestamp on all PHY ports.

The driver currently programs PFINT_TSYN_MSK in different places depending
on whether the port is the clock owner or not. For the clock owner, the
PFINT_TSYN_MSK value is programmed during ice_ptp_init_owner just before
calling ice_ptp_tx_ena_intr to program the PHY ports.

For the non-clock owner ports, the PFINT_TSYN_MSK is programmed during
ice_ptp_init_port.

If a large enough device reset occurs, the PFINT_TSYN_MSK register will be
reset to the default value in which only the PHY associated directly with
the PF will cause the Tx timestamp interrupt to trigger.

The driver lacks logic to reprogram the PFINT_TSYN_MSK register after a
device reset. For the E822 device, this results in the PF no longer
responding to interrupts for other ports. This results in failure to
deliver Tx timestamps to user space applications.

Rename ice_ptp_configure_tx_tstamp to ice_ptp_cfg_tx_interrupt, and unify
the logic for programming PFINT_TSYN_MSK and PFINT_OICR_ENA into one place.
This function will program both registers according to the combination of
user configuration and device requirements.

This ensures that PFINT_TSYN_MSK is always restored when we configure the
Tx timestamp interrupt.

Fixes: d938a8cca88a ("ice: Auxbus devices & driver for E822 TS")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agoice: remove ptp_tx ring parameter flag
Jacob Keller [Tue, 21 Nov 2023 21:12:55 +0000 (13:12 -0800)]
ice: remove ptp_tx ring parameter flag

Before performing a Tx timestamp in ice_stamp(), the driver checks a ptp_tx
ring variable to see if timestamping is enabled on that ring. This value is
set for all rings whenever userspace configures Tx timestamping.

Ostensibly this was done to avoid wasting cycles checking other fields when
timestamping has not been enabled. However, for Tx timestamps we already
get an individual per-SKB flag indicating whether userspace wants to
request a timestamp on that packet. We do not gain much by also having
a separate flag to check for whether timestamping was enabled.

In fact, the driver currently fails to restore the field after a PF reset.
Because of this, if a PF reset occurs, timestamps will be disabled.

Since this flag doesn't add value in the hotpath, remove it and always
provide a timestamp if the SKB flag has been set.

A following change will fix the reset path to properly restore user
timestamping configuration completely.

This went unnoticed for some time because one of the most common
applications using Tx timestamps, ptp4l, will reconfigure the socket as
part of its fault recovery logic.

Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agoMerge branch 'amd-xgbe-fixes-to-handle-corner-cases'
Paolo Abeni [Thu, 23 Nov 2023 12:47:25 +0000 (13:47 +0100)]
Merge branch 'amd-xgbe-fixes-to-handle-corner-cases'

Raju Rangoju says:

====================
amd-xgbe: fixes to handle corner-cases

This series include bug fixes to amd-xgbe driver.
====================

Link: https://lore.kernel.org/r/20231121191435.4049995-1-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agoamd-xgbe: propagate the correct speed and duplex status
Raju Rangoju [Tue, 21 Nov 2023 19:14:35 +0000 (00:44 +0530)]
amd-xgbe: propagate the correct speed and duplex status

xgbe_get_link_ksettings() does not propagate correct speed and duplex
information to ethtool during cable unplug. Due to which ethtool reports
incorrect values for speed and duplex.

Address this by propagating correct information.

Fixes: 7c12aa08779c ("amd-xgbe: Move the PHY support into amd-xgbe")
Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agoamd-xgbe: handle the corner-case during tx completion
Raju Rangoju [Tue, 21 Nov 2023 19:14:34 +0000 (00:44 +0530)]
amd-xgbe: handle the corner-case during tx completion

The existing implementation uses software logic to accumulate tx
completions until the specified time (1ms) is met and then poll them.
However, there exists a tiny gap which leads to a race between
resetting and checking the tx_activate flag. Due to this the tx
completions are not reported to upper layer and tx queue timeout
kicks-in restarting the device.

To address this, introduce a tx cleanup mechanism as part of the
periodic maintenance process.

Fixes: c5aa9e3b8156 ("amd-xgbe: Initial AMD 10GbE platform driver")
Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agoamd-xgbe: handle corner-case during sfp hotplug
Raju Rangoju [Tue, 21 Nov 2023 19:14:33 +0000 (00:44 +0530)]
amd-xgbe: handle corner-case during sfp hotplug

Force the mode change for SFI in Fixed PHY configurations. Fixed PHY
configurations needs PLL to be enabled while doing mode set. When the
SFP module isn't connected during boot, driver assumes AN is ON and
attempts auto-negotiation. However, if the connected SFP comes up in
Fixed PHY configuration the link will not come up as PLL isn't enabled
while the initial mode set command is issued. So, force the mode change
for SFI in Fixed PHY configuration to fix link issues.

Fixes: e57f7a3feaef ("amd-xgbe: Prepare for working with more than one type of phy")
Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agonet: veth: fix ethtool stats reporting
Lorenzo Bianconi [Tue, 21 Nov 2023 19:08:44 +0000 (20:08 +0100)]
net: veth: fix ethtool stats reporting

Fix a possible misalignment between page_pool stats and tx xdp_stats
reported in veth_get_ethtool_stats routine.
The issue can be reproduced configuring the veth pair with the
following tx/rx queues:

$ip link add v0 numtxqueues 2 numrxqueues 4 type veth peer name v1 \
 numtxqueues 1 numrxqueues 1

and loading a simple XDP program on v0 that just returns XDP_PASS.
In this case on v0 the page_pool stats overwrites tx xdp_stats for queue 1.
Fix the issue incrementing pp_idx of dev->real_num_tx_queues * VETH_TQ_STATS_LEN
since we always report xdp_stats for all tx queues in ethtool.

Fixes: 4fc418053ec7 ("net: veth: add page_pool stats")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/c5b5d0485016836448453f12846c7c4ab75b094a.1700593593.git.lorenzo@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agoocteontx2-pf: Fix ntuple rule creation to direct packet to VF with higher Rx queue...
Suman Ghosh [Tue, 21 Nov 2023 16:56:24 +0000 (22:26 +0530)]
octeontx2-pf: Fix ntuple rule creation to direct packet to VF with higher Rx queue than its PF

It is possible to add a ntuple rule which would like to direct packet to
a VF whose number of queues are greater/less than its PF's queue numbers.
For example a PF can have 2 Rx queues but a VF created on that PF can have
8 Rx queues. As of today, ntuple rule will reject rule because it is
checking the requested queue number against PF's number of Rx queues.
As a part of this fix if the action of a ntuple rule is to move a packet
to a VF's queue then the check is removed. Also, a debug information is
printed to aware user that it is user's responsibility to cross check if
the requested queue number on that VF is a valid one.

Fixes: f0a1913f8a6f ("octeontx2-pf: Add support for ethtool ntuple filters")
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231121165624.3664182-1-sumang@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agonet: usb: qmi_wwan: claim interface 4 for ZTE MF290
Lech Perczak [Fri, 17 Nov 2023 23:19:18 +0000 (00:19 +0100)]
net: usb: qmi_wwan: claim interface 4 for ZTE MF290

Interface 4 is used by for QMI interface in stock firmware of MF28D, the
router which uses MF290 modem. Rebind it to qmi_wwan after freeing it up
from option driver.
The proper configuration is:

Interface mapping is:
0: QCDM, 1: (unknown), 2: AT (PCUI), 2: AT (Modem), 4: QMI

T:  Bus=01 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#=  4 Spd=480  MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=19d2 ProdID=0189 Rev= 0.00
S:  Manufacturer=ZTE, Incorporated
S:  Product=ZTE LTE Technologies MSM
C:* #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms
I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms
I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
E:  Ad=84(I) Atr=03(Int.) MxPS=  64 Ivl=2ms
E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms
I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
E:  Ad=86(I) Atr=03(Int.) MxPS=  64 Ivl=2ms
E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms

Cc: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Lech Perczak <lech.perczak@gmail.com>
Link: https://lore.kernel.org/r/20231117231918.100278-3-lech.perczak@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agoMerge tag 'loongarch-fixes-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 22 Nov 2023 18:20:17 +0000 (10:20 -0800)]
Merge tag 'loongarch-fixes-6.7-1' of git://git./linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "Fix several build errors, a potential kernel panic, a cpu hotplug
  issue and update links in documentations"

* tag 'loongarch-fixes-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  Docs/zh_CN/LoongArch: Update links in LoongArch introduction.rst
  Docs/LoongArch: Update links in LoongArch introduction.rst
  LoongArch: Implement constant timer shutdown interface
  LoongArch: Mark {dmw,tlb}_virt_to_page() exports as non-GPL
  LoongArch: Silence the boot warning about 'nokaslr'
  LoongArch: Add __percpu annotation for __percpu_read()/__percpu_write()
  LoongArch: Record pc instead of offset in la_abs relocation
  LoongArch: Explicitly set -fdirect-access-external-data for vmlinux
  LoongArch: Add dependency between vmlinuz.efi and vmlinux.efi

5 months agoMerge tag 'hyperv-fixes-signed-20231121' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 22 Nov 2023 17:56:26 +0000 (09:56 -0800)]
Merge tag 'hyperv-fixes-signed-20231121' of git://git./linux/kernel/git/hyperv/linux

Pull hyperv fixes from Wei Liu:

 - One fix for the KVP daemon (Ani Sinha)

 - Fix for the detection of E820_TYPE_PRAM in a Gen2 VM (Saurabh Sengar)

 - Micro-optimization for hv_nmi_unknown() (Uros Bizjak)

* tag 'hyperv-fixes-signed-20231121' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  x86/hyperv: Use atomic_try_cmpxchg() to micro-optimize hv_nmi_unknown()
  x86/hyperv: Fix the detection of E820_TYPE_PRAM in a Gen2 VM
  hv/hv_kvp_daemon: Some small fixes for handling NM keyfiles

5 months agoasm-generic: qspinlock: fix queued_spin_value_unlocked() implementation
Linus Torvalds [Fri, 10 Nov 2023 06:22:13 +0000 (22:22 -0800)]
asm-generic: qspinlock: fix queued_spin_value_unlocked() implementation

We really don't want to do atomic_read() or anything like that, since we
already have the value, not the lock.  The whole point of this is that
we've loaded the lock from memory, and we want to check whether the
value we loaded was a locked one or not.

The main use of this is the lockref code, which loads both the lock and
the reference count in one atomic operation, and then works on that
combined value.  With the atomic_read(), the compiler would pointlessly
spill the value to the stack, in order to then be able to read it back
"atomically".

This is the qspinlock version of commit c6f4a9002252 ("asm-generic:
ticket-lock: Optimize arch_spin_value_unlocked()") which fixed this same
bug for ticket locks.

Cc: Guo Ren <guoren@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/all/CAHk-=whNRv0v6kQiV5QO6DJhjH4KEL36vWQ6Re8Csrnh4zbRkQ@mail.gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 months agoRevert "net: r8169: Disable multicast filter for RTL8168H and RTL8107E"
Heiner Kallweit [Tue, 21 Nov 2023 08:09:33 +0000 (09:09 +0100)]
Revert "net: r8169: Disable multicast filter for RTL8168H and RTL8107E"

This reverts commit efa5f1311c4998e9e6317c52bc5ee93b3a0f36df.

I couldn't reproduce the reported issue. What I did, based on a pcap
packet log provided by the reporter:
- Used same chip version (RTL8168h)
- Set MAC address to the one used on the reporters system
- Replayed the EAPOL unicast packet that, according to the reporter,
  was filtered out by the mc filter.
The packet was properly received.

Therefore the root cause of the reported issue seems to be somewhere
else. Disabling mc filtering completely for the most common chip
version is a quite big hammer. Therefore revert the change and wait
for further analysis results from the reporter.

Cc: stable@vger.kernel.org
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 months agonet/smc: avoid data corruption caused by decline
D. Wythe [Wed, 22 Nov 2023 02:37:05 +0000 (10:37 +0800)]
net/smc: avoid data corruption caused by decline

We found a data corruption issue during testing of SMC-R on Redis
applications.

The benchmark has a low probability of reporting a strange error as
shown below.

"Error: Protocol error, got "\xe2" as reply type byte"

Finally, we found that the retrieved error data was as follows:

0xE2 0xD4 0xC3 0xD9 0x04 0x00 0x2C 0x20 0xA6 0x56 0x00 0x16 0x3E 0x0C
0xCB 0x04 0x02 0x01 0x00 0x00 0x20 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xE2

It is quite obvious that this is a SMC DECLINE message, which means that
the applications received SMC protocol message.
We found that this was caused by the following situations:

client                  server
        ¦  clc proposal
        ------------->
        ¦  clc accept
        <-------------
        ¦  clc confirm
        ------------->
wait llc confirm
send llc confirm
        ¦failed llc confirm
        ¦   x------
(after 2s)timeout
                        wait llc confirm rsp

wait decline

(after 1s) timeout
                        (after 2s) timeout
        ¦   decline
        -------------->
        ¦   decline
        <--------------

As a result, a decline message was sent in the implementation, and this
message was read from TCP by the already-fallback connection.

This patch double the client timeout as 2x of the server value,
With this simple change, the Decline messages should never cross or
collide (during Confirm link timeout).

This issue requires an immediate solution, since the protocol updates
involve a more long-term solution.

Fixes: 0fb0b02bd6fd ("net/smc: adapt SMC client code to use the LLC flow")
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 months agonfc: virtual_ncidev: Add variable to check if ndev is running
Nguyen Dinh Phi [Tue, 21 Nov 2023 07:53:57 +0000 (15:53 +0800)]
nfc: virtual_ncidev: Add variable to check if ndev is running

syzbot reported an memory leak that happens when an skb is add to
send_buff after virtual nci closed.
This patch adds a variable to track if the ndev is running before
handling new skb in send function.

Signed-off-by: Nguyen Dinh Phi <phind.uet@gmail.com>
Reported-by: syzbot+6eb09d75211863f15e3e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/lkml/00000000000075472b06007df4fb@google.com
Reviewed-by: Bongsu Jeon
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 months agoHID: multitouch: Add quirk for HONOR GLO-GXXX touchpad
Aoba K [Tue, 21 Nov 2023 12:23:11 +0000 (20:23 +0800)]
HID: multitouch: Add quirk for HONOR GLO-GXXX touchpad

Honor MagicBook 13 2023 has a touchpad which do not switch to the multitouch
mode until the input mode feature is written by the host.  The touchpad do
report the input mode at touchpad(3), while itself working under mouse mode. As
a workaround, it is possible to call MT_QUIRE_FORCE_GET_FEATURE to force set
feature in mt_set_input_mode for such device.

The touchpad reports as BLTP7853, which cannot retrive any useful manufacture
information on the internel by this string at present.  As the serial number of
the laptop is GLO-G52, while DMI info reports the laptop serial number as
GLO-GXXX, this workaround should applied to all models which has the GLO-GXXX.

Signed-off-by: Aoba K <nexp_0x17@outlook.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
5 months agoHID: hid-asus: reset the backlight brightness level on resume
Denis Benato [Fri, 17 Nov 2023 01:15:56 +0000 (14:15 +1300)]
HID: hid-asus: reset the backlight brightness level on resume

Some devices managed by this driver automatically set brightness to 0
before entering a suspended state and reset it back to a default
brightness level after the resume:
this has the effect of having the kernel report wrong brightness
status after a sleep, and on some devices (like the Asus RC71L) that
brightness is the intensity of LEDs directly facing the user.

Fix the above issue by setting back brightness to the level it had
before entering a sleep state.

Signed-off-by: Denis Benato <benato.denis96@gmail.com>
Signed-off-by: Luke D. Jones <luke@ljones.dev>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
5 months agoHID: hid-asus: add const to read-only outgoing usb buffer
Denis Benato [Fri, 17 Nov 2023 01:15:55 +0000 (14:15 +1300)]
HID: hid-asus: add const to read-only outgoing usb buffer

In the function asus_kbd_set_report the parameter buf is read-only
as it gets copied in a memory portion suitable for USB transfer,
but the parameter is not marked as const: add the missing const and mark
const immutable buffers passed to that function.

Signed-off-by: Denis Benato <benato.denis96@gmail.com>
Signed-off-by: Luke D. Jones <luke@ljones.dev>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
5 months agox86/hyperv: Use atomic_try_cmpxchg() to micro-optimize hv_nmi_unknown()
Uros Bizjak [Tue, 14 Nov 2023 16:59:28 +0000 (17:59 +0100)]
x86/hyperv: Use atomic_try_cmpxchg() to micro-optimize hv_nmi_unknown()

Use atomic_try_cmpxchg() instead of atomic_cmpxchg(*ptr, old, new) == old
in hv_nmi_unknown(). On x86 the CMPXCHG instruction returns success in
the ZF flag, so this change saves a compare after CMPXCHG. The generated
asm code improves from:

  3e: 65 8b 15 00 00 00 00  mov    %gs:0x0(%rip),%edx
  45: b8 ff ff ff ff        mov    $0xffffffff,%eax
  4a: f0 0f b1 15 00 00 00  lock cmpxchg %edx,0x0(%rip)
  51: 00
  52: 83 f8 ff              cmp    $0xffffffff,%eax
  55: 0f 95 c0              setne  %al

to:

  3e: 65 8b 15 00 00 00 00  mov    %gs:0x0(%rip),%edx
  45: b8 ff ff ff ff        mov    $0xffffffff,%eax
  4a: f0 0f b1 15 00 00 00  lock cmpxchg %edx,0x0(%rip)
  51: 00
  52: 0f 95 c0              setne  %al

No functional change intended.

Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/20231114170038.381634-1-ubizjak@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20231114170038.381634-1-ubizjak@gmail.com>

5 months agodpll: Fix potential msg memleak when genlmsg_put_reply failed
Hao Ge [Tue, 21 Nov 2023 01:37:09 +0000 (09:37 +0800)]
dpll: Fix potential msg memleak when genlmsg_put_reply failed

We should clean the skb resource if genlmsg_put_reply failed.

Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions")
Signed-off-by: Hao Ge <gehao@kylinos.cn>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://lore.kernel.org/r/20231121013709.73323-1-gehao@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 months agoMerge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Jakub Kicinski [Tue, 21 Nov 2023 23:49:30 +0000 (15:49 -0800)]
Merge tag 'for-netdev' of https://git./linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2023-11-21

We've added 19 non-merge commits during the last 4 day(s) which contain
a total of 18 files changed, 1043 insertions(+), 416 deletions(-).

The main changes are:

1) Fix BPF verifier to validate callbacks as if they are called an unknown
   number of times in order to fix not detecting some unsafe programs,
   from Eduard Zingerman.

2) Fix bpf_redirect_peer() handling which missed proper stats accounting
   for veth and netkit and also generally fix missing stats for the latter,
   from Peilin Ye, Daniel Borkmann et al.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: check if max number of bpf_loop iterations is tracked
  bpf: keep track of max number of bpf_loop callback iterations
  selftests/bpf: test widening for iterating callbacks
  bpf: widening for callback iterators
  selftests/bpf: tests for iterating callbacks
  bpf: verify callbacks as if they are called unknown number of times
  bpf: extract setup_func_entry() utility function
  bpf: extract __check_reg_arg() utility function
  selftests/bpf: fix bpf_loop_bench for new callback verification scheme
  selftests/bpf: track string payload offset as scalar in strobemeta
  selftests/bpf: track tcp payload offset as scalar in xdp_synproxy
  selftests/bpf: Add netkit to tc_redirect selftest
  selftests/bpf: De-veth-ize the tc_redirect test case
  bpf, netkit: Add indirect call wrapper for fetching peer dev
  bpf: Fix dev's rx stats for bpf_redirect_peer traffic
  veth: Use tstats per-CPU traffic counters
  netkit: Add tstats per-CPU traffic counters
  net: Move {l,t,d}stats allocation to core and convert veth & vrf
  net, vrf: Move dstats structure to core
====================

Link: https://lore.kernel.org/r/20231121193113.11796-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 months agodocs: netdev: try to guide people on dealing with silence
Jakub Kicinski [Mon, 20 Nov 2023 20:01:09 +0000 (12:01 -0800)]
docs: netdev: try to guide people on dealing with silence

There has been more than a few threads which went idle before
the merge window and now people came back to them and started
asking about next steps.

We currently tell people to be patient and not to repost too
often. Our "not too often", however, is still a few orders of
magnitude faster than other subsystems. Or so I feel after
hearing people talk about review rates at LPC.

Clarify in the doc that if the discussion went idle for a week
on netdev, 95% of the time there's no point waiting longer.

Link: https://lore.kernel.org/r/20231120200109.620392-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 months agonet: usb: ax88179_178a: fix failed operations during ax88179_reset
Jose Ignacio Tornos Martinez [Mon, 20 Nov 2023 12:06:29 +0000 (13:06 +0100)]
net: usb: ax88179_178a: fix failed operations during ax88179_reset

Using generic ASIX Electronics Corp. AX88179 Gigabit Ethernet device,
the following test cycle has been implemented:
    - power on
    - check logs
    - shutdown
    - after detecting the system shutdown, disconnect power
    - after approximately 60 seconds of sleep, power is restored
Running some cycles, sometimes error logs like this appear:
    kernel: ax88179_178a 2-9:1.0 (unnamed net_device) (uninitialized): Failed to write reg index 0x0001: -19
    kernel: ax88179_178a 2-9:1.0 (unnamed net_device) (uninitialized): Failed to read reg index 0x0001: -19
    ...
These failed operation are happening during ax88179_reset execution, so
the initialization could not be correct.

In order to avoid this, we need to increase the delay after reset and
clock initial operations. By using these larger values, many cycles
have been run and no failed operations appear.

It would be better to check some status register to verify when the
operation has finished, but I do not have found any available information
(neither in the public datasheets nor in the manufacturer's driver). The
only available information for the necessary delays is the maufacturer's
driver (original values) but the proposed values are not enough for the
tested devices.

Fixes: e2ca90c276e1f ("ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver")
Reported-by: Herb Wei <weihao.bj@ieisystem.com>
Tested-by: Herb Wei <weihao.bj@ieisystem.com>
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Link: https://lore.kernel.org/r/20231120120642.54334-1-jtornosm@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 months agoMerge tag 'platform-drivers-x86-v6.7-2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 21 Nov 2023 19:56:57 +0000 (11:56 -0800)]
Merge tag 'platform-drivers-x86-v6.7-2' of git://git./linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform drivers fixes from Ilpo Järvinen:
 "Just a few fixes (one with two non-fix deps) plus tidying up
  MAINTAINERS"

* tag 'platform-drivers-x86-v6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: intel_telemetry: Fix kernel doc descriptions
  MAINTAINERS: Drop Mark Gross as maintainer for x86 platform drivers
  platform/x86/amd/pmc: adjust getting DRAM size behavior
  platform/x86: hp-bioscfg: Remove unused obj in hp_add_other_attributes()
  platform/x86: hp-bioscfg: Fix error handling in hp_add_other_attributes()
  platform/x86: hp-bioscfg: move mutex_lock() down in hp_add_other_attributes()
  platform/x86: hp-bioscfg: Simplify return check in hp_add_other_attributes()
  platform/x86: ideapad-laptop: Set max_brightness before using it
  MAINTAINERS: Remove stale entry for SBL platform driver

5 months agoMerge tag 'erofs-for-6.7-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 21 Nov 2023 19:45:48 +0000 (11:45 -0800)]
Merge tag 'erofs-for-6.7-rc3-fixes' of git://git./linux/kernel/git/xiang/erofs

Pull erofs fixes from Gao Xiang:

 - Tidy up erofs_read_inode() for simplicity

 - Fix broken fscache mode due to NULL dereference of dif->bdev_handle

 - Add the EROFS webpage to MAINTAINERS, documentation, and Kconfig

* tag 'erofs-for-6.7-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  MAINTAINERS: erofs: add EROFS webpage
  erofs: fix NULL dereference of dif->bdev_handle in fscache mode
  erofs: simplify erofs_read_inode()

5 months agoMAINTAINERS: Add indirect_call_wrapper.h to NETWORKING [GENERAL]
Simon Horman [Mon, 20 Nov 2023 08:28:40 +0000 (08:28 +0000)]
MAINTAINERS: Add indirect_call_wrapper.h to NETWORKING [GENERAL]

indirect_call_wrapper.h  is not, strictly speaking, networking specific.
However, it's git history indicates that in practice changes go through
netdev and thus the netdev maintainers have effectively been taking
responsibility for it.

Formalise this by adding it to the NETWORKING [GENERAL] section in the
MAINTAINERS file.

It is not clear how many other files under include/linux fall into this
category and it would be interesting, as a follow-up, to audit that and
propose further updates to the MAINTAINERS file as appropriate.

Link: https://lore.kernel.org/netdev/20231116010310.4664dd38@kernel.org/
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231120-indirect_call_wrapper-maintainer-v1-1-0a6bb1f7363e@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agoRevert "HID: logitech-dj: Add support for a new lightspeed receiver iteration"
Jiri Kosina [Tue, 21 Nov 2023 13:38:11 +0000 (14:38 +0100)]
Revert "HID: logitech-dj: Add support for a new lightspeed receiver iteration"

This reverts commit 9d1bd9346241cd6963b58da7ffb7ed303285f684.

Multiple people reported misbehaving devices and reverting this commit fixes
the problem for them. As soon as the original commit author starts reacting
again, we can try to figure out why he hasn't seen the issues (mismatching
report descriptors?), but for the time being, fix for 6.7 by reverting.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=218172
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218094
Cc: <stable@vger.kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
5 months agoMerge branch 'hv_netvsc-fix-race-of-netvsc-vf-register-and-slave-bit'
Paolo Abeni [Tue, 21 Nov 2023 12:15:04 +0000 (13:15 +0100)]
Merge branch 'hv_netvsc-fix-race-of-netvsc-vf-register-and-slave-bit'

Haiyang Zhang says:

====================
hv_netvsc: fix race of netvsc, VF register, and slave bit

There are some races between netvsc probe, set notifier, VF register,
and slave bit setting.
This patch set fixes them.
====================

Link: https://lore.kernel.org/r/1700411023-14317-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agohv_netvsc: Mark VF as slave before exposing it to user-mode
Long Li [Sun, 19 Nov 2023 16:23:43 +0000 (08:23 -0800)]
hv_netvsc: Mark VF as slave before exposing it to user-mode

When a VF is being exposed form the kernel, it should be marked as "slave"
before exposing to the user-mode. The VF is not usable without netvsc
running as master. The user-mode should never see a VF without the "slave"
flag.

This commit moves the code of setting the slave flag to the time before
VF is exposed to user-mode.

Cc: stable@vger.kernel.org
Fixes: 0c195567a8f6 ("netvsc: transparent VF management")
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agohv_netvsc: Fix race of register_netdevice_notifier and VF register
Haiyang Zhang [Sun, 19 Nov 2023 16:23:42 +0000 (08:23 -0800)]
hv_netvsc: Fix race of register_netdevice_notifier and VF register

If VF NIC is registered earlier, NETDEV_REGISTER event is replayed,
but NETDEV_POST_INIT is not.

Move register_netdevice_notifier() earlier, so the call back
function is set before probing.

Cc: stable@vger.kernel.org
Fixes: e04e7a7bbd4b ("hv_netvsc: Fix a deadlock by getting rtnl lock earlier in netvsc_probe()")
Reported-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agohv_netvsc: fix race of netvsc and VF register_netdevice
Haiyang Zhang [Sun, 19 Nov 2023 16:23:41 +0000 (08:23 -0800)]
hv_netvsc: fix race of netvsc and VF register_netdevice

The rtnl lock also needs to be held before rndis_filter_device_add()
which advertises nvsp_2_vsc_capability / sriov bit, and triggers
VF NIC offering and registering. If VF NIC finished register_netdev()
earlier it may cause name based config failure.

To fix this issue, move the call to rtnl_lock() before
rndis_filter_device_add(), so VF will be registered later than netvsc
/ synthetic NIC, and gets a name numbered (ethX) after netvsc.

Cc: stable@vger.kernel.org
Fixes: e04e7a7bbd4b ("hv_netvsc: Fix a deadlock by getting rtnl lock earlier in netvsc_probe()")
Reported-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agoipv4: Correct/silence an endian warning in __ip_do_redirect
Kunwu Chan [Sun, 19 Nov 2023 14:17:59 +0000 (22:17 +0800)]
ipv4: Correct/silence an endian warning in __ip_do_redirect

net/ipv4/route.c:783:46: warning: incorrect type in argument 2 (different base types)
net/ipv4/route.c:783:46:    expected unsigned int [usertype] key
net/ipv4/route.c:783:46:    got restricted __be32 [usertype] new_gw

Fixes: 969447f226b4 ("ipv4: use new_gw for redirect neigh lookup")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Link: https://lore.kernel.org/r/20231119141759.420477-1-chentao@kylinos.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
5 months agoHID: add ALWAYS_POLL quirk for Apple kb
Oliver Neukum [Tue, 14 Nov 2023 14:54:30 +0000 (15:54 +0100)]
HID: add ALWAYS_POLL quirk for Apple kb

These devices disconnect if suspended without remote wakeup. They can operate
with the standard driver.

Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
5 months agoHID: glorious: fix Glorious Model I HID report
Brett Raye [Fri, 3 Nov 2023 01:10:38 +0000 (18:10 -0700)]
HID: glorious: fix Glorious Model I HID report

The Glorious Model I mouse has a buggy HID report descriptor for its
keyboard endpoint (used for programmable buttons). For report ID 2, there
is a mismatch between Logical Minimum and Usage Minimum in the array that
reports keycodes.

The offending portion of the descriptor: (from hid-decode)

0x95, 0x05,                    //  Report Count (5)                   30
0x75, 0x08,                    //  Report Size (8)                    32
0x15, 0x00,                    //  Logical Minimum (0)                34
0x25, 0x65,                    //  Logical Maximum (101)              36
0x05, 0x07,                    //  Usage Page (Keyboard)              38
0x19, 0x01,                    //  Usage Minimum (1)                  40
0x29, 0x65,                    //  Usage Maximum (101)                42
0x81, 0x00,                    //  Input (Data,Arr,Abs)               44

This bug shifts all programmed keycodes up by 1. Importantly, this causes
"empty" array indexes of 0x00 to be interpreted as 0x01, ErrorRollOver.
The presence of ErrorRollOver causes the system to ignore all keypresses
from the endpoint and breaks the ability to use the programmable buttons.

Setting byte 41 to 0x00 fixes this, and causes keycodes to be interpreted
correctly.

Also, USB_VENDOR_ID_GLORIOUS is changed to USB_VENDOR_ID_SINOWEALTH,
and a new ID for Laview Technology is added. Glorious seems to be
white-labeling controller boards or mice from these vendors. There isn't a
single canonical vendor ID for Glorious products.

Signed-off-by: Brett Raye <braye@fastmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
5 months agoHID: fix HID device resource race between HID core and debugging support
Charles Yi [Tue, 31 Oct 2023 04:32:39 +0000 (12:32 +0800)]
HID: fix HID device resource race between HID core and debugging support

hid_debug_events_release releases resources bound to the HID device instance.
hid_device_release releases the underlying HID device instance potentially
before hid_debug_events_release has completed releasing debug resources bound
to the same HID device instance.

Reference count to prevent the HID device instance from being torn down
preemptively when HID debugging support is used. When count reaches zero,
release core resources of HID device instance using hiddev_free.

The crash:

[  120.728477][ T4396] kernel BUG at lib/list_debug.c:53!
[  120.728505][ T4396] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[  120.739806][ T4396] Modules linked in: bcmdhd dhd_static_buf 8822cu pcie_mhi r8168
[  120.747386][ T4396] CPU: 1 PID: 4396 Comm: hidt_bridge Not tainted 5.10.110 #257
[  120.754771][ T4396] Hardware name: Rockchip RK3588 EVB4 LP4 V10 Board (DT)
[  120.761643][ T4396] pstate: 60400089 (nZCv daIf +PAN -UAO -TCO BTYPE=--)
[  120.768338][ T4396] pc : __list_del_entry_valid+0x98/0xac
[  120.773730][ T4396] lr : __list_del_entry_valid+0x98/0xac
[  120.779120][ T4396] sp : ffffffc01e62bb60
[  120.783126][ T4396] x29: ffffffc01e62bb60 x28: ffffff818ce3a200
[  120.789126][ T4396] x27: 0000000000000009 x26: 0000000000980000
[  120.795126][ T4396] x25: ffffffc012431000 x24: ffffff802c6d4e00
[  120.801125][ T4396] x23: ffffff8005c66f00 x22: ffffffc01183b5b8
[  120.807125][ T4396] x21: ffffff819df2f100 x20: 0000000000000000
[  120.813124][ T4396] x19: ffffff802c3f0700 x18: ffffffc01d2cd058
[  120.819124][ T4396] x17: 0000000000000000 x16: 0000000000000000
[  120.825124][ T4396] x15: 0000000000000004 x14: 0000000000003fff
[  120.831123][ T4396] x13: ffffffc012085588 x12: 0000000000000003
[  120.837123][ T4396] x11: 00000000ffffbfff x10: 0000000000000003
[  120.843123][ T4396] x9 : 455103d46b329300 x8 : 455103d46b329300
[  120.849124][ T4396] x7 : 74707572726f6320 x6 : ffffffc0124b8cb5
[  120.855124][ T4396] x5 : ffffffffffffffff x4 : 0000000000000000
[  120.861123][ T4396] x3 : ffffffc011cf4f90 x2 : ffffff81fee7b948
[  120.867122][ T4396] x1 : ffffffc011cf4f90 x0 : 0000000000000054
[  120.873122][ T4396] Call trace:
[  120.876259][ T4396]  __list_del_entry_valid+0x98/0xac
[  120.881304][ T4396]  hid_debug_events_release+0x48/0x12c
[  120.886617][ T4396]  full_proxy_release+0x50/0xbc
[  120.891323][ T4396]  __fput+0xdc/0x238
[  120.895075][ T4396]  ____fput+0x14/0x24
[  120.898911][ T4396]  task_work_run+0x90/0x148
[  120.903268][ T4396]  do_exit+0x1bc/0x8a4
[  120.907193][ T4396]  do_group_exit+0x8c/0xa4
[  120.911458][ T4396]  get_signal+0x468/0x744
[  120.915643][ T4396]  do_signal+0x84/0x280
[  120.919650][ T4396]  do_notify_resume+0xd0/0x218
[  120.924262][ T4396]  work_pending+0xc/0x3f0

[ Rahul Rameshbabu <sergeantsagara@protonmail.com>: rework changelog ]
Fixes: cd667ce24796 ("HID: use debugfs for events/reports dumping")
Signed-off-by: Charles Yi <be286@163.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
5 months agoHID: apple: add Jamesdonkey and A3R to non-apple keyboards list
Yihong Cao [Sun, 29 Oct 2023 17:05:38 +0000 (01:05 +0800)]
HID: apple: add Jamesdonkey and A3R to non-apple keyboards list

Jamesdonkey A3R keyboard is identified as "Jamesdonkey A3R" in wired
mode, "A3R-U" in wireless mode and "A3R" in bluetooth mode. Adding them
to non-apple keyboards fixes function key.

Signed-off-by: Yihong Cao <caoyihong4@outlook.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
5 months agoHID: mcp2221: Allow IO to start during probe
Hamish Martin [Wed, 25 Oct 2023 03:55:11 +0000 (16:55 +1300)]
HID: mcp2221: Allow IO to start during probe

During the probe we add an I2C adapter and as soon as we add that adapter
it may be used for a transfer (e.g via the code in i2cdetect()).
Those transfers are not able to complete and time out. This is because the
HID raw_event callback (mcp2221_raw_event) will not be invoked until the
HID device's 'driver_input_lock' is marked up at the completion of the
probe in hid_device_probe(). This starves the driver of the responses it
is waiting for.
In order to allow the I2C transfers to complete while we are still in the
probe, start the IO once we have completed init of the HID device.

This issue seems to have been seen before and a patch was submitted but
it seems it was never accepted. See:
https://lore.kernel.org/all/20221103222714.21566-3-Enrik.Berkhan@inka.de/

Signed-off-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
5 months agoHID: mcp2221: Set driver data before I2C adapter add
Hamish Martin [Wed, 25 Oct 2023 03:55:10 +0000 (16:55 +1300)]
HID: mcp2221: Set driver data before I2C adapter add

The process of adding an I2C adapter can invoke I2C accesses on that new
adapter (see i2c_detect()).

Ensure we have set the adapter's driver data to avoid null pointer
dereferences in the xfer functions during the adapter add.

This has been noted in the past and the same fix proposed but not
completed. See:
https://lore.kernel.org/lkml/ef597e73-ed71-168e-52af-0d19b03734ac@vigem.de/

Signed-off-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
5 months agoplatform/x86: intel_telemetry: Fix kernel doc descriptions
Andy Shevchenko [Mon, 20 Nov 2023 15:07:56 +0000 (17:07 +0200)]
platform/x86: intel_telemetry: Fix kernel doc descriptions

LKP found issues with a kernel doc in the driver:

core.c:116: warning: Function parameter or member 'ioss_evtconfig' not described in 'telemetry_update_events'
core.c:188: warning: Function parameter or member 'ioss_evtconfig' not described in 'telemetry_get_eventconfig'

It looks like it were copy'n'paste typos when these descriptions
had been introduced. Fix the typos.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310070743.WALmRGSY-lkp@intel.com/
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20231120150756.1661425-1-andriy.shevchenko@linux.intel.com
Reviewed-by: Rajneesh Bhardwaj <irenic.rajneesh@gmail.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
5 months agoMAINTAINERS: Drop Mark Gross as maintainer for x86 platform drivers
Hans de Goede [Mon, 20 Nov 2023 15:45:45 +0000 (16:45 +0100)]
MAINTAINERS: Drop Mark Gross as maintainer for x86 platform drivers

Mark has not really been active as maintainer for x86 platform drivers
lately, drop Mark from the MAINTAINERS entries for drivers/platform/x86,
drivers/platform/mellanox and drivers/platform/surface.

Cc: Mark Gross <markgross@kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20231120154548.611041-1-hdegoede@redhat.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
5 months agoDocs/zh_CN/LoongArch: Update links in LoongArch introduction.rst
Yanteng Si [Tue, 21 Nov 2023 07:03:26 +0000 (15:03 +0800)]
Docs/zh_CN/LoongArch: Update links in LoongArch introduction.rst

LoongArch-Vol1 has been updated to v1.10, the links in the documentation
are out of date, let's update it.

Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
5 months agoDocs/LoongArch: Update links in LoongArch introduction.rst
Yanteng Si [Tue, 21 Nov 2023 07:03:25 +0000 (15:03 +0800)]
Docs/LoongArch: Update links in LoongArch introduction.rst

LoongArch-Vol1 has been updated to v1.10, the links in the documentation
are out of date, let's update it.

Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
5 months agoLoongArch: Implement constant timer shutdown interface
Bibo Mao [Tue, 21 Nov 2023 07:03:25 +0000 (15:03 +0800)]
LoongArch: Implement constant timer shutdown interface

When a cpu is hot-unplugged, it is put in idle state and the function
arch_cpu_idle_dead() is called. The timer interrupt for this processor
should be disabled, otherwise there will be pending timer interrupt for
the unplugged cpu, so that vcpu is prevented from giving up scheduling
when system is running in vm mode.

This patch implements the timer shutdown interface so that the constant
timer will be properly disabled when a CPU is hot-unplugged.

Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
5 months agoLoongArch: Mark {dmw,tlb}_virt_to_page() exports as non-GPL
Huacai Chen [Tue, 21 Nov 2023 07:03:25 +0000 (15:03 +0800)]
LoongArch: Mark {dmw,tlb}_virt_to_page() exports as non-GPL

Mark {dmw,tlb}_virt_to_page() exports as non-GPL, in order to let
out-of-tree modules (e.g. OpenZFS) be built without errors. Otherwise
we get:

ERROR: modpost: GPL-incompatible module zfs.ko uses GPL-only symbol 'dmw_virt_to_page'
ERROR: modpost: GPL-incompatible module zfs.ko uses GPL-only symbol 'tlb_virt_to_page'

Reported-by: Haowu Ge <gehaowu@bitmoe.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
5 months agoLoongArch: Silence the boot warning about 'nokaslr'
Huacai Chen [Tue, 21 Nov 2023 07:03:25 +0000 (15:03 +0800)]
LoongArch: Silence the boot warning about 'nokaslr'

The kernel parameter 'nokaslr' is handled before start_kernel(), so we
don't need early_param() to mark it technically. But it can cause a boot
warning as follows:

Unknown kernel command line parameters "nokaslr", will be passed to user space.

When we use 'init=/bin/bash', 'nokaslr' which passed to user space will
even cause a kernel panic. So we use early_param() to mark 'nokaslr',
simply print a notice and silence the boot warning (also fix a potential
panic). This logic is similar to RISC-V.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
5 months agoLoongArch: Add __percpu annotation for __percpu_read()/__percpu_write()
Huacai Chen [Tue, 21 Nov 2023 07:03:25 +0000 (15:03 +0800)]
LoongArch: Add __percpu annotation for __percpu_read()/__percpu_write()

When build kernel with C=1, we get:

arch/loongarch/kernel/process.c:234:46: warning: incorrect type in argument 1 (different address spaces)
arch/loongarch/kernel/process.c:234:46:    expected void *ptr
arch/loongarch/kernel/process.c:234:46:    got unsigned long [noderef] __percpu *
arch/loongarch/kernel/process.c:234:46: warning: incorrect type in argument 1 (different address spaces)
arch/loongarch/kernel/process.c:234:46:    expected void *ptr
arch/loongarch/kernel/process.c:234:46:    got unsigned long [noderef] __percpu *
arch/loongarch/kernel/process.c:234:46: warning: incorrect type in argument 1 (different address spaces)
arch/loongarch/kernel/process.c:234:46:    expected void *ptr
arch/loongarch/kernel/process.c:234:46:    got unsigned long [noderef] __percpu *
arch/loongarch/kernel/process.c:234:46: warning: incorrect type in argument 1 (different address spaces)
arch/loongarch/kernel/process.c:234:46:    expected void *ptr
arch/loongarch/kernel/process.c:234:46:    got unsigned long [noderef] __percpu *

Add __percpu annotation for __percpu_read()/__percpu_write() can avoid
such warnings. __percpu_xchg() and other functions don't need annotation
because their wrapper, i.e. _pcp_protect(), already suppresses warnings.

Also adjust the indentations in this file.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202311080409.LlOfTR3m-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202311080840.Vc2kXhfp-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202311081340.3k72KKdg-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202311120926.cjYHyoYw-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202311152142.g6UyNx1R-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202311160339.DbhaH8LX-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202311181454.CTPrSYmQ-lkp@intel.com/
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
5 months agoLoongArch: Record pc instead of offset in la_abs relocation
WANG Rui [Tue, 21 Nov 2023 07:03:25 +0000 (15:03 +0800)]
LoongArch: Record pc instead of offset in la_abs relocation

To clarify, the previous version functioned flawlessly. However, it's
worth noting that the LLVM's LoongArch backend currently lacks support
for cross-section label calculations. With this patch, we enable the use
of clang to compile relocatable kernels.

Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
5 months agoLoongArch: Explicitly set -fdirect-access-external-data for vmlinux
WANG Rui [Tue, 21 Nov 2023 07:03:25 +0000 (15:03 +0800)]
LoongArch: Explicitly set -fdirect-access-external-data for vmlinux

After this llvm commit [1], The -fno-pic does not imply direct access
external data. Explicitly set -fdirect-access-external-data for vmlinux
that can avoids GOT entries.

Link: https://github.com/llvm/llvm-project/commit/47eeee297775347cbdb7624d6a766c2a3eec4a59
Suggested-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
5 months agoLoongArch: Add dependency between vmlinuz.efi and vmlinux.efi
Masahiro Yamada [Tue, 21 Nov 2023 07:03:25 +0000 (15:03 +0800)]
LoongArch: Add dependency between vmlinuz.efi and vmlinux.efi

A common issue in Makefile is a race in parallel building.

You need to be careful to prevent multiple threads from writing to the
same file simultaneously.

Commit 3939f3345050 ("ARM: 8418/1: add boot image dependencies to not
generate invalid images") addressed such a bad scenario.

A similar symptom occurs with the following command:

  $ make -j$(nproc) ARCH=loongarch vmlinux.efi vmlinuz.efi
    [ snip ]
    SORTTAB vmlinux
    OBJCOPY arch/loongarch/boot/vmlinux.efi
    OBJCOPY arch/loongarch/boot/vmlinux.efi
    PAD     arch/loongarch/boot/vmlinux.bin
    GZIP    arch/loongarch/boot/vmlinuz
    OBJCOPY arch/loongarch/boot/vmlinuz.o
    LD      arch/loongarch/boot/vmlinuz.efi.elf
    OBJCOPY arch/loongarch/boot/vmlinuz.efi

The log "OBJCOPY arch/loongarch/boot/vmlinux.efi" is displayed twice.

It indicates that two threads simultaneously enter arch/loongarch/boot/
and write to arch/loongarch/boot/vmlinux.efi.

It occasionally leads to a build failure:

  $ make -j$(nproc) ARCH=loongarch vmlinux.efi vmlinuz.efi
    [ snip ]
    SORTTAB vmlinux
    OBJCOPY arch/loongarch/boot/vmlinux.efi
    PAD     arch/loongarch/boot/vmlinux.bin
  truncate: Invalid number: ‘arch/loongarch/boot/vmlinux.bin’
  make[2]: *** [drivers/firmware/efi/libstub/Makefile.zboot:13:
  arch/loongarch/boot/vmlinux.bin] Error 1
  make[2]: *** Deleting file 'arch/loongarch/boot/vmlinux.bin'
  make[1]: *** [arch/loongarch/Makefile:146: vmlinuz.efi] Error 2
  make[1]: *** Waiting for unfinished jobs....
  make: *** [Makefile:234: __sub-make] Error 2

vmlinuz.efi depends on vmlinux.efi, but such a dependency is not
specified in arch/loongarch/Makefile.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
5 months agoMerge branch 'verify-callbacks-as-if-they-are-called-unknown-number-of-times'
Alexei Starovoitov [Tue, 21 Nov 2023 02:33:36 +0000 (18:33 -0800)]
Merge branch 'verify-callbacks-as-if-they-are-called-unknown-number-of-times'

Eduard Zingerman says:

====================
verify callbacks as if they are called unknown number of times

This series updates verifier logic for callback functions handling.
Current master simulates callback body execution exactly once,
which leads to verifier not detecting unsafe programs like below:

    static int unsafe_on_zero_iter_cb(__u32 idx, struct num_context *ctx)
    {
        ctx->i = 0;
        return 0;
    }

    SEC("?raw_tp")
    int unsafe_on_zero_iter(void *unused)
    {
        struct num_context loop_ctx = { .i = 32 };
        __u8 choice_arr[2] = { 0, 1 };

        bpf_loop(100, unsafe_on_zero_iter_cb, &loop_ctx, 0);
        return choice_arr[loop_ctx.i];
    }

This was reported previously in [0].
The basic idea of the fix is to schedule callback entry state for
verification in env->head until some identical, previously visited
state in current DFS state traversal is found. Same logic as with open
coded iterators, and builds on top recent fixes [1] for those.

The series is structured as follows:
- patches #1,2,3 update strobemeta, xdp_synproxy selftests and
  bpf_loop_bench benchmark to allow convergence of the bpf_loop
  callback states;
- patches #4,5 just shuffle the code a bit;
- patch #6 is the main part of the series;
- patch #7 adds test cases for #6;
- patch #8 extend patch #6 with same speculative scalar widening
  logic, as used for open coded iterators;
- patch #9 adds test cases for #8;
- patch #10 extends patch #6 to track maximal number of callback
  executions specifically for bpf_loop();
- patch #11 adds test cases for #10.

Veristat results comparing this series to master+patches #1,2,3 using selftests
show the following difference:

File                       Program        States (A)  States (B)  States (DIFF)
-------------------------  -------------  ----------  ----------  -------------
bpf_loop_bench.bpf.o       benchmark               1           2  +1 (+100.00%)
pyperf600_bpf_loop.bpf.o   on_event              322         407  +85 (+26.40%)
strobemeta_bpf_loop.bpf.o  on_event              113         151  +38 (+33.63%)
xdp_synproxy_kern.bpf.o    syncookie_tc          341         291  -50 (-14.66%)
xdp_synproxy_kern.bpf.o    syncookie_xdp         344         301  -43 (-12.50%)

Veristat results comparing this series to master using Tetragon BPF
files [2] also show some differences.
States diff varies from +2% to +15% on 23 programs out of 186,
no new failures.

Changelog:
- V3 [5] -> V4, changes suggested by Andrii:
  - validate mark_chain_precision() result in patch #10;
  - renaming s/cumulative_callback_depth/callback_unroll_depth/.
- V2 [4] -> V3:
  - fixes in expected log messages for test cases:
    - callback_result_precise;
    - parent_callee_saved_reg_precise_with_callback;
    - parent_stack_slot_precise_with_callback;
  - renamings (suggested by Alexei):
    - s/callback_iter_depth/cumulative_callback_depth/
    - s/is_callback_iter_next/calls_callback/
    - s/mark_callback_iter_next/mark_calls_callback/
  - prepare_func_exit() updated to exit with -EFAULT when
    callee->in_callback_fn is true but calls_callback() is not true
    for callsite;
  - test case 'bpf_loop_iter_limit_nested' rewritten to use return
    value check instead of verifier log message checks
    (suggested by Alexei).
- V1 [3] -> V2, changes suggested by Andrii:
  - small changes for error handling code in __check_func_call();
  - callback body processing log is now matched in relevant
    verifier_subprog_precision.c tests;
  - R1 passed to bpf_loop() is now always marked as precise;
  - log level 2 message for bpf_loop() iteration termination instead of
    iteration depth messages;
  - __no_msg macro removed;
  - bpf_loop_iter_limit_nested updated to avoid using __no_msg;
  - commit message for patch #3 updated according to Alexei's request.

[0] https://lore.kernel.org/bpf/CA+vRuzPChFNXmouzGG+wsy=6eMcfr1mFG0F3g7rbg-sedGKW3w@mail.gmail.com/
[1] https://lore.kernel.org/bpf/20231024000917.12153-1-eddyz87@gmail.com/
[2] git@github.com:cilium/tetragon.git
[3] https://lore.kernel.org/bpf/20231116021803.9982-1-eddyz87@gmail.com/T/#t
[4] https://lore.kernel.org/bpf/20231118013355.7943-1-eddyz87@gmail.com/T/#t
[5] https://lore.kernel.org/bpf/20231120225945.11741-1-eddyz87@gmail.com/T/#t
====================

Link: https://lore.kernel.org/r/20231121020701.26440-1-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 months agoselftests/bpf: check if max number of bpf_loop iterations is tracked
Eduard Zingerman [Tue, 21 Nov 2023 02:07:01 +0000 (04:07 +0200)]
selftests/bpf: check if max number of bpf_loop iterations is tracked

Check that even if bpf_loop() callback simulation does not converge to
a specific state, verification could proceed via "brute force"
simulation of maximal number of callback calls.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231121020701.26440-12-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 months agobpf: keep track of max number of bpf_loop callback iterations
Eduard Zingerman [Tue, 21 Nov 2023 02:07:00 +0000 (04:07 +0200)]
bpf: keep track of max number of bpf_loop callback iterations

In some cases verifier can't infer convergence of the bpf_loop()
iteration. E.g. for the following program:

    static int cb(__u32 idx, struct num_context* ctx)
    {
        ctx->i++;
        return 0;
    }

    SEC("?raw_tp")
    int prog(void *_)
    {
        struct num_context ctx = { .i = 0 };
        __u8 choice_arr[2] = { 0, 1 };

        bpf_loop(2, cb, &ctx, 0);
        return choice_arr[ctx.i];
    }

Each 'cb' simulation would eventually return to 'prog' and reach
'return choice_arr[ctx.i]' statement. At which point ctx.i would be
marked precise, thus forcing verifier to track multitude of separate
states with {.i=0}, {.i=1}, ... at bpf_loop() callback entry.

This commit allows "brute force" handling for such cases by limiting
number of callback body simulations using 'umax' value of the first
bpf_loop() parameter.

For this, extend bpf_func_state with 'callback_depth' field.
Increment this field when callback visiting state is pushed to states
traversal stack. For frame #N it's 'callback_depth' field counts how
many times callback with frame depth N+1 had been executed.
Use bpf_func_state specifically to allow independent tracking of
callback depths when multiple nested bpf_loop() calls are present.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231121020701.26440-11-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 months agoselftests/bpf: test widening for iterating callbacks
Eduard Zingerman [Tue, 21 Nov 2023 02:06:59 +0000 (04:06 +0200)]
selftests/bpf: test widening for iterating callbacks

A test case to verify that imprecise scalars widening is applied to
callback entering state, when callback call is simulated repeatedly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231121020701.26440-10-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 months agobpf: widening for callback iterators
Eduard Zingerman [Tue, 21 Nov 2023 02:06:58 +0000 (04:06 +0200)]
bpf: widening for callback iterators

Callbacks are similar to open coded iterators, so add imprecise
widening logic for callback body processing. This makes callback based
loops behave identically to open coded iterators, e.g. allowing to
verify programs like below:

  struct ctx { u32 i; };
  int cb(u32 idx, struct ctx* ctx)
  {
          ++ctx->i;
          return 0;
  }
  ...
  struct ctx ctx = { .i = 0 };
  bpf_loop(100, cb, &ctx, 0);
  ...

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231121020701.26440-9-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 months agoselftests/bpf: tests for iterating callbacks
Eduard Zingerman [Tue, 21 Nov 2023 02:06:57 +0000 (04:06 +0200)]
selftests/bpf: tests for iterating callbacks

A set of test cases to check behavior of callback handling logic,
check if verifier catches the following situations:
- program not safe on second callback iteration;
- program not safe on zero callback iterations;
- infinite loop inside a callback.

Verify that callback logic works for bpf_loop, bpf_for_each_map_elem,
bpf_user_ringbuf_drain, bpf_find_vma.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231121020701.26440-8-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 months agobpf: verify callbacks as if they are called unknown number of times
Eduard Zingerman [Tue, 21 Nov 2023 02:06:56 +0000 (04:06 +0200)]
bpf: verify callbacks as if they are called unknown number of times

Prior to this patch callbacks were handled as regular function calls,
execution of callback body was modeled exactly once.
This patch updates callbacks handling logic as follows:
- introduces a function push_callback_call() that schedules callback
  body verification in env->head stack;
- updates prepare_func_exit() to reschedule callback body verification
  upon BPF_EXIT;
- as calls to bpf_*_iter_next(), calls to callback invoking functions
  are marked as checkpoints;
- is_state_visited() is updated to stop callback based iteration when
  some identical parent state is found.

Paths with callback function invoked zero times are now verified first,
which leads to necessity to modify some selftests:
- the following negative tests required adding release/unlock/drop
  calls to avoid previously masked unrelated error reports:
  - cb_refs.c:underflow_prog
  - exceptions_fail.c:reject_rbtree_add_throw
  - exceptions_fail.c:reject_with_cp_reference
- the following precision tracking selftests needed change in expected
  log trace:
  - verifier_subprog_precision.c:callback_result_precise
    (note: r0 precision is no longer propagated inside callback and
           I think this is a correct behavior)
  - verifier_subprog_precision.c:parent_callee_saved_reg_precise_with_callback
  - verifier_subprog_precision.c:parent_stack_slot_precise_with_callback

Reported-by: Andrew Werner <awerner32@gmail.com>
Closes: https://lore.kernel.org/bpf/CA+vRuzPChFNXmouzGG+wsy=6eMcfr1mFG0F3g7rbg-sedGKW3w@mail.gmail.com/
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231121020701.26440-7-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 months agobpf: extract setup_func_entry() utility function
Eduard Zingerman [Tue, 21 Nov 2023 02:06:55 +0000 (04:06 +0200)]
bpf: extract setup_func_entry() utility function

Move code for simulated stack frame creation to a separate utility
function. This function would be used in the follow-up change for
callbacks handling.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231121020701.26440-6-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 months agobpf: extract __check_reg_arg() utility function
Eduard Zingerman [Tue, 21 Nov 2023 02:06:54 +0000 (04:06 +0200)]
bpf: extract __check_reg_arg() utility function

Split check_reg_arg() into two utility functions:
- check_reg_arg() operating on registers from current verifier state;
- __check_reg_arg() operating on a specific set of registers passed as
  a parameter;

The __check_reg_arg() function would be used by a follow-up change for
callbacks handling.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231121020701.26440-5-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 months agoselftests/bpf: fix bpf_loop_bench for new callback verification scheme
Eduard Zingerman [Tue, 21 Nov 2023 02:06:53 +0000 (04:06 +0200)]
selftests/bpf: fix bpf_loop_bench for new callback verification scheme

This is a preparatory change. A follow-up patch "bpf: verify callbacks
as if they are called unknown number of times" changes logic for
callbacks handling. While previously callbacks were verified as a
single function call, new scheme takes into account that callbacks
could be executed unknown number of times.

This has dire implications for bpf_loop_bench:

    SEC("fentry/" SYS_PREFIX "sys_getpgid")
    int benchmark(void *ctx)
    {
            for (int i = 0; i < 1000; i++) {
                    bpf_loop(nr_loops, empty_callback, NULL, 0);
                    __sync_add_and_fetch(&hits, nr_loops);
            }
            return 0;
    }

W/o callbacks change verifier sees it as a 1000 calls to
empty_callback(). However, with callbacks change things become
exponential:
- i=0: state exploring empty_callback is scheduled with i=0 (a);
- i=1: state exploring empty_callback is scheduled with i=1;
  ...
- i=999: state exploring empty_callback is scheduled with i=999;
- state (a) is popped from stack;
- i=1: state exploring empty_callback is scheduled with i=1;
  ...

Avoid this issue by rewriting outer loop as bpf_loop().
Unfortunately, this adds a function call to a loop at runtime, which
negatively affects performance:

            throughput               latency
   before:  149.919 ± 0.168 M ops/s, 6.670 ns/op
   after :  137.040 ± 0.187 M ops/s, 7.297 ns/op

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231121020701.26440-4-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 months agoselftests/bpf: track string payload offset as scalar in strobemeta
Eduard Zingerman [Tue, 21 Nov 2023 02:06:52 +0000 (04:06 +0200)]
selftests/bpf: track string payload offset as scalar in strobemeta

This change prepares strobemeta for update in callbacks verification
logic. To allow bpf_loop() verification converge when multiple
callback iterations are considered:
- track offset inside strobemeta_payload->payload directly as scalar
  value;
- at each iteration make sure that remaining
  strobemeta_payload->payload capacity is sufficient for execution of
  read_{map,str}_var functions;
- make sure that offset is tracked as unbound scalar between
  iterations, otherwise verifier won't be able infer that bpf_loop
  callback reaches identical states.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231121020701.26440-3-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 months agoselftests/bpf: track tcp payload offset as scalar in xdp_synproxy
Eduard Zingerman [Tue, 21 Nov 2023 02:06:51 +0000 (04:06 +0200)]
selftests/bpf: track tcp payload offset as scalar in xdp_synproxy

This change prepares syncookie_{tc,xdp} for update in callbakcs
verification logic. To allow bpf_loop() verification converge when
multiple callback itreations are considered:
- track offset inside TCP payload explicitly, not as a part of the
  pointer;
- make sure that offset does not exceed MAX_PACKET_OFF enforced by
  verifier;
- make sure that offset is tracked as unbound scalar between
  iterations, otherwise verifier won't be able infer that bpf_loop
  callback reaches identical states.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231121020701.26440-2-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 months agoMerge branch 'bpf_redirect_peer fixes'
Martin KaFai Lau [Mon, 20 Nov 2023 18:15:17 +0000 (10:15 -0800)]
Merge branch 'bpf_redirect_peer fixes'

Daniel Borkmann says:

====================
This fixes bpf_redirect_peer stats accounting for veth and netkit,
and adds tstats in the first place for the latter. Utilise indirect
call wrapper for bpf_redirect_peer, and improve test coverage of the
latter also for netkit devices. Details in the patches, thanks!

The series was targeted at bpf originally, and is done here as well,
so it can trigger BPF CI. Jakub, if you think directly going via net
is better since the majority of the diff touches net anyway, that is
fine, too.

Thanks!

v2 -> v3:
  - Add kdoc for pcpu_stat_type (Simon)
  - Reject invalid type value in netdev_do_alloc_pcpu_stats (Simon)
  - Add Reviewed-by tags from list
v1 -> v2:
  - Move stats allocation/freeing into net core (Jakub)
  - As prepwork for the above, move vrf's dstats over into the core
  - Add a check into stats alloc to enforce tstats upon
    implementing ndo_get_peer_dev
  - Add Acked-by tags from list

Daniel Borkmann (6):
  net, vrf: Move dstats structure to core
  net: Move {l,t,d}stats allocation to core and convert veth & vrf
  netkit: Add tstats per-CPU traffic counters
  bpf, netkit: Add indirect call wrapper for fetching peer dev
  selftests/bpf: De-veth-ize the tc_redirect test case
  selftests/bpf: Add netkit to tc_redirect selftest
====================

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
6 months agoselftests/bpf: Add netkit to tc_redirect selftest
Daniel Borkmann [Tue, 14 Nov 2023 00:42:20 +0000 (01:42 +0100)]
selftests/bpf: Add netkit to tc_redirect selftest

Extend the existing tc_redirect selftest to also cover netkit devices
for exercising the bpf_redirect_peer() code paths, so that we have both
veth as well as netkit covered, all tests still pass after this change.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20231114004220.6495-9-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
6 months agoselftests/bpf: De-veth-ize the tc_redirect test case
Daniel Borkmann [Tue, 14 Nov 2023 00:42:19 +0000 (01:42 +0100)]
selftests/bpf: De-veth-ize the tc_redirect test case

No functional changes to the test case, but just renaming various functions,
variables, etc, to remove veth part of their name for making it more generic
and reusable later on (e.g. for netkit).

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20231114004220.6495-8-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
6 months agobpf, netkit: Add indirect call wrapper for fetching peer dev
Daniel Borkmann [Tue, 14 Nov 2023 00:42:18 +0000 (01:42 +0100)]
bpf, netkit: Add indirect call wrapper for fetching peer dev

ndo_get_peer_dev is used in tcx BPF fast path, therefore make use of
indirect call wrapper and therefore optimize the bpf_redirect_peer()
internal handling a bit. Add a small skb_get_peer_dev() wrapper which
utilizes the INDIRECT_CALL_1() macro instead of open coding.

Future work could potentially add a peer pointer directly into struct
net_device in future and convert veth and netkit over to use it so
that eventually ndo_get_peer_dev can be removed.

Co-developed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20231114004220.6495-7-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
6 months agobpf: Fix dev's rx stats for bpf_redirect_peer traffic
Peilin Ye [Tue, 14 Nov 2023 00:42:17 +0000 (01:42 +0100)]
bpf: Fix dev's rx stats for bpf_redirect_peer traffic

Traffic redirected by bpf_redirect_peer() (used by recent CNIs like Cilium)
is not accounted for in the RX stats of supported devices (that is, veth
and netkit), confusing user space metrics collectors such as cAdvisor [0],
as reported by Youlun.

Fix it by calling dev_sw_netstats_rx_add() in skb_do_redirect(), to update
RX traffic counters. Devices that support ndo_get_peer_dev _must_ use the
@tstats per-CPU counters (instead of @lstats, or @dstats).

To make this more fool-proof, error out when ndo_get_peer_dev is set but
@tstats are not selected.

  [0] Specifically, the "container_network_receive_{byte,packet}s_total"
      counters are affected.

Fixes: 9aa1206e8f48 ("bpf: Add redirect_peer helper")
Reported-by: Youlun Zhang <zhangyoulun@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20231114004220.6495-6-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
6 months agoveth: Use tstats per-CPU traffic counters
Peilin Ye [Tue, 14 Nov 2023 00:42:16 +0000 (01:42 +0100)]
veth: Use tstats per-CPU traffic counters

Currently veth devices use the lstats per-CPU traffic counters, which only
cover TX traffic. veth_get_stats64() actually populates RX stats of a veth
device from its peer's TX counters, based on the assumption that a veth
device can _only_ receive packets from its peer, which is no longer true:

For example, recent CNIs (like Cilium) can use the bpf_redirect_peer() BPF
helper to redirect traffic from NIC's tc ingress to veth's tc ingress (in
a different netns), skipping veth's peer device. Unfortunately, this kind
of traffic isn't currently accounted for in veth's RX stats.

In preparation for the fix, use tstats (instead of lstats) to maintain
both RX and TX counters for each veth device. We'll use RX counters for
bpf_redirect_peer() traffic, and keep using TX counters for the usual
"peer-to-peer" traffic. In veth_get_stats64(), calculate RX stats by
_adding_ RX count to peer's TX count, in order to cover both kinds of
traffic.

veth_stats_rx() might need a name change (perhaps to "veth_stats_xdp()")
for less confusion, but let's leave it to another patch to keep the fix
minimal.

Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20231114004220.6495-5-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
6 months agonetkit: Add tstats per-CPU traffic counters
Daniel Borkmann [Tue, 14 Nov 2023 00:42:15 +0000 (01:42 +0100)]
netkit: Add tstats per-CPU traffic counters

Add dev->tstats traffic accounting to netkit. The latter contains per-CPU
RX and TX counters.

The dev's TX counters are bumped upon pass/unspec as well as redirect
verdicts, in other words, on everything except for drops.

The dev's RX counters are bumped upon successful __netif_rx(), as well
as from skb_do_redirect() (not part of this commit here).

Using dev->lstats with having just a single packets/bytes counter and
inferring one another's RX counters from the peer dev's lstats is not
possible given skb_do_redirect() can also bump the device's stats.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20231114004220.6495-4-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
6 months agonet: Move {l,t,d}stats allocation to core and convert veth & vrf
Daniel Borkmann [Tue, 14 Nov 2023 00:42:14 +0000 (01:42 +0100)]
net: Move {l,t,d}stats allocation to core and convert veth & vrf

Move {l,t,d}stats allocation to the core and let netdevs pick the stats
type they need. That way the driver doesn't have to bother with error
handling (allocation failure checking, making sure free happens in the
right spot, etc) - all happening in the core.

Co-developed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Cc: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231114004220.6495-3-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
6 months agonet, vrf: Move dstats structure to core
Daniel Borkmann [Tue, 14 Nov 2023 00:42:13 +0000 (01:42 +0100)]
net, vrf: Move dstats structure to core

Just move struct pcpu_dstats out of the vrf into the core, and streamline
the field names slightly, so they better align with the {t,l}stats ones.

No functional change otherwise. A conversion of the u64s to u64_stats_t
could be done at a separate point in future. This move is needed as we are
moving the {t,l,d}stats allocation/freeing to the core.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231114004220.6495-2-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
6 months agoLinux 6.7-rc2 v6.7-rc2
Linus Torvalds [Sun, 19 Nov 2023 23:02:14 +0000 (15:02 -0800)]
Linux 6.7-rc2

6 months agoMerge tag 'kbuild-fixes-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/masahi...
Linus Torvalds [Sun, 19 Nov 2023 21:54:28 +0000 (13:54 -0800)]
Merge tag 'kbuild-fixes-v6.7' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Fix section mismatch warning messages for riscv and loongarch

 - Remove CONFIG_IA64 left-over from linux/export-internal.h

 - Fix the location of the quotes for UIMAGE_NAME

 - Fix a memory leak bug in Kconfig

* tag 'kbuild-fixes-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kconfig: fix memory leak from range properties
  kbuild: Move the single quotes for image name
  linux/export: clean up the IA-64 KSYM_FUNC macro
  modpost: fix section mismatch message for RELA

6 months agoMerge tag 'irq_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 19 Nov 2023 21:49:32 +0000 (13:49 -0800)]
Merge tag 'irq_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip

Pull irq fix from Borislav Petkov:

 - Flush the translation service tables to prevent unpredictable
   behavior on non-coherent GIC devices

* tag 'irq_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3-its: Flush ITS tables correctly in non-coherent GIC designs

6 months agoMerge tag 'x86_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 19 Nov 2023 21:46:17 +0000 (13:46 -0800)]
Merge tag 'x86_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Ignore invalid x2APIC entries in order to not waste per-CPU data

 - Fix a back-to-back signals handling scenario when shadow stack is in
   use

 - A documentation fix

 - Add Kirill as TDX maintainer

* tag 'x86_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/acpi: Ignore invalid x2APIC entries
  x86/shstk: Delay signal entry SSP write until after user accesses
  x86/Documentation: Indent 'note::' directive for protocol version number note
  MAINTAINERS: Add Intel TDX entry

6 months agoMerge tag 'timers_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 19 Nov 2023 21:35:07 +0000 (13:35 -0800)]
Merge tag 'timers_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip

Pull timer fix from Borislav Petkov:

 - Do the push of pending hrtimers away from a CPU which is being
   offlined earlier in the offlining process in order to prevent a
   deadlock

* tag 'timers_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  hrtimers: Push pending hrtimers away from outgoing CPU earlier

6 months agoMerge tag 'sched_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 19 Nov 2023 21:32:00 +0000 (13:32 -0800)]
Merge tag 'sched_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Borislav Petkov:

 - Fix virtual runtime calculation when recomputing a sched entity's
   weights

 - Fix wrongly rejected unprivileged poll requests to the cgroup psi
   pressure files

 - Make sure the load balancing is done by only one CPU

* tag 'sched_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix the decision for load balance
  sched: psi: fix unprivileged polling against cgroups
  sched/eevdf: Fix vruntime adjustment on reweight

6 months agoMerge tag 'locking_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 19 Nov 2023 21:30:21 +0000 (13:30 -0800)]
Merge tag 'locking_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip

Pull locking fix from Borislav Petkov:

 - Fix a hardcoded futex flags case which lead to one robust futex test
   failure

* tag 'locking_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  futex: Fix hardcoded flags

6 months agoMerge tag 'perf_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 19 Nov 2023 21:26:42 +0000 (13:26 -0800)]
Merge tag 'perf_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip

Pull perf fix from Borislav Petkov:

 - Make sure the context refcount is transferred too when migrating perf
   events

* tag 'perf_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix cpuctx refcounting

6 months agonet: fill in MODULE_DESCRIPTION()s for SOCK_DIAG modules
Jakub Kicinski [Sun, 19 Nov 2023 03:30:06 +0000 (19:30 -0800)]
net: fill in MODULE_DESCRIPTION()s for SOCK_DIAG modules

W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to all the sock diag modules in one fell swoop.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 months agoocteontx2-pf: Fix memory leak during interface down
Suman Ghosh [Fri, 17 Nov 2023 10:40:18 +0000 (16:10 +0530)]
octeontx2-pf: Fix memory leak during interface down

During 'ifconfig <netdev> down' one RSS memory was not getting freed.
This patch fixes the same.

Fixes: 81a4362016e7 ("octeontx2-pf: Add RSS multi group support")
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 months agowireguard: use DEV_STATS_INC()
Eric Dumazet [Fri, 17 Nov 2023 14:17:33 +0000 (14:17 +0000)]
wireguard: use DEV_STATS_INC()

wg_xmit() can be called concurrently, KCSAN reported [1]
some device stats updates can be lost.

Use DEV_STATS_INC() for this unlikely case.

[1]
BUG: KCSAN: data-race in wg_xmit / wg_xmit

read-write to 0xffff888104239160 of 8 bytes by task 1375 on cpu 0:
wg_xmit+0x60f/0x680 drivers/net/wireguard/device.c:231
__netdev_start_xmit include/linux/netdevice.h:4918 [inline]
netdev_start_xmit include/linux/netdevice.h:4932 [inline]
xmit_one net/core/dev.c:3543 [inline]
dev_hard_start_xmit+0x11b/0x3f0 net/core/dev.c:3559
...

read-write to 0xffff888104239160 of 8 bytes by task 1378 on cpu 1:
wg_xmit+0x60f/0x680 drivers/net/wireguard/device.c:231
__netdev_start_xmit include/linux/netdevice.h:4918 [inline]
netdev_start_xmit include/linux/netdevice.h:4932 [inline]
xmit_one net/core/dev.c:3543 [inline]
dev_hard_start_xmit+0x11b/0x3f0 net/core/dev.c:3559
...

v2: also change wg_packet_consume_data_done() (Hangbin Liu)
    and wg_packet_purge_staged_packets()

Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 months agonet: wangxun: fix kernel panic due to null pointer
Jiawen Wu [Fri, 17 Nov 2023 10:11:08 +0000 (18:11 +0800)]
net: wangxun: fix kernel panic due to null pointer

When the device uses a custom subsystem vendor ID, the function
wx_sw_init() returns before the memory of 'wx->mac_table' is allocated.
The null pointer will causes the kernel panic.

Fixes: 79625f45ca73 ("net: wangxun: Move MAC address handling to libwx")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 months agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 18 Nov 2023 23:20:58 +0000 (15:20 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Seven small fixes, six in drivers and one in sd.

  The sd fix is so large because it changes a struct pointer to a struct
  but otherwise is fairly simple"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: qcom-ufs: dt-bindings: Document the SM8650 UFS Controller
  scsi: sd: Fix sshdr use in sd_suspend_common()
  scsi: scsi_debug: Delete some bogus error checking
  scsi: scsi_debug: Fix some bugs in sdebug_error_write()
  scsi: ufs: core: Fix racing issue between ufshcd_mcq_abort() and ISR
  scsi: ufs: core: Expand MCQ queue slot to DeviceQueueDepth + 1
  scsi: qla2xxx: Fix system crash due to bad pointer access

6 months agoMerge tag 'parisc-for-6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/delle...
Linus Torvalds [Sat, 18 Nov 2023 23:13:10 +0000 (15:13 -0800)]
Merge tag 'parisc-for-6.7-rc2' of git://git./linux/kernel/git/deller/parisc-linux

Pull parisc fixes from Helge Deller:
 "On parisc we still sometimes need writeable stacks, e.g. if programs
  aren't compiled with gcc-14. To avoid issues with the upcoming
  systemd-254 we therefore have to disable prctl(PR_SET_MDWE) for now
  (for parisc only).

  The other two patches are minor: a bugfix for the soft power-off on
  qemu with 64-bit kernel and prefer strscpy() over strlcpy():

   - Fix power soft-off on qemu

   - Disable prctl(PR_SET_MDWE) since parisc sometimes still needs
     writeable stacks

   - Use strscpy instead of strlcpy in show_cpuinfo()"

* tag 'parisc-for-6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  prctl: Disable prctl(PR_SET_MDWE) on parisc
  parisc/power: Fix power soft-off when running on qemu
  parisc: Replace strlcpy() with strscpy()

6 months agoMerge tag 'xfs-6.7-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Sat, 18 Nov 2023 19:28:28 +0000 (11:28 -0800)]
Merge tag 'xfs-6.7-fixes-1' of git://git./fs/xfs/xfs-linux

Pull xfs fixes from Chandan Babu:

 - Fix deadlock arising due to intent items in AIL not being cleared
   when log recovery fails

 - Fix stale data exposure bug when remapping COW fork extents to data
   fork

 - Fix deadlock when data device flush fails

 - Fix AGFL minimum size calculation

 - Select DEBUG_FS instead of XFS_DEBUG when XFS_ONLINE_SCRUB_STATS is
   selected

 - Fix corruption of log inode's extent count field when NREXT64 feature
   is enabled

* tag 'xfs-6.7-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: recovery should not clear di_flushiter unconditionally
  xfs: inode recovery does not validate the recovered inode
  xfs: fix again select in kconfig XFS_ONLINE_SCRUB_STATS
  xfs: fix internal error from AGFL exhaustion
  xfs: up(ic_sema) if flushing data device fails
  xfs: only remap the written blocks in xfs_reflink_end_cow_extent
  XFS: Update MAINTAINERS to catch all XFS documentation
  xfs: abort intent items when recovery intents fail
  xfs: factor out xfs_defer_pending_abort

6 months agoMerge tag 'nfsd-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Linus Torvalds [Sat, 18 Nov 2023 19:23:32 +0000 (11:23 -0800)]
Merge tag 'nfsd-6.7-1' of git://git./linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:

 - Fix several long-standing bugs in the duplicate reply cache

 - Fix a memory leak

* tag 'nfsd-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  NFSD: Fix checksum mismatches in the duplicate reply cache
  NFSD: Fix "start of NFS reply" pointer passed to nfsd_cache_update()
  NFSD: Update nfsd_cache_append() to use xdr_stream
  nfsd: fix file memleak on client_opens_release

6 months agoMerge tag '6.7-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sat, 18 Nov 2023 19:18:46 +0000 (11:18 -0800)]
Merge tag '6.7-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

 - multichannel fixes (including a lock ordering fix and an important
   refcounting fix)

 - spnego fix

* tag '6.7-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix lock ordering while disabling multichannel
  cifs: fix leak of iface for primary channel
  cifs: fix check of rc in function generate_smb3signingkey
  cifs: spnego: add ';' in HOST_KEY_LEN

6 months agoprctl: Disable prctl(PR_SET_MDWE) on parisc
Helge Deller [Sat, 18 Nov 2023 18:33:35 +0000 (19:33 +0100)]
prctl: Disable prctl(PR_SET_MDWE) on parisc

systemd-254 tries to use prctl(PR_SET_MDWE) for it's MemoryDenyWriteExecute
functionality, but fails on parisc which still needs executable stacks in
certain combinations of gcc/glibc/kernel.

Disable prctl(PR_SET_MDWE) by returning -EINVAL for now on parisc, until
userspace has catched up.

Signed-off-by: Helge Deller <deller@gmx.de>
Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Sam James <sam@gentoo.org>
Closes: https://github.com/systemd/systemd/issues/29775
Tested-by: Sam James <sam@gentoo.org>
Link: https://lore.kernel.org/all/875y2jro9a.fsf@gentoo.org/
Cc: <stable@vger.kernel.org> # v6.3+