net/mlx5e: Use synchronize_rcu to sync with NAPI
authorMaxim Mikityanskiy <maximmi@mellanox.com>
Thu, 11 Jun 2020 11:25:19 +0000 (14:25 +0300)
committerSaeed Mahameed <saeedm@nvidia.com>
Tue, 22 Sep 2020 00:22:21 +0000 (17:22 -0700)
commit9c25a22dfb00270372224721fed646965420323a
tree7aa5cd4729b74d75ce128a227504ddffbbd7752e
parentfe45386a208277cae4648106133c08246eecd012
net/mlx5e: Use synchronize_rcu to sync with NAPI

As described in the previous commit, napi_synchronize doesn't quite fit
the purpose when we just need to wait until the currently running NAPI
quits. Its implementation waits until NAPI is not running by polling and
waiting for 1ms in between. In cases where we need to deactivate one
queue (e.g., recovery flows) or where we deactivate them one-by-one
(deactivate channel flow), we may get stuck in napi_synchronize forever
if other queues keep NAPI active, causing a soft lockup. Depending on
kernel configuration (CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC), it may result
in a kernel panic.

To fix the issue, use synchronize_rcu to wait for NAPI to quit, and wrap
the whole NAPI in rcu_read_lock.

Fixes: acc6c5953af1 ("net/mlx5e: Split open/close channels to stages")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
drivers/net/ethernet/mellanox/mlx5/core/en_main.c
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c