sfrench/cifs-2.6.git
19 months agoparisc: fix compilation errrors
Qian Cai [Sun, 25 Aug 2019 00:54:43 +0000 (17:54 -0700)]
parisc: fix compilation errrors

Commit 0cfaee2af3a0 ("include/asm-generic/5level-fixup.h: fix variable
'p4d' set but not used") converted a few functions from macros to static
inline, which causes parisc to complain,

  In file included from include/asm-generic/4level-fixup.h:38:0,
                   from arch/parisc/include/asm/pgtable.h:5,
                   from arch/parisc/include/asm/io.h:6,
                   from include/linux/io.h:13,
                   from sound/core/memory.c:9:
  include/asm-generic/5level-fixup.h:14:18: error: unknown type name 'pgd_t'; did you mean 'pid_t'?
   #define p4d_t    pgd_t
                    ^
  include/asm-generic/5level-fixup.h:24:28: note: in expansion of macro 'p4d_t'
   static inline int p4d_none(p4d_t p4d)
                              ^~~~~

It is because "4level-fixup.h" is included before "asm/page.h" where
"pgd_t" is defined.

Link: http://lkml.kernel.org/r/20190815205305.1382-1-cai@lca.pw
Fixes: 0cfaee2af3a0 ("include/asm-generic/5level-fixup.h: fix variable 'p4d' set but not used")
Signed-off-by: Qian Cai <cai@lca.pw>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
19 months agomm, page_alloc: move_freepages should not examine struct page of reserved memory
David Rientjes [Sun, 25 Aug 2019 00:54:40 +0000 (17:54 -0700)]
mm, page_alloc: move_freepages should not examine struct page of reserved memory

After commit 907ec5fca3dc ("mm: zero remaining unavailable struct
pages"), struct page of reserved memory is zeroed.  This causes
page->flags to be 0 and fixes issues related to reading
/proc/kpageflags, for example, of reserved memory.

The VM_BUG_ON() in move_freepages_block(), however, assumes that
page_zone() is meaningful even for reserved memory.  That assumption is
no longer true after the aforementioned commit.

There's no reason why move_freepages_block() should be testing the
legitimacy of page_zone() for reserved memory; its scope is limited only
to pages on the zone's freelist.

Note that pfn_valid() can be true for reserved memory: there is a
backing struct page.  The check for page_to_nid(page) is also buggy but
reserved memory normally only appears on node 0 so the zeroing doesn't
affect this.

Move the debug checks to after verifying PageBuddy is true.  This
isolates the scope of the checks to only be for buddy pages which are on
the zone's freelist which move_freepages_block() is operating on.  In
this case, an incorrect node or zone is a bug worthy of being warned
about (and the examination of struct page is acceptable bcause this
memory is not reserved).

Why does move_freepages_block() gets called on reserved memory? It's
simply math after finding a valid free page from the per-zone free area
to use as fallback.  We find the beginning and end of the pageblock of
the valid page and that can bring us into memory that was reserved per
the e820.  pfn_valid() is still true (it's backed by a struct page), but
since it's zero'd we shouldn't make any inferences here about comparing
its node or zone.  The current node check just happens to succeed most
of the time by luck because reserved memory typically appears on node 0.

The fix here is to validate that we actually have buddy pages before
testing if there's any type of zone or node strangeness going on.

We noticed it almost immediately after bringing 907ec5fca3dc in on
CONFIG_DEBUG_VM builds.  It depends on finding specific free pages in
the per-zone free area where the math in move_freepages() will bring the
start or end pfn into reserved memory and wanting to claim that entire
pageblock as a new migratetype.  So the path will be rare, require
CONFIG_DEBUG_VM, and require fallback to a different migratetype.

Some struct pages were already zeroed from reserve pages before
907ec5fca3c so it theoretically could trigger before this commit.  I
think it's rare enough under a config option that most people don't run
that others may not have noticed.  I wouldn't argue against a stable tag
and the backport should be easy enough, but probably wouldn't single out
a commit that this is fixing.

Mel said:

: The overhead of the debugging check is higher with this patch although
: it'll only affect debug builds and the path is not particularly hot.
: If this was a concern, I think it would be reasonable to simply remove
: the debugging check as the zone boundaries are checked in
: move_freepages_block and we never expect a zone/node to be smaller than
: a pageblock and stuck in the middle of another zone.

Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1908122036560.10779@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
19 months agomm/z3fold.c: fix race between migration and destruction
Henry Burns [Sun, 25 Aug 2019 00:54:37 +0000 (17:54 -0700)]
mm/z3fold.c: fix race between migration and destruction

In z3fold_destroy_pool() we call destroy_workqueue(&pool->compact_wq).
However, we have no guarantee that migration isn't happening in the
background at that time.

Migration directly calls queue_work_on(pool->compact_wq), if destruction
wins that race we are using a destroyed workqueue.

Link: http://lkml.kernel.org/r/20190809213828.202833-1-henryburns@google.com
Signed-off-by: Henry Burns <henryburns@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Jonathan Adams <jwadams@google.com>
Cc: Henry Burns <henrywolfeburns@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
19 months agoMerge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 24 Aug 2019 18:42:06 +0000 (11:42 -0700)]
Merge tag 'hyperv-fixes-signed' of git://git./linux/kernel/git/hyperv/linux

Pull Hyper-V fixes from Sasha Levin:

 - Fix for panics and network failures on PAE guests by Dexuan Cui.

 - Fix of a memory leak (and related cleanups) in the hyper-v keyboard
   driver by Dexuan Cui.

 - Code cleanups for hyper-v clocksource driver during the merge window
   by Dexuan Cui.

 - Fix for a false positive warning in the userspace hyper-v KVP store
   by Vitaly Kuznetsov.

* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: vmbus: Fix virt_to_hvpfn() for X86_PAE
  Tools: hv: kvp: eliminate 'may be used uninitialized' warning
  Input: hyperv-keyboard: Use in-place iterator API in the channel callback
  Drivers: hv: vmbus: Remove the unused "tsc_page" from struct hv_context

19 months agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Sat, 24 Aug 2019 18:35:25 +0000 (11:35 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "Two KVM/arm fixes for MMIO emulation and UBSAN.

  Unusually, we're routing them via the arm64 tree as per Paolo's
  request on the list:

    https://lore.kernel.org/kvm/21ae69a2-2546-29d0-bff6-2ea825e3d968@redhat.com/

  We don't actually have any other arm64 fixes pending at the moment
  (touch wood), so I've pulled from Marc, written a merge commit, tagged
  the result and run it through my build/boot/bisect scripts"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  KVM: arm/arm64: VGIC: Properly initialise private IRQ affinity
  KVM: arm/arm64: Only skip MMIO insn once

19 months agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 24 Aug 2019 18:26:51 +0000 (11:26 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Four fixes, three for edge conditions which don't occur very often.
  The lpfc fix mitigates memory exhaustion for some high CPU systems"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: lpfc: Mitigate high memory pre-allocation by SCSI-MQ
  scsi: ufs: Fix NULL pointer dereference in ufshcd_config_vreg_hpm()
  scsi: target: tcmu: avoid use-after-free after command timeout
  scsi: qla2xxx: Fix gnl.l memory leak on adapter init failure

19 months agoMerge tag 'xfs-5.3-fixes-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Sat, 24 Aug 2019 18:21:26 +0000 (11:21 -0700)]
Merge tag 'xfs-5.3-fixes-6' of git://git./fs/xfs/xfs-linux

Pull xfs fix from Darrick Wong:
 "A single patch that fixes a xfs lockup problem when a chown/chgrp
  operation fails due to running out of quota. It has survived the usual
  xfstests runs and merges cleanly with this morning's master:

   - Fix a forgotten inode unlock when chown/chgrp fail due to quota"

* tag 'xfs-5.3-fixes-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT

19 months agoMerge tag 'drm-fixes-2019-08-24' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Sat, 24 Aug 2019 18:16:04 +0000 (11:16 -0700)]
Merge tag 'drm-fixes-2019-08-24' of git://anongit.freedesktop.org/drm/drm

Pull more drm fixes from Dave Airlie:
 "Although the tree built for me fine on arm here, it appears either
  header cleanups in next or some kconfig combo it breaks, so this
  contains a fix to mediatek to include dma-mapping.h explicitly.

  There was also one nouveau fix that came in late that I was going to
  leave until next week, but since I was sending this I thought it may
  as well be in here:

  mediatek:
   - fix build in some cases

  nouveau:
   - fix hang with i2c and mst docks"

* tag 'drm-fixes-2019-08-24' of git://anongit.freedesktop.org/drm/drm:
  drm/mediatek: include dma-mapping header
  drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX

19 months agoMerge tag 'kvmarm-fixes-for-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Will Deacon [Sat, 24 Aug 2019 11:45:20 +0000 (12:45 +0100)]
Merge tag 'kvmarm-fixes-for-5.3-3' of git://git./linux/kernel/git/kvmarm/kvmarm into kvm/fixes

Pull KVM/arm fixes from Marc Zyngier as per Paulo's request at:

  https://lkml.kernel.org/r/21ae69a2-2546-29d0-bff6-2ea825e3d968@redhat.com

  "One (hopefully last) set of fixes for KVM/arm for 5.3: an embarassing
   MMIO emulation regression, and a UBSAN splat. Oh well...

   - Don't overskip instructions on MMIO emulation

   - Fix UBSAN splat when initializing PPI priorities"

* tag 'kvmarm-fixes-for-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm:
  KVM: arm/arm64: VGIC: Properly initialise private IRQ affinity
  KVM: arm/arm64: Only skip MMIO insn once

19 months agodrm/mediatek: include dma-mapping header
Dave Airlie [Sat, 24 Aug 2019 05:07:07 +0000 (15:07 +1000)]
drm/mediatek: include dma-mapping header

Although it builds fine here in my arm cross compile, it seems
either via some other patches in -next or some Kconfig combination,
this fails to build for everyone.

Include linux/dma-mapping.h should fix it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
19 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Fri, 23 Aug 2019 21:53:09 +0000 (14:53 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma

Pull rdma fixes from Doug Ledford:
 "No beating around the bush: this is a monster pull request for an -rc5
  kernel. Intel hit me with a series of fixes for TID processing.
  Mellanox hit me with a series for their UMR memory support.

  And we had one fix for siw that fixes the 32bit build warnings and
  because of the number of casts that had to be changed to properly
  silence the warnings, that one patch alone is a full 40% of the LOC of
  this entire pull request. Given that this is the initial release
  kernel for siw, I'm trying to fix anything in it that we can, so that
  adds to the impetus to take fixes for it like this one.

  I had to do a rebase early in the week. Jason had thought he put a
  patch on the rc queue that he needed to be there so he could base some
  work off of it, and it had actually not been placed there. So he asked
  me (on Tuesday) to fix that up before pushing my wip branch to the
  official rc branch. I did, and that's why the early patches look like
  they were all committed at the same time on Tuesday. That bunch had
  been in my queue prior.

  The various patches all pass my test for being legitimate fixes and
  not attempts to slide new features or development into a late rc.
  Well, they were all fixes with the exception of a couple clean up
  patches people wrote for making the fixes they also wrote better (like
  a cleanup patch to move UMR checking into a function so that the
  remaining UMR fix patches can reference that function), so I left
  those in place too.

  My apologies for the LOC count and the number of patches here, it's
  just how the cards fell this cycle.

  Summary:

   - Fix siw buffer mapping issue

   - Fix siw 32/64 casting issues

   - Fix a KASAN access issue in bnxt_re

   - Fix several memory leaks (hfi1, mlx4)

   - Fix a NULL deref in cma_cleanup

   - Fixes for UMR memory support in mlx5 (4 patch series)

   - Fix namespace check for restrack

   - Fixes for counter support

   - Fixes for hfi1 TID processing (5 patch series)

   - Fix potential NULL deref in siw

   - Fix memory page calculations in mlx5"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (21 commits)
  RDMA/siw: Fix 64/32bit pointer inconsistency
  RDMA/siw: Fix SGL mapping issues
  RDMA/bnxt_re: Fix stack-out-of-bounds in bnxt_qplib_rcfw_send_message
  infiniband: hfi1: fix memory leaks
  infiniband: hfi1: fix a memory leak bug
  IB/mlx4: Fix memory leaks
  RDMA/cma: fix null-ptr-deref Read in cma_cleanup
  IB/mlx5: Block MR WR if UMR is not possible
  IB/mlx5: Fix MR re-registration flow to use UMR properly
  IB/mlx5: Report and handle ODP support properly
  IB/mlx5: Consolidate use_umr checks into single function
  RDMA/restrack: Rewrite PID namespace check to be reliable
  RDMA/counters: Properly implement PID checks
  IB/core: Fix NULL pointer dereference when bind QP to counter
  IB/hfi1: Drop stale TID RDMA packets that cause TIDErr
  IB/hfi1: Add additional checks when handling TID RDMA WRITE DATA packet
  IB/hfi1: Add additional checks when handling TID RDMA READ RESP packet
  IB/hfi1: Unsafe PSN checking for TID RDMA READ Resp packet
  IB/hfi1: Drop stale TID RDMA packets
  RDMA/siw: Fix potential NULL de-ref
  ...

19 months agoMerge tag 'for-linus-20190823' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 23 Aug 2019 21:45:45 +0000 (14:45 -0700)]
Merge tag 'for-linus-20190823' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Here's a set of fixes that should go into this release. This contains:

   - Three minor fixes for NVMe.

   - Three minor tweaks for the io_uring polling logic.

   - Officially mark Song as the MD maintainer, after he's been filling
     that role sucessfully for the last 6 months or so"

* tag 'for-linus-20190823' of git://git.kernel.dk/linux-block:
  io_uring: add need_resched() check in inner poll loop
  md: update MAINTAINERS info
  io_uring: don't enter poll loop if we have CQEs pending
  nvme: Add quirk for LiteON CL1 devices running FW 22301111
  nvme: Fix cntlid validation when not using NVMEoF
  nvme-multipath: fix possible I/O hang when paths are updated
  io_uring: fix potential hang with polled IO

19 months agoMerge tag 'for-5.3/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/devic...
Linus Torvalds [Fri, 23 Aug 2019 17:53:34 +0000 (10:53 -0700)]
Merge tag 'for-5.3/dm-fixes-2' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - Revert a DM bufio change from during the 5.3 merge window now that a
   proper fix has been made to the block loopback driver.

 - Fix DM kcopyd to wakeup so failed subjobs get completed.

 - Various fixes to DM zoned target to address error handling, and other
   small tweaks (SPDX license identifiers and fix typos).

 - Fix DM integrity range locking race by tracking whether journal has
   changed.

 - Fix DM dust target to detect reads of badblocks beyond the first 512b
   sector (applicable if blocksize is larger than 512b).

 - Fix DM persistent-data issue in both the DM btree and DM
   space-map-metadata interfaces.

 - Fix out of bounds memory access with certain DM table configurations.

* tag 'for-5.3/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm table: fix invalid memory accesses with too high sector number
  dm space map metadata: fix missing store of apply_bops() return value
  dm btree: fix order of block initialization in btree_split_beneath
  dm raid: add missing cleanup in raid_ctr()
  dm zoned: fix potential NULL dereference in dmz_do_reclaim()
  dm dust: use dust block size for badblocklist index
  dm integrity: fix a crash due to BUG_ON in __journal_read_write()
  dm zoned: fix a few typos
  dm zoned: add SPDX license identifiers
  dm zoned: properly handle backing device failure
  dm zoned: improve error handling in i/o map code
  dm zoned: improve error handling in reclaim
  dm kcopyd: always complete failed jobs
  Revert "dm bufio: fix deadlock with loop device"

19 months agoMerge tag 'xfs-5.3-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Fri, 23 Aug 2019 17:49:44 +0000 (10:49 -0700)]
Merge tag 'xfs-5.3-fixes-4' of git://git./fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
 "Here are a few more bug fixes that trickled in since the last pull.
  They've survived the usual xfstests runs and merge cleanly with this
  morning's master.

  I expect there to be one more pull request tomorrow for the fix to
  that quota related inode unlock bug that we were reviewing last night,
  but it will continue to soak in the testing machine for several more
  hours.

   - Fix missing compat ioctl handling for get/setlabel

   - Fix missing ioctl pointer sanitization on s390

   - Fix a page locking deadlock in the dedupe comparison code

   - Fix inadequate locking in reflink code w.r.t. concurrent directio

   - Fix broken error detection when breaking layouts"

* tag 'xfs-5.3-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  fs/xfs: Fix return code of xfs_break_leased_layouts()
  xfs: fix reflink source file racing with directio writes
  vfs: fix page locking deadlocks when deduping files
  xfs: compat_ioctl: use compat_ptr()
  xfs: fall back to native ioctls for unhandled compat ones

19 months agoKVM: arm/arm64: VGIC: Properly initialise private IRQ affinity
Andre Przywara [Fri, 23 Aug 2019 10:34:16 +0000 (11:34 +0100)]
KVM: arm/arm64: VGIC: Properly initialise private IRQ affinity

At the moment we initialise the target *mask* of a virtual IRQ to the
VCPU it belongs to, even though this mask is only defined for GICv2 and
quickly runs out of bits for many GICv3 guests.
This behaviour triggers an UBSAN complaint for more than 32 VCPUs:
------
[ 5659.462377] UBSAN: Undefined behaviour in virt/kvm/arm/vgic/vgic-init.c:223:21
[ 5659.471689] shift exponent 32 is too large for 32-bit type 'unsigned int'
------
Also for GICv3 guests the reporting of TARGET in the "vgic-state" debugfs
dump is wrong, due to this very same problem.

Because there is no requirement to create the VGIC device before the
VCPUs (and QEMU actually does it the other way round), we can't safely
initialise mpidr or targets in kvm_vgic_vcpu_init(). But since we touch
every private IRQ for each VCPU anyway later (in vgic_init()), we can
just move the initialisation of those fields into there, where we
definitely know the VGIC type.

On the way make sure we really have either a VGICv2 or a VGICv3 device,
since the existing code is just checking for "VGICv3 or not", silently
ignoring the uninitialised case.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reported-by: Dave Martin <dave.martin@arm.com>
Tested-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
19 months agoMerge tag 'modules-for-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 23 Aug 2019 16:22:00 +0000 (09:22 -0700)]
Merge tag 'modules-for-v5.3-rc6' of git://git./linux/kernel/git/jeyu/linux

Pull modules fixes from Jessica Yu:
 "Fix BUG_ON() being triggered in frob_text() due to non-page-aligned
  module sections"

* tag 'modules-for-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  modules: page-align module section allocations only for arches supporting strict module rwx
  modules: always page-align module section allocations

19 months agoMerge tag 'ceph-for-5.3-rc6' of git://github.com/ceph/ceph-client
Linus Torvalds [Fri, 23 Aug 2019 16:19:38 +0000 (09:19 -0700)]
Merge tag 'ceph-for-5.3-rc6' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "Three important fixes tagged for stable (an indefinite hang, a crash
  on an assert and a NULL pointer dereference) plus a small series from
  Luis fixing instances of vfree() under spinlock"

* tag 'ceph-for-5.3-rc6' of git://github.com/ceph/ceph-client:
  libceph: fix PG split vs OSD (re)connect race
  ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply
  ceph: clear page dirty before invalidate page
  ceph: fix buffer free while holding i_ceph_lock in fill_inode()
  ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob()
  ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()
  libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer

19 months agoRDMA/siw: Fix 64/32bit pointer inconsistency
Bernard Metzler [Thu, 22 Aug 2019 17:37:38 +0000 (19:37 +0200)]
RDMA/siw: Fix 64/32bit pointer inconsistency

Fixes improper casting between addresses and unsigned types.
Changes siw_pbl_get_buffer() function to return appropriate
dma_addr_t, and not u64.

Also fixes debug prints. Now any potentially kernel private
pointers are printed formatted as '%pK', to allow keeping that
information secret.

Fixes: d941bfe500be ("RDMA/siw: Change CQ flags from 64->32 bits")
Fixes: b0fff7317bb4 ("rdma/siw: completion queue methods")
Fixes: 8b6a361b8c48 ("rdma/siw: receive path")
Fixes: b9be6f18cf9e ("rdma/siw: transmit path")
Fixes: f29dd55b0236 ("rdma/siw: queue pair methods")
Fixes: 2251334dcac9 ("rdma/siw: application buffer management")
Fixes: 303ae1cdfdf7 ("rdma/siw: application interface")
Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
Fixes: a531975279f3 ("rdma/siw: main include file")

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: Jason Gunthorpe <jgg@ziepe.ca>
Reported-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20190822173738.26817-1-bmt@zurich.ibm.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoMerge tag 'drm-fixes-2019-08-23' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 23 Aug 2019 16:03:06 +0000 (09:03 -0700)]
Merge tag 'drm-fixes-2019-08-23' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Live from the laundromat after my washing machine broke down, we have
  the 5.3-rc6 fixes. Changelog is in the tag below, but nothing too
  noteworthy in here:

  rcar-du:
   - LVDS dual-link mode fix

  mediatek:
   - of node refcount fix
   - prime buffer import fix
   - dma max seg fix

  komeda:
   - output polling fix
   - abfc format fix
   - memory-region DT fix

  amdgpu:
   - bpc display fix
   - ioctl memory leak fix
   - gfxoff fix
   - smu warnings fix

  i915:
   - HDMI mode readout fix"

* tag 'drm-fixes-2019-08-23' of git://anongit.freedesktop.org/drm/drm:
  drm/amdgpu/powerplay: silence a warning in smu_v11_0_setup_pptable
  drm/amd/display: Calculate bpc based on max_requested_bpc
  drm/amdgpu: prevent memory leaks in AMDGPU_CS ioctl
  drm/amd/amdgpu: disable MMHUB PG for navi10
  drm/amd/powerplay: remove duplicate macro smu_get_uclk_dpm_states in amdgpu_smu.h
  drm/amd/powerplay: fix variable type errors in smu_v11_0_setup_pptable
  drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible
  drm/i915: Fix HW readout for crtc_clock in HDMI mode
  drm/mediatek: mtk_drm_drv.c: Add of_node_put() before goto
  drm: rcar_lvds: Fix dual link mode operations
  drm/mediatek: set DMA max segment size
  drm/mediatek: use correct device to import PRIME buffers
  drm/omap: ensure we have a valid dma_mask
  drm/komeda: Add support for 'memory-region' DT node property
  drm/komeda: Adds internal bpp computing for arm afbc only format YU08 YU10
  drm/komeda: Initialize and enable output polling on Komeda

19 months agodm table: fix invalid memory accesses with too high sector number
Mikulas Patocka [Fri, 23 Aug 2019 13:54:09 +0000 (09:54 -0400)]
dm table: fix invalid memory accesses with too high sector number

If the sector number is too high, dm_table_find_target() should return a
pointer to a zeroed dm_target structure (the caller should test it with
dm_target_is_valid).

However, for some table sizes, the code in dm_table_find_target() that
performs btree lookup will access out of bound memory structures.

Fix this bug by testing the sector number at the beginning of
dm_table_find_target(). Also, add an "inline" keyword to the function
dm_table_get_size() because this is a hot path.

Fixes: 512875bd9661 ("dm: table detect io beyond device")
Cc: stable@vger.kernel.org
Reported-by: Zhang Tao <kontais@zoho.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
19 months agoxfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT
Darrick J. Wong [Fri, 23 Aug 2019 03:55:54 +0000 (20:55 -0700)]
xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT

Benjamin Moody reported to Debian that XFS partially wedges when a chgrp
fails on account of being out of disk quota.  I ran his reproducer
script:

# adduser dummy
# adduser dummy plugdev

# dd if=/dev/zero bs=1M count=100 of=test.img
# mkfs.xfs test.img
# mount -t xfs -o gquota test.img /mnt
# mkdir -p /mnt/dummy
# chown -c dummy /mnt/dummy
# xfs_quota -xc 'limit -g bsoft=100k bhard=100k plugdev' /mnt

(and then as user dummy)

$ dd if=/dev/urandom bs=1M count=50 of=/mnt/dummy/foo
$ chgrp plugdev /mnt/dummy/foo

and saw:

================================================
WARNING: lock held when returning to user space!
5.3.0-rc5 #rc5 Tainted: G        W
------------------------------------------------
chgrp/47006 is leaving the kernel with locks still held!
1 lock held by chgrp/47006:
 #0: 000000006664ea2d (&xfs_nondir_ilock_class){++++}, at: xfs_ilock+0xd2/0x290 [xfs]

...which is clearly caused by xfs_setattr_nonsize failing to unlock the
ILOCK after the xfs_qm_vop_chown_reserve call fails.  Add the missing
unlock.

Reported-by: benjamin.moody@gmail.com
Fixes: 253f4911f297 ("xfs: better xfs_trans_alloc interface")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
19 months agoMerge branch 'linux-5.3' of git://github.com/skeggsb/linux into drm-fixes
Dave Airlie [Fri, 23 Aug 2019 03:53:59 +0000 (13:53 +1000)]
Merge branch 'linux-5.3' of git://github.com/skeggsb/linux into drm-fixes

Fixes i2c on DP with some docks.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Ben Skeggs <skeggsb@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CACAvsv713t2_BQ44gVV7Lqic6Vwmhq0r4FB5v-t0kD1jzFrbmQ@mail.gmail.com
19 months agodrm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX
Lyude Paul [Thu, 25 Jul 2019 19:40:01 +0000 (15:40 -0400)]
drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX

While I had thought I had fixed this issue in:

commit 342406e4fbba ("drm/nouveau/i2c: Disable i2c bus access after
->fini()")

It turns out that while I did fix the error messages I was seeing on my
P50 when trying to access i2c busses with the GPU in runtime suspend, I
accidentally had missed one important detail that was mentioned on the
bug report this commit was supposed to fix: that the CPU would only lock
up when trying to access i2c busses _on connected devices_ _while the
GPU is not in runtime suspend_. Whoops. That definitely explains why I
was not able to get my machine to hang with i2c bus interactions until
now, as plugging my P50 into it's dock with an HDMI monitor connected
allowed me to finally reproduce this locally.

Now that I have managed to reproduce this issue properly, it looks like
the problem is much simpler then it looks. It turns out that some
connected devices, such as MST laptop docks, will actually ACK i2c reads
even if no data was actually read:

[  275.063043] nouveau 0000:01:00.0: i2c: aux 000a: 1: 0000004c 1
[  275.063447] nouveau 0000:01:00.0: i2c: aux 000a: 00 01101000 10040000
[  275.063759] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000001
[  275.064024] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
[  275.064285] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
[  275.064594] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000

Because we don't handle the situation of i2c ack without any data, we
end up entering an infinite loop in nvkm_i2c_aux_i2c_xfer() since the
value of cnt always remains at 0. This finally properly explains how
this could result in a CPU hang like the ones observed in the
aforementioned commit.

So, fix this by retrying transactions if no data is written or received,
and give up and fail the transaction if we continue to not write or
receive any data after 32 retries.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
19 months agodrm/amdgpu/powerplay: silence a warning in smu_v11_0_setup_pptable
Alex Deucher [Thu, 22 Aug 2019 03:25:27 +0000 (22:25 -0500)]
drm/amdgpu/powerplay: silence a warning in smu_v11_0_setup_pptable

I think gcc is confused as I don't see how size could be used
unitialized, but go ahead and silence the warning.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190822032527.1376-1-alexander.deucher@amd.com
19 months agoMerge tag 'drm-misc-fixes-2019-08-22' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 23 Aug 2019 01:43:47 +0000 (11:43 +1000)]
Merge tag 'drm-misc-fixes-2019-08-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Fixes for v5.3-rc6:
- dma fix for omap.
- Make output polling work on komeda.
- Fix bpp computing for AFBC formats in komeda.
- Support the memory-region property in komeda.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/5f1fdfe3-814e-fad1-663c-7279217fc085@linux.intel.com
19 months agoMerge tag 'drm-intel-fixes-2019-08-22' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Fri, 23 Aug 2019 01:41:58 +0000 (11:41 +1000)]
Merge tag 'drm-intel-fixes-2019-08-22' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

drm/i915 fixes for v5.3-rc6:
- fix hardware state readout for 10 bpc HDMI

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87sgptd114.fsf@intel.com
19 months agoio_uring: add need_resched() check in inner poll loop
Jens Axboe [Thu, 22 Aug 2019 04:19:11 +0000 (22:19 -0600)]
io_uring: add need_resched() check in inner poll loop

The outer poll loop checks for whether we need to reschedule, and
returns to userspace if we do. However, it's possible to get stuck
in the inner loop as well, if the CPU we are running on needs to
reschedule to finish the IO work.

Add the need_resched() check in the inner loop as well. This fixes
a potential hang if the kernel is configured with
CONFIG_PREEMPT_VOLUNTARY=y.

Reported-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
19 months agoMerge tag 'pci-v5.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Linus Torvalds [Thu, 22 Aug 2019 21:04:47 +0000 (14:04 -0700)]
Merge tag 'pci-v5.3-fixes-1' of git://git./linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

 - Reset both NVIDIA GPU and HDA in ThinkPad P50 quirk, which was broken
   by another quirk that enabled the HDA device (Lyude Paul)

 - Fix pciebus-howto.rst documentation filename typo (Bjorn Helgaas)

* tag 'pci-v5.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  Documentation PCI: Fix pciebus-howto.rst filename typo
  PCI: Reset both NVIDIA GPU and HDA in ThinkPad P50 workaround

19 months agodm space map metadata: fix missing store of apply_bops() return value
ZhangXiaoxu [Mon, 19 Aug 2019 03:31:21 +0000 (11:31 +0800)]
dm space map metadata: fix missing store of apply_bops() return value

In commit 6096d91af0b6 ("dm space map metadata: fix occasional leak
of a metadata block on resize"), we refactor the commit logic to a new
function 'apply_bops'.  But when that logic was replaced in out() the
return value was not stored.  This may lead out() returning a wrong
value to the caller.

Fixes: 6096d91af0b6 ("dm space map metadata: fix occasional leak of a metadata block on resize")
Cc: stable@vger.kernel.org
Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
19 months agodm btree: fix order of block initialization in btree_split_beneath
ZhangXiaoxu [Sat, 17 Aug 2019 05:32:40 +0000 (13:32 +0800)]
dm btree: fix order of block initialization in btree_split_beneath

When btree_split_beneath() splits a node to two new children, it will
allocate two blocks: left and right.  If right block's allocation
failed, the left block will be unlocked and marked dirty.  If this
happened, the left block'ss content is zero, because it wasn't
initialized with the btree struct before the attempot to allocate the
right block.  Upon return, when flushing the left block to disk, the
validator will fail when check this block.  Then a BUG_ON is raised.

Fix this by completely initializing the left block before allocating and
initializing the right block.

Fixes: 4dcb8b57df359 ("dm btree: fix leak of bufio-backed block in btree_split_beneath error path")
Cc: stable@vger.kernel.org
Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
19 months agoMerge tag 'Wimplicit-fallthrough-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kerne...
Linus Torvalds [Thu, 22 Aug 2019 18:26:10 +0000 (11:26 -0700)]
Merge tag 'Wimplicit-fallthrough-5.3-rc6' of git://git./linux/kernel/git/gustavoars/linux

Pull more fallthrough fixes from Gustavo A. R. Silva:
 "Fix fall-through warnings on arm and mips for multiple configurations"

* tag 'Wimplicit-fallthrough-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  video: fbdev: acornfb: Mark expected switch fall-through
  scsi: libsas: sas_discover: Mark expected switch fall-through
  MIPS: Octeon: Mark expected switch fall-through
  power: supply: ab8500_charger: Mark expected switch fall-through
  watchdog: wdt285: Mark expected switch fall-through
  mtd: sa1100: Mark expected switch fall-through
  drm/sun4i: tcon: Mark expected switch fall-through
  drm/sun4i: sun6i_mipi_dsi: Mark expected switch fall-through
  ARM: riscpc: Mark expected switch fall-through
  dmaengine: fsldma: Mark expected switch fall-through

19 months agoMerge tag 'tag-chrome-platform-fixes-for-v5.3-rc6' of git://git.kernel.org/pub/scm...
Linus Torvalds [Thu, 22 Aug 2019 18:17:20 +0000 (11:17 -0700)]
Merge tag 'tag-chrome-platform-fixes-for-v5.3-rc6' of git://git./linux/kernel/git/chrome-platform/linux

Pull chrome platform fix from Benson Leung:
 "Fix a kernel crash during suspend/resume of cros_ec_ishtp"

* tag 'tag-chrome-platform-fixes-for-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
  platform/chrome: cros_ec_ishtp: fix crash during suspend

19 months agoMerge tag 'afs-fixes-20190822' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowe...
Linus Torvalds [Thu, 22 Aug 2019 18:12:33 +0000 (11:12 -0700)]
Merge tag 'afs-fixes-20190822' of git://git./linux/kernel/git/dhowells/linux-fs

Pull AFS fixes from David Howells:

 - Fix a cell record leak due to the default error not being cleared.

 - Fix an oops in tracepoint due to a pointer that may contain an error.

 - Fix the ACL storage op for YFS where the wrong op definition is being
   used. By luck, this only actually affects the information appearing
   in traces.

* tag 'afs-fixes-20190822' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: use correct afs_call_type in yfs_fs_store_opaque_acl2
  afs: Fix possible oops in afs_lookup trace event
  afs: Fix leak in afs_lookup_cell_rcu()

19 months agoRDMA/siw: Fix SGL mapping issues
Bernard Metzler [Thu, 22 Aug 2019 15:07:41 +0000 (17:07 +0200)]
RDMA/siw: Fix SGL mapping issues

All user level and most in-kernel applications submit WQEs
where the SG list entries are all of a single type.
iSER in particular, however, will send us WQEs with mixed SG
types: sge[0] = kernel buffer, sge[1] = PBL region.
Check and set is_kva on each SG entry individually instead of
assuming the first SGE type carries through to the last.
This fixes iSER over siw.

Fixes: b9be6f18cf9e ("rdma/siw: transmit path")
Reported-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
Tested-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20190822150741.21871-1-bmt@zurich.ibm.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoRDMA/bnxt_re: Fix stack-out-of-bounds in bnxt_qplib_rcfw_send_message
Selvin Xavier [Thu, 22 Aug 2019 10:02:50 +0000 (03:02 -0700)]
RDMA/bnxt_re: Fix stack-out-of-bounds in bnxt_qplib_rcfw_send_message

Driver copies FW commands to the HW queue as  units of 16 bytes. Some
of the command structures are not exact multiple of 16. So while copying
the data from those structures, the stack out of bounds messages are
reported by KASAN. The following error is reported.

[ 1337.530155] ==================================================================
[ 1337.530277] BUG: KASAN: stack-out-of-bounds in bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530413] Read of size 16 at addr ffff888725477a48 by task rmmod/2785

[ 1337.530540] CPU: 5 PID: 2785 Comm: rmmod Tainted: G           OE     5.2.0-rc6+ #75
[ 1337.530541] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
[ 1337.530542] Call Trace:
[ 1337.530548]  dump_stack+0x5b/0x90
[ 1337.530556]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530560]  print_address_description+0x65/0x22e
[ 1337.530568]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530575]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530577]  __kasan_report.cold.3+0x37/0x77
[ 1337.530581]  ? _raw_write_trylock+0x10/0xe0
[ 1337.530588]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530590]  kasan_report+0xe/0x20
[ 1337.530592]  memcpy+0x1f/0x50
[ 1337.530600]  bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530608]  ? bnxt_qplib_creq_irq+0xa0/0xa0 [bnxt_re]
[ 1337.530611]  ? xas_create+0x3aa/0x5f0
[ 1337.530613]  ? xas_start+0x77/0x110
[ 1337.530615]  ? xas_clear_mark+0x34/0xd0
[ 1337.530623]  bnxt_qplib_free_mrw+0x104/0x1a0 [bnxt_re]
[ 1337.530631]  ? bnxt_qplib_destroy_ah+0x110/0x110 [bnxt_re]
[ 1337.530633]  ? bit_wait_io_timeout+0xc0/0xc0
[ 1337.530641]  bnxt_re_dealloc_mw+0x2c/0x60 [bnxt_re]
[ 1337.530648]  bnxt_re_destroy_fence_mr+0x77/0x1d0 [bnxt_re]
[ 1337.530655]  bnxt_re_dealloc_pd+0x25/0x60 [bnxt_re]
[ 1337.530677]  ib_dealloc_pd_user+0xbe/0xe0 [ib_core]
[ 1337.530683]  srpt_remove_one+0x5de/0x690 [ib_srpt]
[ 1337.530689]  ? __srpt_close_all_ch+0xc0/0xc0 [ib_srpt]
[ 1337.530692]  ? xa_load+0x87/0xe0
...
[ 1337.530840]  do_syscall_64+0x6d/0x1f0
[ 1337.530843]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1337.530845] RIP: 0033:0x7ff5b389035b
[ 1337.530848] Code: 73 01 c3 48 8b 0d 2d 0b 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fd 0a 2c 00 f7 d8 64 89 01 48
[ 1337.530849] RSP: 002b:00007fff83425c28 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 1337.530852] RAX: ffffffffffffffda RBX: 00005596443e6750 RCX: 00007ff5b389035b
[ 1337.530853] RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005596443e67b8
[ 1337.530854] RBP: 0000000000000000 R08: 00007fff83424ba1 R09: 0000000000000000
[ 1337.530856] R10: 00007ff5b3902960 R11: 0000000000000206 R12: 00007fff83425e50
[ 1337.530857] R13: 00007fff8342673c R14: 00005596443e6260 R15: 00005596443e6750

[ 1337.530885] The buggy address belongs to the page:
[ 1337.530962] page:ffffea001c951dc0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
[ 1337.530964] flags: 0x57ffffc0000000()
[ 1337.530967] raw: 0057ffffc0000000 0000000000000000 ffffffff1c950101 0000000000000000
[ 1337.530970] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 1337.530970] page dumped because: kasan: bad access detected

[ 1337.530996] Memory state around the buggy address:
[ 1337.531072]  ffff888725477900: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 f2 f2 f2
[ 1337.531180]  ffff888725477980: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
[ 1337.531288] >ffff888725477a00: 00 f2 f2 f2 f2 f2 f2 00 00 00 f2 00 00 00 00 00
[ 1337.531393]                                                  ^
[ 1337.531478]  ffff888725477a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1337.531585]  ffff888725477b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1337.531691] ==================================================================

Fix this by passing the exact size of each FW command to
bnxt_qplib_rcfw_send_message as req->cmd_size. Before sending
the command to HW, modify the req->cmd_size to number of 16 byte units.

Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1566468170-489-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoafs: use correct afs_call_type in yfs_fs_store_opaque_acl2
YueHaibing [Mon, 19 Aug 2019 15:05:31 +0000 (16:05 +0100)]
afs: use correct afs_call_type in yfs_fs_store_opaque_acl2

It seems that 'yfs_RXYFSStoreOpaqueACL2' should be use in
yfs_fs_store_opaque_acl2().

Fixes: f5e4546347bc ("afs: Implement YFS ACL setting")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David Howells <dhowells@redhat.com>
19 months agoafs: Fix possible oops in afs_lookup trace event
Marc Dionne [Thu, 22 Aug 2019 12:28:43 +0000 (13:28 +0100)]
afs: Fix possible oops in afs_lookup trace event

The afs_lookup trace event can cause the following:

[  216.576777] BUG: kernel NULL pointer dereference, address: 000000000000023b
[  216.576803] #PF: supervisor read access in kernel mode
[  216.576813] #PF: error_code(0x0000) - not-present page
...
[  216.576913] RIP: 0010:trace_event_raw_event_afs_lookup+0x9e/0x1c0 [kafs]

If the inode from afs_do_lookup() is an error other than ENOENT, or if it
is ENOENT and afs_try_auto_mntpt() returns an error, the trace event will
try to dereference the error pointer as a valid pointer.

Use IS_ERR_OR_NULL to only pass a valid pointer for the trace, or NULL.

Ideally the trace would include the error value, but for now just avoid
the oops.

Fixes: 80548b03991f ("afs: Add more tracepoints")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
19 months agoafs: Fix leak in afs_lookup_cell_rcu()
David Howells [Thu, 22 Aug 2019 12:28:43 +0000 (13:28 +0100)]
afs: Fix leak in afs_lookup_cell_rcu()

Fix a leak on the cell refcount in afs_lookup_cell_rcu() due to
non-clearance of the default error in the case a NULL cell name is passed
and the workstation default cell is used.

Also put a bit at the end to make sure we don't leak a cell ref if we're
going to be returning an error.

This leak results in an assertion like the following when the kafs module is
unloaded:

AFS: Assertion failed
2 == 1 is false
0x2 == 0x1 is false
------------[ cut here ]------------
kernel BUG at fs/afs/cell.c:770!
...
RIP: 0010:afs_manage_cells+0x220/0x42f [kafs]
...
 process_one_work+0x4c2/0x82c
 ? pool_mayday_timeout+0x1e1/0x1e1
 ? do_raw_spin_lock+0x134/0x175
 worker_thread+0x336/0x4a6
 ? rescuer_thread+0x4af/0x4af
 kthread+0x1de/0x1ee
 ? kthread_park+0xd4/0xd4
 ret_from_fork+0x24/0x30

Fixes: 989782dcdc91 ("afs: Overhaul cell database management")
Signed-off-by: David Howells <dhowells@redhat.com>
19 months agoKVM: arm/arm64: Only skip MMIO insn once
Andrew Jones [Thu, 22 Aug 2019 11:03:05 +0000 (13:03 +0200)]
KVM: arm/arm64: Only skip MMIO insn once

If after an MMIO exit to userspace a VCPU is immediately run with an
immediate_exit request, such as when a signal is delivered or an MMIO
emulation completion is needed, then the VCPU completes the MMIO
emulation and immediately returns to userspace. As the exit_reason
does not get changed from KVM_EXIT_MMIO in these cases we have to
be careful not to complete the MMIO emulation again, when the VCPU is
eventually run again, because the emulation does an instruction skip
(and doing too many skips would be a waste of guest code :-) We need
to use additional VCPU state to track if the emulation is complete.
As luck would have it, we already have 'mmio_needed', which even
appears to be used in this way by other architectures already.

Fixes: 0d640732dbeb ("arm64: KVM: Skip MMIO insn after emulation")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
19 months agolibceph: fix PG split vs OSD (re)connect race
Ilya Dryomov [Tue, 20 Aug 2019 14:40:33 +0000 (16:40 +0200)]
libceph: fix PG split vs OSD (re)connect race

We can't rely on ->peer_features in calc_target() because it may be
called both when the OSD session is established and open and when it's
not.  ->peer_features is not valid unless the OSD session is open.  If
this happens on a PG split (pg_num increase), that could mean we don't
resend a request that should have been resent, hanging the client
indefinitely.

In userspace this was fixed by looking at require_osd_release and
get_xinfo[osd].features fields of the osdmap.  However these fields
belong to the OSD section of the osdmap, which the kernel doesn't
decode (only the client section is decoded).

Instead, let's drop this feature check.  It effectively checks for
luminous, so only pre-luminous OSDs would be affected in that on a PG
split the kernel might resend a request that should not have been
resent.  Duplicates can occur in other scenarios, so both sides should
already be prepared for them: see dup/replay logic on the OSD side and
retry_attempt check on the client side.

Cc: stable@vger.kernel.org
Fixes: 7de030d6b10a ("libceph: resend on PG splits if OSD has RESEND_ON_SPLIT")
Link: https://tracker.ceph.com/issues/41162
Reported-by: Jerry Lee <leisurelysw24@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Jerry Lee <leisurelysw24@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
19 months agoceph: don't try fill file_lock on unsuccessful GETFILELOCK reply
Jeff Layton [Thu, 15 Aug 2019 10:23:38 +0000 (06:23 -0400)]
ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply

When ceph_mdsc_do_request returns an error, we can't assume that the
filelock_reply pointer will be set. Only try to fetch fields out of
the r_reply_info when it returns success.

Cc: stable@vger.kernel.org
Reported-by: Hector Martin <hector@marcansoft.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
19 months agoceph: clear page dirty before invalidate page
Erqi Chen [Wed, 24 Jul 2019 02:26:09 +0000 (10:26 +0800)]
ceph: clear page dirty before invalidate page

clear_page_dirty_for_io(page) before mapping->a_ops->invalidatepage().
invalidatepage() clears page's private flag, if dirty flag is not
cleared, the page may cause BUG_ON failure in ceph_set_page_dirty().

Cc: stable@vger.kernel.org
Link: https://tracker.ceph.com/issues/40862
Signed-off-by: Erqi Chen <chenerqi@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
19 months agoceph: fix buffer free while holding i_ceph_lock in fill_inode()
Luis Henriques [Fri, 19 Jul 2019 14:32:22 +0000 (15:32 +0100)]
ceph: fix buffer free while holding i_ceph_lock in fill_inode()

Calling ceph_buffer_put() in fill_inode() may result in freeing the
i_xattrs.blob buffer while holding the i_ceph_lock.  This can be fixed by
postponing the call until later, when the lock is released.

The following backtrace was triggered by fstests generic/070.

  BUG: sleeping function called from invalid context at mm/vmalloc.c:2283
  in_atomic(): 1, irqs_disabled(): 0, pid: 3852, name: kworker/0:4
  6 locks held by kworker/0:4/3852:
   #0: 000000004270f6bb ((wq_completion)ceph-msgr){+.+.}, at: process_one_work+0x1b8/0x5f0
   #1: 00000000eb420803 ((work_completion)(&(&con->work)->work)){+.+.}, at: process_one_work+0x1b8/0x5f0
   #2: 00000000be1c53a4 (&s->s_mutex){+.+.}, at: dispatch+0x288/0x1476
   #3: 00000000559cb958 (&mdsc->snap_rwsem){++++}, at: dispatch+0x2eb/0x1476
   #4: 000000000d5ebbae (&req->r_fill_mutex){+.+.}, at: dispatch+0x2fc/0x1476
   #5: 00000000a83d0514 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: fill_inode.isra.0+0xf8/0xf70
  CPU: 0 PID: 3852 Comm: kworker/0:4 Not tainted 5.2.0+ #441
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
  Workqueue: ceph-msgr ceph_con_workfn
  Call Trace:
   dump_stack+0x67/0x90
   ___might_sleep.cold+0x9f/0xb1
   vfree+0x4b/0x60
   ceph_buffer_release+0x1b/0x60
   fill_inode.isra.0+0xa9b/0xf70
   ceph_fill_trace+0x13b/0xc70
   ? dispatch+0x2eb/0x1476
   dispatch+0x320/0x1476
   ? __mutex_unlock_slowpath+0x4d/0x2a0
   ceph_con_workfn+0xc97/0x2ec0
   ? process_one_work+0x1b8/0x5f0
   process_one_work+0x244/0x5f0
   worker_thread+0x4d/0x3e0
   kthread+0x105/0x140
   ? process_one_work+0x5f0/0x5f0
   ? kthread_park+0x90/0x90
   ret_from_fork+0x3a/0x50

Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
19 months agoceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob()
Luis Henriques [Fri, 19 Jul 2019 14:32:21 +0000 (15:32 +0100)]
ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob()

Calling ceph_buffer_put() in __ceph_build_xattrs_blob() may result in
freeing the i_xattrs.blob buffer while holding the i_ceph_lock.  This can
be fixed by having this function returning the old blob buffer and have
the callers of this function freeing it when the lock is released.

The following backtrace was triggered by fstests generic/117.

  BUG: sleeping function called from invalid context at mm/vmalloc.c:2283
  in_atomic(): 1, irqs_disabled(): 0, pid: 649, name: fsstress
  4 locks held by fsstress/649:
   #0: 00000000a7478e7e (&type->s_umount_key#19){++++}, at: iterate_supers+0x77/0xf0
   #1: 00000000f8de1423 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: ceph_check_caps+0x7b/0xc60
   #2: 00000000562f2b27 (&s->s_mutex){+.+.}, at: ceph_check_caps+0x3bd/0xc60
   #3: 00000000f83ce16a (&mdsc->snap_rwsem){++++}, at: ceph_check_caps+0x3ed/0xc60
  CPU: 1 PID: 649 Comm: fsstress Not tainted 5.2.0+ #439
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x67/0x90
   ___might_sleep.cold+0x9f/0xb1
   vfree+0x4b/0x60
   ceph_buffer_release+0x1b/0x60
   __ceph_build_xattrs_blob+0x12b/0x170
   __send_cap+0x302/0x540
   ? __lock_acquire+0x23c/0x1e40
   ? __mark_caps_flushing+0x15c/0x280
   ? _raw_spin_unlock+0x24/0x30
   ceph_check_caps+0x5f0/0xc60
   ceph_flush_dirty_caps+0x7c/0x150
   ? __ia32_sys_fdatasync+0x20/0x20
   ceph_sync_fs+0x5a/0x130
   iterate_supers+0x8f/0xf0
   ksys_sync+0x4f/0xb0
   __ia32_sys_sync+0xa/0x10
   do_syscall_64+0x50/0x1c0
   entry_SYSCALL_64_after_hwframe+0x49/0xbe
  RIP: 0033:0x7fc6409ab617

Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
19 months agoceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()
Luis Henriques [Fri, 19 Jul 2019 14:32:20 +0000 (15:32 +0100)]
ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()

Calling ceph_buffer_put() in __ceph_setxattr() may end up freeing the
i_xattrs.prealloc_blob buffer while holding the i_ceph_lock.  This can be
fixed by postponing the call until later, when the lock is released.

The following backtrace was triggered by fstests generic/117.

  BUG: sleeping function called from invalid context at mm/vmalloc.c:2283
  in_atomic(): 1, irqs_disabled(): 0, pid: 650, name: fsstress
  3 locks held by fsstress/650:
   #0: 00000000870a0fe8 (sb_writers#8){.+.+}, at: mnt_want_write+0x20/0x50
   #1: 00000000ba0c4c74 (&type->i_mutex_dir_key#6){++++}, at: vfs_setxattr+0x55/0xa0
   #2: 000000008dfbb3f2 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: __ceph_setxattr+0x297/0x810
  CPU: 1 PID: 650 Comm: fsstress Not tainted 5.2.0+ #437
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x67/0x90
   ___might_sleep.cold+0x9f/0xb1
   vfree+0x4b/0x60
   ceph_buffer_release+0x1b/0x60
   __ceph_setxattr+0x2b4/0x810
   __vfs_setxattr+0x66/0x80
   __vfs_setxattr_noperm+0x59/0xf0
   vfs_setxattr+0x81/0xa0
   setxattr+0x115/0x230
   ? filename_lookup+0xc9/0x140
   ? rcu_read_lock_sched_held+0x74/0x80
   ? rcu_sync_lockdep_assert+0x2e/0x60
   ? __sb_start_write+0x142/0x1a0
   ? mnt_want_write+0x20/0x50
   path_setxattr+0xba/0xd0
   __x64_sys_lsetxattr+0x24/0x30
   do_syscall_64+0x50/0x1c0
   entry_SYSCALL_64_after_hwframe+0x49/0xbe
  RIP: 0033:0x7ff23514359a

Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
19 months agolibceph: allow ceph_buffer_put() to receive a NULL ceph_buffer
Luis Henriques [Fri, 19 Jul 2019 14:32:19 +0000 (15:32 +0100)]
libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer

Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
19 months agomd: update MAINTAINERS info
Song Liu [Wed, 21 Aug 2019 18:45:25 +0000 (11:45 -0700)]
md: update MAINTAINERS info

I have been reviewing patches for md in the past few months. Mark me
as the MD maintainer, as I have effectively been filling that role.

Cc: NeilBrown <neilb@suse.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
19 months agoMerge tag 'drm-fixes-5.3-2019-08-21' of git://people.freedesktop.org/~agd5f/linux...
Dave Airlie [Thu, 22 Aug 2019 02:59:10 +0000 (12:59 +1000)]
Merge tag 'drm-fixes-5.3-2019-08-21' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

drm-fixes-5.3-2019-08-21:

amdgpu:
- Fix gfxoff logic on RV
- Powerplay fixes
- Fix a possible memory leak in CS ioctl
- bpc fix for display

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190822021022.3356-1-alexander.deucher@amd.com
19 months agoMerge tag 'mediatek-drm-fixes-5.3' of https://github.com/ckhu-mediatek/linux.git...
Dave Airlie [Thu, 22 Aug 2019 02:56:32 +0000 (12:56 +1000)]
Merge tag 'mediatek-drm-fixes-5.3' of https://github.com/ckhu-mediatek/linux.git-tags into drm-fixes

Mediatek memory leak drm fix for Linux 5.3

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1566264270.30493.4.camel@mtksdaap41
19 months agoMerge tag 'du-fixes-20190816' of git://linuxtv.org/pinchartl/media into drm-fixes
Dave Airlie [Thu, 22 Aug 2019 02:53:23 +0000 (12:53 +1000)]
Merge tag 'du-fixes-20190816' of git://linuxtv.org/pinchartl/media into drm-fixes

R-Car LVDS encoder fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190816130115.GH5020@pendragon.ideasonboard.com
19 months agodrm/amd/display: Calculate bpc based on max_requested_bpc
Nicholas Kazlauskas [Wed, 21 Aug 2019 15:27:13 +0000 (11:27 -0400)]
drm/amd/display: Calculate bpc based on max_requested_bpc

[Why]
The only place where state->max_bpc is updated on the connector is
at the start of atomic check during drm_atomic_connector_check. It
isn't updated when adding the connectors to the atomic state after
the fact. It also doesn't necessarily reflect the right value when
called in amdgpu during mode validation outside of atomic check.

This can cause the wrong bpc to be used even if the max_requested_bpc
is the correct value.

[How]
Don't rely on state->max_bpc reflecting the real bpc value and just
do the min(...) based on display info bpc and max_requested_bpc.

Fixes: 01933ba42d3d ("drm/amd/display: Use current connector state if NULL when checking bpc")
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
19 months agodrm/amdgpu: prevent memory leaks in AMDGPU_CS ioctl
Nicolai Hähnle [Tue, 20 Aug 2019 13:39:53 +0000 (15:39 +0200)]
drm/amdgpu: prevent memory leaks in AMDGPU_CS ioctl

Error out if the AMDGPU_CS ioctl is called with multiple SYNCOBJ_OUT and/or
TIMELINE_SIGNAL chunks, since otherwise the last chunk wins while the
allocated array as well as the reference counts of sync objects are leaked.

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
19 months agodrm/amd/amdgpu: disable MMHUB PG for navi10
Kenneth Feng [Tue, 20 Aug 2019 07:11:37 +0000 (15:11 +0800)]
drm/amd/amdgpu: disable MMHUB PG for navi10

Disable MMHUB PG for navi10 according to the production requirement.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
19 months agodrm/amd/powerplay: remove duplicate macro smu_get_uclk_dpm_states in amdgpu_smu.h
Kevin Wang [Tue, 20 Aug 2019 05:28:51 +0000 (13:28 +0800)]
drm/amd/powerplay: remove duplicate macro smu_get_uclk_dpm_states in amdgpu_smu.h

remove duplicate macro smu_get_uclk_dpm_states in amdgpu_smu.h

"
 #define smu_get_uclk_dpm_states(smu, clocks_in_khz, num_states) \
         ((smu)->ppt_funcs->get_uclk_dpm_states ? (smu)->ppt_funcs->get_uclk_dpm_states((smu), (clocks_in_khz), (num_states)) : 0)
 #define smu_get_max_sustainable_clocks_by_dc(smu, max_clocks) \
         ((smu)->funcs->get_max_sustainable_clocks_by_dc ? (smu)->funcs->get_max_sustainable_clocks_by_dc((smu), (max_clocks)) : 0)
 #define smu_get_uclk_dpm_states(smu, clocks_in_khz, num_states) \
         ((smu)->ppt_funcs->get_uclk_dpm_states ? (smu)->ppt_funcs->get_uclk_dpm_states((smu), (clocks_in_khz), (num_states)) : 0)
"

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
19 months agodrm/amd/powerplay: fix variable type errors in smu_v11_0_setup_pptable
Kevin Wang [Mon, 19 Aug 2019 15:38:02 +0000 (23:38 +0800)]
drm/amd/powerplay: fix variable type errors in smu_v11_0_setup_pptable

fix size type errors, from uint32_t to uint16_t.
it will cause only initializes the highest 16 bits in
smu_get_atom_data_table function.

bug report:
This fixes the following static checker warning.
        drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:390 smu_v11_0_setup_pptable()
        warn: passing casted pointer '&size' to 'smu_get_atom_data_table()' 32 vs 16.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
19 months agodrm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible
Alex Deucher [Thu, 15 Aug 2019 13:27:09 +0000 (08:27 -0500)]
drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible

We need to set certain power gating flags after we determine
if the firmware version is sufficient to support gfxoff.
Previously we set the pg flags in early init, but we later
we might have disabled gfxoff if the firmware versions didn't
support it.  Move adding the additional pg flags after we
determine whether or not to support gfxoff.

Fixes: 005440066f92 ("drm/amdgpu: enable gfxoff again on raven series (v2)")
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: stable@vger.kernel.org
19 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Wed, 21 Aug 2019 18:48:38 +0000 (11:48 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "A couple bugfixes, and mostly selftests changes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  selftests/kvm: make platform_info_test pass on AMD
  Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot"
  selftests: kvm: fix state save/load on processors without XSAVE
  selftests: kvm: fix vmx_set_nested_state_test
  selftests: kvm: provide common function to enable eVMCS
  selftests: kvm: do not try running the VM in vmx_set_nested_state_test
  KVM: x86: svm: remove redundant assignment of var new_entry
  MAINTAINERS: add KVM x86 reviewers
  MAINTAINERS: change list for KVM/s390
  kvm: x86: skip populating logical dest map if apic is not sw enabled

19 months agoselftests/kvm: make platform_info_test pass on AMD
Vitaly Kuznetsov [Mon, 10 Jun 2019 17:22:55 +0000 (19:22 +0200)]
selftests/kvm: make platform_info_test pass on AMD

test_msr_platform_info_disabled() generates EXIT_SHUTDOWN but VMCB state
is undefined after that so an attempt to launch this guest again from
test_msr_platform_info_enabled() fails. Reorder the tests to make test
pass.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
19 months agoMerge tag 'nfsd-5.3-1' of git://linux-nfs.org/~bfields/linux
Linus Torvalds [Wed, 21 Aug 2019 17:04:38 +0000 (10:04 -0700)]
Merge tag 'nfsd-5.3-1' of git://linux-nfs.org/~bfields/linux

Pull nfsd fixes from Bruce Fields:
 "Fix nfsd bugs: three in the new nfsd/clients/ code, one in the reply
  cache containerization"

* tag 'nfsd-5.3-1' of git://linux-nfs.org/~bfields/linux:
  nfsd4: Fix kernel crash when reading proc file reply_cache_stats
  nfsd: initialize i_private before d_add
  nfsd: use i_wrlock instead of rcu for nfsdfs i_private
  nfsd: fix dentry leak upon mkdir failure.

19 months agodm raid: add missing cleanup in raid_ctr()
Wenwen Wang [Mon, 19 Aug 2019 00:18:34 +0000 (19:18 -0500)]
dm raid: add missing cleanup in raid_ctr()

If rs_prepare_reshape() fails, no cleanup is executed, leading to
leak of the raid_set structure allocated at the beginning of
raid_ctr(). To fix this issue, go to the label 'bad' if the error
occurs.

Fixes: 11e4723206683 ("dm raid: stop keeping raid set frozen altogether")
Cc: stable@vger.kernel.org
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
19 months agodm zoned: fix potential NULL dereference in dmz_do_reclaim()
Dan Carpenter [Mon, 19 Aug 2019 09:58:14 +0000 (12:58 +0300)]
dm zoned: fix potential NULL dereference in dmz_do_reclaim()

This function is supposed to return error pointers so it matches the
dmz_get_rnd_zone_for_reclaim() function.  The current code could lead to
a NULL dereference in dmz_do_reclaim()

Fixes: b234c6d7a703 ("dm zoned: improve error handling in reclaim")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
19 months agodm dust: use dust block size for badblocklist index
Bryan Gurney [Fri, 16 Aug 2019 14:09:53 +0000 (10:09 -0400)]
dm dust: use dust block size for badblocklist index

Change the "frontend" dust_remove_block, dust_add_block, and
dust_query_block functions to store the "dust block number", instead
of the sector number corresponding to the "dust block number".

For the "backend" functions dust_map_read and dust_map_write,
right-shift by sect_per_block_shift.  This fixes the inability to
emulate failure beyond the first sector of each "dust block" (for
devices with a "dust block size" larger than 512 bytes).

Fixes: e4f3fabd67480bf ("dm: add dust target")
Cc: stable@vger.kernel.org
Signed-off-by: Bryan Gurney <bgurney@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
19 months agodrm/i915: Fix HW readout for crtc_clock in HDMI mode
Imre Deak [Thu, 8 Aug 2019 16:25:47 +0000 (19:25 +0300)]
drm/i915: Fix HW readout for crtc_clock in HDMI mode

The conversion during HDMI HW readout from port_clock to crtc_clock was
missed when HDMI 10bpc support was added, so fix that.

v2:
- Unscrew the non-HDMI case.

Fixes: cd9e11a8bf25 ("drm/i915/icl: Add 10-bit support for hdmi")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109593
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190808162547.7009-1-imre.deak@intel.com
(cherry picked from commit 2969a78aead38b49e80c821a5c683544ab16160d)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
19 months agomodules: page-align module section allocations only for arches supporting strict...
He Zhe [Tue, 20 Aug 2019 14:53:10 +0000 (22:53 +0800)]
modules: page-align module section allocations only for arches supporting strict module rwx

We should keep the case of "#define debug_align(X) (X)" for all arches
without CONFIG_HAS_STRICT_MODULE_RWX ability, which would save people, who
are sensitive to system size, a lot of memory when using modules,
especially for embedded systems. This is also the intention of the
original #ifdef... statement and still valid for now.

Note that this still keeps the effect of the fix of the following commit,
38f054d549a8 ("modules: always page-align module section allocations"),
since when CONFIG_ARCH_HAS_STRICT_MODULE_RWX is enabled, module pages are
aligned.

Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
19 months agoRevert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot"
Paolo Bonzini [Thu, 15 Aug 2019 07:43:32 +0000 (09:43 +0200)]
Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot"

This reverts commit 4e103134b862314dc2f2f18f2fb0ab972adc3f5f.
Alex Williamson reported regressions with device assignment with
this patch.  Even though the bug is probably elsewhere and still
latent, this is needed to fix the regression.

Fixes: 4e103134b862 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot", 2019-02-05)
Reported-by: Alex Willamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
19 months agoselftests: kvm: fix state save/load on processors without XSAVE
Paolo Bonzini [Tue, 20 Aug 2019 15:35:52 +0000 (17:35 +0200)]
selftests: kvm: fix state save/load on processors without XSAVE

state_test and smm_test are failing on older processors that do not
have xcr0.  This is because on those processor KVM does provide
support for KVM_GET/SET_XSAVE (to avoid having to rely on the older
KVM_GET/SET_FPU) but not for KVM_GET/SET_XCRS.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
19 months agovideo: fbdev: acornfb: Mark expected switch fall-through
Gustavo A. R. Silva [Wed, 21 Aug 2019 00:07:46 +0000 (19:07 -0500)]
video: fbdev: acornfb: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

Fix the following warning (Building: rpc_defconfig arm):

drivers/video/fbdev/acornfb.c: In function ‘acornfb_parse_dram’:
drivers/video/fbdev/acornfb.c:860:9: warning: this statement may fall through [-Wimplicit-fallthrough=]
    size *= 1024;
    ~~~~~^~~~~~~
drivers/video/fbdev/acornfb.c:861:3: note: here
   case 'K':
   ^~~~

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
19 months agoscsi: libsas: sas_discover: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 20 Aug 2019 21:20:05 +0000 (16:20 -0500)]
scsi: libsas: sas_discover: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

Fix the following warning (Building: mtx1_defconfig mips):

drivers/scsi/libsas/sas_discover.c: In function ‘sas_discover_domain’:
./include/linux/printk.h:309:2: warning: this statement may fall through [-Wimplicit-fallthrough=]
  printk(KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__)
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/libsas/sas_discover.c:459:3: note: in expansion of macro ‘pr_notice’
   pr_notice("ATA device seen but CONFIG_SCSI_SAS_ATA=N so cannot attach\n");
   ^~~~~~~~~
drivers/scsi/libsas/sas_discover.c:462:2: note: here
  default:
  ^~~~~~~

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
19 months agoMIPS: Octeon: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 20 Aug 2019 21:03:09 +0000 (16:03 -0500)]
MIPS: Octeon: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

Fix the following warning (Building: cavium_octeon_defconfig mips):

arch/mips/include/asm/octeon/cvmx-sli-defs.h:47:6: warning: this statement
may fall through [-Wimplicit-fallthrough=]

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
19 months agopower: supply: ab8500_charger: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 20 Aug 2019 20:55:26 +0000 (15:55 -0500)]
power: supply: ab8500_charger: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

Fix the following warning (Building: allmodconfig arm):

drivers/power/supply/ab8500_charger.c: In function ‘ab8500_charger_max_usb_curr’:
drivers/power/supply/ab8500_charger.c:738:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (di->vbus_detected) {
      ^
drivers/power/supply/ab8500_charger.c:745:2: note: here
  case USB_STAT_HM_IDGND:
  ^~~~

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
19 months agowatchdog: wdt285: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 20 Aug 2019 18:07:46 +0000 (13:07 -0500)]
watchdog: wdt285: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

Fix the following warning (Building: footbridge_defconfig arm):

drivers/watchdog/wdt285.c: In function ‘watchdog_ioctl’:
drivers/watchdog/wdt285.c:170:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
   watchdog_ping();
   ^~~~~~~~~~~~~~~
drivers/watchdog/wdt285.c:172:2: note: here
  case WDIOC_GETTIMEOUT:
  ^~~~

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
19 months agomtd: sa1100: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 20 Aug 2019 17:54:32 +0000 (12:54 -0500)]
mtd: sa1100: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

Fix the following warning (Building: assabet_defconfig arm):

drivers/mtd/maps/sa1100-flash.c: In function ‘sa1100_probe_subdev’:
drivers/mtd/maps/sa1100-flash.c:82:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
   printk(KERN_WARNING "SA1100 flash: unknown base address "
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          "0x%08lx, assuming CS0\n", phys);
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/mtd/maps/sa1100-flash.c:85:2: note: here
  case SA1100_CS0_PHYS:
  ^~~~

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
19 months agodrm/sun4i: tcon: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 20 Aug 2019 17:47:06 +0000 (12:47 -0500)]
drm/sun4i: tcon: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

Fix the following warning (Building: sunxi_defconfig arm):

drivers/gpu/drm/sun4i/sun4i_tcon.c: In function ‘sun4i_tcon0_mode_set_dithering’:
drivers/gpu/drm/sun4i/sun4i_tcon.c:318:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
   val |= SUN4I_TCON0_FRM_CTL_MODE_B;
drivers/gpu/drm/sun4i/sun4i_tcon.c:319:2: note: here
  case MEDIA_BUS_FMT_RGB666_1X18:
  ^~~~

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
19 months agodrm/sun4i: sun6i_mipi_dsi: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 20 Aug 2019 18:01:03 +0000 (13:01 -0500)]
drm/sun4i: sun6i_mipi_dsi: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

Fix the following warning (Building: multi_v7_defconfig arm):

drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c: In function ‘sun6i_dsi_transfer’:
drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c:993:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (msg->rx_len == 1) {
      ^
drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c:998:2: note: here
  default:
  ^~~~~~~

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
19 months agoARM: riscpc: Mark expected switch fall-through
Gustavo A. R. Silva [Wed, 21 Aug 2019 00:29:16 +0000 (19:29 -0500)]
ARM: riscpc: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

Fix the following warning (Building: rpc_defconfig arm):

arch/arm/mach-rpc/riscpc.c: In function ‘parse_tag_acorn’:
arch/arm/mach-rpc/riscpc.c:48:13: warning: this statement may fall through [-Wimplicit-fallthrough=]
   vram_size += PAGE_SIZE * 256;
   ~~~~~~~~~~^~~~~~~~~~~~~~~~~~
arch/arm/mach-rpc/riscpc.c:49:2: note: here
  case 256:
  ^~~~

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
19 months agodmaengine: fsldma: Mark expected switch fall-through
Gustavo A. R. Silva [Mon, 12 Aug 2019 00:18:03 +0000 (19:18 -0500)]
dmaengine: fsldma: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

Fix the following warnings (Building: powerpc-ppa8548_defconfig powerpc):

drivers/dma/fsldma.c: In function ‘fsl_dma_chan_probe’:
drivers/dma/fsldma.c:1165:26: warning: this statement may fall through [-Wimplicit-fallthrough=]
   chan->toggle_ext_pause = fsl_chan_toggle_ext_pause;
   ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/dma/fsldma.c:1166:2: note: here
  case FSL_DMA_IP_83XX:
  ^~~~

Reported-by: kbuild test robot <lkp@intel.com>
Acked-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
19 months agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Linus Torvalds [Tue, 20 Aug 2019 18:18:43 +0000 (11:18 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:

 - a few regression fixes for wacom driver (including fix for my earlier
   mismerge) from Aaron Armstrong Skomra and Jason Gerecke

 - revert of a few Logitech device ID additions which turn out to not
   work perfectly with the hidpp driver at the moment; proper support is
   now scheduled for 5.4. Fixes from Benjamin Tissoires

 - scheduling-in-atomic fix for cp2112 driver, from Benjamin Tissoires

 - new device ID to intel-ish, from Even Xu

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: wacom: correct misreported EKR ring values
  HID: cp2112: prevent sleeping function called from invalid context
  HID: intel-ish-hid: ipc: add EHL device id
  HID: wacom: Correct distance scale for 2nd-gen Intuos devices
  HID: logitech-hidpp: remove support for the G700 over USB
  Revert "HID: logitech-hidpp: add USB PID for a few more supported mice"
  HID: wacom: add back changes dropped in merge commit

19 months agoinfiniband: hfi1: fix memory leaks
Wenwen Wang [Sun, 18 Aug 2019 18:54:46 +0000 (13:54 -0500)]
infiniband: hfi1: fix memory leaks

In fault_opcodes_write(), 'data' is allocated through kcalloc(). However,
it is not deallocated in the following execution if an error occurs,
leading to memory leaks. To fix this issue, introduce the 'free_data' label
to free 'data' before returning the error.

Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/1566154486-3713-1-git-send-email-wenwen@cs.uga.edu
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoinfiniband: hfi1: fix a memory leak bug
Wenwen Wang [Sun, 18 Aug 2019 19:29:31 +0000 (14:29 -0500)]
infiniband: hfi1: fix a memory leak bug

In fault_opcodes_read(), 'data' is not deallocated if debugfs_file_get()
fails, leading to a memory leak. To fix this bug, introduce the 'free_data'
label to free 'data' before returning the error.

Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/1566156571-4335-1-git-send-email-wenwen@cs.uga.edu
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoIB/mlx4: Fix memory leaks
Wenwen Wang [Sun, 18 Aug 2019 20:23:01 +0000 (15:23 -0500)]
IB/mlx4: Fix memory leaks

In mlx4_ib_alloc_pv_bufs(), 'tun_qp->tx_ring' is allocated through
kcalloc(). However, it is not always deallocated in the following execution
if an error occurs, leading to memory leaks. To fix this issue, free
'tun_qp->tx_ring' whenever an error occurs.

Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/1566159781-4642-1-git-send-email-wenwen@cs.uga.edu
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoRDMA/cma: fix null-ptr-deref Read in cma_cleanup
zhengbin [Mon, 19 Aug 2019 04:27:39 +0000 (12:27 +0800)]
RDMA/cma: fix null-ptr-deref Read in cma_cleanup

In cma_init, if cma_configfs_init fails, need to free the
previously memory and return fail, otherwise will trigger
null-ptr-deref Read in cma_cleanup.

cma_cleanup
  cma_configfs_exit
    configfs_unregister_subsystem

Fixes: 045959db65c6 ("IB/cma: Add configfs for rdma_cm")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Link: https://lore.kernel.org/r/1566188859-103051-1-git-send-email-zhengbin13@huawei.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoIB/mlx5: Block MR WR if UMR is not possible
Moni Shoua [Thu, 15 Aug 2019 08:38:34 +0000 (11:38 +0300)]
IB/mlx5: Block MR WR if UMR is not possible

Check conditions that are mandatory to post_send UMR WQEs.
1. Modifying page size.
2. Modifying remote atomic permissions if atomic access is required.

If either condition is not fulfilled then fail to post_send() flow.

Fixes: c8d75a980fab ("IB/mlx5: Respect new UMR capabilities")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-9-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoIB/mlx5: Fix MR re-registration flow to use UMR properly
Moni Shoua [Thu, 15 Aug 2019 08:38:33 +0000 (11:38 +0300)]
IB/mlx5: Fix MR re-registration flow to use UMR properly

The UMR WQE in the MR re-registration flow requires that
modify_atomic and modify_entity_size capabilities are enabled.
Therefore, check that the these capabilities are present before going to
umr flow and go through slow path if not.

Fixes: c8d75a980fab ("IB/mlx5: Respect new UMR capabilities")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-8-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoIB/mlx5: Report and handle ODP support properly
Moni Shoua [Thu, 15 Aug 2019 08:38:32 +0000 (11:38 +0300)]
IB/mlx5: Report and handle ODP support properly

ODP depends on the several device capabilities, among them is the ability
to send UMR WQEs with that modify atomic and entity size of the MR.
Therefore, only if all conditions to send such a UMR WQE are met then
driver can report that ODP is supported. Use this check of conditions
in all places where driver needs to know about ODP support.

Also, implicit ODP support depends on ability of driver to send UMR WQEs
for an indirect mkey. Therefore, verify that all conditions to do so are
met when reporting support.

Fixes: c8d75a980fab ("IB/mlx5: Respect new UMR capabilities")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-7-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoIB/mlx5: Consolidate use_umr checks into single function
Moni Shoua [Thu, 15 Aug 2019 08:38:31 +0000 (11:38 +0300)]
IB/mlx5: Consolidate use_umr checks into single function

Introduce helper function to unify various use_umr checks.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-6-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoRDMA/restrack: Rewrite PID namespace check to be reliable
Leon Romanovsky [Thu, 15 Aug 2019 08:38:29 +0000 (11:38 +0300)]
RDMA/restrack: Rewrite PID namespace check to be reliable

task_active_pid_ns() is wrong API to check PID namespace because it
posses some restrictions and return PID namespace where the process
was allocated. It created mismatches with current namespace, which
can be different.

Rewrite whole rdma_is_visible_in_pid_ns() logic to provide reliable
results without any relation to allocated PID namespace.

Fixes: 8be565e65fa9 ("RDMA/nldev: Factor out the PID namespace check")
Fixes: 6a6c306a09b5 ("RDMA/restrack: Make is_visible_in_pid_ns() as an API")
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-4-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoRDMA/counters: Properly implement PID checks
Leon Romanovsky [Thu, 15 Aug 2019 08:38:28 +0000 (11:38 +0300)]
RDMA/counters: Properly implement PID checks

"Auto" configuration mode is called for visible in that PID
namespace and it ensures that all counters and QPs are coexist
in the same namespace and belong to same PID.

Fixes: 99fa331dc862 ("RDMA/counter: Add "auto" configuration mode support")
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-3-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoIB/core: Fix NULL pointer dereference when bind QP to counter
Ido Kalir [Thu, 15 Aug 2019 08:38:27 +0000 (11:38 +0300)]
IB/core: Fix NULL pointer dereference when bind QP to counter

If QP is not visible to the pid, then we try to decrease its reference
count and return from the function before the QP pointer is
initialized. This lead to NULL pointer dereference.
Fix it by pass directly the res to the rdma_restract_put as arg instead of
&qp->res.

This fixes below call trace:
[ 5845.110329] BUG: kernel NULL pointer dereference, address:
00000000000000dc
[ 5845.120482] Oops: 0002 [#1] SMP PTI
[ 5845.129119] RIP: 0010:rdma_restrack_put+0x5/0x30 [ib_core]
[ 5845.169450] Call Trace:
[ 5845.170544]  rdma_counter_get_qp+0x5c/0x70 [ib_core]
[ 5845.172074]  rdma_counter_bind_qpn_alloc+0x6f/0x1a0 [ib_core]
[ 5845.173731]  nldev_stat_set_doit+0x314/0x330 [ib_core]
[ 5845.175279]  rdma_nl_rcv_msg+0xeb/0x1d0 [ib_core]
[ 5845.176772]  ? __kmalloc_node_track_caller+0x20b/0x2b0
[ 5845.178321]  rdma_nl_rcv+0xcb/0x120 [ib_core]
[ 5845.179753]  netlink_unicast+0x179/0x220
[ 5845.181066]  netlink_sendmsg+0x2d8/0x3d0
[ 5845.182338]  sock_sendmsg+0x30/0x40
[ 5845.183544]  __sys_sendto+0xdc/0x160
[ 5845.184832]  ? syscall_trace_enter+0x1f8/0x2e0
[ 5845.186209]  ? __audit_syscall_exit+0x1d9/0x280
[ 5845.187584]  __x64_sys_sendto+0x24/0x30
[ 5845.188867]  do_syscall_64+0x48/0x120
[ 5845.190097]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 1bd8e0a9d0fd1 ("RDMA/counter: Allow manual mode configuration support")
Signed-off-by: Ido Kalir <idok@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-2-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoIB/hfi1: Drop stale TID RDMA packets that cause TIDErr
Kaike Wan [Thu, 15 Aug 2019 19:20:58 +0000 (15:20 -0400)]
IB/hfi1: Drop stale TID RDMA packets that cause TIDErr

In a congested fabric with adaptive routing enabled, traces show that
packets could be delivered out of order. A stale TID RDMA data packet
could lead to TidErr if the TID entries have been released by duplicate
data packets generated from retries, and subsequently erroneously force
the qp into error state in the current implementation.

Since the payload has already been dropped by hardware, the packet can
be simply dropped and it is no longer necessary to put the qp into
error state.

Fixes: 9905bf06e890 ("IB/hfi1: Add functions to receive TID RDMA READ response")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20190815192058.105923.72324.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoIB/hfi1: Add additional checks when handling TID RDMA WRITE DATA packet
Kaike Wan [Thu, 15 Aug 2019 19:20:51 +0000 (15:20 -0400)]
IB/hfi1: Add additional checks when handling TID RDMA WRITE DATA packet

In a congested fabric with adaptive routing enabled, traces show that
packets could be delivered out of order, which could cause incorrect
processing of stale packets. For stale TID RDMA WRITE DATA packets that
cause KDETH EFLAGS errors, this patch adds additional checks before
processing the packets.

Fixes: d72fe7d5008b ("IB/hfi1: Add a function to receive TID RDMA WRITE DATA packet")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20190815192051.105923.69979.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoIB/hfi1: Add additional checks when handling TID RDMA READ RESP packet
Kaike Wan [Thu, 15 Aug 2019 19:20:45 +0000 (15:20 -0400)]
IB/hfi1: Add additional checks when handling TID RDMA READ RESP packet

In a congested fabric with adaptive routing enabled, traces show that
packets could be delivered out of order, which could cause incorrect
processing of stale packets. For stale TID RDMA READ RESP packets that
cause KDETH EFLAGS errors, this patch adds additional checks before
processing the packets.

Fixes: 9905bf06e890 ("IB/hfi1: Add functions to receive TID RDMA READ response")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20190815192045.105923.59813.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoIB/hfi1: Unsafe PSN checking for TID RDMA READ Resp packet
Kaike Wan [Thu, 15 Aug 2019 19:20:39 +0000 (15:20 -0400)]
IB/hfi1: Unsafe PSN checking for TID RDMA READ Resp packet

When processing a TID RDMA READ RESP packet that causes KDETH EFLAGS
errors, the packet's IB PSN is checked against qp->s_last_psn and
qp->s_psn without the protection of qp->s_lock, which is not safe.

This patch fixes the issue by acquiring qp->s_lock first.

Fixes: 9905bf06e890 ("IB/hfi1: Add functions to receive TID RDMA READ response")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20190815192039.105923.7852.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoIB/hfi1: Drop stale TID RDMA packets
Kaike Wan [Thu, 15 Aug 2019 19:20:33 +0000 (15:20 -0400)]
IB/hfi1: Drop stale TID RDMA packets

In a congested fabric with adaptive routing enabled, traces show that
the sender could receive stale TID RDMA NAK packets that contain newer
KDETH PSNs and older Verbs PSNs. If not dropped, these packets could
cause the incorrect rewinding of the software flows and the incorrect
completion of TID RDMA WRITE requests, and eventually leading to memory
corruption and kernel crash.

The current code drops stale TID RDMA ACK/NAK packets solely based
on KDETH PSNs, which may lead to erroneous processing. This patch
fixes the issue by also checking the Verbs PSN. Addition checks are
added before rewinding the TID RDMA WRITE DATA packets.

Fixes: 9e93e967f7b4 ("IB/hfi1: Add a function to receive TID RDMA ACK packet")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20190815192033.105923.44192.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoRDMA/siw: Fix potential NULL de-ref
Bernard Metzler [Mon, 19 Aug 2019 14:02:57 +0000 (16:02 +0200)]
RDMA/siw: Fix potential NULL de-ref

In siw_connect() we have an error flow where there is no valid qp
pointer.  Make sure we don't try to de-ref in that situation.

Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20190819140257.19319-1-bmt@zurich.ibm.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoRDMA/mlx5: Fix MR npages calculation for IB_ACCESS_HUGETLB
Jason Gunthorpe [Thu, 15 Aug 2019 08:38:30 +0000 (11:38 +0300)]
RDMA/mlx5: Fix MR npages calculation for IB_ACCESS_HUGETLB

When ODP is enabled with IB_ACCESS_HUGETLB then the required pages
should be calculated based on the extent of the MR, which is rounded
to the nearest huge page alignment.

Fixes: d2183c6f1958 ("RDMA/umem: Move page_shift from ib_umem to ib_odp_umem")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-5-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
19 months agoio_uring: don't enter poll loop if we have CQEs pending
Jens Axboe [Tue, 20 Aug 2019 17:03:11 +0000 (11:03 -0600)]
io_uring: don't enter poll loop if we have CQEs pending

We need to check if we have CQEs pending before starting a poll loop,
as those could be the events we will be spinning for (and hence we'll
find none). This can happen if a CQE triggers an error, or if it is
found by eg an IRQ before we get a chance to find it through polling.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
19 months agonvme: Add quirk for LiteON CL1 devices running FW 22301111
Mario Limonciello [Fri, 16 Aug 2019 20:16:19 +0000 (15:16 -0500)]
nvme: Add quirk for LiteON CL1 devices running FW 22301111

One of the components in LiteON CL1 device has limitations that
can be encountered based upon boundary race conditions using the
nvme bus specific suspend to idle flow.

When this situation occurs the drive doesn't resume properly from
suspend-to-idle.

LiteON has confirmed this problem and fixed in the next firmware
version.  As this firmware is already in the field, avoid running
nvme specific suspend to idle flow.

Fixes: d916b1be94b6 ("nvme-pci: use host managed power state for suspend")
Link: http://lists.infradead.org/pipermail/linux-nvme/2019-July/thread.html
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Charles Hyde <charles.hyde@dellteam.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
19 months agonvme: Fix cntlid validation when not using NVMEoF
Guilherme G. Piccoli [Wed, 14 Aug 2019 14:26:10 +0000 (11:26 -0300)]
nvme: Fix cntlid validation when not using NVMEoF

Commit 1b1031ca63b2 ("nvme: validate cntlid during controller initialisation")
introduced a validation for controllers with duplicate cntlid that runs
on nvme_init_subsystem(). The problem is that the validation relies on
ctrl->cntlid, and this value is assigned (from id_ctrl value) after the
call for nvme_init_subsystem() in nvme_init_identify() for non-fabrics
scenario. That leads to ctrl->cntlid always being 0 in case we have a
physical set of controllers in the same subsystem.

This patch fixes that by loading the discovered cntlid id_ctrl value into
ctrl->cntlid before the subsystem initialization, only for the non-fabrics
case. The patch was tested with emulated nvme devices (qemu) having two
controllers in a single subsystem. Without the patch, we couldn't make
it work failing in the duplicate check; when running with the patch, we
could see the subsystem holding both controllers.

For the fabrics case we see ctrl->cntlid has a more intricate relation
with the admin connect, so we didn't change that.

Fixes: 1b1031ca63b2 ("nvme: validate cntlid during controller initialisation")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
19 months agonvme-multipath: fix possible I/O hang when paths are updated
Anton Eidelman [Mon, 12 Aug 2019 20:00:36 +0000 (23:00 +0300)]
nvme-multipath: fix possible I/O hang when paths are updated

nvme_state_set_live() making a path available triggers requeue_work
in order to resubmit requests that ended up on requeue_list when no
paths were available.

This requeue_work may race with concurrent nvme_ns_head_make_request()
that do not observe the live path yet.
Such concurrent requests may by made by either:
- New IO submission.
- Requeue_work triggered by nvme_failover_req() or another ana_work.

A race may cause requeue_work capture the state of requeue_list before
more requests get onto the list. These requests will stay on the list
forever unless requeue_work is triggered again.

In order to prevent such race, nvme_state_set_live() should
synchronize_srcu(&head->srcu) before triggering the requeue_work and
prevent nvme_ns_head_make_request referencing an old snapshot of the
path list.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
19 months agoio_uring: fix potential hang with polled IO
Jens Axboe [Mon, 19 Aug 2019 18:15:59 +0000 (12:15 -0600)]
io_uring: fix potential hang with polled IO

If a request issue ends up being punted to async context to avoid
blocking, we can get into a situation where the original application
enters the poll loop for that very request before it has been issued.
This should not be an issue, except that the polling will hold the
io_uring uring_ctx mutex for the duration of the poll. When the async
worker has actually issued the request, it needs to acquire this mutex
to add the request to the poll issued list. Since the application
polling is already holding this mutex, the workqueue sleeps on the
mutex forever, and the application thus never gets a chance to poll for
the very request it was interested in.

Fix this by ensuring that the polling drops the uring_ctx occasionally
if it's not making any progress.

Reported-by: Jeffrey M. Birnbaum <jmbnyc@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>