sfrench/cifs-2.6.git
5 years agoMerge tag 'staging-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sat, 9 Jun 2018 17:32:39 +0000 (10:32 -0700)]
Merge tag 'staging-4.18-rc1' of git://git./linux/kernel/git/gregkh/staging

Pull staging/IIO updates from Greg KH:
 "Here is the big staging and IIO driver update for 4.18-rc1.

  It was delayed as I wanted to make sure the final driver deletions did
  not cause any major merge issues, and all now looks good.

  There are a lot of patches here, just over 1000. The diffstat summary
  shows the major changes here:

1007 files changed, 16828 insertions(+), 227770 deletions(-)

  Because of this, we might be close to shrinking the overall kernel
  source code size for two releases in a row.

  There was loads of work in this release cycle, primarily:

   - tons of ks7010 driver cleanups

   - lots of mt7621 driver fixes and cleanups

   - most driver cleanups

   - wilc1000 fixes and cleanups

   - lots and lots of IIO driver cleanups and new additions

   - debugfs cleanups for all staging drivers

   - lots of other staging driver cleanups and fixes, the shortlog has
     the full details.

  but the big user-visable things here are the removal of 3 chunks of
  code:

   - ncpfs and ipx were removed on schedule, no one has cared about this
     code since it moved to staging last year, and if it needs to come
     back, it can be reverted.

   - lustre file system is removed.

     I've ranted at the lustre developers about once a year for the past
     5 years, with no real forward progress at all to clean things up
     and get the code into the "real" part of the kernel.

     Given that the lustre developers continue to work on an external
     tree and try to port those changes to the in-kernel tree every once
     in a while, this whole thing really really is not working out at
     all. So I'm deleting it so that the developers can spend the time
     working in their out-of-tree location and get things cleaned up
     properly to get merged into the tree correctly at a later date.

  Because of these file removals, you will have merge issues on some of
  these files (2 in the ipx code, 1 in the ncpfs code, and 1 in the
  atomisp driver). Just delete those files, it's a simple merge :)

  All of this has been in linux-next for a while with no reported
  problems"

* tag 'staging-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (1011 commits)
  staging: ipx: delete it from the tree
  ncpfs: remove uapi .h files
  ncpfs: remove Documentation
  ncpfs: remove compat functionality
  staging: ncpfs: delete it
  staging: lustre: delete the filesystem from the tree.
  staging: vc04_services: no need to save the log debufs dentries
  staging: vc04_services: vchiq_debugfs_log_entry can be a void *
  staging: vc04_services: remove struct vchiq_debugfs_info
  staging: vc04_services: move client dbg directory into static variable
  staging: vc04_services: remove odd vchiq_debugfs_top() wrapper
  staging: vc04_services: no need to check debugfs return values
  staging: mt7621-gpio: reorder includes alphabetically
  staging: mt7621-gpio: change gc_map to don't use pointers
  staging: mt7621-gpio: use GPIOF_DIR_OUT and GPIOF_DIR_IN macros instead of custom values
  staging: mt7621-gpio: change 'to_mediatek_gpio' to make just a one line return
  staging: mt7621-gpio: dt-bindings: update documentation for #interrupt-cells property
  staging: mt7621-gpio: update #interrupt-cells for the gpio node
  staging: mt7621-gpio: dt-bindings: complete documentation for the gpio
  staging: mt7621-dts: add missing properties to gpio node
  ...

5 years agoMerge tag 'libnvdimm-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdim...
Linus Torvalds [Sat, 9 Jun 2018 00:21:52 +0000 (17:21 -0700)]
Merge tag 'libnvdimm-for-4.18' of git://git./linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
 "This adds a user for the new 'bytes-remaining' updates to
  memcpy_mcsafe() that you already received through Ingo via the
  x86-dax- for-linus pull.

  Not included here, but still targeting this cycle, is support for
  handling memory media errors (poison) consumed via userspace dax
  mappings.

  Summary:

   - DAX broke a fundamental assumption of truncate of file mapped
     pages. The truncate path assumed that it is safe to disconnect a
     pinned page from a file and let the filesystem reclaim the physical
     block. With DAX the page is equivalent to the filesystem block.
     Introduce dax_layout_busy_page() to enable filesystems to wait for
     pinned DAX pages to be released. Without this wait a filesystem
     could allocate blocks under active device-DMA to a new file.

   - DAX arranges for the block layer to be bypassed and uses
     dax_direct_access() + copy_to_iter() to satisfy read(2) calls.
     However, the memcpy_mcsafe() facility is available through the pmem
     block driver. In order to safely handle media errors, via the DAX
     block-layer bypass, introduce copy_to_iter_mcsafe().

   - Fix cache management policy relative to the ACPI NFIT Platform
     Capabilities Structure to properly elide cache flushes when they
     are not necessary. The table indicates whether CPU caches are
     power-fail protected. Clarify that a deep flush is always performed
     on REQ_{FUA,PREFLUSH} requests"

* tag 'libnvdimm-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (21 commits)
  dax: Use dax_write_cache* helpers
  libnvdimm, pmem: Do not flush power-fail protected CPU caches
  libnvdimm, pmem: Unconditionally deep flush on *sync
  libnvdimm, pmem: Complete REQ_FLUSH => REQ_PREFLUSH
  acpi, nfit: Remove ecc_unit_size
  dax: dax_insert_mapping_entry always succeeds
  libnvdimm, e820: Register all pmem resources
  libnvdimm: Debug probe times
  linvdimm, pmem: Preserve read-only setting for pmem devices
  x86, nfit_test: Add unit test for memcpy_mcsafe()
  pmem: Switch to copy_to_iter_mcsafe()
  dax: Report bytes remaining in dax_iomap_actor()
  dax: Introduce a ->copy_to_iter dax operation
  uio, lib: Fix CONFIG_ARCH_HAS_UACCESS_MCSAFE compilation
  xfs, dax: introduce xfs_break_dax_layouts()
  xfs: prepare xfs_break_layouts() for another layout type
  xfs: prepare xfs_break_layouts() to be called with XFS_MMAPLOCK_EXCL
  mm, fs, dax: handle layout changes to pinned dax mappings
  mm: fix __gup_device_huge vs unmap
  mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS
  ...

5 years agoMerge branch 'for-4.18/mcsafe' into libnvdimm-for-next
Dan Williams [Fri, 8 Jun 2018 22:16:44 +0000 (15:16 -0700)]
Merge branch 'for-4.18/mcsafe' into libnvdimm-for-next

5 years agoMerge branch 'for-4.18/dax' into libnvdimm-for-next
Dan Williams [Fri, 8 Jun 2018 22:16:40 +0000 (15:16 -0700)]
Merge branch 'for-4.18/dax' into libnvdimm-for-next

5 years agoMerge tag 'for-linus-20180608' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 8 Jun 2018 20:36:19 +0000 (13:36 -0700)]
Merge tag 'for-linus-20180608' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "A few fixes for this merge window, where some of them should go in
  sooner rather than later, hence a new pull this week. This pull
  request contains:

   - Set of NVMe fixes, mostly follow up cleanups/fixes to the queue
     changes, but also teardown/removal and misc changes (Christop/Dan/
     Johannes/Sagi/Steve).

   - Two lightnvm fixes for issues that showed up in this window
     (Colin/Wei).

   - Failfast/driver flags inheritance for flush requests (Hannes).

   - The md device put sanitization and fix (Kent).

   - dm bio_set inheritance fix (me).

   - nbd discard granularity fix (Josef).

   - nbd consistency in command printing (Kevin).

   - Loop recursion validation fix (Ted).

   - Partition overlap check (Wang)"

[ .. and now my build is warning-free again thanks to the md fix  - Linus ]

* tag 'for-linus-20180608' of git://git.kernel.dk/linux-block: (22 commits)
  nvme: cleanup double shift issue
  nvme-pci: make CMB SQ mod-param read-only
  nvme-pci: unquiesce dead controller queues
  nvme-pci: remove HMB teardown on reset
  nvme-pci: queue creation fixes
  nvme-pci: remove unnecessary completion doorbell check
  nvme-pci: remove unnecessary nested locking
  nvmet: filter newlines from user input
  nvme-rdma: correctly check for target keyed sgl support
  nvme: don't hold nvmf_transports_rwsem for more than transport lookups
  nvmet: return all zeroed buffer when we can't find an active namespace
  md: Unify mddev destruction paths
  dm: use bioset_init_from_src() to copy bio_set
  block: add bioset_init_from_src() helper
  block: always set partition number to '0' in blk_partition_remap()
  block: pass failfast and driver-specific flags to flush requests
  nbd: set discard_alignment to the granularity
  nbd: Consistently use request pointer in debug messages.
  block: add verifier for cmdline partition
  lightnvm: pblk: fix resource leak of invalid_bitmap
  ...

5 years agoMerge tag 'regulator-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie...
Linus Torvalds [Fri, 8 Jun 2018 20:08:57 +0000 (13:08 -0700)]
Merge tag 'regulator-v4.18' of git://git./linux/kernel/git/broonie/regulator

Pull regulator updates from Mark Brown:
 "Quite a lot of core work this time around, though not 100% successful.

  We gained support for runtime mode changes thanks to David Collins and
  improved support for write only regulators (ones where we can't read
  back the configuration) from Douglas Anderson.

  There's been quite a bit of work from Linus Walleij on converting from
  specfying GPIOs by numbers to descriptors. Sadly the testing turned
  out to be less good than we had hoped and so a lot of this had to be
  reverted.

  We also have the start of updates to use coupled regulators from
  Maciej Purski, unfortunately there are further problems there so the
  last couple of patches have been reverted.

  We also have new drivers for BD71837 and SY8106A devices, SAW
  regulators on Qualcomm SPMI and dropped support for some preproduction
  chips that never made it to market from the AB8500 driver"

* tag 'regulator-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (57 commits)
  regulator: gpio: Revert
  ARM: pxa, regulator: fix building ezx e680
  regulator: Revert coupled regulator support again
  regulator: wm8994: Fix shared GPIOs
  regulator: max77686: Fix shared GPIOs
  regulator: bd71837BD71837 PMIC regulator driver
  regulator: bd71837: Devicetree bindings for BD71837 regulators
  regulator: gpio: Get enable GPIO using GPIO descriptor
  regulator: fixed: Convert to use GPIO descriptor only
  regulator: s2mps11: Fix boot on Odroid XU3
  dt-bindings: qcom_spmi: Document SAW support
  regulator: qcom_spmi: Add support for SAW
  regulator: tps65090: Pass descriptor instead of GPIO number
  regulator: s5m8767: Pass descriptor instead of GPIO number
  regulator: pfuze100: Delete reference to ena_gpio
  regulator: max8952: Pass descriptor instead of GPIO number
  regulator: lp8788-ldo: Pass descriptor instead of GPIO number
  regulator: lm363x: Pass descriptor instead of GPIO number
  regulator: max8973: Pass descriptor instead of GPIO number
  regulator: mc13xxx-core: Switch to SPDX identifier
  ...

5 years agonvme: cleanup double shift issue
Dan Carpenter [Thu, 7 Jun 2018 08:27:41 +0000 (11:27 +0300)]
nvme: cleanup double shift issue

The problem here is that set_bit() and test_bit() take a bit number so
we should be passing 0 but instead we're passing (1 << 0) which leads to
a double shift.  It doesn't cause a runtime bug in the current code
because it's done consistently and we only set that one bit.

I decided to just re-use NVME_AER_NOTICE_NS_CHANGED instead of
introducing a new define for this.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agonvme-pci: make CMB SQ mod-param read-only
Keith Busch [Wed, 6 Jun 2018 14:13:09 +0000 (08:13 -0600)]
nvme-pci: make CMB SQ mod-param read-only

A controller reset after a run time change of the CMB module parameter
breaks the driver. An 'on -> off' will have the driver use NULL for the
host memory queue, and 'off -> on' will use mismatched queue depth between
the device and the host.

We could fix both, but there isn't really a good reason to change this
at run time anyway, compared to at module load time, so this patch makes
parameter read-only after after modprobe.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agonvme-pci: unquiesce dead controller queues
Keith Busch [Wed, 6 Jun 2018 14:13:08 +0000 (08:13 -0600)]
nvme-pci: unquiesce dead controller queues

This patch ensures the nvme namsepace request queues are not quiesced
on a surprise removal. It's possible the queues were previously killed
in a failed reset, so the queues need to be unquiesced to ensure all
requests are flushed to completion.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agonvme-pci: remove HMB teardown on reset
Keith Busch [Wed, 6 Jun 2018 14:13:07 +0000 (08:13 -0600)]
nvme-pci: remove HMB teardown on reset

The controller is required to disable its host memory buffer use on
controller reset. We don't need to submit an admin command to delete it,
so this patch skips sending that command so we don't need to worry about
handling a timeout.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agonvme-pci: queue creation fixes
Keith Busch [Wed, 6 Jun 2018 14:13:06 +0000 (08:13 -0600)]
nvme-pci: queue creation fixes

We've been ignoring NVMe error status on queue creations. Fortunately they
are uncommon, but we should handle these anyway. This patch adds checks
for the a positive error return value that indicates an NVMe status.

If we do see a negative return, the controller isn't usable, so this
patch returns immediately in since we can't unwind that failure.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agonvme-pci: remove unnecessary completion doorbell check
Keith Busch [Wed, 6 Jun 2018 14:13:05 +0000 (08:13 -0600)]
nvme-pci: remove unnecessary completion doorbell check

The nvme pci driver never unmaps the doorbell registers while the requests
are active, so we can always safely update the completion queue head.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agonvme-pci: remove unnecessary nested locking
Keith Busch [Wed, 6 Jun 2018 14:13:04 +0000 (08:13 -0600)]
nvme-pci: remove unnecessary nested locking

The nvme pci driver no longer handles completions under the cq lock,
so the nested locking is not necessary.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agonvmet: filter newlines from user input
Sagi Grimberg [Wed, 6 Jun 2018 12:27:48 +0000 (15:27 +0300)]
nvmet: filter newlines from user input

We should avoid consuming the newlines in traddr, trsvcid and
device_path. Add minimal processing to make sure they are gone.

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agonvme-rdma: correctly check for target keyed sgl support
Steve Wise [Tue, 5 Jun 2018 17:16:41 +0000 (10:16 -0700)]
nvme-rdma: correctly check for target keyed sgl support

The code was checking bit 20 instead of bit 2.  Also fixed the log entry.

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agonvme: don't hold nvmf_transports_rwsem for more than transport lookups
Johannes Thumshirn [Fri, 1 Jun 2018 07:11:20 +0000 (09:11 +0200)]
nvme: don't hold nvmf_transports_rwsem for more than transport lookups

Only take nvmf_transports_rwsem when doing a lookup of registered
transports, so that a blocking ->create_ctrl doesn't prevent other
actions on /dev/nvme-fabrics.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
[hch: increased lock hold time a bit to be safe, added a comment
 and updated the changelog]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agonvmet: return all zeroed buffer when we can't find an active namespace
Christoph Hellwig [Thu, 31 May 2018 16:23:48 +0000 (18:23 +0200)]
nvmet: return all zeroed buffer when we can't find an active namespace

Quote from Figure 106 in NVMe 1.3a:

  The Identify Namespace data structure is returned to the host for the
  namespace specified in the Namespace Identifier (CDW1.NSID) field if it
  is an active NSID. If the specified namespace is not an active NSID,
  then the controller returns a zero filled data structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@rimberg.me>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoMerge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64...
Linus Torvalds [Fri, 8 Jun 2018 18:10:58 +0000 (11:10 -0700)]
Merge tag 'arm64-upstream' of git://git./linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "Apart from the core arm64 and perf changes, the Spectre v4 mitigation
  touches the arm KVM code and the ACPI PPTT support touches drivers/
  (acpi and cacheinfo). I should have the maintainers' acks in place.

  Summary:

   - Spectre v4 mitigation (Speculative Store Bypass Disable) support
     for arm64 using SMC firmware call to set a hardware chicken bit

   - ACPI PPTT (Processor Properties Topology Table) parsing support and
     enable the feature for arm64

   - Report signal frame size to user via auxv (AT_MINSIGSTKSZ). The
     primary motivation is Scalable Vector Extensions which requires
     more space on the signal frame than the currently defined
     MINSIGSTKSZ

   - ARM perf patches: allow building arm-cci as module, demote
     dev_warn() to dev_dbg() in arm-ccn event_init(), miscellaneous
     cleanups

   - cmpwait() WFE optimisation to avoid some spurious wakeups

   - L1_CACHE_BYTES reverted back to 64 (for performance reasons that
     have to do with some network allocations) while keeping
     ARCH_DMA_MINALIGN to 128. cache_line_size() returns the actual
     hardware Cache Writeback Granule

   - Turn LSE atomics on by default in Kconfig

   - Kernel fault reporting tidying

   - Some #include and miscellaneous cleanups"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (53 commits)
  arm64: Fix syscall restarting around signal suppressed by tracer
  arm64: topology: Avoid checking numa mask for scheduler MC selection
  ACPI / PPTT: fix build when CONFIG_ACPI_PPTT is not enabled
  arm64: cpu_errata: include required headers
  arm64: KVM: Move VCPU_WORKAROUND_2_FLAG macros to the top of the file
  arm64: signal: Report signal frame size to userspace via auxv
  arm64/sve: Thin out initialisation sanity-checks for sve_max_vl
  arm64: KVM: Add ARCH_WORKAROUND_2 discovery through ARCH_FEATURES_FUNC_ID
  arm64: KVM: Handle guest's ARCH_WORKAROUND_2 requests
  arm64: KVM: Add ARCH_WORKAROUND_2 support for guests
  arm64: KVM: Add HYP per-cpu accessors
  arm64: ssbd: Add prctl interface for per-thread mitigation
  arm64: ssbd: Introduce thread flag to control userspace mitigation
  arm64: ssbd: Restore mitigation status on CPU resume
  arm64: ssbd: Skip apply_ssbd if not using dynamic mitigation
  arm64: ssbd: Add global mitigation state accessor
  arm64: Add 'ssbd' command-line option
  arm64: Add ARCH_WORKAROUND_2 probing
  arm64: Add per-cpu infrastructure to call ARCH_WORKAROUND_2
  arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1
  ...

5 years agoMerge tag 'dmaengine-4.18-rc1' of git://git.infradead.org/users/vkoul/slave-dma
Linus Torvalds [Fri, 8 Jun 2018 18:02:21 +0000 (11:02 -0700)]
Merge tag 'dmaengine-4.18-rc1' of git://git.infradead.org/users/vkoul/slave-dma

Pull dmaengine updates from Vinod Koul:

 - updates to sprd, bam_dma, stm drivers

 - remove VLAs in dmatest

 - move TI drivers to their own subdir

 - switch to SPDX tags for ima/mxs dma drivers

 - simplify getting .drvdata on bunch of drivers by Wolfram Sang

* tag 'dmaengine-4.18-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (32 commits)
  dmaengine: sprd: Add Spreadtrum DMA configuration
  dmaengine: sprd: Optimize the sprd_dma_prep_dma_memcpy()
  dmaengine: imx-dma: Switch to SPDX identifier
  dmaengine: mxs-dma: Switch to SPDX identifier
  dmaengine: imx-sdma: Switch to SPDX identifier
  dmaengine: usb-dmac: Document R8A7799{0,5} bindings
  dmaengine: qcom: bam_dma: fix some doc warnings.
  dmaengine: qcom: bam_dma: fix invalid assignment warning
  dmaengine: sprd: fix an NULL vs IS_ERR() bug
  dmaengine: sprd: Use devm_ioremap_resource() to map memory
  dmaengine: sprd: Fix potential NULL dereference in sprd_dma_probe()
  dmaengine: pl330: flush before wait, and add dev burst support.
  dmaengine: axi-dmac: Request IRQ with IRQF_SHARED
  dmaengine: stm32-mdma: fix spelling mistake: "avalaible" -> "available"
  dmaengine: rcar-dmac: Document R-Car D3 bindings
  dmaengine: sprd: Move DMA request mode and interrupt type into head file
  dmaengine: sprd: Define the DMA data width type
  dmaengine: sprd: Define the DMA transfer step type
  dmaengine: ti: New directory for Texas Instruments DMA drivers
  dmaengine: shdmac: Change platform check to CONFIG_ARCH_RENESAS
  ...

5 years agoMerge tag 'iommu-updates-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 8 Jun 2018 17:44:33 +0000 (10:44 -0700)]
Merge tag 'iommu-updates-v4.18' of git://git./linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:
 "Nothing big this time. In particular:

   - Debugging code for Tegra-GART

   - Improvement in Intel VT-d fault printing to prevent soft-lockups
     when on fault storms

   - Improvements in AMD IOMMU event reporting

   - NUMA aware allocation in io-pgtable code for ARM

   - Various other small fixes and cleanups all over the place"

* tag 'iommu-updates-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/io-pgtable-arm: Make allocations NUMA-aware
  iommu/amd: Prevent possible null pointer dereference and infinite loop
  iommu/amd: Fix grammar of comments
  iommu: Clean up the comments for iommu_group_alloc
  iommu/vt-d: Remove unnecessary parentheses
  iommu/vt-d: Clean up pasid quirk for pre-production devices
  iommu/vt-d: Clean up unused variable in find_or_alloc_domain
  iommu/vt-d: Fix iotlb psi missing for mappings
  iommu/vt-d: Introduce __mapping_notify_one()
  iommu: Remove extra NULL check when call strtobool()
  iommu/amd: Update logging information for new event type
  iommu/amd: Update the PASID information printed to the system log
  iommu/tegra: gart: Fix gart_iommu_unmap()
  iommu/tegra: gart: Add debugging facility
  iommu/io-pgtable-arm: Use for_each_set_bit to simplify code
  iommu/qcom: Simplify getting .drvdata
  iommu: Remove depends on HAS_DMA in case of platform dependency
  iommu/vt-d: Ratelimit each dmar fault printing

5 years agoMerge tag 'mtd/for-4.18' of git://git.infradead.org/linux-mtd
Linus Torvalds [Fri, 8 Jun 2018 17:39:20 +0000 (10:39 -0700)]
Merge tag 'mtd/for-4.18' of git://git.infradead.org/linux-mtd

Pull MTD updates from Boris Brezillon:
 "Core changes:
   - Add a sysfs attribute to expose available OOB size

  Driver changes:
   - Remove HAS_DMA dependency on various drivers
   - Use dev_get_drvdata() instead of platform_get_drvdata() in docg3
   - Replace msleep by usleep_range() in the dataflash driver
   - Avoid VLA usage in nftl layers
   - Remove useless .owner assignment in pismo
   - Fix various issues in the CFI driver
   - Improve TRX partition handling expose a DT compat for this part
     parser
   - Clarify OFFSET_CONTINUOUS meaning

  NAND core changes:
   - Add Miquel as a NAND maintainer
   - Add access mode to the nand_page_io_req struct
   - Fix kernel-doc in rawnand.h
   - Support bit-wise majority to recover from corrupted ONFI parameter
     pages
   - Stop checking FAIL bit after a SET_FEATURES, as documented in the
     ONFI spec

  Raw NAND Driver changes:
   - Fix and cleanup the error path of many NAND controller drivers
   - GPMI:
      + Cleanup/simplification of a few aspects in the driver
      + Take ECC setup specified in the DT into account
   - sunxi: remove support for GPIO-based R/B polling
   - MTK:
      + Use of_device_get_match_data() instead of of_match_device()
      + Add an entry in MAINTAINERS for this driver
      + Fix nand-ecc-step-size and nand-ecc-strength description in the
        DT bindings doc
   - fsl_ifc: fix ->cmdfunc() to read more than one ONFI parameter page

  OneNAND driver changes:
   - samsung: use dev_get_drvdata() instead of platform_get_drvdata()

  SPI NOR core changes:
   - Add support for a bunch of SPI NOR chips
   - Clear EAR reg when switching to 3-byte addressing mode on Winbond
     chips

  SPI NOR controller driver changes:
   - cadence: Add DMA support for direct mode reads
   - hisi: Prefix a few functions with hisi_
   - intel:
      + Mark the driver as "dangerous" in Kconfig
      + Fix atomic sequence handling
      + Pass a 40us delay (instead of 0us) to readl_poll_timeout()
   - fsl:
      + fix a typo in a function name
      + add support for IP variants embedded in the ls2080a and ls1080a
        SoCs
   - stm32: request exclusive control of the reset line"

* tag 'mtd/for-4.18' of git://git.infradead.org/linux-mtd: (66 commits)
  mtd: nand: Pass mode information to nand_page_io_req
  mtd: cfi_cmdset_0002: Change erase one block to enable XIP once
  mtd: cfi_cmdset_0002: Change erase functions to check chip good only
  mtd: cfi_cmdset_0002: Change erase functions to retry for error
  mtd: cfi_cmdset_0002: Change definition naming to retry write operation
  mtd: cfi_cmdset_0002: Change write buffer to check correct value
  mtd: cmdlinepart: Update comment for introduction of OFFSET_CONTINUOUS
  mtd: bcm47xxpart: add of_match_table with a new DT binding
  dt-bindings: mtd: document Broadcom's BCM47xx partitions
  mtd: spi-nor: Add support for EN25QH32
  mtd: spi-nor: Add support for is25wp series chips
  mtd: spi-nor: Add Winbond w25q32jv support
  mtd: spi-nor: fsl-quadspi: add support for ls2080a/ls1080a
  mtd: spi-nor: stm32-quadspi: explicitly request exclusive reset control
  mtd: spi-nor: intel: provide a range for poll_timout
  mtd: spi-nor: fsl-quadspi: fix api naming typo _init_ahb_read
  mtd: spi-nor: intel-spi: Explicitly mark the driver as dangerous in Kconfig
  mtd: spi-nor: intel-spi: Fix atomic sequence handling
  mtd: rawnand: Do not check FAIL bit when executing a SET_FEATURES op
  mtd: rawnand: use bit-wise majority to recover the ONFI param page
  ...

5 years agoMerge tag 'gpio-v4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Fri, 8 Jun 2018 17:31:52 +0000 (10:31 -0700)]
Merge tag 'gpio-v4.18-1' of git://git./linux/kernel/git/linusw/linux-gpio

Pull GPIO updates from Linus Walleij:
 "This is the bulk of GPIO changes for the v4.18 development cycle.

  Core changes:

   - We have killed off VLA from the core library and all drivers.

     The background should be clear for everyone at this point:

        https://lwn.net/Articles/749064/

     Also I just don't like VLA's, kernel developers hate it when
     compilers do things behind their back. It's as simple as that.

     I'm sorry that they even slipped in to begin with. Kudos to Laura
     Abbott for exorcising them.

   - Support GPIO hogs in machines/board files.

  New drivers and chip support:

   - R-Car r8a77470 (RZ/G1C)

   - R-Car r8a77965 (M3-N)

   - R-Car r8a77990 (E3)

   - PCA953x driver improvements to accomodate more variants.

  Improvements and new features:

   - Support one interrupt per line on port A in the DesignWare dwapb
     driver.

  Misc:

   - Random cleanups, right header files in the drivers, some size
     optimizations etc"

* tag 'gpio-v4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (73 commits)
  gpio: davinci: fix build warning when !CONFIG_OF
  gpio: dwapb: Fix rework support for 1 interrupt per port A GPIO
  gpio: pxa: Include the right header
  gpio: pl061: Include the right header
  gpio: pch: Include the right header
  gpio: pcf857x: Include the right header
  gpio: pca953x: Include the right header
  gpio: palmas: Include the right header
  gpio: omap: Include the right header
  gpio: octeon: Include the right header
  gpio: mxs: Switch to SPDX identifier
  gpio: Remove VLA from stmpe driver
  gpio: mxc: Switch to SPDX identifier
  gpio: mxc: add clock operation
  gpio: Remove VLA from gpiolib
  gpio: aspeed: Use a cache of output data registers
  gpio: aspeed: Set output latch before changing direction
  gpio: pca953x: fix address calculation for pcal6524
  gpio: pca953x: define masks for addressing common and extended registers
  gpio: pca953x: set the PCA_PCAL flag also when matching by DT
  ...

5 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Linus Torvalds [Fri, 8 Jun 2018 17:29:26 +0000 (10:29 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid

Pull HID updates from Jiri Kosina:

 - Valve Steam Controller support from Rodrigo Rivas Costa

 - Redragon Asura support from Robert Munteanu

 - improvement of duplicate usage handling in generic hid-input from
   Benjamin Tissoires

 - Win 8.1 precisioun touchpad spec implementation from Benjamin
   Tissoires

 - Support for "In Range" flag for Wacom Intuos/Bamboo devices from
   Jason Gerecke

 - other various assorted smaller fixes and improvements

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (27 commits)
  HID: rmi: use HID_QUIRK_NO_INPUT_SYNC
  HID: multitouch: fix calculation of last slot field in multi-touch reports
  HID: quirks: remove Delcom Visual Signal Indicator from hid_have_special_driver[]
  HID: steam: select CONFIG_POWER_SUPPLY
  HID: i2c-hid: remove i2c_hid_open_mut
  HID: wacom: Support "in range" for Intuos/Bamboo tablets where possible
  HID: core: fix hid_hw_open() comment
  HID: hid-plantronics: Re-resend Update to map button for PTT products
  HID: multitouch: fix types returned from mt_need_to_apply_feature()
  HID: i2c-hid: check if device is there before really probing
  HID: steam: add missing fields in client initialization
  HID: steam: add battery device.
  HID: add driver for Valve Steam Controller
  HID: alps: Fix some style in 't4_read_write_register()'
  HID: alps: Check errors returned by 't4_read_write_register()'
  HID: alps: Save a memory allocation in 't4_read_write_register()' when writing data
  HID: alps: Report an error if we receive invalid data in 't4_read_write_register()'
  HID: multitouch: implement precision touchpad latency and switches
  HID: multitouch: simplify the settings of the various features
  HID: multitouch: make use of HID_QUIRK_INPUT_PER_APP
  ...

5 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livep...
Linus Torvalds [Fri, 8 Jun 2018 17:27:41 +0000 (10:27 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/livepatching

Pull livepatching fixlet from Jiri Kosina:
 "livepatching documentation fix from Petr Mladek"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
  livepatch: Remove not longer valid limitations from the documentation

5 years agoMerge branch 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Fri, 8 Jun 2018 17:00:20 +0000 (10:00 -0700)]
Merge branch 'work.aio' of git://git./linux/kernel/git/viro/vfs

Pull aio iopriority support from Al Viro:
 "The rest of aio stuff for this cycle - Adam's aio ioprio series"

* 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: aio ioprio use ioprio_check_cap ret val
  fs: aio ioprio add explicit block layer dependence
  fs: iomap dio set bio prio from kiocb prio
  fs: blkdev set bio prio from kiocb prio
  fs: Add aio iopriority support
  fs: Convert kiocb rw_hint from enum to u16
  block: add ioprio_check_cap function

5 years agoMerge branch 'work.lookup' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Fri, 8 Jun 2018 16:56:38 +0000 (09:56 -0700)]
Merge branch 'work.lookup' of git://git./linux/kernel/git/viro/vfs

Pull proc_fill_cache regression fix from Al Viro:
 "Regression fix for proc_fill_cache() braino introduced when switching
  instantiate() callback to d_splice_alias()"

* 'work.lookup' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix proc_fill_cache() in case of d_alloc_parallel() failure

5 years agoMerge tag 'for-linus-4.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 8 Jun 2018 16:24:54 +0000 (09:24 -0700)]
Merge tag 'for-linus-4.18-rc1-tag' of git://git./linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:
 "This contains some minor code cleanups (fixing return types of
  functions), some fixes for Linux running as Xen PVH guest, and adding
  of a new guest resource mapping feature for Xen tools"

* tag 'for-linus-4.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/PVH: Make GDT selectors PVH-specific
  xen/PVH: Set up GS segment for stack canary
  xen/store: do not store local values in xen_start_info
  xen-netfront: fix xennet_start_xmit()'s return type
  xen/privcmd: add IOCTL_PRIVCMD_MMAP_RESOURCE
  xen: Change return type to vm_fault_t

5 years agoMerge branch 'regulator-4.17' into regulator-4.18 merge window
Mark Brown [Fri, 8 Jun 2018 15:27:56 +0000 (16:27 +0100)]
Merge branch 'regulator-4.17' into regulator-4.18 merge window

5 years agomd: Unify mddev destruction paths
Kent Overstreet [Fri, 8 Jun 2018 00:52:54 +0000 (20:52 -0400)]
md: Unify mddev destruction paths

Previously, mddev_put() had a couple different paths for freeing a
mddev, due to the fact that the kobject wasn't initialized when the
mddev was first allocated. If we move the kobject_init() to when it's
first allocated and just use kobject_add() later, we can clean all this
up.

This also removes a hack in mddev_put() to avoid freeing biosets under a
spinlock, which involved copying biosets on the stack after the reset
bioset_init() changes.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agodm: use bioset_init_from_src() to copy bio_set
Jens Axboe [Thu, 7 Jun 2018 20:42:06 +0000 (14:42 -0600)]
dm: use bioset_init_from_src() to copy bio_set

We can't just copy and clear a bio_set, use the bio helper to
setup a new bio_set with the settings from another one.

Fixes: 6f1c819c219f ("dm: convert to bioset_init()/mempool_init()")
Reported-by: Venkat R.B <vrbagal1@linux.vnet.ibm.com>
Tested-by: Venkat R.B <vrbagal1@linux.vnet.ibm.com>
Tested-by: Li Wang <liwang@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoblock: add bioset_init_from_src() helper
Jens Axboe [Thu, 7 Jun 2018 20:42:05 +0000 (14:42 -0600)]
block: add bioset_init_from_src() helper

Add a helper that allows a caller to initialize a new bio_set,
using the settings from an existing bio_set.

Reported-by: Venkat R.B <vrbagal1@linux.vnet.ibm.com>
Tested-by: Venkat R.B <vrbagal1@linux.vnet.ibm.com>
Tested-by: Li Wang <liwang@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoarm64: Fix syscall restarting around signal suppressed by tracer
Dave Martin [Thu, 7 Jun 2018 11:32:05 +0000 (12:32 +0100)]
arm64: Fix syscall restarting around signal suppressed by tracer

Commit 17c2895 ("arm64: Abstract syscallno manipulation") abstracts
out the pt_regs.syscallno value for a syscall cancelled by a tracer
as NO_SYSCALL, and provides helpers to set and check for this
condition.  However, the way this was implemented has the
unintended side-effect of disabling part of the syscall restart
logic.

This comes about because the second in_syscall() check in
do_signal() re-evaluates the "in a syscall" condition based on the
updated pt_regs instead of the original pt_regs.  forget_syscall()
is explicitly called prior to the second check in order to prevent
restart logic in the ret_to_user path being spuriously triggered,
which means that the second in_syscall() check always yields false.

This triggers a failure in
tools/testing/selftests/seccomp/seccomp_bpf.c, when using ptrace to
suppress a signal that interrups a nanosleep() syscall.

Misbehaviour of this type is only expected in the case where a
tracer suppresses a signal and the target process is either being
single-stepped or the interrupted syscall attempts to restart via
-ERESTARTBLOCK.

This patch restores the old behaviour by performing the
in_syscall() check only once at the start of the function.

Fixes: 17c289586009 ("arm64: Abstract syscallno manipulation")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reported-by: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # 4.14.x-
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
5 years agoMerge branch 'for-4.18/wacom' into for-linus
Jiri Kosina [Fri, 8 Jun 2018 08:28:24 +0000 (10:28 +0200)]
Merge branch 'for-4.18/wacom' into for-linus

Support for "In Range" flag for Wacom Intuos/Bamboo devices from Jason Gerecke

5 years agoMerge branch 'for-4.18/upstream' into for-linus
Jiri Kosina [Fri, 8 Jun 2018 08:27:40 +0000 (10:27 +0200)]
Merge branch 'for-4.18/upstream' into for-linus

5 years agoMerge branch 'for-4.18/rmi' into for-linus
Jiri Kosina [Fri, 8 Jun 2018 08:27:02 +0000 (10:27 +0200)]
Merge branch 'for-4.18/rmi' into for-linus

RMI4 correct split report handling from Benjamin Tissoires

5 years agoMerge branch 'for-4.18/plantronics' into for-linus
Jiri Kosina [Fri, 8 Jun 2018 08:26:18 +0000 (10:26 +0200)]
Merge branch 'for-4.18/plantronics' into for-linus

5 years agoMerge branch 'for-4.18/multitouch' into for-linus
Jiri Kosina [Fri, 8 Jun 2018 08:25:50 +0000 (10:25 +0200)]
Merge branch 'for-4.18/multitouch' into for-linus

- improvement of duplicate usage handling in hid-input from Benjamin Tissoires
- Win 8.1 precisioun touchpad spec implementation from Benjamin Tissoires

5 years agoMerge branch 'for-4.18/i2c-hid' into for-linus
Jiri Kosina [Fri, 8 Jun 2018 08:23:34 +0000 (10:23 +0200)]
Merge branch 'for-4.18/i2c-hid' into for-linus

Assorted smaller fixes to i2c-hid driver

5 years agoMerge branch 'for-4.18/hid-steam' into for-linus
Jiri Kosina [Fri, 8 Jun 2018 08:22:26 +0000 (10:22 +0200)]
Merge branch 'for-4.18/hid-steam' into for-linus

Valve Steam Controller support from Rodrigo Rivas Costa

5 years agoMerge branch 'for-4.18/hid-redragon' into for-linus
Jiri Kosina [Fri, 8 Jun 2018 08:21:47 +0000 (10:21 +0200)]
Merge branch 'for-4.18/hid-redragon' into for-linus

Redragon Asura support from Robert Munteanu

5 years agoMerge branch 'for-4.18/alps' into for-linus
Jiri Kosina [Fri, 8 Jun 2018 08:20:42 +0000 (10:20 +0200)]
Merge branch 'for-4.18/alps' into for-linus

hid-alps driver cleanups wrt. t4_read_write_register() handling
from Christophe Jaillet

5 years agofix proc_fill_cache() in case of d_alloc_parallel() failure
Al Viro [Fri, 8 Jun 2018 05:17:11 +0000 (01:17 -0400)]
fix proc_fill_cache() in case of d_alloc_parallel() failure

If d_alloc_parallel() returns ERR_PTR(...), we don't want to dput()
that.  Small reorganization allows to have all error-in-lookup
cases rejoin the main codepath after dput(child), avoiding the
entire problem.

Spotted-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Fixes: 0168b9e38c42 "procfs: switch instantiate_t to d_splice_alias()"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
5 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Fri, 8 Jun 2018 01:39:37 +0000 (18:39 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge updates from Andrew Morton:

 - a few misc things

 - ocfs2 updates

 - v9fs updates

 - MM

 - procfs updates

 - lib/ updates

 - autofs updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits)
  autofs: small cleanup in autofs_getpath()
  autofs: clean up includes
  autofs: comment on selinux changes needed for module autoload
  autofs: update MAINTAINERS entry for autofs
  autofs: use autofs instead of autofs4 in documentation
  autofs: rename autofs documentation files
  autofs: create autofs Kconfig and Makefile
  autofs: delete fs/autofs4 source files
  autofs: update fs/autofs4/Makefile
  autofs: update fs/autofs4/Kconfig
  autofs: copy autofs4 to autofs
  autofs4: use autofs instead of autofs4 everywhere
  autofs4: merge auto_fs.h and auto_fs4.h
  fs/binfmt_misc.c: do not allow offset overflow
  checkpatch: improve patch recognition
  lib/ucs2_string.c: add MODULE_LICENSE()
  lib/mpi: headers cleanup
  lib/percpu_ida.c: use _irqsave() instead of local_irq_save() + spin_lock
  lib/idr.c: remove simple_ida_lock
  lib/bitmap.c: micro-optimization for __bitmap_complement()
  ...

5 years agoautofs: small cleanup in autofs_getpath()
Dan Carpenter [Fri, 8 Jun 2018 00:11:52 +0000 (17:11 -0700)]
autofs: small cleanup in autofs_getpath()

We don't set "*name" so it's slightly nicer to just pass "name" instead
of "&name".

Link: http://lkml.kernel.org/r/20180531064736.lnisb55eajwjynvk@kili.mountain
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoautofs: clean up includes
Ian Kent [Fri, 8 Jun 2018 00:11:48 +0000 (17:11 -0700)]
autofs: clean up includes

Remove includes that aren't needed from autofs (and fs/compat_ioctl.c).

Link: http://lkml.kernel.org/r/152635085258.5968.9743527195522188148.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoautofs: comment on selinux changes needed for module autoload
Ian Kent [Fri, 8 Jun 2018 00:11:45 +0000 (17:11 -0700)]
autofs: comment on selinux changes needed for module autoload

Due to the autofs4 module using a file system type name of autofs
different from the module containing directory name autoload did not
function properly.  To work around this kernel configurations have often
elected to build the module into the kernel.

This can result in selinux policies that prohibit autoloading of the
autofs module which need to be changed.

Add a comment about this to "possible changes" section of the autofs4
module help.

Link: http://lkml.kernel.org/r/152686474171.6155.1239659539983577463.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoautofs: update MAINTAINERS entry for autofs
Ian Kent [Fri, 8 Jun 2018 00:11:42 +0000 (17:11 -0700)]
autofs: update MAINTAINERS entry for autofs

Update the autofs entry in MAINTAINERS to reflect the rename of autofs4
to autofs.

Link: http://lkml.kernel.org/r/152626709611.28589.456596640024354223.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoautofs: use autofs instead of autofs4 in documentation
Ian Kent [Fri, 8 Jun 2018 00:11:38 +0000 (17:11 -0700)]
autofs: use autofs instead of autofs4 in documentation

Finally remove autofs4 references in the filesystems documentation.

Link: http://lkml.kernel.org/r/152626709055.28589.416082809460051475.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoautofs: rename autofs documentation files
Ian Kent [Fri, 8 Jun 2018 00:11:35 +0000 (17:11 -0700)]
autofs: rename autofs documentation files

There are two files in Documentation/filsystems that should now use
autofs rather than autofs4 in their names.

Link: http://lkml.kernel.org/r/152626707957.28589.3325300375892913999.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoautofs: create autofs Kconfig and Makefile
Ian Kent [Fri, 8 Jun 2018 00:11:31 +0000 (17:11 -0700)]
autofs: create autofs Kconfig and Makefile

Create Makefile and Kconfig for autofs module.

[raven@themaw.net: make autofs4 Kconfig depend on AUTOFS_FS]
Link: http://lkml.kernel.org/r/152687649097.8263.7046086367407522029.stgit@pluto.themaw.net
Link: http://lkml.kernel.org/r/152626705591.28589.356365986974038383.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoautofs: delete fs/autofs4 source files
Ian Kent [Fri, 8 Jun 2018 00:11:26 +0000 (17:11 -0700)]
autofs: delete fs/autofs4 source files

Delete the now unused autofs4 module files.

Link: http://lkml.kernel.org/r/152626707391.28589.3553309771262313504.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoautofs: update fs/autofs4/Makefile
Ian Kent [Fri, 8 Jun 2018 00:11:22 +0000 (17:11 -0700)]
autofs: update fs/autofs4/Makefile

Update Makefile to build from source in fs/autofs instead of fs/autofs4.

Link: http://lkml.kernel.org/r/152626706824.28589.1915028175544560855.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoautofs: update fs/autofs4/Kconfig
Ian Kent [Fri, 8 Jun 2018 00:11:17 +0000 (17:11 -0700)]
autofs: update fs/autofs4/Kconfig

Update Kconfig and add a depricated warning.

[raven@themaw.net: make autofs4 Kconfig depend on AUTOFS_FS]
Link: http://lkml.kernel.org/r/152687649097.8263.7046086367407522029.stgit@pluto.themaw.net
Link: http://lkml.kernel.org/r/152626706133.28589.11994171621899212952.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoautofs: copy autofs4 to autofs
Ian Kent [Fri, 8 Jun 2018 00:11:13 +0000 (17:11 -0700)]
autofs: copy autofs4 to autofs

Copy source files from the autofs4 directory to the autofs directory.

Link: http://lkml.kernel.org/r/152626705013.28589.931913083997578251.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoautofs4: use autofs instead of autofs4 everywhere
Ian Kent [Fri, 8 Jun 2018 00:11:09 +0000 (17:11 -0700)]
autofs4: use autofs instead of autofs4 everywhere

Update naming within autofs source to be consistent by changing
occurrences of autofs4 to autofs.

Link: http://lkml.kernel.org/r/152626703688.28589.8315406711135226803.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoautofs4: merge auto_fs.h and auto_fs4.h
Ian Kent [Fri, 8 Jun 2018 00:11:05 +0000 (17:11 -0700)]
autofs4: merge auto_fs.h and auto_fs4.h

The autofs module has long since been removed so there's no need to have
two separate include files for autofs.

Link: http://lkml.kernel.org/r/152626703024.28589.9571964661718767929.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agofs/binfmt_misc.c: do not allow offset overflow
Thadeu Lima de Souza Cascardo [Fri, 8 Jun 2018 00:11:01 +0000 (17:11 -0700)]
fs/binfmt_misc.c: do not allow offset overflow

WHen registering a new binfmt_misc handler, it is possible to overflow
the offset to get a negative value, which might crash the system, or
possibly leak kernel data.

Here is a crash log when 2500000000 was used as an offset:

  BUG: unable to handle kernel paging request at ffff989cfd6edca0
  IP: load_misc_binary+0x22b/0x470 [binfmt_misc]
  PGD 1ef3e067 P4D 1ef3e067 PUD 0
  Oops: 0000 [#1] SMP NOPTI
  Modules linked in: binfmt_misc kvm_intel ppdev kvm irqbypass joydev input_leds serio_raw mac_hid parport_pc qemu_fw_cfg parpy
  CPU: 0 PID: 2499 Comm: bash Not tainted 4.15.0-22-generic #24-Ubuntu
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
  RIP: 0010:load_misc_binary+0x22b/0x470 [binfmt_misc]
  Call Trace:
    search_binary_handler+0x97/0x1d0
    do_execveat_common.isra.34+0x667/0x810
    SyS_execve+0x31/0x40
    do_syscall_64+0x73/0x130
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Use kstrtoint instead of simple_strtoul.  It will work as the code
already set the delimiter byte to '\0' and we only do it when the field
is not empty.

Tested with offsets -1, 2500000000, UINT_MAX and INT_MAX.  Also tested
with examples documented at Documentation/admin-guide/binfmt-misc.rst
and other registrations from packages on Ubuntu.

Link: http://lkml.kernel.org/r/20180529135648.14254-1-cascardo@canonical.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocheckpatch: improve patch recognition
Joe Perches [Fri, 8 Jun 2018 00:10:58 +0000 (17:10 -0700)]
checkpatch: improve patch recognition

There are mode change and rename only patches that are unrecognized by
checkpatch.

Recognize them.

[joe@perches.com: fix missing close parenthesis]
Link: http://lkml.kernel.org/r/af44c893f6973393f2a5b11f1a8e5cd4c8bbbba5.camel@perches.com
Link: http://lkml.kernel.org/r/974a407e6fa18abd5a965da39cc68986a4c4f091.1526949367.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reported-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib/ucs2_string.c: add MODULE_LICENSE()
Randy Dunlap [Fri, 8 Jun 2018 00:10:55 +0000 (17:10 -0700)]
lib/ucs2_string.c: add MODULE_LICENSE()

Fix missing MODULE_LICENSE() warning in lib/ucs2_string.c:

  WARNING: modpost: missing MODULE_LICENSE() in lib/ucs2_string.o
  see include/linux/module.h for more information

Link: http://lkml.kernel.org/r/b2505bb4-dcf5-fc46-443d-e47db1cb2f59@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib/mpi: headers cleanup
Vasily Averin [Fri, 8 Jun 2018 00:10:51 +0000 (17:10 -0700)]
lib/mpi: headers cleanup

MPI headers contain definitions for huge number of non-existing
functions.

Most part of these functions was removed in 2012 by Dmitry Kasatkin
 - 7cf4206a99d1 ("Remove unused code from MPI library")
 - 9e235dcaf4f6 ("Revert "crypto: GnuPG based MPI lib - additional ...")
 - bc95eeadf5c6 ("lib/mpi: removed unused functions")
however headers wwere not updated properly.

Also I deleted some unused macros.

Link: http://lkml.kernel.org/r/fb2fc1ef-1185-f0a3-d8d0-173d2f97bbaf@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib/percpu_ida.c: use _irqsave() instead of local_irq_save() + spin_lock
Sebastian Andrzej Siewior [Fri, 8 Jun 2018 00:10:48 +0000 (17:10 -0700)]
lib/percpu_ida.c: use _irqsave() instead of local_irq_save() + spin_lock

percpu_ida() decouples disabling interrupts from the locking operations.
This breaks some assumptions if the locking operations are replaced like
they are under -RT.

The same locking can be achieved by avoiding local_irq_save() and using
spin_lock_irqsave() instead.  percpu_ida_alloc() gains one more preemption
point because after unlocking the fastpath and before the pool lock is
acquired, the interrupts are briefly enabled.

Link: http://lkml.kernel.org/r/20180504153218.7301-1-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Shaohua Li <shli@fb.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib/idr.c: remove simple_ida_lock
Matthew Wilcox [Fri, 8 Jun 2018 00:10:45 +0000 (17:10 -0700)]
lib/idr.c: remove simple_ida_lock

Improve the scalability of the IDA by using the per-IDA xa_lock rather
than the global simple_ida_lock.  IDAs are not typically used in
performance-sensitive locations, but since we have this lock anyway, we
can use it.  It is also a step towards converting the IDA from the radix
tree to the XArray.

[akpm@linux-foundation.org: idr.c needs xarray.h]
Link: http://lkml.kernel.org/r/20180331125332.GF13332@bombadil.infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib/bitmap.c: micro-optimization for __bitmap_complement()
Yury Norov [Fri, 8 Jun 2018 00:10:41 +0000 (17:10 -0700)]
lib/bitmap.c: micro-optimization for __bitmap_complement()

Use BITS_TO_LONGS() macro to avoid calculation of reminder (bits %
BITS_PER_LONG) On ARM64 it saves 5 instruction for function - 16 before
and 11 after.

Link: http://lkml.kernel.org/r/20180411145914.6011-1-ynorov@caviumnetworks.com
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoget_maintainer: improve patch recognition
Joe Perches [Fri, 8 Jun 2018 00:10:38 +0000 (17:10 -0700)]
get_maintainer: improve patch recognition

There are mode change and rename only patches that are unrecognized
by the get_maintainer.pl script.

Recognize them.

Link: http://lkml.kernel.org/r/bf63101a908d0ff51948164aa60e672368066186.1526949367.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reported-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agokernel/hung_task.c: show all hung tasks before panic
Tetsuo Handa [Fri, 8 Jun 2018 00:10:34 +0000 (17:10 -0700)]
kernel/hung_task.c: show all hung tasks before panic

When we get a hung task it can often be valuable to see _all_ the hung
tasks on the system before calling panic().

Quoting from https://syzkaller.appspot.com/text?tag=CrashReport&id=5316056503549952
----------------------------------------
INFO: task syz-executor0:6540 blocked for more than 120 seconds.
      Not tainted 4.16.0+ #13
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor0   D23560  6540   4521 0x80000004
Call Trace:
 context_switch kernel/sched/core.c:2848 [inline]
 __schedule+0x8fb/0x1ef0 kernel/sched/core.c:3490
 schedule+0xf5/0x430 kernel/sched/core.c:3549
 schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3607
 __mutex_lock_common kernel/locking/mutex.c:833 [inline]
 __mutex_lock+0xb7f/0x1810 kernel/locking/mutex.c:893
 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
 lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355
 __blkdev_driver_ioctl block/ioctl.c:303 [inline]
 blkdev_ioctl+0x1759/0x1e00 block/ioctl.c:601
 ioctl_by_bdev+0xa5/0x110 fs/block_dev.c:2060
 isofs_get_last_session fs/isofs/inode.c:567 [inline]
 isofs_fill_super+0x2ba9/0x3bc0 fs/isofs/inode.c:660
 mount_bdev+0x2b7/0x370 fs/super.c:1119
 isofs_mount+0x34/0x40 fs/isofs/inode.c:1560
 mount_fs+0x66/0x2d0 fs/super.c:1222
 vfs_kern_mount.part.26+0xc6/0x4a0 fs/namespace.c:1037
 vfs_kern_mount fs/namespace.c:2514 [inline]
 do_new_mount fs/namespace.c:2517 [inline]
 do_mount+0xea4/0x2b90 fs/namespace.c:2847
 ksys_mount+0xab/0x120 fs/namespace.c:3063
 SYSC_mount fs/namespace.c:3077 [inline]
 SyS_mount+0x39/0x50 fs/namespace.c:3074
 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
(...snipped...)
Showing all locks held in the system:
(...snipped...)
2 locks held by syz-executor0/6540:
 #0: 00000000566d4c39 (&type->s_umount_key#49/1){+.+.}, at: alloc_super fs/super.c:211 [inline]
 #0: 00000000566d4c39 (&type->s_umount_key#49/1){+.+.}, at: sget_userns+0x3b2/0xe60 fs/super.c:502 /* down_write_nested(&s->s_umount, SINGLE_DEPTH_NESTING); */
 #1: 0000000043ca8836 (&lo->lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 /* mutex_lock_nested(&lo->lo_ctl_mutex, 1); */
(...snipped...)
3 locks held by syz-executor7/6541:
 #0: 0000000043ca8836 (&lo->lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 /* mutex_lock_nested(&lo->lo_ctl_mutex, 1); */
 #1: 000000007bf3d3f9 (&bdev->bd_mutex){+.+.}, at: blkdev_reread_part+0x1e/0x40 block/ioctl.c:192
 #2: 00000000566d4c39 (&type->s_umount_key#50){.+.+}, at: __get_super.part.10+0x1d3/0x280 fs/super.c:663 /* down_read(&sb->s_umount); */
----------------------------------------

When reporting an AB-BA deadlock like shown above, it would be nice if
trace of PID=6541 is printed as well as trace of PID=6540 before calling
panic().

Showing hung tasks up to /proc/sys/kernel/hung_task_warnings could delay
calling panic() but normally there should not be so many hung tasks.

Link: http://lkml.kernel.org/r/201804050705.BHE57833.HVFOFtSOMQJFOL@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoinclude/linux/types.h: use fixed width types without double-underscore prefix
Masahiro Yamada [Fri, 8 Jun 2018 00:10:30 +0000 (17:10 -0700)]
include/linux/types.h: use fixed width types without double-underscore prefix

This header file is not exported.  It is safe to reference types without
double-underscore prefix.

Link: http://lkml.kernel.org/r/1526350925-14922-3-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Lihao Liang <lianglihao@huawei.com>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoinclude/linux/types.h: define aligned_ types based on uapi header
Masahiro Yamada [Fri, 8 Jun 2018 00:10:27 +0000 (17:10 -0700)]
include/linux/types.h: define aligned_ types based on uapi header

<uapi/linux/types.h> has the same typedefs except that it prefixes them
with double-underscore for user space.  Use them for the kernel space
typedefs.

Link: http://lkml.kernel.org/r/1526350925-14922-2-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Lihao Liang <lianglihao@huawei.com>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoint-ll64.h: define u{8,16,32,64} and s{8,16,32,64} based on uapi header
Masahiro Yamada [Fri, 8 Jun 2018 00:10:24 +0000 (17:10 -0700)]
int-ll64.h: define u{8,16,32,64} and s{8,16,32,64} based on uapi header

<uapi/asm-generic/int-ll64.h> has the same typedefs except that it
prefixes them with double-underscore for user space.  Use them for
the kernel space typedefs.

Link: http://lkml.kernel.org/r/1526350925-14922-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Lihao Liang <lianglihao@huawei.com>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agotools/testing/selftests/proc: test /proc/*/fd a bit (+ PF_KTHREAD is ABI!)
Alexey Dobriyan [Fri, 8 Jun 2018 00:10:20 +0000 (17:10 -0700)]
tools/testing/selftests/proc: test /proc/*/fd a bit (+ PF_KTHREAD is ABI!)

* Test lookup in /proc/self/fd.
  "map_files" lookup story showed that lookup is not that simple.

* Test that all those symlinks open the same file.
  Check with (st_dev, st_info).

* Test that kernel threads do not have anything in their /proc/*/fd/
  directory.

Now this is where things get interesting.

First, kernel threads aren't pinned by /proc/self or equivalent,
thus some "atomicity" is required.

Second, ->comm can contain whitespace and ')'.
No, they are not escaped.

Third, the only reliable way to check if process is kernel thread
appears to be field #9 in /proc/*/stat.

This field is struct task_struct::flags in decimal!
Check is done by testing PF_KTHREAD flags like we do in kernel.

PF_KTREAD value is a part of userspace ABI !!!

Other methods for determining kernel threadness are not reliable:
* RSS can be 0 if everything is swapped, even while reading
  from /proc/self.

* ->total_vm CAN BE ZERO if process is finishing

munmap(NULL, whole address space);

* /proc/*/maps and similar files can be empty because unmapping
  everything works. Read returning 0 can't distinguish between
  kernel thread and such suicide process.

Link: http://lkml.kernel.org/r/20180505000414.GA15090@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoproc: use "unsigned int" for /proc/*/stack
Alexey Dobriyan [Fri, 8 Jun 2018 00:10:17 +0000 (17:10 -0700)]
proc: use "unsigned int" for /proc/*/stack

struct stack_trace::nr_entries is defined as "unsigned int" (YAY!) so
the iterator should be unsigned as well.

It saves 1 byte of code or something like that.

Link: http://lkml.kernel.org/r/20180423215248.GG9043@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoproc: use "unsigned int" for sigqueue length
Alexey Dobriyan [Fri, 8 Jun 2018 00:10:13 +0000 (17:10 -0700)]
proc: use "unsigned int" for sigqueue length

It's defined as atomic_t and really long signal queues are unheard of.

Link: http://lkml.kernel.org/r/20180423215119.GF9043@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoproc: use "unsigned int" in proc_fill_cache()
Alexey Dobriyan [Fri, 8 Jun 2018 00:10:10 +0000 (17:10 -0700)]
proc: use "unsigned int" in proc_fill_cache()

All those lengths are unsigned as they should be.

Link: http://lkml.kernel.org/r/20180423213751.GC9043@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoproc: smaller RCU section in ->getattr()
Alexey Dobriyan [Fri, 8 Jun 2018 00:10:07 +0000 (17:10 -0700)]
proc: smaller RCU section in ->getattr()

struct kstat is thread local.

Link: http://lkml.kernel.org/r/20180423213626.GB9043@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoproc: deduplicate /proc/*/cmdline implementation
Alexey Dobriyan [Fri, 8 Jun 2018 00:10:02 +0000 (17:10 -0700)]
proc: deduplicate /proc/*/cmdline implementation

Code can be sonsolidated if a dummy region of 0 length is used in normal
case of \0-separated command line:

1) [arg_start, arg_end) + [dummy len=0]
2) [arg_start, arg_end) + [env_start, env_end)

Link: http://lkml.kernel.org/r/20180221193335.GB28678@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoproc: simpler iterations for /proc/*/cmdline
Alexey Dobriyan [Fri, 8 Jun 2018 00:09:59 +0000 (17:09 -0700)]
proc: simpler iterations for /proc/*/cmdline

"rv" variable is used both as a counter of bytes transferred and an
error value holder but it can be reduced solely to error values if
original start of userspace buffer is stashed and used at the very end.

[akpm@linux-foundation.org: simplify cleanup code]
Link: http://lkml.kernel.org/r/20180221193009.GA28678@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoproc: somewhat simpler code for /proc/*/cmdline
Alexey Dobriyan [Fri, 8 Jun 2018 00:09:55 +0000 (17:09 -0700)]
proc: somewhat simpler code for /proc/*/cmdline

"final" variable is OK but we can get away with less lines.

Link: http://lkml.kernel.org/r/20180221192751.GC28548@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoproc: more "unsigned int" in /proc/*/cmdline
Alexey Dobriyan [Fri, 8 Jun 2018 00:09:52 +0000 (17:09 -0700)]
proc: more "unsigned int" in /proc/*/cmdline

access_remote_vm() doesn't return negative errors, it returns number of
bytes read/written (0 if error occurs).  This allows to delete some
comparisons which never trigger.

Reuse "nr_read" variable while I'm at it.

Link: http://lkml.kernel.org/r/20180221192605.GB28548@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: remove page_is_poisoned() from linux/mm.h
Sahara [Fri, 8 Jun 2018 00:09:48 +0000 (17:09 -0700)]
mm: remove page_is_poisoned() from linux/mm.h

When commit bd33ef368135 ("mm: enable page poisoning early at boot") got
rid of the PAGE_EXT_DEBUG_POISON, page_is_poisoned in the header left
behind.  This patch cleans up the leftovers under the table.

Link: http://lkml.kernel.org/r/1528101069-21637-1-git-send-email-kpark3469@gmail.com
Signed-off-by: Sahara <keun-o.park@darkmatter.ae>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomem_cgroup: make sure moving_account, move_lock_task and stat_cpu in the same cacheline
Aaron Lu [Fri, 8 Jun 2018 00:09:44 +0000 (17:09 -0700)]
mem_cgroup: make sure moving_account, move_lock_task and stat_cpu in the same cacheline

The LKP robot found a 27% will-it-scale/page_fault3 performance
regression regarding commit e27be240df53("mm: memcg: make sure
memory.events is uptodate when waking pollers").

What the test does is:
 1 mkstemp() a 128M file on a tmpfs;
 2 start $nr_cpu processes, each to loop the following:
   2.1 mmap() this file in shared write mode;
   2.2 write 0 to this file in a PAGE_SIZE step till the end of the file;
   2.3 unmap() this file and repeat this process.
 3 After 5 minutes, check how many loops they managed to complete, the
   higher the better.

The commit itself looks innocent enough as it merely changed some event
counting mechanism and this test didn't trigger those events at all.
Perf shows increased cycles spent on accessing root_mem_cgroup->stat_cpu
in count_memcg_event_mm()(called by handle_mm_fault()) and in
__mod_memcg_state() called by page_add_file_rmap().  So it's likely due
to the changed layout of 'struct mem_cgroup' that either make stat_cpu
falling into a constantly modifying cacheline or some hot fields stop
being in the same cacheline.

I verified this by moving memory_events[] back to where it was:

: --- a/include/linux/memcontrol.h
: +++ b/include/linux/memcontrol.h
: @@ -205,7 +205,6 @@ struct mem_cgroup {
:   int oom_kill_disable;
:
:   /* memory.events */
: - atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS];
:   struct cgroup_file events_file;
:
:   /* protect arrays of thresholds */
: @@ -238,6 +237,7 @@ struct mem_cgroup {
:   struct mem_cgroup_stat_cpu __percpu *stat_cpu;
:   atomic_long_t stat[MEMCG_NR_STAT];
:   atomic_long_t events[NR_VM_EVENT_ITEMS];
: + atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS];
:
:   unsigned long socket_pressure;

And performance restored.

Later investigation found that as long as the following 3 fields
moving_account, move_lock_task and stat_cpu are in the same cacheline,
performance will be good.  To avoid future performance surprise by other
commits changing the layout of 'struct mem_cgroup', this patch makes
sure the 3 fields stay in the same cacheline.

One concern of this approach is, moving_account and move_lock_task could
be modified when a process changes memory cgroup while stat_cpu is a
always read field, it might hurt to place them in the same cacheline.  I
assume it is rare for a process to change memory cgroup so this should
be OK.

Link: https://lkml.kernel.org/r/20180528114019.GF9904@yexl-desktop
Link: http://lkml.kernel.org/r/20180601071115.GA27302@intel.com
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Reported-by: kernel test robot <xiaolong.ye@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: kvmalloc does not fallback to vmalloc for incompatible gfp flags
Michal Hocko [Fri, 8 Jun 2018 00:09:40 +0000 (17:09 -0700)]
mm: kvmalloc does not fallback to vmalloc for incompatible gfp flags

kvmalloc warned about incompatible gfp_mask to catch abusers (mostly
GFP_NOFS) with an intention that this will motivate authors of the code
to fix those.  Linus argues that this just motivates people to do even
more hacks like

if (gfp == GFP_KERNEL)
kvmalloc
else
kmalloc

I haven't seen this happening much (Linus pointed to bucket_lock special
cases an atomic allocation but my git foo hasn't found much more) but it
is true that we can grow those in future.  Therefore Linus suggested to
simply not fallback to vmalloc for incompatible gfp flags and rather
stick with the kmalloc path.

Link: http://lkml.kernel.org/r/20180601115329.27807-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tom Herbert <tom@quantonium.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoinclude/linux/gfp.h: fix the annotation of GFP_ZONE_TABLE
Huaisheng Ye [Fri, 8 Jun 2018 00:09:36 +0000 (17:09 -0700)]
include/linux/gfp.h: fix the annotation of GFP_ZONE_TABLE

When bit is equal to 0x4, it means OPT_ZONE_DMA32 should be got from
GFP_ZONE_TABLE.  OPT_ZONE_DMA32 shall be equal to ZONE_DMA32 or
ZONE_NORMAL according to the status of CONFIG_ZONE_DMA32.

Similarly, when bit is equal to 0xc, that means OPT_ZONE_DMA32 should be
got with an allocation policy GFP_MOVABLE.  So ZONE_DMA32 or ZONE_NORMAL
is the possible result value.

Link: http://lkml.kernel.org/r/20180601163403.1032-1-yehs2007@zoho.com
Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/shmem.c: zero out unused vma fields in shmem_pseudo_vma_init()
Kirill A. Shutemov [Fri, 8 Jun 2018 00:09:32 +0000 (17:09 -0700)]
mm/shmem.c: zero out unused vma fields in shmem_pseudo_vma_init()

shmem/tmpfs uses pseudo vma to allocate page with correct NUMA policy.

The pseudo vma doesn't have vm_page_prot set.  We are going to encode
encryption KeyID in vm_page_prot.  Having garbage there causes problems.

Zero out all unused fields in the pseudo vma.

Link: http://lkml.kernel.org/r/20180531135602.20321-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm, page_alloc: do not break __GFP_THISNODE by zonelist reset
Vlastimil Babka [Fri, 8 Jun 2018 00:09:29 +0000 (17:09 -0700)]
mm, page_alloc: do not break __GFP_THISNODE by zonelist reset

In __alloc_pages_slowpath() we reset zonelist and preferred_zoneref for
allocations that can ignore memory policies.  The zonelist is obtained
from current CPU's node.  This is a problem for __GFP_THISNODE
allocations that want to allocate on a different node, e.g.  because the
allocating thread has been migrated to a different CPU.

This has been observed to break SLAB in our 4.4-based kernel, because
there it relies on __GFP_THISNODE working as intended.  If a slab page
is put on wrong node's list, then further list manipulations may corrupt
the list because page_to_nid() is used to determine which node's
list_lock should be locked and thus we may take a wrong lock and race.

Current SLAB implementation seems to be immune by luck thanks to commit
511e3a058812 ("mm/slab: make cache_grow() handle the page allocated on
arbitrary node") but there may be others assuming that __GFP_THISNODE
works as promised.

We can fix it by simply removing the zonelist reset completely.  There
is actually no reason to reset it, because memory policies and cpusets
don't affect the zonelist choice in the first place.  This was different
when commit 183f6371aac2 ("mm: ignore mempolicies when using
ALLOC_NO_WATERMARK") introduced the code, as mempolicies provided their
own restricted zonelists.

We might consider this for 4.17 although I don't know if there's
anything currently broken.

SLAB is currently not affected, but in kernels older than 4.7 that don't
yet have 511e3a058812 ("mm/slab: make cache_grow() handle the page
allocated on arbitrary node") it is.  That's at least 4.4 LTS.  Older
ones I'll have to check.

So stable backports should be more important, but will have to be
reviewed carefully, as the code went through many changes.  BTW I think
that also the ac->preferred_zoneref reset is currently useless if we
don't also reset ac->nodemask from a mempolicy to NULL first (which we
probably should for the OOM victims etc?), but I would leave that for a
separate patch.

Link: http://lkml.kernel.org/r/20180525130853.13915-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: 183f6371aac2 ("mm: ignore mempolicies when using ALLOC_NO_WATERMARK")
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agouserfaultfd: prevent non-cooperative events vs mcopy_atomic races
Mike Rapoport [Fri, 8 Jun 2018 00:09:25 +0000 (17:09 -0700)]
userfaultfd: prevent non-cooperative events vs mcopy_atomic races

If a process monitored with userfaultfd changes it's memory mappings or
forks() at the same time as uffd monitor fills the process memory with
UFFDIO_COPY, the actual creation of page table entries and copying of
the data in mcopy_atomic may happen either before of after the memory
mapping modifications and there is no way for the uffd monitor to
maintain consistent view of the process memory layout.

For instance, let's consider fork() running in parallel with
userfaultfd_copy():

process                  | uffd monitor
---------------------------------+------------------------------
fork()                  | userfaultfd_copy()
...                  | ...
    dup_mmap()                  |     down_read(mmap_sem)
    down_write(mmap_sem)         |     /* create PTEs, copy data */
        dup_uffd()               |     up_read(mmap_sem)
        copy_page_range()        |
        up_write(mmap_sem)       |
        dup_uffd_complete()      |
            /* notify monitor */ |

If the userfaultfd_copy() takes the mmap_sem first, the new page(s) will
be present by the time copy_page_range() is called and they will appear
in the child's memory mappings.  However, if the fork() is the first to
take the mmap_sem, the new pages won't be mapped in the child's address
space.

If the pages are not present and child tries to access them, the monitor
will get page fault notification and everything is fine.  However, if
the pages *are present*, the child can access them without uffd
noticing.  And if we copy them into child it'll see the wrong data.
Since we are talking about background copy, we'd need to decide whether
the pages should be copied or not regardless #PF notifications.

Since userfaultfd monitor has no way to determine what was the order,
let's disallow userfaultfd_copy in parallel with the non-cooperative
events.  In such case we return -EAGAIN and the uffd monitor can
understand that userfaultfd_copy() clashed with a non-cooperative event
and take an appropriate action.

Link: http://lkml.kernel.org/r/1527061324-19949-1-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: memcg: allow lowering memory.swap.max below the current usage
Tejun Heo [Fri, 8 Jun 2018 00:09:21 +0000 (17:09 -0700)]
mm: memcg: allow lowering memory.swap.max below the current usage

Currently an attempt to set swap.max into a value lower than the actual
swap usage fails, which causes configuration problems as there's no way
of lowering the configuration below the current usage short of turning
off swap entirely.  This makes swap.max difficult to use and allows
delegatees to lock the delegator out of reducing swap allocation.

This patch updates swap_max_write() so that the limit can be lowered
below the current usage.  It doesn't implement active reclaiming of swap
entries for the following reasons.

* mem_cgroup_swap_full() already tells the swap machinary to
  aggressively reclaim swap entries if the usage is above 50% of
  limit, so simply lowering the limit automatically triggers gradual
  reclaim.

* Forcing back swapped out pages is likely to heavily impact the
  workload and mess up the working set.  Given that swap usually is a
  lot less valuable and less scarce, letting the existing usage
  dissipate over time through the above gradual reclaim and as they're
  falted back in is likely the better behavior.

Link: http://lkml.kernel.org/r/20180523185041.GR1718769@devbig577.frc2.facebook.com
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shaohua Li <shli@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/shmem.c: use new return type vm_fault_t
Souptick Joarder [Fri, 8 Jun 2018 00:09:17 +0000 (17:09 -0700)]
mm/shmem.c: use new return type vm_fault_t

Use new return type vm_fault_t for fault handler.  For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno.  Once all instances are converted, vm_fault_t will become a
distinct type.

See commit 1c8f422059ae ("mm: change return type to vm_fault_t")

vmf_error() is the newly introduce inline function in 4.17-rc6.

Link: http://lkml.kernel.org/r/20180521202410.GA17912@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoslub: remove 'reserved' file from sysfs
Matthew Wilcox [Fri, 8 Jun 2018 00:09:14 +0000 (17:09 -0700)]
slub: remove 'reserved' file from sysfs

Christoph doubts anyone was using the 'reserved' file in sysfs, so remove
it.

Link: http://lkml.kernel.org/r/20180518194519.3820-17-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoslub: remove kmem_cache->reserved
Matthew Wilcox [Fri, 8 Jun 2018 00:09:10 +0000 (17:09 -0700)]
slub: remove kmem_cache->reserved

The reserved field was only used for embedding an rcu_head in the data
structure.  With the previous commit, we no longer need it.  That lets us
remove the 'reserved' argument to a lot of functions.

Link: http://lkml.kernel.org/r/20180518194519.3820-16-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoslab,slub: remove rcu_head size checks
Matthew Wilcox [Fri, 8 Jun 2018 00:09:05 +0000 (17:09 -0700)]
slab,slub: remove rcu_head size checks

rcu_head may now grow larger than list_head without affecting slab or
slub.

Link: http://lkml.kernel.org/r/20180518194519.3820-15-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: add hmm_data to struct page
Matthew Wilcox [Fri, 8 Jun 2018 00:09:01 +0000 (17:09 -0700)]
mm: add hmm_data to struct page

Make hmm_data an explicit member of the struct page union.

Link: http://lkml.kernel.org/r/20180518194519.3820-14-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: add pt_mm to struct page
Matthew Wilcox [Fri, 8 Jun 2018 00:08:57 +0000 (17:08 -0700)]
mm: add pt_mm to struct page

For pgd page table pages, x86 overloads the page->index field to store a
pointer to the mm_struct.  Rename this to pt_mm so it's visible to other
users.

Link: http://lkml.kernel.org/r/20180518194519.3820-13-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: improve struct page documentation
Matthew Wilcox [Fri, 8 Jun 2018 00:08:53 +0000 (17:08 -0700)]
mm: improve struct page documentation

Rewrite the documentation to describe what you can use in struct page
rather than what you can't.

Link: http://lkml.kernel.org/r/20180518194519.3820-12-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: combine LRU and main union in struct page
Matthew Wilcox [Fri, 8 Jun 2018 00:08:50 +0000 (17:08 -0700)]
mm: combine LRU and main union in struct page

This gives us five words of space in a single union in struct page.  The
compound_mapcount moves position (from offset 24 to offset 20) on 64-bit
systems, but that does not seem likely to cause any trouble.

Link: http://lkml.kernel.org/r/20180518194519.3820-11-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: move lru union within struct page
Matthew Wilcox [Fri, 8 Jun 2018 00:08:46 +0000 (17:08 -0700)]
mm: move lru union within struct page

Since the LRU is two words, this does not affect the double-word alignment
of SLUB's freelist.

Link: http://lkml.kernel.org/r/20180518194519.3820-10-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: use page->deferred_list
Matthew Wilcox [Fri, 8 Jun 2018 00:08:42 +0000 (17:08 -0700)]
mm: use page->deferred_list

Now that we can represent the location of 'deferred_list' in C instead of
comments, make use of that ability.

Link: http://lkml.kernel.org/r/20180518194519.3820-9-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: combine first three unions in struct page
Matthew Wilcox [Fri, 8 Jun 2018 00:08:39 +0000 (17:08 -0700)]
mm: combine first three unions in struct page

By combining these three one-word unions into one three-word union, we
make it easier for users to add their own multi-word fields to struct
page, as well as making it obvious that SLUB needs to keep its double-word
alignment for its freelist & counters.

No field moves position; verified with pahole.

Link: http://lkml.kernel.org/r/20180518194519.3820-8-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: move _refcount out of struct page union
Matthew Wilcox [Fri, 8 Jun 2018 00:08:35 +0000 (17:08 -0700)]
mm: move _refcount out of struct page union

Keeping the refcount in the union only encourages people to put something
else in the union which will overlap with _refcount and eventually explode
messily.  pahole reports no fields change location.

Link: http://lkml.kernel.org/r/20180518194519.3820-7-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: move 'private' union within struct page
Matthew Wilcox [Fri, 8 Jun 2018 00:08:31 +0000 (17:08 -0700)]
mm: move 'private' union within struct page

By moving page->private to the fourth word of struct page, we can put the
SLUB counters in the same word as SLAB's s_mem and still do the
cmpxchg_double trick.  Now the SLUB counters no longer overlap with the
mapcount or refcount so we can drop the call to page_mapcount_reset() and
simplify set_page_slub_counters() to a single line.

Link: http://lkml.kernel.org/r/20180518194519.3820-6-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: switch s_mem and slab_cache in struct page
Matthew Wilcox [Fri, 8 Jun 2018 00:08:26 +0000 (17:08 -0700)]
mm: switch s_mem and slab_cache in struct page

This will allow us to store slub's counters in the same bits as slab's
s_mem.  slub now needs to set page->mapping to NULL as it frees the page,
just like slab does.

Link: http://lkml.kernel.org/r/20180518194519.3820-5-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: mark pages in use for page tables
Matthew Wilcox [Fri, 8 Jun 2018 00:08:23 +0000 (17:08 -0700)]
mm: mark pages in use for page tables

Define a new PageTable bit in the page_type and use it to mark pages in
use as page tables.  This can be helpful when debugging crashdumps or
analysing memory fragmentation.  Add a KPF flag to report these pages to
userspace and update page-types.c to interpret that flag.

Note that only pages currently accounted as NR_PAGETABLES are tracked as
PageTable; this does not include pgd/p4d/pud/pmd pages.  Those will be the
subject of a later patch.

Link: http://lkml.kernel.org/r/20180518194519.3820-4-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>