sfrench/cifs-2.6.git
4 years agoaudit: Abstract hash key handling
Jan Kara [Fri, 16 Dec 2016 09:13:37 +0000 (10:13 +0100)]
audit: Abstract hash key handling

Audit tree currently uses inode pointer as a key into the hash table.
Getting that from notification mark will be somewhat more difficult with
coming fsnotify changes. So abstract getting of hash key from the audit
chunk and inode so that we can change the method to obtain a key easily.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
CC: Paul Moore <paul@paul-moore.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Jan Kara <jack@suse.cz>
4 years agofanotify: Move recalculation of inode / vfsmount mask under mark_mutex
Jan Kara [Wed, 14 Dec 2016 12:53:46 +0000 (13:53 +0100)]
fanotify: Move recalculation of inode / vfsmount mask under mark_mutex

Move recalculation of inode / vfsmount notification mask under
group->mark_mutex of the mark which was modified. These are the only
places where mask recalculation happens without mark being protected
from detaching from inode / vfsmount which will cause issues with the
following patches.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
4 years agoinotify: Remove inode pointers from debug messages
Jan Kara [Fri, 9 Dec 2016 08:38:55 +0000 (09:38 +0100)]
inotify: Remove inode pointers from debug messages

Printing inode pointers in warnings has dubious value and with future
changes we won't be able to easily get them without either locking or
chances we oops along the way. So just remove inode pointers from the
warning messages.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
4 years agofsnotify: Remove unnecessary tests when showing fdinfo
Jan Kara [Fri, 9 Dec 2016 08:18:21 +0000 (09:18 +0100)]
fsnotify: Remove unnecessary tests when showing fdinfo

show_fdinfo() iterates group's list of marks. All marks found there are
guaranteed to be alive and they stay so until we release
group->mark_mutex. So remove uncecessary tests whether mark is alive.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
4 years agoLinux 4.11-rc2 v4.11-rc2
Linus Torvalds [Sun, 12 Mar 2017 21:47:08 +0000 (14:47 -0700)]
Linux 4.11-rc2

4 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Sun, 12 Mar 2017 21:22:25 +0000 (14:22 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux

Pull s390 fixes from Martin Schwidefsky:

 - four patches to get the new cputime code in shape for s390

 - add the new statx system call

 - a few bug fixes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: wire up statx system call
  KVM: s390: Fix guest migration for huge guests resulting in panic
  s390/ipl: always use load normal for CCW-type re-IPL
  s390/timex: micro optimization for tod_to_ns
  s390/cputime: provide archicture specific cputime_to_nsecs
  s390/cputime: reset all accounting fields on fork
  s390/cputime: remove last traces of cputime_t
  s390: fix in-kernel program checks
  s390/crypt: fix missing unlock in ctr_paes_crypt on error path

4 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 12 Mar 2017 21:18:49 +0000 (14:18 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:

 - a fix for the kexec/purgatory regression which was introduced in the
   merge window via an innocent sparse fix. We could have reverted that
   commit, but on deeper inspection it turned out that the whole
   machinery is neither documented nor robust. So a proper cleanup was
   done instead

 - the fix for the TLB flush issue which was discovered recently

 - a simple typo fix for a reboot quirk

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tlb: Fix tlb flushing when lguest clears PGE
  kexec, x86/purgatory: Unbreak it and clean it up
  x86/reboot/quirks: Fix typo in ASUS EeeBook X205TA reboot quirk

4 years agoMerge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 12 Mar 2017 21:11:38 +0000 (14:11 -0700)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:

 - a workaround for a GIC erratum

 - a missing stub function for CONFIG_IRQDOMAIN=n

 - fixes for a couple of type inconsistencies

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/crossbar: Fix incorrect type of register size
  irqchip/gicv3-its: Add workaround for QDF2400 ITS erratum 0065
  irqdomain: Add empty irq_domain_check_msi_remap
  irqchip/crossbar: Fix incorrect type of local variables

4 years agox86/tlb: Fix tlb flushing when lguest clears PGE
Daniel Borkmann [Sat, 11 Mar 2017 00:31:19 +0000 (01:31 +0100)]
x86/tlb: Fix tlb flushing when lguest clears PGE

Fengguang reported random corruptions from various locations on x86-32
after commits d2852a224050 ("arch: add ARCH_HAS_SET_MEMORY config") and
9d876e79df6a ("bpf: fix unlocking of jited image when module ronx not set")
that uses the former. While x86-32 doesn't have a JIT like x86_64, the
bpf_prog_lock_ro() and bpf_prog_unlock_ro() got enabled due to
ARCH_HAS_SET_MEMORY, whereas Fengguang's test kernel doesn't have module
support built in and therefore never had the DEBUG_SET_MODULE_RONX setting
enabled.

After investigating the crashes further, it turned out that using
set_memory_ro() and set_memory_rw() didn't have the desired effect, for
example, setting the pages as read-only on x86-32 would still let
probe_kernel_write() succeed without error. This behavior would manifest
itself in situations where the vmalloc'ed buffer was accessed prior to
set_memory_*() such as in case of bpf_prog_alloc(). In cases where it
wasn't, the page attribute changes seemed to have taken effect, leading to
the conclusion that a TLB invalidate didn't happen. Moreover, it turned out
that this issue reproduced with qemu in "-cpu kvm64" mode, but not for
"-cpu host". When the issue occurs, change_page_attr_set_clr() did trigger
a TLB flush as expected via __flush_tlb_all() through cpa_flush_range(),
though.

There are 3 variants for issuing a TLB flush: invpcid_flush_all() (depends
on CPU feature bits X86_FEATURE_INVPCID, X86_FEATURE_PGE), cr4 based flush
(depends on X86_FEATURE_PGE), and cr3 based flush.  For "-cpu host" case in
my setup, the flush used invpcid_flush_all() variant, whereas for "-cpu
kvm64", the flush was cr4 based. Switching the kvm64 case to cr3 manually
worked fine, and further investigating the cr4 one turned out that
X86_CR4_PGE bit was not set in cr4 register, meaning the
__native_flush_tlb_global_irq_disabled() wrote cr4 twice with the same
value instead of clearing X86_CR4_PGE in the first write to trigger the
flush.

It turned out that X86_CR4_PGE was cleared from cr4 during init from
lguest_arch_host_init() via adjust_pge(). The X86_FEATURE_PGE bit is also
cleared from there due to concerns of using PGE in guest kernel that can
lead to hard to trace bugs (see bff672e630a0 ("lguest: documentation V:
Host") in init()). The CPU feature bits are cleared in dynamic
boot_cpu_data, but they never propagated to __flush_tlb_all() as it uses
static_cpu_has() instead of boot_cpu_has() for testing which variant of TLB
flushing to use, meaning they still used the old setting of the host
kernel.

Clearing via setup_clear_cpu_cap(X86_FEATURE_PGE) so this would propagate
to static_cpu_has() checks is too late at this point as sections have been
patched already, so for now, it seems reasonable to switch back to
boot_cpu_has(X86_FEATURE_PGE) as it was prior to commit c109bf95992b
("x86/cpufeature: Remove cpu_has_pge"). This lets the TLB flush trigger via
cr3 as originally intended, properly makes the new page attributes visible
and thus fixes the crashes seen by Fengguang.

Fixes: c109bf95992b ("x86/cpufeature: Remove cpu_has_pge")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: bp@suse.de
Cc: Kees Cook <keescook@chromium.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: lkp@01.org
Cc: Laura Abbott <labbott@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernrl.org/r/20170301125426.l4nf65rx4wahohyl@wfg-t540p.sh.intel.com
Link: http://lkml.kernel.org/r/25c41ad9eca164be4db9ad84f768965b7eb19d9e.1489191673.git.daniel@iogearbox.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
4 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sat, 11 Mar 2017 22:24:58 +0000 (14:24 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Radim Krčmář:
 "ARM updates from Marc Zyngier:
   - vgic updates:
     - Honour disabling the ITS
     - Don't deadlock when deactivating own interrupts via MMIO
     - Correctly expose the lact of IRQ/FIQ bypass on GICv3

   - I/O virtualization:
     - Make KVM_CAP_NR_MEMSLOTS big enough for large guests with many
       PCIe devices

   - General bug fixes:
     - Gracefully handle exception generated with syndroms that the host
       doesn't understand
     - Properly invalidate TLBs on VHE systems

  x86:
   - improvements in emulation of VMCLEAR, VMX MSR bitmaps, and VCPU
     reset

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: nVMX: do not warn when MSR bitmap address is not backed
  KVM: arm64: Increase number of user memslots to 512
  KVM: arm/arm64: Remove KVM_PRIVATE_MEM_SLOTS definition that are unused
  KVM: arm/arm64: Enable KVM_CAP_NR_MEMSLOTS on arm/arm64
  KVM: Add documentation for KVM_CAP_NR_MEMSLOTS
  KVM: arm/arm64: VGIC: Fix command handling while ITS being disabled
  arm64: KVM: Survive unknown traps from guests
  arm: KVM: Survive unknown traps from guests
  KVM: arm/arm64: Let vcpu thread modify its own active state
  KVM: nVMX: reset nested_run_pending if the vCPU is going to be reset
  kvm: nVMX: VMCLEAR should not cause the vCPU to shut down
  KVM: arm/arm64: vgic-v3: Don't pretend to support IRQ/FIQ bypass
  arm64: KVM: VHE: Clear HCR_TGE when invalidating guest TLBs

4 years agoMerge tag 'extable-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
Linus Torvalds [Sat, 11 Mar 2017 22:16:50 +0000 (14:16 -0800)]
Merge tag 'extable-fix' of git://git./linux/kernel/git/paulg/linux

Pull extable.h fix from Paul Gortmaker:
 "Fixup for arch/score after extable.h introduction.

  It seems that Guenter is the only one on the planet doing builds for
  arch/score -- we don't have compile coverage for it in linux-next or
  in the kbuild-bot either. Guenter couldn't even recall where he got
  his toolchain, but was kind enough to share it with me so I could
  validate this change and also add arch/score to my build coverage.

  I sat on this a bit in case there was any other fallout in other arch
  dirs, but since this still seems to be the only one, I might as well
  send it on its way"

* tag 'extable-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
  score: Fix implicit includes now failing build after extable change

4 years agoMerge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso...
Linus Torvalds [Sat, 11 Mar 2017 17:08:47 +0000 (09:08 -0800)]
Merge tag 'random_for_linus' of git://git./linux/kernel/git/tytso/random

Pull random updates from Ted Ts'o:
 "Change get_random_{int,log} to use the CRNG used by /dev/urandom and
  getrandom(2). It's faster and arguably more secure than cut-down MD5
  that we had been using.

  Also do some code cleanup"

* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
  random: move random_min_urandom_seed into CONFIG_SYSCTL ifdef block
  random: convert get_random_int/long into get_random_u32/u64
  random: use chacha20 for get_random_int/long
  random: fix comment for unused random_min_urandom_seed
  random: remove variable limit
  random: remove stale urandom_init_wait
  random: remove stale maybe_reseed_primary_crng

4 years agoscore: Fix implicit includes now failing build after extable change
Guenter Roeck [Wed, 22 Feb 2017 19:07:57 +0000 (11:07 -0800)]
score: Fix implicit includes now failing build after extable change

After changing from module.h to extable.h, score builds fail with:

  arch/score/kernel/traps.c: In function 'do_ri':
  arch/score/kernel/traps.c:248:4: error: implicit declaration of function 'user_disable_single_step'
  arch/score/mm/extable.c: In function 'fixup_exception':
  arch/score/mm/extable.c:32:38: error: dereferencing pointer to incomplete type
  arch/score/mm/extable.c:34:24: error: dereferencing pointer to incomplete type

because extable.h doesn't drag in the same amount of headers as the
module.h did.  Add in the headers which were implicitly expected.

Fixes: 90858794c960 ("module.h: remove extable.h include now users have migrated")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
[PG: tweak commit log; refresh for sched header refactoring.]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
4 years agoMerge tag 'tty-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Sat, 11 Mar 2017 08:20:12 +0000 (00:20 -0800)]
Merge tag 'tty-4.11-rc2' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial fixes frpm Greg KH:
 "Here are two bugfixes for tty stuff for 4.11-rc2.

  One of them resolves the pretty bad bug in the n_hdlc code that
  Alexander Popov found and fixed and has been reported everywhere. The
  other just fixes a samsung serial driver issue when DMA fails on some
  systems.

  Both have been in linux-next with no reported issues"

* tag 'tty-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: samsung: Continue to work if DMA request fails
  tty: n_hdlc: get rid of racy n_hdlc.tbuf

4 years agoMerge tag 'staging-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sat, 11 Mar 2017 08:13:28 +0000 (00:13 -0800)]
Merge tag 'staging-4.11-rc2' of git://git./linux/kernel/git/gregkh/staging

Pull staging driver fixes from Greg KH:
 "Here are two small build warning fixes for some staging drivers that
  Arnd has found on his valiant quest to get the kernel to build
  properly with no warnings.

  Both of these have been in linux-next this week and resolve the
  reported issues"

* tag 'staging-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: octeon: remove unused variable
  staging/vc04_services: add CONFIG_OF dependency

4 years agoMerge tag 'usb-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sat, 11 Mar 2017 08:08:39 +0000 (00:08 -0800)]
Merge tag 'usb-4.11-rc2' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here is a number of different USB fixes for 4.11-rc2.

  Seems like there were a lot of unresolved issues that people have been
  finding for this subsystem, and a bunch of good security auditing
  happening as well from Johan Hovold. There's the usual batch of gadget
  driver fixes and xhci issues resolved as well.

 All of these have been in linux-next with no reported issues"

* tag 'usb-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (35 commits)
  usb: host: xhci-plat: Fix timeout on removal of hot pluggable xhci controllers
  usb: host: xhci-dbg: HCIVERSION should be a binary number
  usb: xhci: remove dummy extra_priv_size for size of xhci_hcd struct
  usb: xhci-mtk: check hcc_params after adding primary hcd
  USB: serial: digi_acceleport: fix OOB-event processing
  MAINTAINERS: usb251xb: remove reference inexistent file
  doc: dt-bindings: usb251xb: mark reg as required
  usb: usb251xb: dt: add unit suffix to oc-delay and power-on-time
  usb: usb251xb: remove max_{power,current}_{sp,bp} properties
  usb-storage: Add ignore-residue quirk for Initio INIC-3619
  USB: iowarrior: fix NULL-deref in write
  USB: iowarrior: fix NULL-deref at probe
  usb: phy: isp1301: Add OF device ID table
  usb: ohci-at91: Do not drop unhandled USB suspend control requests
  USB: serial: safe_serial: fix information leak in completion handler
  USB: serial: io_ti: fix information leak in completion handler
  USB: serial: omninet: drop open callback
  USB: serial: omninet: fix reference leaks at open
  USB: serial: io_ti: fix NULL-deref in interrupt callback
  usb: dwc3: gadget: make to increment req->remaining in all cases
  ...

4 years agoMerge tag 'pinctrl-v4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Sat, 11 Mar 2017 08:06:18 +0000 (00:06 -0800)]
Merge tag 'pinctrl-v4.11-2' of git://git./linux/kernel/git/linusw/linux-pinctrl

Pull pinctrl fixes from Linus Walleij:
 "Two smaller pin control fixes for the v4.11 series:

   - Add a get_direction() function to the qcom driver

   - Fix two pin names in the uniphier driver"

* tag 'pinctrl-v4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: uniphier: change pin names of aio/xirq for LD11
  pinctrl: qcom: add get_direction function

4 years agokexec, x86/purgatory: Unbreak it and clean it up
Thomas Gleixner [Fri, 10 Mar 2017 12:17:18 +0000 (13:17 +0100)]
kexec, x86/purgatory: Unbreak it and clean it up

The purgatory code defines global variables which are referenced via a
symbol lookup in the kexec code (core and arch).

A recent commit addressing sparse warnings made these static and thereby
broke kexec_file.

Why did this happen? Simply because the whole machinery is undocumented and
lacks any form of forward declarations. The variable names are unspecific
and lack a prefix, so adding forward declarations creates shadow variables
in the core code. Aside of that the code relies on magic constants and
duplicate struct definitions with no way to ensure that these things stay
in sync. The section placement of the purgatory variables happened by
chance and not by design.

Unbreak kexec and cleanup the mess:

 - Add proper forward declarations and document the usage
 - Use common struct definition
 - Use the proper common defines instead of magic constants
 - Add a purgatory_ prefix to have a proper name space
 - Use ARRAY_SIZE() instead of a homebrewn reimplementation
 - Add proper sections to the purgatory variables [ From Mike ]

Fixes: 72042a8c7b01 ("x86/purgatory: Make functions and variables static")
Reported-by: Mike Galbraith <<efault@gmx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Nicholas Mc Guire <der.herr@hofr.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: "Tobin C. Harding" <me@tobin.cc>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1703101315140.3681@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
4 years agoMerge tag 'ceph-for-4.11-rc2' of git://github.com/ceph/ceph-client
Linus Torvalds [Fri, 10 Mar 2017 19:05:47 +0000 (11:05 -0800)]
Merge tag 'ceph-for-4.11-rc2' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:

 - a fix for the recently discovered misdirected requests bug present in
   jewel and later on the server side and all stable kernels

 - a fixup for -rc1 CRUSH changes

 - two usability enhancements: osd_request_timeout option and
   supported_features bus attribute.

* tag 'ceph-for-4.11-rc2' of git://github.com/ceph/ceph-client:
  libceph: osd_request_timeout option
  rbd: supported_features bus attribute
  libceph: don't set weight to IN when OSD is destroyed
  libceph: fix crush_decode() for older maps

4 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Fri, 10 Mar 2017 17:56:16 +0000 (09:56 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "Here are some driver bugfixes from I2C.

  Unusual this time are the two reverts. One because I accidently picked
  a patch from the list which I should have pulled from my co-maintainer
  instead ("missing of_node_put"). And one which I wrongly assumed to be
  an easy fix but it turned out already that it needs more iterations
  ("copy device properties")"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  Revert "i2c: copy device properties when using i2c_register_board_info()"
  Revert "i2c: add missing of_node_put in i2c_mux_del_adapters"
  i2c: exynos5: Avoid transaction timeouts due TRANSFER_DONE_AUTO not set
  i2c: designware: add reset interface
  i2c: meson: fix wrong variable usage in meson_i2c_put_data
  i2c: copy device properties when using i2c_register_board_info()
  i2c: m65xx: drop superfluous quirk structure
  i2c: brcmstb: Fix START and STOP conditions
  i2c: add missing of_node_put in i2c_mux_del_adapters
  i2c: riic: fix restart condition
  i2c: add missing of_node_put in i2c_mux_del_adapters

4 years agoMerge tag 'drm-fixes-for-4.11-rc2' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 10 Mar 2017 17:53:00 +0000 (09:53 -0800)]
Merge tag 'drm-fixes-for-4.11-rc2' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "Intel, amd and mxsfb fixes.

  These are the drm fixes I've collected for rc2. Mostly i915 GVT only
  fixes, along with a single EDID fix, some mxsfb fixes and a few minor
  amd fixes"

* tag 'drm-fixes-for-4.11-rc2' of git://people.freedesktop.org/~airlied/linux: (38 commits)
  drm: mxsfb: Implement drm_panel handling
  drm: mxsfb_crtc: Fix the framebuffer misplacement
  drm: mxsfb: Fix crash when provided invalid DT bindings
  drm: mxsfb: fix pixel clock polarity
  drm: mxsfb: use bus_format to determine LCD bus width
  drm/amdgpu: bump driver version for some new features
  drm/amdgpu: validate paramaters in the gem ioctl
  drm/amd/amdgpu: fix console deadlock if late init failed
  drm/i915/gvt: change some gvt_err to gvt_dbg_cmd
  drm/i915/gvt: protect RO and Rsvd bits of virtual vgpu configuration space
  drm/i915/gvt: handle workload lifecycle properly
  drm/edid: Add EDID_QUIRK_FORCE_8BPC quirk for Rotel RSX-1058
  drm/i915/gvt: fix an error for F_RO flag
  drm/i915/gvt: use pfn_valid for better checking
  drm/i915/gvt: set SFUSE_STRAP properly for vitual monitor detection
  drm/i915/gvt: fix an error for one register
  drm/i915/gvt: add more registers into handlers list
  drm/i915/gvt: have more registers with F_CMD_ACCESS flags set
  drm/i915/gvt: add some new MMIOs to cmd_access white list
  drm/i915/gvt: fix pcode mailbox write emulation of BDW
  ...

4 years agoMerge branch 'prep-for-5level'
Linus Torvalds [Fri, 10 Mar 2017 16:59:07 +0000 (08:59 -0800)]
Merge branch 'prep-for-5level'

Merge 5-level page table prep from Kirill Shutemov:
 "Here's relatively low-risk part of 5-level paging patchset. Merging it
  now will make x86 5-level paging enabling in v4.12 easier.

  The first patch is actually x86-specific: detect 5-level paging
  support. It boils down to single define.

  The rest of patchset converts Linux MMU abstraction from 4- to 5-level
  paging.

  Enabling of new abstraction in most cases requires adding single line
  of code in arch-specific code. The rest is taken care by asm-generic/.

  Changes to mm/ code are mostly mechanical: add support for new page
  table level -- p4d_t -- where we deal with pud_t now.

  v2:
   - fix build on microblaze (Michal);
   - comment for __ARCH_HAS_5LEVEL_HACK in kasan_populate_zero_shadow();
   - acks from Michal"

* emailed patches from Kirill A Shutemov <kirill.shutemov@linux.intel.com>:
  mm: introduce __p4d_alloc()
  mm: convert generic code to 5-level paging
  asm-generic: introduce <asm-generic/pgtable-nop4d.h>
  arch, mm: convert all architectures to use 5level-fixup.h
  asm-generic: introduce __ARCH_USE_5LEVEL_HACK
  asm-generic: introduce 5level-fixup.h
  x86/cpufeature: Add 5-level paging detection

4 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Fri, 10 Mar 2017 16:34:42 +0000 (08:34 -0800)]
Merge branch 'akpm' (patches from Andrew)

Merge fixes from Andrew Morton:
 "26 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (26 commits)
  userfaultfd: remove wrong comment from userfaultfd_ctx_get()
  fat: fix using uninitialized fields of fat_inode/fsinfo_inode
  sh: cayman: IDE support fix
  kasan: fix races in quarantine_remove_cache()
  kasan: resched in quarantine_remove_cache()
  mm: do not call mem_cgroup_free() from within mem_cgroup_alloc()
  thp: fix another corner case of munlock() vs. THPs
  rmap: fix NULL-pointer dereference on THP munlocking
  mm/memblock.c: fix memblock_next_valid_pfn()
  userfaultfd: selftest: vm: allow to build in vm/ directory
  userfaultfd: non-cooperative: userfaultfd_remove revalidate vma in MADV_DONTNEED
  userfaultfd: non-cooperative: fix fork fctx->new memleak
  mm/cgroup: avoid panic when init with low memory
  drivers/md/bcache/util.h: remove duplicate inclusion of blkdev.h
  mm/vmstats: add thp_split_pud event for clarity
  include/linux/fs.h: fix unsigned enum warning with gcc-4.2
  userfaultfd: non-cooperative: release all ctx in dup_userfaultfd_complete
  userfaultfd: non-cooperative: robustness check
  userfaultfd: non-cooperative: rollback userfaultfd_exit
  x86, mm: unify exit paths in gup_pte_range()
  ...

4 years agox86/reboot/quirks: Fix typo in ASUS EeeBook X205TA reboot quirk
Matjaz Hegedic [Thu, 9 Mar 2017 13:00:17 +0000 (14:00 +0100)]
x86/reboot/quirks: Fix typo in ASUS EeeBook X205TA reboot quirk

The reboot quirk for ASUS EeeBook X205TA contains a typo in
DMI_PRODUCT_NAME, improperly referring to X205TAW instead of
X205TA, which prevents the quirk from being triggered. The
model X205TAW already has a reboot quirk of its own.

This fix simply removes the inappropriate final letter W.

Fixes: 90b28ded88dd ("x86/reboot/quirks: Add ASUS EeeBook X205TA reboot quirk")
Signed-off-by: Matjaz Hegedic <matjaz.hegedic@gmail.com>
Link: http://lkml.kernel.org/r/1489064417-7445-1-git-send-email-matjaz.hegedic@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
4 years agoMerge tag 'xfs-4.11-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Fri, 10 Mar 2017 02:11:28 +0000 (18:11 -0800)]
Merge tag 'xfs-4.11-fixes-1' of git://git./fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
 "Here are some bug fixes for -rc2 to clean up the copy on write
  handling and to remove a cause of hangs.

   - Fix various iomap bugs

   - Fix overly aggressive CoW preallocation garbage collection

   - Fixes to CoW endio error handling

   - Fix some incorrect geometry calculations

   - Remove a potential system hang in bulkstat

   - Try to allocate blocks more aggressively to reduce ENOSPC errors"

* tag 'xfs-4.11-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: try any AG when allocating the first btree block when reflinking
  xfs: use iomap new flag for newly allocated delalloc blocks
  xfs: remove kmem_zalloc_greedy
  xfs: Use xfs_icluster_size_fsb() to calculate inode alignment mask
  xfs: fix and streamline error handling in xfs_end_io
  xfs: only reclaim unwritten COW extents periodically
  iomap: invalidate page caches should be after iomap_dio_complete() in direct write

4 years agoMerge tag 'gcc-plugins-v4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 10 Mar 2017 02:05:41 +0000 (18:05 -0800)]
Merge tag 'gcc-plugins-v4.11-rc2' of git://git./linux/kernel/git/kees/linux

Pull gcc-plugins fix from Kees Cook:
 "Fixes a typo in sancov plugin, exposed in earlier compiler versions"

* tag 'gcc-plugins-v4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  gcc-plugins: fix sancov_plugin for gcc-5

4 years agodrm: mxsfb: Implement drm_panel handling
Fabio Estevam [Wed, 1 Feb 2017 17:19:47 +0000 (15:19 -0200)]
drm: mxsfb: Implement drm_panel handling

Currently when the 'power-supply' regulator is passed via device tree
it does not actually work since drm_panel_prepare()/drm_panel_enable()
are never called.

Quoting Thierry Reding: "It should really call drm_panel_prepare() and
drm_panel_enable() while switching on the display pipeline and
drm_panel_disable(), followed by drm_panel_unprepare() while switching
off the display pipeline."

So do as suggested, so that the 'power-supply' regulator can be functional.

Reported-by: Breno Lima <breno.lima@nxp.com>
Suggested-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Tested-by: Marek Vasut <marex@denx.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
4 years agodrm: mxsfb_crtc: Fix the framebuffer misplacement
Fabio Estevam [Thu, 2 Feb 2017 21:26:38 +0000 (19:26 -0200)]
drm: mxsfb_crtc: Fix the framebuffer misplacement

Currently the framebuffer content is displayed with incorrect offsets
in both the vertical and horizontal directions.

The fbdev version of the driver does not show this problem. Breno Lima
dumped the eLCDIF controller registers on both the drm and fbdev drivers
and noticed that the VDCTRL3 register is configured incorrectly in the
drm driver.

The fbdev driver calculates the vertical and horizontal wait counts
of the VDCTRL3 register by doing: back porch + sync length.

Looking at the horizontal and vertical timing diagram from
include/drm/drm_modes.h this value corresponds to:

crtc_[hv]total - crtc_[hv]sync_start

So fix the VDCTRL3 register setting accordingly so that the eLCDIF
controller can properly show the framebuffer content in the correct
position.

Reported-by: Breno Lima <breno.lima@nxp.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Tested-by: Breno Lima <breno.lima@nxp.com>
Tested-by: Marek Vasut <marex@denx.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
4 years agodrm: mxsfb: Fix crash when provided invalid DT bindings
Marek Vasut [Sat, 28 Jan 2017 17:01:57 +0000 (18:01 +0100)]
drm: mxsfb: Fix crash when provided invalid DT bindings

The mxsfb driver will crash if the mxsfb DT node has a subnode,
but the content of the subnode is not of-graph binding with an
endpoint linking to panel. The crash was triggered by providing
old-style panel bindings to the mxsfb driver instead of the new
of-graph ones.

The problem happens in mxsfb_create_output(), which is invoked
from mxsfb_load(). The mxsfb_create_output() iterates over all
mxsfb DT subnode endpoints and tries to bind a panel on each
endpoint. If there is any problem binding the panel, that is,
mxsfb->panel == NULL, this function will return an error code,
otherwise success 0 is returned.

If the subnodes do not specify of-graph binding with an endpoint,
the iteration over endpoints in mxsfb_create_output() will have
zero cycles and the function will immediatelly return 0, but the
mxsfb->panel will remain NULL. This is propagated back into the
mxsfb_load(), which does not detect any problem and expects that
the mxsfb->panel is valid, thus calls mxsfb_panel_attach(). But
since mxsfb->panel == NULL, mxsfb_panel_attach() is called with
first argument NULL and this crashes the kernel.

This patch fixes the problem by explicitly checking for valid
mxsfb->panel at the end of the iteration in mxsfb_create_output().

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Breno Matheus Lima <brenomatheus@gmail.com>
Tested-by: Breno Lima <breno.lima@nxp.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
4 years agodrm: mxsfb: fix pixel clock polarity
Stefan Agner [Wed, 14 Dec 2016 20:48:09 +0000 (12:48 -0800)]
drm: mxsfb: fix pixel clock polarity

The DRM subsystem specifies the pixel clock polarity from a
controllers perspective: DRM_BUS_FLAG_PIXDATA_NEGEDGE means
the controller drives the data on pixel clocks falling edge.
That is the controllers DOTCLK_POL=0 (Default is data launched
at negative edge).

Also change the data enable logic to be high active by default
and only change if explicitly requested via bus_flags. With
that defaults are:
- Data enable: high active
- Pixel clock polarity: controller drives data on negative edge

Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Marek Vasut <marex@denx.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
4 years agodrm: mxsfb: use bus_format to determine LCD bus width
Stefan Agner [Thu, 15 Dec 2016 01:28:41 +0000 (17:28 -0800)]
drm: mxsfb: use bus_format to determine LCD bus width

The LCD bus width does not need to align with the pixel format. The
LCDIF controller automatically converts between pixel formats and
bus width by padding or dropping LSBs.

The DRM subsystem has the notion of bus_format which allows to
determine what bus_formats are supported by the display. Choose the
first available or fallback to 24 bit if none are available.

Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Marek Vasut <marex@denx.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
4 years agoMerge branch 'drm-fixes-4.11' of git://people.freedesktop.org/~agd5f/linux into drm...
Dave Airlie [Fri, 10 Mar 2017 01:07:34 +0000 (11:07 +1000)]
Merge branch 'drm-fixes-4.11' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

* 'drm-fixes-4.11' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: bump driver version for some new features
  drm/amdgpu: validate paramaters in the gem ioctl
  drm/amd/amdgpu: fix console deadlock if late init failed

4 years agoMerge tag 'drm-intel-fixes-2017-03-09' of git://anongit.freedesktop.org/git/drm-intel...
Dave Airlie [Fri, 10 Mar 2017 01:07:13 +0000 (11:07 +1000)]
Merge tag 'drm-intel-fixes-2017-03-09' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes

flushing out gvt-g fixes

* tag 'drm-intel-fixes-2017-03-09' of git://anongit.freedesktop.org/git/drm-intel: (29 commits)
  drm/i915/gvt: change some gvt_err to gvt_dbg_cmd
  drm/i915/gvt: protect RO and Rsvd bits of virtual vgpu configuration space
  drm/i915/gvt: handle workload lifecycle properly
  drm/i915/gvt: fix an error for F_RO flag
  drm/i915/gvt: use pfn_valid for better checking
  drm/i915/gvt: set SFUSE_STRAP properly for vitual monitor detection
  drm/i915/gvt: fix an error for one register
  drm/i915/gvt: add more registers into handlers list
  drm/i915/gvt: have more registers with F_CMD_ACCESS flags set
  drm/i915/gvt: add some new MMIOs to cmd_access white list
  drm/i915/gvt: fix pcode mailbox write emulation of BDW
  drm/i915/gvt: add resolution definition for vGPU type
  drm/i915/gvt: Add more edid definition support
  drm/i915/gvt: adjust to fixed vGPU types
  drm/i915/gvt: remove unnecessary error msg from gtt write
  drm/i915/gvt: refine pcode write emulation
  drm/i915/gvt: clear the vGPU reset logic
  drm/i915/gvt: decrease priority of output msg for untracked mmio
  drm/i915/gvt: set default value to 0 for unhandled mmio regs
  drm/i915/gvt: add cmd_access to GEN7_HALF_SLICE_CHICKEN1
  ...

4 years agoMerge tag 'drm-misc-fixes-2017-03-06' of git://anongit.freedesktop.org/git/drm-misc...
Dave Airlie [Fri, 10 Mar 2017 01:06:46 +0000 (11:06 +1000)]
Merge tag 'drm-misc-fixes-2017-03-06' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes

Just 1 8bpc quirk from Ville, cc: stable

* tag 'drm-misc-fixes-2017-03-06' of git://anongit.freedesktop.org/git/drm-misc:
  drm/edid: Add EDID_QUIRK_FORCE_8BPC quirk for Rotel RSX-1058

4 years agouserfaultfd: remove wrong comment from userfaultfd_ctx_get()
David Hildenbrand [Fri, 10 Mar 2017 00:17:40 +0000 (16:17 -0800)]
userfaultfd: remove wrong comment from userfaultfd_ctx_get()

It's a void function, so there is no return value;

Link: http://lkml.kernel.org/r/20170309150817.7510-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agofat: fix using uninitialized fields of fat_inode/fsinfo_inode
OGAWA Hirofumi [Fri, 10 Mar 2017 00:17:37 +0000 (16:17 -0800)]
fat: fix using uninitialized fields of fat_inode/fsinfo_inode

Recently fallocate patch was merged and it uses
MSDOS_I(inode)->mmu_private at fat_evict_inode().  However,
fat_inode/fsinfo_inode that was introduced in past didn't initialize
MSDOS_I(inode) properly.

With those combinations, it became the cause of accessing random entry
in FAT area.

Link: http://lkml.kernel.org/r/87pohrj4i8.fsf@mail.parknet.co.jp
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Reported-by: Moreno Bartalucci <moreno.bartalucci@tecnorama.it>
Tested-by: Moreno Bartalucci <moreno.bartalucci@tecnorama.it>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agosh: cayman: IDE support fix
Bartlomiej Zolnierkiewicz [Fri, 10 Mar 2017 00:17:34 +0000 (16:17 -0800)]
sh: cayman: IDE support fix

Remove incorrect CONFIG_IDE ifdef (CONFIG_IDE config option is for
internal drivers/ide/ use) and make IDE hardware interface always
initialized (not only when IDE subsystem is built-in).

This patch allows Cayman board to work with modular IDE subsystem
support and removes the requirement of having the whole core IDE
subsystem built-in when using libata PATA support.

Link: http://lkml.kernel.org/r/1990884.yFoE6lSB9G@amdc3058
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agokasan: fix races in quarantine_remove_cache()
Dmitry Vyukov [Fri, 10 Mar 2017 00:17:32 +0000 (16:17 -0800)]
kasan: fix races in quarantine_remove_cache()

quarantine_remove_cache() frees all pending objects that belong to the
cache, before we destroy the cache itself.  However there are currently
two possibilities how it can fail to do so.

First, another thread can hold some of the objects from the cache in
temp list in quarantine_put().  quarantine_put() has a windows of
enabled interrupts, and on_each_cpu() in quarantine_remove_cache() can
finish right in that window.  These objects will be later freed into the
destroyed cache.

Then, quarantine_reduce() has the same problem.  It grabs a batch of
objects from the global quarantine, then unlocks quarantine_lock and
then frees the batch.  quarantine_remove_cache() can finish while some
objects from the cache are still in the local to_free list in
quarantine_reduce().

Fix the race with quarantine_put() by disabling interrupts for the whole
duration of quarantine_put().  In combination with on_each_cpu() in
quarantine_remove_cache() it ensures that quarantine_remove_cache()
either sees the objects in the per-cpu list or in the global list.

Fix the race with quarantine_reduce() by protecting quarantine_reduce()
with srcu critical section and then doing synchronize_srcu() at the end
of quarantine_remove_cache().

I've done some assessment of how good synchronize_srcu() works in this
case.  And on a 4 CPU VM I see that it blocks waiting for pending read
critical sections in about 2-3% of cases.  Which looks good to me.

I suspect that these races are the root cause of some GPFs that I
episodically hit.  Previously I did not have any explanation for them.

  BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
  IP: qlist_free_all+0x2e/0xc0 mm/kasan/quarantine.c:155
  PGD 6aeea067
  PUD 60ed7067
  PMD 0
  Oops: 0000 [#1] SMP KASAN
  Dumping ftrace buffer:
     (ftrace buffer empty)
  Modules linked in:
  CPU: 0 PID: 13667 Comm: syz-executor2 Not tainted 4.10.0+ #60
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  task: ffff88005f948040 task.stack: ffff880069818000
  RIP: 0010:qlist_free_all+0x2e/0xc0 mm/kasan/quarantine.c:155
  RSP: 0018:ffff88006981f298 EFLAGS: 00010246
  RAX: ffffea0000ffff00 RBX: 0000000000000000 RCX: ffffea0000ffff1f
  RDX: 0000000000000000 RSI: ffff88003fffc3e0 RDI: 0000000000000000
  RBP: ffff88006981f2c0 R08: ffff88002fed7bd8 R09: 00000001001f000d
  R10: 00000000001f000d R11: ffff88006981f000 R12: ffff88003fffc3e0
  R13: ffff88006981f2d0 R14: ffffffff81877fae R15: 0000000080000000
  FS:  00007fb911a2d700(0000) GS:ffff88003ec00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000000c8 CR3: 0000000060ed6000 CR4: 00000000000006f0
  Call Trace:
   quarantine_reduce+0x10e/0x120 mm/kasan/quarantine.c:239
   kasan_kmalloc+0xca/0xe0 mm/kasan/kasan.c:590
   kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:544
   slab_post_alloc_hook mm/slab.h:456 [inline]
   slab_alloc_node mm/slub.c:2718 [inline]
   kmem_cache_alloc_node+0x1d3/0x280 mm/slub.c:2754
   __alloc_skb+0x10f/0x770 net/core/skbuff.c:219
   alloc_skb include/linux/skbuff.h:932 [inline]
   _sctp_make_chunk+0x3b/0x260 net/sctp/sm_make_chunk.c:1388
   sctp_make_data net/sctp/sm_make_chunk.c:1420 [inline]
   sctp_make_datafrag_empty+0x208/0x360 net/sctp/sm_make_chunk.c:746
   sctp_datamsg_from_user+0x7e8/0x11d0 net/sctp/chunk.c:266
   sctp_sendmsg+0x2611/0x3970 net/sctp/socket.c:1962
   inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761
   sock_sendmsg_nosec net/socket.c:633 [inline]
   sock_sendmsg+0xca/0x110 net/socket.c:643
   SYSC_sendto+0x660/0x810 net/socket.c:1685
   SyS_sendto+0x40/0x50 net/socket.c:1653

I am not sure about backporting.  The bug is quite hard to trigger, I've
seen it few times during our massive continuous testing (however, it
could be cause of some other episodic stray crashes as it leads to
memory corruption...).  If it is triggered, the consequences are very
bad -- almost definite bad memory corruption.  The fix is non trivial
and has chances of introducing new bugs.  I am also not sure how
actively people use KASAN on older releases.

[dvyukov@google.com: - sorted includes[
Link: http://lkml.kernel.org/r/20170309094028.51088-1-dvyukov@google.com
Link: http://lkml.kernel.org/r/20170308151532.5070-1-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agokasan: resched in quarantine_remove_cache()
Dmitry Vyukov [Fri, 10 Mar 2017 00:17:28 +0000 (16:17 -0800)]
kasan: resched in quarantine_remove_cache()

We see reported stalls/lockups in quarantine_remove_cache() on machines
with large amounts of RAM.  quarantine_remove_cache() needs to scan
whole quarantine in order to take out all objects belonging to the
cache.  Quarantine is currently 1/32-th of RAM, e.g.  on a machine with
256GB of memory that will be 8GB.  Moreover quarantine scanning is a
walk over uncached linked list, which is slow.

Add cond_resched() after scanning of each non-empty batch of objects.
Batches are specifically kept of reasonable size for quarantine_put().
On a machine with 256GB of RAM we should have ~512 non-empty batches,
each with 16MB of objects.

Link: http://lkml.kernel.org/r/20170308154239.25440-1-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm: do not call mem_cgroup_free() from within mem_cgroup_alloc()
Tahsin Erdogan [Fri, 10 Mar 2017 00:17:26 +0000 (16:17 -0800)]
mm: do not call mem_cgroup_free() from within mem_cgroup_alloc()

mem_cgroup_free() indirectly calls wb_domain_exit() which is not
prepared to deal with a struct wb_domain object that hasn't executed
wb_domain_init().  For instance, the following warning message is
printed by lockdep if alloc_percpu() fails in mem_cgroup_alloc():

  INFO: trying to register non-static key.
  the code is fine but needs lockdep annotation.
  turning off the locking correctness validator.
  CPU: 1 PID: 1950 Comm: mkdir Not tainted 4.10.0+ #151
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Call Trace:
   dump_stack+0x67/0x99
   register_lock_class+0x36d/0x540
   __lock_acquire+0x7f/0x1a30
   lock_acquire+0xcc/0x200
   del_timer_sync+0x3c/0xc0
   wb_domain_exit+0x14/0x20
   mem_cgroup_free+0x14/0x40
   mem_cgroup_css_alloc+0x3f9/0x620
   cgroup_apply_control_enable+0x190/0x390
   cgroup_mkdir+0x290/0x3d0
   kernfs_iop_mkdir+0x58/0x80
   vfs_mkdir+0x10e/0x1a0
   SyS_mkdirat+0xa8/0xd0
   SyS_mkdir+0x14/0x20
   entry_SYSCALL_64_fastpath+0x18/0xad

Add __mem_cgroup_free() which skips wb_domain_exit().  This is used by
both mem_cgroup_free() and mem_cgroup_alloc() clean up.

Fixes: 0b8f73e104285 ("mm: memcontrol: clean up alloc, online, offline, free functions")
Link: http://lkml.kernel.org/r/20170306192122.24262-1-tahsin@google.com
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agothp: fix another corner case of munlock() vs. THPs
Kirill A. Shutemov [Fri, 10 Mar 2017 00:17:23 +0000 (16:17 -0800)]
thp: fix another corner case of munlock() vs. THPs

The following test case triggers BUG() in munlock_vma_pages_range():

int main(int argc, char *argv[])
{
int fd;

system("mount -t tmpfs -o huge=always none /mnt");
fd = open("/mnt/test", O_CREAT | O_RDWR);
ftruncate(fd, 4UL << 20);
mmap(NULL, 4UL << 20, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_FIXED | MAP_LOCKED, fd, 0);
mmap(NULL, 4096, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_LOCKED, fd, 0);
munlockall();
return 0;
}

The second mmap() create PTE-mapping of the first huge page in file.  It
makes kernel munlock the page as we never keep PTE-mapped page mlocked.

On munlockall() when we handle vma created by the first mmap(),
munlock_vma_page() returns page_mask == 0, as the page is not mlocked
anymore.  On next iteration follow_page_mask() return tail page, but
page_mask is HPAGE_NR_PAGES - 1.  It makes us skip to the first tail
page of the next huge page and step on
VM_BUG_ON_PAGE(PageMlocked(page)).

The fix is not use the page_mask from follow_page_mask() at all.  It has
no use for us.

Link: http://lkml.kernel.org/r/20170302150252.34120-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org> [4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agormap: fix NULL-pointer dereference on THP munlocking
Kirill A. Shutemov [Fri, 10 Mar 2017 00:17:20 +0000 (16:17 -0800)]
rmap: fix NULL-pointer dereference on THP munlocking

The following test case triggers NULL-pointer derefernce in
try_to_unmap_one():

#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>

int main(int argc, char *argv[])
{
int fd;

system("mount -t tmpfs -o huge=always none /mnt");
fd = open("/mnt/test", O_CREAT | O_RDWR);
ftruncate(fd, 2UL << 20);
mmap(NULL, 2UL << 20, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_FIXED | MAP_LOCKED, fd, 0);
mmap(NULL, 2UL << 20, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_LOCKED, fd, 0);
munlockall();
return 0;
}

Apparently, there's a case when we call try_to_unmap() on huge PMDs:
it's TTU_MUNLOCK.

Let's handle this case correctly.

Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
Link: http://lkml.kernel.org/r/20170302151159.30592-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm/memblock.c: fix memblock_next_valid_pfn()
AKASHI Takahiro [Fri, 10 Mar 2017 00:17:17 +0000 (16:17 -0800)]
mm/memblock.c: fix memblock_next_valid_pfn()

Obviously, we should not access memblock.memory.regions[right] if
'right' is outside of [0..memblock.memory.cnt>.

Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
Link: http://lkml.kernel.org/r/20170303023745.9104-1-takahiro.akashi@linaro.org
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agouserfaultfd: selftest: vm: allow to build in vm/ directory
Andrea Arcangeli [Fri, 10 Mar 2017 00:17:14 +0000 (16:17 -0800)]
userfaultfd: selftest: vm: allow to build in vm/ directory

linux/tools/testing/selftests/vm $ make

  gcc -Wall -I ../../../../usr/include     compaction_test.c -lrt -o /compaction_test
  /usr/lib/gcc/x86_64-pc-linux-gnu/4.9.4/../../../../x86_64-pc-linux-gnu/bin/ld: cannot open output file /compaction_test: Permission denied
  collect2: error: ld returned 1 exit status
  make: *** [../lib.mk:54: /compaction_test] Error 1

Since commit a8ba798bc8ec ("selftests: enable O and KBUILD_OUTPUT")
selftests/vm build fails if run from the "selftests/vm" directory, but
it works in the selftests/ directory.  It's quicker to be able to do a
local vm-only build after a tree wipe and this patch allows for it
again.

Link: http://lkml.kernel.org/r/20170302173738.18994-4-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agouserfaultfd: non-cooperative: userfaultfd_remove revalidate vma in MADV_DONTNEED
Andrea Arcangeli [Fri, 10 Mar 2017 00:17:11 +0000 (16:17 -0800)]
userfaultfd: non-cooperative: userfaultfd_remove revalidate vma in MADV_DONTNEED

userfaultfd_remove() has to be execute before zapping the pagetables or
UFFDIO_COPY could keep filling pages after zap_page_range returned,
which would result in non zero data after a MADV_DONTNEED.

However userfaultfd_remove() may have to release the mmap_sem.  This was
handled correctly in MADV_REMOVE, but MADV_DONTNEED accessed a
potentially stale vma (the very vma passed to zap_page_range(vma, ...)).

The fix consists in revalidating the vma in case userfaultfd_remove()
had to release the mmap_sem.

This also optimizes away an unnecessary down_read/up_read in the
MADV_REMOVE case if UFFD_EVENT_FORK had to be delivered.

It all remains zero runtime cost in case CONFIG_USERFAULTFD=n as
userfaultfd_remove() will be defined as "true" at build time.

Link: http://lkml.kernel.org/r/20170302173738.18994-3-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agouserfaultfd: non-cooperative: fix fork fctx->new memleak
Mike Rapoport [Fri, 10 Mar 2017 00:17:09 +0000 (16:17 -0800)]
userfaultfd: non-cooperative: fix fork fctx->new memleak

We have a memleak in the ->new ctx if the uffd of the parent is closed
before the fork event is read, nothing frees the new context.

Link: http://lkml.kernel.org/r/20170302173738.18994-2-aarcange@redhat.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm/cgroup: avoid panic when init with low memory
Laurent Dufour [Fri, 10 Mar 2017 00:17:06 +0000 (16:17 -0800)]
mm/cgroup: avoid panic when init with low memory

The system may panic when initialisation is done when almost all the
memory is assigned to the huge pages using the kernel command line
parameter hugepage=xxxx.  Panic may occur like this:

  Unable to handle kernel paging request for data at address 0x00000000
  Faulting instruction address: 0xc000000000302b88
  Oops: Kernel access of bad area, sig: 11 [#1]
  SMP NR_CPUS=2048 [    0.082424] NUMA
  pSeries
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-15-generic #16-Ubuntu
  task: c00000021ed01600 task.stack: c00000010d108000
  NIP: c000000000302b88 LR: c000000000270e04 CTR: c00000000016cfd0
  REGS: c00000010d10b2c0 TRAP: 0300   Not tainted (4.9.0-15-generic)
  MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>[ 0.082770]   CR: 28424422  XER: 00000000
  CFAR: c0000000003d28b8 DAR: 0000000000000000 DSISR: 40000000 SOFTE: 1
  GPR00: c000000000270e04 c00000010d10b540 c00000000141a300 c00000010fff6300
  GPR04: 0000000000000000 00000000026012c0 c00000010d10b630 0000000487ab0000
  GPR08: 000000010ee90000 c000000001454fd8 0000000000000000 0000000000000000
  GPR12: 0000000000004400 c00000000fb80000 00000000026012c0 00000000026012c0
  GPR16: 00000000026012c0 0000000000000000 0000000000000000 0000000000000002
  GPR20: 000000000000000c 0000000000000000 0000000000000000 00000000024200c0
  GPR24: c0000000016eef48 0000000000000000 c00000010fff7d00 00000000026012c0
  GPR28: 0000000000000000 c00000010fff7d00 c00000010fff6300 c00000010d10b6d0
  NIP mem_cgroup_soft_limit_reclaim+0xf8/0x4f0
  LR do_try_to_free_pages+0x1b4/0x450
  Call Trace:
    do_try_to_free_pages+0x1b4/0x450
    try_to_free_pages+0xf8/0x270
    __alloc_pages_nodemask+0x7a8/0xff0
    new_slab+0x104/0x8e0
    ___slab_alloc+0x620/0x700
    __slab_alloc+0x34/0x60
    kmem_cache_alloc_node_trace+0xdc/0x310
    mem_cgroup_init+0x158/0x1c8
    do_one_initcall+0x68/0x1d0
    kernel_init_freeable+0x278/0x360
    kernel_init+0x24/0x170
    ret_from_kernel_thread+0x5c/0x74
  Instruction dump:
  eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020 3d230001 e9499a42 3d220004
  3929acd8 794a1f24 7d295214 eac90100 <e93600002fa90000 419eff74 3b200000
  ---[ end trace 342f5208b00d01b6 ]---

This is a chicken and egg issue where the kernel try to get free memory
when allocating per node data in mem_cgroup_init(), but in that path
mem_cgroup_soft_limit_reclaim() is called which assumes that these data
are allocated.

As mem_cgroup_soft_limit_reclaim() is best effort, it should return when
these data are not yet allocated.

This patch also fixes potential null pointer access in
mem_cgroup_remove_from_trees() and mem_cgroup_update_tree().

Link: http://lkml.kernel.org/r/1487856999-16581-2-git-send-email-ldufour@linux.vnet.ibm.com
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agodrivers/md/bcache/util.h: remove duplicate inclusion of blkdev.h
Masanari Iida [Fri, 10 Mar 2017 00:17:03 +0000 (16:17 -0800)]
drivers/md/bcache/util.h: remove duplicate inclusion of blkdev.h

Link: http://lkml.kernel.org/r/20170226060230.11555-1-standby24x7@gmail.com
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Coly Li <colyli@suse.de>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm/vmstats: add thp_split_pud event for clarity
Yisheng Xie [Fri, 10 Mar 2017 00:17:00 +0000 (16:17 -0800)]
mm/vmstats: add thp_split_pud event for clarity

We added support for PUD-sized transparent hugepages, however we count
the event "thp split pud" into thp_split_pmd event.

To separate the event count of thp split pud from pmd, add a new event
named thp_split_pud.

Link: http://lkml.kernel.org/r/1488282380-5076-1-git-send-email-xieyisheng1@huawei.com
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Ebru Akagunduz <ebru.akagunduz@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoinclude/linux/fs.h: fix unsigned enum warning with gcc-4.2
Arnd Bergmann [Fri, 10 Mar 2017 00:16:57 +0000 (16:16 -0800)]
include/linux/fs.h: fix unsigned enum warning with gcc-4.2

With arm-linux-gcc-4.2, almost every file we build in the kernel ends up
with this warning:

  include/linux/fs.h:2648: warning: comparison of unsigned expression < 0 is always false

Later versions don't have this problem, but it's easy enough to work
around.

Link: http://lkml.kernel.org/r/20161216105634.235457-12-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agouserfaultfd: non-cooperative: release all ctx in dup_userfaultfd_complete
Andrea Arcangeli [Fri, 10 Mar 2017 00:16:54 +0000 (16:16 -0800)]
userfaultfd: non-cooperative: release all ctx in dup_userfaultfd_complete

Don't stop running dup_fctx() even if userfaultfd_event_wait_completion
fails as it has to run userfaultfd_ctx_put on all ctx to pair against
the userfaultfd_ctx_get that was run on all fctx->orig in
dup_userfaultfd.

Link: http://lkml.kernel.org/r/20170224181957.19736-4-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agouserfaultfd: non-cooperative: robustness check
Andrea Arcangeli [Fri, 10 Mar 2017 00:16:52 +0000 (16:16 -0800)]
userfaultfd: non-cooperative: robustness check

Similar to the handle_userfault() case, also make sure to never attempt
to send any event past the PF_EXITING point of no return.

This is purely a robustness check.

Link: http://lkml.kernel.org/r/20170224181957.19736-3-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agouserfaultfd: non-cooperative: rollback userfaultfd_exit
Andrea Arcangeli [Fri, 10 Mar 2017 00:16:49 +0000 (16:16 -0800)]
userfaultfd: non-cooperative: rollback userfaultfd_exit

Patch series "userfaultfd non-cooperative further update for 4.11 merge
window".

Unfortunately I noticed one relevant bug in userfaultfd_exit while doing
more testing.  I've been doing testing before and this was also tested
by kbuild bot and exercised by the selftest, but this bug never
reproduced before.

I dropped userfaultfd_exit as result.  I dropped it because of
implementation difficulty in receiving signals in __mmput and because I
think -ENOSPC as result from the background UFFDIO_COPY should be enough
already.

Before I decided to remove userfaultfd_exit, I noticed userfaultfd_exit
wasn't exercised by the selftest and when I tried to exercise it, after
moving it to a more correct place in __mmput where it would make more
sense and where the vma list is stable, it resulted in the
event_wait_completion in D state.  So then I added the second patch to
be sure even if we call userfaultfd_event_wait_completion too late
during task exit(), we won't risk to generate tasks in D state.  The
same check exists in handle_userfault() for the same reason, except it
makes a difference there, while here is just a robustness check and it's
run under WARN_ON_ONCE.

While looking at the userfaultfd_event_wait_completion() function I
looked back at its callers too while at it and I think it's not ok to
stop executing dup_fctx on the fcs list because we relay on
userfaultfd_event_wait_completion to execute
userfaultfd_ctx_put(fctx->orig) which is paired against
userfaultfd_ctx_get(fctx->orig) in dup_userfault just before
list_add(fcs).  This change only takes care of fctx->orig but this area
also needs further review looking for similar problems in fctx->new.

The only patch that is urgent is the first because it's an use after
free during a SMP race condition that affects all processes if
CONFIG_USERFAULTFD=y.  Very hard to reproduce though and probably
impossible without SLUB poisoning enabled.

This patch (of 3):

I once reproduced this oops with the userfaultfd selftest, it's not
easily reproducible and it requires SLUB poisoning to reproduce.

    general protection fault: 0000 [#1] SMP
    Modules linked in:
    CPU: 2 PID: 18421 Comm: userfaultfd Tainted: G               ------------ T 3.10.0+ #15
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
    task: ffff8801f83b9440 ti: ffff8801f833c000 task.ti: ffff8801f833c000
    RIP: 0010:[<ffffffff81451299>]  [<ffffffff81451299>] userfaultfd_exit+0x29/0xa0
    RSP: 0018:ffff8801f833fe80  EFLAGS: 00010202
    RAX: ffff8801f833ffd8 RBX: 6b6b6b6b6b6b6b6b RCX: ffff8801f83b9440
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800baf18600
    RBP: ffff8801f833fee8 R08: 0000000000000000 R09: 0000000000000001
    R10: 0000000000000000 R11: ffffffff8127ceb3 R12: 0000000000000000
    R13: ffff8800baf186b0 R14: ffff8801f83b99f8 R15: 00007faed746c700
    FS:  0000000000000000(0000) GS:ffff88023fc80000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 00007faf0966f028 CR3: 0000000001bc6000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Call Trace:
      do_exit+0x297/0xd10
      SyS_exit+0x17/0x20
      tracesys+0xdd/0xe2
    Code: 00 00 66 66 66 66 90 55 48 89 e5 41 54 53 48 83 ec 58 48 8b 1f 48 85 db 75 11 eb 73 66 0f 1f 44 00 00 48 8b 5b 10 48 85 db 74 64 <4c> 8b a3 b8 00 00 00 4d 85 e4 74 eb 41 f6 84 24 2c 01 00 00 80
    RIP  [<ffffffff81451299>] userfaultfd_exit+0x29/0xa0
     RSP <ffff8801f833fe80>
    ---[ end trace 9fecd6dcb442846a ]---

In the debugger I located the "mm" pointer in the stack and walking
mm->mmap->vm_next through the end shows the vma->vm_next list is fully
consistent and it is null terminated list as expected.  So this has to
be an SMP race condition where userfaultfd_exit was running while the
vma list was being modified by another CPU.

When userfaultfd_exit() run one of the ->vm_next pointers pointed to
SLAB_POISON (RBX is the vma pointer and is 0x6b6b..).

The reason is that it's not running in __mmput but while there are still
other threads running and it's not holding the mmap_sem (it can't as it
has to wait the even to be received by the manager).  So this is an use
after free that was happening for all processes.

One more implementation problem aside from the race condition:
userfaultfd_exit has really to check a flag in mm->flags before walking
the vma or it's going to slowdown the exit() path for regular tasks.

One more implementation problem: at that point signals can't be
delivered so it would also create a task in D state if the manager
doesn't read the event.

The major design issue: it overall looks superfluous as the manager can
check for -ENOSPC in the background transfer:

if (mmget_not_zero(ctx->mm)) {
[..]
} else {
return -ENOSPC;
}

It's safer to roll it back and re-introduce it later if at all.

[rppt@linux.vnet.ibm.com: documentation fixup after removal of UFFD_EVENT_EXIT]
Link: http://lkml.kernel.org/r/1488345437-4364-1-git-send-email-rppt@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/20170224181957.19736-2-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agox86, mm: unify exit paths in gup_pte_range()
Dan Williams [Fri, 10 Mar 2017 00:16:45 +0000 (16:16 -0800)]
x86, mm: unify exit paths in gup_pte_range()

All exit paths from gup_pte_range() require pte_unmap() of the original
pte page before returning.  Refactor the code to have a single exit
point to do the unmap.

This mirrors the flow of the generic gup_pte_range() in mm/gup.c.

Link: http://lkml.kernel.org/r/148804251828.36605.14910389618497006945.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agox86, mm: fix gup_pte_range() vs DAX mappings
Dan Williams [Fri, 10 Mar 2017 00:16:42 +0000 (16:16 -0800)]
x86, mm: fix gup_pte_range() vs DAX mappings

gup_pte_range() fails to check pte_allows_gup() before translating a DAX
pte entry, pte_devmap(), to a page.  This allows writes to read-only
mappings, and bypasses the DAX cacheline dirty tracking due to missed
'mkwrite' faults.  The gup_huge_pmd() path and the gup_huge_pud() path
correctly check pte_allows_gup() before checking for _devmap() entries.

Fixes: 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings")
Link: http://lkml.kernel.org/r/148804251312.36605.12665024794196605053.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Dave Hansen <dave.hansen@linux.intel.com>
Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Xiong Zhou <xzhou@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agopower/mm: update pte_write and pte_wrprotect to handle savedwrite
Aneesh Kumar K.V [Fri, 10 Mar 2017 00:16:39 +0000 (16:16 -0800)]
power/mm: update pte_write and pte_wrprotect to handle savedwrite

We use pte_write() to check whethwer the pte entry is writable.  This is
mostly used to later mark the pte read only if it is writable.  The other
use of pte_write() is to check whether the pte_entry is writable so that
hardware page table entry can be marked accordingly.  This is used in kvm
where we look at qemu page table entry and update hardware hash page table
for the guest with correct write enable bit.

With the above, for the first usage we should also check the savedwrite
bit so that we can correctly clear the savedwite bit.  For the later, we
add a new variant __pte_write().

With this we can revert write_protect_page part of 595cd8f256d2 ("mm/ksm:
handle protnone saved writes when making page write protect").  But I left
it as it is as an example code for savedwrite check.

Fixes: c137a2757b886 ("powerpc/mm/autonuma: switch ppc64 to its own implementation of saved write")
Link: http://lkml.kernel.org/r/1488203787-17849-2-git-send-email-aneesh.kumar@linux.vnet.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agopowerpc/mm: handle protnone ptes on fork
Aneesh Kumar K.V [Fri, 10 Mar 2017 00:16:36 +0000 (16:16 -0800)]
powerpc/mm: handle protnone ptes on fork

We need to mark pages of parent process read only on fork.  Numa fault
pte needs a protnone ptes variant with saved write flag set.  On fork we
need to make sure we remove the saved write bit.  Instead of adding the
protnone check in the caller update ptep_set_wrprotect variants to clear
savedwrite bit.

Without this we see random segfaults in application on fork.

Fixes: c137a2757b886 ("powerpc/mm/autonuma: switch ppc64 to its own implementation of saved write")
Link: http://lkml.kernel.org/r/1488203787-17849-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoscripts/spelling.txt: add "overide" pattern and fix typo instances
Masahiro Yamada [Fri, 10 Mar 2017 00:16:33 +0000 (16:16 -0800)]
scripts/spelling.txt: add "overide" pattern and fix typo instances

Fix typos and add the following to the scripts/spelling.txt:

  overide||override

While we are here, fix the doubled "address" in the touched line
Documentation/devicetree/bindings/regulator/ti-abb-regulator.txt.

Also, fix the comment block style in the touched hunks in
drivers/media/dvb-frontends/drx39xyj/drx_driver.h.

Link: http://lkml.kernel.org/r/1481573103-11329-21-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoscripts/spelling.txt: add "disble(d)" pattern and fix typo instances
Masahiro Yamada [Fri, 10 Mar 2017 00:16:31 +0000 (16:16 -0800)]
scripts/spelling.txt: add "disble(d)" pattern and fix typo instances

Fix typos and add the following to the scripts/spelling.txt:

  disble||disable
  disbled||disabled

I kept the TSL2563_INT_DISBLED in /drivers/iio/light/tsl2563.c
untouched.  The macro is not referenced at all, but this commit is
touching only comment blocks just in case.

Link: http://lkml.kernel.org/r/1481573103-11329-20-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agouserfaultfd: shmem: __do_fault requires VM_FAULT_NOPAGE
Andrea Arcangeli [Fri, 10 Mar 2017 00:16:28 +0000 (16:16 -0800)]
userfaultfd: shmem: __do_fault requires VM_FAULT_NOPAGE

__do_fault assumes vmf->page has been initialized and is valid if
VM_FAULT_NOPAGE is not returned by vma->vm_ops->fault(vma, vmf).

handle_userfault() in turn should return VM_FAULT_NOPAGE if it doesn't
return VM_FAULT_SIGBUS or VM_FAULT_RETRY (the other two possibilities).

This VM_FAULT_NOPAGE case is only invoked when signal are pending and it
didn't matter for anonymous memory before.  It only started to matter
since shmem was introduced.  hugetlbfs also takes a different path and
doesn't exercise __do_fault.

Link: http://lkml.kernel.org/r/20170228154201.GH5816@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoMerge tag 'pm-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 10 Mar 2017 00:30:37 +0000 (16:30 -0800)]
Merge tag 'pm-4.11-rc2' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix several issues in the intel_pstate driver and one issue in
  the schedutil cpufreq governor, clean up that governor a bit and hook
  up existing code for disabling cpufreq to a new kernel command line
  option.

  Specifics:

   - Three fixes for intel_pstate problems related to the passive mode
     (in which it acts as a regular cpufreq scaling driver), two for the
     handling of global P-state limits and one for the handling of the
     cpu_frequency tracepoint in that mode (Rafael Wysocki).

   - Three fixes for the handling of P-state limits in intel_pstate in
     the active mode (Rafael Wysocki).

   - Introduction of a new cpufreq.off=1 kernel command line argument
     that will disable cpufreq entirely if passed to the kernel and is
     simply hooked up to the existing code used by Xen (Len Brown).

   - Fix for the schedutil cpufreq governor to prevent it from using
     stale raw frequency values in configurations with mutiple CPUs
     sharing one policy object and a cleanup for it reducing its
     overhead slightly (Viresh Kumar)"

* tag 'pm-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy
  cpufreq: intel_pstate: Fix intel_pstate_verify_policy()
  cpufreq: intel_pstate: Fix global settings in active mode
  cpufreq: Add the "cpufreq.off=1" cmdline option
  cpufreq: schedutil: Pass sg_policy to get_next_freq()
  cpufreq: schedutil: move cached_raw_freq to struct sugov_policy
  cpufreq: intel_pstate: Avoid triggering cpu_frequency tracepoint unnecessarily
  cpufreq: intel_pstate: Fix intel_cpufreq_verify_policy()
  cpufreq: intel_pstate: Do not use performance_limits in passive mode

4 years agoMerge tag 'pci-v4.11-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaa...
Linus Torvalds [Fri, 10 Mar 2017 00:02:04 +0000 (16:02 -0800)]
Merge tag 'pci-v4.11-fixes-2' of git://git./linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
 "PCI fixes:

   - fix NULL pointer dereference in Exynos driver

   - fix NULL pointer dereference in ASPM with pre-1.1 PCIe devices

   - blacklist QLogic ISP2722 to prevent panics while reading VPD"

* tag 'pci-v4.11-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI/ASPM: Always set link->downstream to avoid NULL dereference on remove
  PCI: Prevent VPD access for QLogic ISP2722
  PCI: exynos: Initialize elbi_base even when using PHY framework

4 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Thu, 9 Mar 2017 23:53:25 +0000 (15:53 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Sending this a bit sooner than I otherwise would have, as a fix in the
  merge window had some unfortunate issues and side effects for some
  folks.

  This contains:

   - Fixes from Jan for the bdi registration/unregistration. These have
     been tested by the various parties reporting issues, and should be
     solid at this point.

   - Also from Jan, fix for axonram gendisk registration.

   - A stable fix for zram from Johannes.

   - A small series from Ming, fixing up some long standing issues with
     blk-mq hardware queue kobject initialization and registration.

   - A fix for sed opal from Jon, fixing a nonsensical range check and
     some set-but-not-used variables.

   - A fix from Neil for a long standing deadlock issue for stacking
     device drivers. With this in place, dm/md don't have to work around
     the issue anymore, and can be properly fixed up"

* 'for-linus' of git://git.kernel.dk/linux-block:
  axonram: Fix gendisk handling
  blk: improve order of bio handling in generic_make_request()
  Revert "scsi, block: fix duplicate bdi name registration crashes"
  block: Make del_gendisk() safer for disks without queues
  bdi: Fix use-after-free in wb_congested_put()
  block: Allow bdi re-registration
  block/sed: Fix opal user range check and unused variables
  zram: set physical queue limits to avoid array out of bounds accesses
  blk-mq: free hctx->cpumask in release handler of hctx's kobject
  blk-mq: make lifetime consistent between hctx and its kobject
  blk-mq: make lifetime consitent between q/ctx and its kobject
  blk-mq: initialize mq kobjects in blk_mq_init_allocated_queue()

4 years agoMerge tag 'media/v4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Thu, 9 Mar 2017 23:50:56 +0000 (15:50 -0800)]
Merge tag 'media/v4.11-2' of git://git./linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
 "Media regression fixes:

   - serial_ir: fix a Kernel crash during boot on Kernel 4.11-rc1, due
     to an IRQ code called too early

   - other IR regression fixes at lirc and at the raw IR decoding

   - a deadlock fix at the RC nuvoton driver

   - fix another issue with DMA on stack at dw2102 driver

  There's an extra patch there that change a driver interface for the
  SoC VSP1 driver, with is shared between the DRM and V4L2 driver. The
  patch itself is trivial, and was acked by David Arlie"

* tag 'media/v4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] v4l: vsp1: Adapt vsp1_du_setup_lif() interface to use a structure
  [media] dw2102: don't do DMA on stack
  [media] rc: protocol is not set on register for raw IR devices
  [media] rc: raw decoder for keymap protocol is not loaded on register
  [media] rc: nuvoton: fix deadlock in nvt_write_wakeup_codes
  [media] lirc: fix dead lock between open and wakeup_filter
  [media] serial_ir: ensure we're ready to receive interrupts

4 years agos390: wire up statx system call
Heiko Carstens [Mon, 6 Mar 2017 08:57:17 +0000 (09:57 +0100)]
s390: wire up statx system call

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
4 years agodrm/amdgpu: bump driver version for some new features
Alex Deucher [Wed, 8 Mar 2017 22:23:21 +0000 (17:23 -0500)]
drm/amdgpu: bump driver version for some new features

We added new gem ioctl flags and the new fences ioctl, but forgot
to bump the version.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: validate paramaters in the gem ioctl
Alex Deucher [Wed, 8 Mar 2017 22:40:17 +0000 (17:40 -0500)]
drm/amdgpu: validate paramaters in the gem ioctl

Reject it if there are any invalid flags or domains.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agoMerge tag 'for-linus-4.11-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 9 Mar 2017 20:23:30 +0000 (12:23 -0800)]
Merge tag 'for-linus-4.11-rc1-tag' of git://git./linux/kernel/git/xen/tip

Pull xen fix and cleanup from Juergen Gross:
 "This contains one fix for MSIX handling under Xen and a trivial
  cleanup patch"

* tag 'for-linus-4.11-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xenbus: Remove duplicate inclusion of linux/init.h
  xen: do not re-use pirq number cached in pci device msi msg data

4 years agomm: introduce __p4d_alloc()
Kirill A. Shutemov [Thu, 9 Mar 2017 14:24:08 +0000 (17:24 +0300)]
mm: introduce __p4d_alloc()

For full 5-level paging we need a helper to allocate p4d page table.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm: convert generic code to 5-level paging
Kirill A. Shutemov [Thu, 9 Mar 2017 14:24:07 +0000 (17:24 +0300)]
mm: convert generic code to 5-level paging

Convert all non-architecture-specific code to 5-level paging.

It's mostly mechanical adding handling one more page table level in
places where we deal with pud_t.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoasm-generic: introduce <asm-generic/pgtable-nop4d.h>
Kirill A. Shutemov [Thu, 9 Mar 2017 14:24:06 +0000 (17:24 +0300)]
asm-generic: introduce <asm-generic/pgtable-nop4d.h>

Like with pgtable-nopud.h for 4-level paging, this new header is base
for converting an architectures to properly folded p4d_t level.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoarch, mm: convert all architectures to use 5level-fixup.h
Kirill A. Shutemov [Thu, 9 Mar 2017 14:24:05 +0000 (17:24 +0300)]
arch, mm: convert all architectures to use 5level-fixup.h

If an architecture uses 4level-fixup.h we don't need to do anything as
it includes 5level-fixup.h.

If an architecture uses pgtable-nop*d.h, define __ARCH_USE_5LEVEL_HACK
before inclusion of the header. It makes asm-generic code to use
5level-fixup.h.

If an architecture has 4-level paging or folds levels on its own,
include 5level-fixup.h directly.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoasm-generic: introduce __ARCH_USE_5LEVEL_HACK
Kirill A. Shutemov [Thu, 9 Mar 2017 14:24:04 +0000 (17:24 +0300)]
asm-generic: introduce __ARCH_USE_5LEVEL_HACK

We are going to introduce <asm-generic/pgtable-nop4d.h> to provide
abstraction for properly (in opposite to 5level-fixup.h hack) folded
p4d level. The new header will be included from pgtable-nopud.h.

If an architecture uses <asm-generic/nop*d.h>, we cannot use
5level-fixup.h directly to quickly convert the architecture to 5-level
paging as it would conflict with pgtable-nop4d.h.

With this patch an architecture can define __ARCH_USE_5LEVEL_HACK before
inclusion <asm-genenric/nop*d.h> to use 5level-fixup.h.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoasm-generic: introduce 5level-fixup.h
Kirill A. Shutemov [Thu, 9 Mar 2017 14:24:03 +0000 (17:24 +0300)]
asm-generic: introduce 5level-fixup.h

We are going to switch core MM to 5-level paging abstraction.

This is preparation step which adds <asm-generic/5level-fixup.h>
As with 4level-fixup.h, the new header allows quickly make all
architectures compatible with 5-level paging in core MM.

In long run we would like to switch architectures to properly folded p4d
level by using <asm-generic/pgtable-nop4d.h>, but it requires more
changes to arch-specific code.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agox86/cpufeature: Add 5-level paging detection
Kirill A. Shutemov [Thu, 9 Mar 2017 14:24:02 +0000 (17:24 +0300)]
x86/cpufeature: Add 5-level paging detection

Look for 'la57' in /proc/cpuinfo to see if your machine supports 5-level
paging.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agousb: host: xhci-plat: Fix timeout on removal of hot pluggable xhci controllers
Guenter Roeck [Thu, 9 Mar 2017 13:39:37 +0000 (15:39 +0200)]
usb: host: xhci-plat: Fix timeout on removal of hot pluggable xhci controllers

Upstream commit 98d74f9ceaef ("xhci: fix 10 second timeout on removal of
PCI hotpluggable xhci controllers") fixes a problem with hot pluggable PCI
xhci controllers which can result in excessive timeouts, to the point where
the system reports a deadlock.

The same problem is seen with hot pluggable xhci controllers using the
xhci-plat driver, such as the driver used for Type-C ports on rk3399.
Similar to hot-pluggable PCI controllers, the driver for this chip
removes the xhci controller from the system when the Type-C cable is
disconnected.

The solution for PCI devices works just as well for non-PCI devices
and avoids the problem.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agousb: host: xhci-dbg: HCIVERSION should be a binary number
Peter Chen [Thu, 9 Mar 2017 13:39:36 +0000 (15:39 +0200)]
usb: host: xhci-dbg: HCIVERSION should be a binary number

According to xHCI spec, HCIVERSION containing a BCD encoding
of the xHCI specification revision number, 0100h corresponds
to xHCI version 1.0. Change "100" as "0x100".

Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Fixes: 04abb6de2825 ("xhci: Read and parse new xhci
1.1 capability register")
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agousb: xhci: remove dummy extra_priv_size for size of xhci_hcd struct
Chunfeng Yun [Thu, 9 Mar 2017 13:39:35 +0000 (15:39 +0200)]
usb: xhci: remove dummy extra_priv_size for size of xhci_hcd struct

because hcd_priv_size is already size of xhci_hcd struct,
extra_priv_size is not needed anymore for MTK and tegra drivers.

Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agousb: xhci-mtk: check hcc_params after adding primary hcd
Chunfeng Yun [Thu, 9 Mar 2017 13:39:34 +0000 (15:39 +0200)]
usb: xhci-mtk: check hcc_params after adding primary hcd

hcc_params is set in xhci_gen_setup() called from usb_add_hcd(),
so checks the Maximum Primary Stream Array Size in the hcc_params
register after adding primary hcd.

Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoRevert "i2c: copy device properties when using i2c_register_board_info()"
Wolfram Sang [Thu, 9 Mar 2017 15:41:48 +0000 (16:41 +0100)]
Revert "i2c: copy device properties when using i2c_register_board_info()"

This reverts commit b0c1e95ab44feaad8831f2c06a3473c974003b49. It
contains a flaw and the next version has more features added which makes
me want to move it to the next cycle.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
4 years agoMerge branch 'i2c-mux/for-current' of https://github.com/peda-r/i2c-mux into i2c...
Wolfram Sang [Thu, 9 Mar 2017 15:34:41 +0000 (16:34 +0100)]
Merge branch 'i2c-mux/for-current' of https://github.com/peda-r/i2c-mux into i2c/for-current

4 years agoRevert "i2c: add missing of_node_put in i2c_mux_del_adapters"
Wolfram Sang [Thu, 9 Mar 2017 15:32:17 +0000 (16:32 +0100)]
Revert "i2c: add missing of_node_put in i2c_mux_del_adapters"

This reverts commit 02dbfa5e5583523035f05636c614a0eca77f1aab. I grabbed
the wrong version from the list and will pull the proper one from Peter
Rosin's mux tree.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
4 years agoi2c: exynos5: Avoid transaction timeouts due TRANSFER_DONE_AUTO not set
Javier Martinez Canillas [Thu, 9 Mar 2017 14:05:33 +0000 (11:05 -0300)]
i2c: exynos5: Avoid transaction timeouts due TRANSFER_DONE_AUTO not set

After commit 7999eecb7e56 ("i2c: exynos5: fix arbitration lost handling"),
some I2C transactions are failing because the TRANSFER_DONE_AUTO field is
not set in the I2C_TRANS_STATUS register so the i2c->status value is left
to -EINVAL causing the i2c->msg_complete completion to never be signaled.

For example, when reading the time of an I2C rtc on an Exynos5800 machine:

$ cat /sys/class/rtc/rtc0/time
[   25.924594] exynos5-hsi2c 12e10000.i2c: rx timeout
[   65.028365] max77686-rtc max77802-rtc: Fail to read time reg(-22)
cat: /sys/class/rtc/rtc0/time: Invalid argument

The Exynos5422 manual states clearly that most I2C_TRANS_STATUS reg bits
(including TRANSFER_DONE_AUTO) are cleared after the register is read. So
reading has side effects and should only be done if HSI2C_INT_I2C was set.

Fixes: 7999eecb7e56 ("i2c: exynos5: fix arbitration lost handling")
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
4 years agoMerge tag 'kvm-arm-for-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Radim Krčmář [Thu, 9 Mar 2017 14:48:42 +0000 (15:48 +0100)]
Merge tag 'kvm-arm-for-4.11-rc2' of git://git./linux/kernel/git/kvmarm/kvmarm

KVM/ARM updates for v4.11-rc2

vgic updates:
- Honour disabling the ITS
- Don't deadlock when deactivating own interrupts via MMIO
- Correctly expose the lact of IRQ/FIQ bypass on GICv3

I/O virtualization:
- Make KVM_CAP_NR_MEMSLOTS big enough for large guests with
  many PCIe devices

General bug fixes:
- Gracefully handle exception generated with syndroms that
  the host doesn't understand
- Properly invalidate TLBs on VHE systems

4 years agoKVM: nVMX: do not warn when MSR bitmap address is not backed
Radim Krčmář [Tue, 7 Mar 2017 16:51:49 +0000 (17:51 +0100)]
KVM: nVMX: do not warn when MSR bitmap address is not backed

Before trying to do nested_get_page() in nested_vmx_merge_msr_bitmap(),
we have already checked that the MSR bitmap address is valid (4k aligned
and within physical limits).  SDM doesn't specify what happens if the
there is no memory mapped at the valid address, but Intel CPUs treat the
situation as if the bitmap was configured to trap all MSRs.

KVM already does that by returning false and a correct handling doesn't
need the guest-trigerrable warning that was reported by syzkaller:
(The warning was originally there to catch some possible bugs in nVMX.)

  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 7832 at arch/x86/kvm/vmx.c:9709
  nested_vmx_merge_msr_bitmap arch/x86/kvm/vmx.c:9709 [inline]
  WARNING: CPU: 0 PID: 7832 at arch/x86/kvm/vmx.c:9709
  nested_get_vmcs12_pages+0xfb6/0x15c0 arch/x86/kvm/vmx.c:9640
  Kernel panic - not syncing: panic_on_warn set ...
  CPU: 0 PID: 7832 Comm: syz-executor1 Not tainted 4.10.0+ #229
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Call Trace:
   __dump_stack lib/dump_stack.c:15 [inline]
   dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
   panic+0x1fb/0x412 kernel/panic.c:179
   __warn+0x1c4/0x1e0 kernel/panic.c:540
   warn_slowpath_null+0x2c/0x40 kernel/panic.c:583
   nested_vmx_merge_msr_bitmap arch/x86/kvm/vmx.c:9709 [inline]
   nested_get_vmcs12_pages+0xfb6/0x15c0 arch/x86/kvm/vmx.c:9640
   enter_vmx_non_root_mode arch/x86/kvm/vmx.c:10471 [inline]
   nested_vmx_run+0x6186/0xaab0 arch/x86/kvm/vmx.c:10561
   handle_vmlaunch+0x1a/0x20 arch/x86/kvm/vmx.c:7312
   vmx_handle_exit+0xfc0/0x3f00 arch/x86/kvm/vmx.c:8526
   vcpu_enter_guest arch/x86/kvm/x86.c:6982 [inline]
   vcpu_run arch/x86/kvm/x86.c:7044 [inline]
   kvm_arch_vcpu_ioctl_run+0x1418/0x4840 arch/x86/kvm/x86.c:7205
   kvm_vcpu_ioctl+0x673/0x1120 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2570

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
[Jim Mattson explained the bare metal behavior: "I believe this behavior
 would be documented in the chipset data sheet rather than the SDM,
 since the chipset returns all 1s for an unclaimed read."]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
4 years agoMerge branch 'pm-cpufreq-sched'
Rafael J. Wysocki [Thu, 9 Mar 2017 14:12:55 +0000 (15:12 +0100)]
Merge branch 'pm-cpufreq-sched'

* pm-cpufreq-sched:
  cpufreq: schedutil: Pass sg_policy to get_next_freq()
  cpufreq: schedutil: move cached_raw_freq to struct sugov_policy

4 years agoMerge branch 'pm-cpufreq'
Rafael J. Wysocki [Thu, 9 Mar 2017 14:12:27 +0000 (15:12 +0100)]
Merge branch 'pm-cpufreq'

* pm-cpufreq:
  cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy
  cpufreq: intel_pstate: Fix intel_pstate_verify_policy()
  cpufreq: intel_pstate: Fix global settings in active mode
  cpufreq: Add the "cpufreq.off=1" cmdline option
  cpufreq: intel_pstate: Avoid triggering cpu_frequency tracepoint unnecessarily
  cpufreq: intel_pstate: Fix intel_cpufreq_verify_policy()
  cpufreq: intel_pstate: Do not use performance_limits in passive mode

4 years agoMerge tag 'irq-fixes-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/maz...
Thomas Gleixner [Thu, 9 Mar 2017 11:06:41 +0000 (12:06 +0100)]
Merge tag 'irq-fixes-4.11-rc2' of git://git./linux/kernel/git/maz/arm-platforms into irq/urgent

Pull irqchip/irqdomain updates for 4.11-rc2 from Marc Zyngier

 - irqchip/crossbar: Some type tidying up
 - irqchip/gicv3-its: Workaround for a Qualcomm erratum
 - irqdomain: Compile for for systems that don't use CONFIG_IRQ_DOMAIN

Fixed up minor conflict in the crossbar driver.

4 years agoMerge tag 'usb-serial-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Greg Kroah-Hartman [Thu, 9 Mar 2017 10:14:06 +0000 (11:14 +0100)]
Merge tag 'usb-serial-4.11-rc2' of git://git./linux/kernel/git/johan/usb-serial into usb-linus

Johan writes:

USB-serial fixes for v4.11-rc2

Here's a fix for a digi_acceleport regression in -rc1, and some fixes
for long-standing issues in three other drivers, including a
NULL-pointer dereference and a couple of information leaks that could be
triggered by a malicious device.

Signed-off-by: Johan Hovold <johan@kernel.org>
4 years agoUSB: serial: digi_acceleport: fix OOB-event processing
Johan Hovold [Fri, 24 Feb 2017 18:11:28 +0000 (19:11 +0100)]
USB: serial: digi_acceleport: fix OOB-event processing

A recent change claimed to fix an off-by-one error in the OOB-port
completion handler, but instead introduced such an error. This could
specifically led to modem-status changes going unnoticed, effectively
breaking TIOCMGET.

Note that the offending commit fixes a loop-condition underflow and is
marked for stable, but should not be backported without this fix.

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: 2d380889215f ("USB: serial: digi_acceleport: fix OOB data sanity check")
Cc: stable <stable@vger.kernel.org> # v2.6.30
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoMAINTAINERS: usb251xb: remove reference inexistent file
Richard Leitner [Mon, 6 Mar 2017 08:24:23 +0000 (09:24 +0100)]
MAINTAINERS: usb251xb: remove reference inexistent file

The platform_data header file was dropped in the merged version of the
USB251xB driver. Therefore remove its reference from the MAINTAINERS file.

Signed-off-by: Richard Leitner <richard.leitner@skidata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodoc: dt-bindings: usb251xb: mark reg as required
Richard Leitner [Mon, 6 Mar 2017 08:24:22 +0000 (09:24 +0100)]
doc: dt-bindings: usb251xb: mark reg as required

Mark the reg property as required and furthermore fix some typos and
spellings in the documentation.

Signed-off-by: Richard Leitner <richard.leitner@skidata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agousb: usb251xb: dt: add unit suffix to oc-delay and power-on-time
Richard Leitner [Mon, 6 Mar 2017 08:24:21 +0000 (09:24 +0100)]
usb: usb251xb: dt: add unit suffix to oc-delay and power-on-time

Rename oc-delay-* to oc-delay-us and make it expect a time value.
Furthermore add -ms suffix to power-on-time. There changes were
suggested by Rob Herring in https://lkml.org/lkml/2017/2/15/1283.

Signed-off-by: Richard Leitner <richard.leitner@skidata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agousb: usb251xb: remove max_{power,current}_{sp,bp} properties
Richard Leitner [Mon, 6 Mar 2017 08:24:20 +0000 (09:24 +0100)]
usb: usb251xb: remove max_{power,current}_{sp,bp} properties

Remove the max_{power,current}_{sp,bp} properties of the usb251xb driver
from devicetree. This is done to simplify the dt bindings as requested
by Rob Herring in https://lkml.org/lkml/2017/2/15/1283. If those
properties are ever needed by somebody they can be enabled again easily.

Signed-off-by: Richard Leitner <richard.leitner@skidata.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agousb-storage: Add ignore-residue quirk for Initio INIC-3619
Tobias Jakobi [Mon, 27 Feb 2017 23:46:58 +0000 (00:46 +0100)]
usb-storage: Add ignore-residue quirk for Initio INIC-3619

This USB-SATA bridge chip is used in a StarTech enclosure for
optical drives.

Without the quirk MakeMKV fails during the key exchange with an
installed BluRay drive:
> Error 'Scsi error - ILLEGAL REQUEST:COPY PROTECTION KEY EXCHANGE FAILURE - KEY NOT ESTABLISHED'
> occurred while issuing SCSI command AD010..080002400 to device 'SG:dev_11:2'

Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoUSB: iowarrior: fix NULL-deref in write
Johan Hovold [Tue, 7 Mar 2017 15:11:04 +0000 (16:11 +0100)]
USB: iowarrior: fix NULL-deref in write

Make sure to verify that we have the required interrupt-out endpoint for
IOWarrior56 devices to avoid dereferencing a NULL-pointer in write
should a malicious device lack such an endpoint.

Fixes: 946b960d13c1 ("USB: add driver for iowarrior devices.")
Cc: stable <stable@vger.kernel.org> # 2.6.21
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoUSB: iowarrior: fix NULL-deref at probe
Johan Hovold [Tue, 7 Mar 2017 15:11:03 +0000 (16:11 +0100)]
USB: iowarrior: fix NULL-deref at probe

Make sure to check for the required interrupt-in endpoint to avoid
dereferencing a NULL-pointer should a malicious device lack such an
endpoint.

Note that a fairly recent change purported to fix this issue, but added
an insufficient test on the number of endpoints only, a test which can
now be removed.

Fixes: 4ec0ef3a8212 ("USB: iowarrior: fix oops with malicious USB descriptors")
Fixes: 946b960d13c1 ("USB: add driver for iowarrior devices.")
Cc: stable <stable@vger.kernel.org> # 2.6.21
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agousb: phy: isp1301: Add OF device ID table
Javier Martinez Canillas [Wed, 22 Feb 2017 18:23:22 +0000 (15:23 -0300)]
usb: phy: isp1301: Add OF device ID table

The driver doesn't have a struct of_device_id table but supported devices
are registered via Device Trees. This is working on the assumption that a
I2C device registered via OF will always match a legacy I2C device ID and
that the MODALIAS reported will always be of the form i2c:<device>.

But this could change in the future so the correct approach is to have an
OF device ID table if the devices are registered via OF.

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agousb: ohci-at91: Do not drop unhandled USB suspend control requests
Jelle Martijn Kok [Tue, 21 Feb 2017 11:48:18 +0000 (12:48 +0100)]
usb: ohci-at91: Do not drop unhandled USB suspend control requests

In patch 2e2aa1bc7eff90ecm, USB suspend and wakeup control requests are
passed to SFR_OHCIICR register. If a processor does not have such a
register, this hub control request will be dropped.

If no such a SFR register is available, all USB suspend control requests
will now be processed using ohci_hub_control()
(like before patch 2e2aa1bc7eff90ecm.)

Tested on an Atmel AT91SAM9G20 with an on-board TI TUSB2046B hub chip
If the last USB device is unplugged from the USB hub, the hub goes into
sleep and will not wakeup when an USB devices is inserted.

Fixes: 2e2aa1bc7eff90ec ("usb: ohci-at91: Forcibly suspend ports while USB suspend")
Signed-off-by: Jelle Martijn Kok <jmkok@youcom.nl>
Tested-by: Wenyou Yang <wenyou.yang@atmel.com>
Cc: Wenyou Yang <wenyou.yang@atmel.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@vger.kernel.org>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoKVM: arm64: Increase number of user memslots to 512
Linu Cherian [Wed, 8 Mar 2017 06:08:35 +0000 (11:38 +0530)]
KVM: arm64: Increase number of user memslots to 512

Having only 32 memslots is a real constraint for the maximum
number of PCI devices that can be assigned to a single guest.
Assuming each PCI device/virtual function having two memory BAR
regions, we could assign only 15 devices/virtual functions to a
guest.

Hence increase KVM_USER_MEM_SLOTS to 512 as done in other archs like
powerpc.

Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Linu Cherian <linu.cherian@cavium.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>