sfrench/cifs-2.6.git
4 years agoMerge tag 'nfs-rdma-4.6-1' of git://git.linux-nfs.org/projects/anna/nfs-rdma
Trond Myklebust [Wed, 16 Mar 2016 20:24:36 +0000 (16:24 -0400)]
Merge tag 'nfs-rdma-4.6-1' of git://git.linux-nfs.org/projects/anna/nfs-rdma

NFS: NFSoRDMA Client Side Changes

These patches include several bugfixes and cleanups for the NFSoRDMA client.
This includes bugfixes for NFS v4.1, proper RDMA_ERROR handling, and fixes
from the recent workqueue swicchover.  These patches also switch xprtrdma to
use the new CQ API

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* tag 'nfs-rdma-4.6-1' of git://git.linux-nfs.org/projects/anna/nfs-rdma: (787 commits)
  xprtrdma: Use new CQ API for RPC-over-RDMA client send CQs
  xprtrdma: Use an anonymous union in struct rpcrdma_mw
  xprtrdma: Use new CQ API for RPC-over-RDMA client receive CQs
  xprtrdma: Serialize credit accounting again
  xprtrdma: Properly handle RDMA_ERROR replies
  rpcrdma: Add RPCRDMA_HDRLEN_ERR
  xprtrdma: Do not wait if ib_post_send() fails
  xprtrdma: Segment head and tail XDR buffers on page boundaries
  xprtrdma: Clean up dprintk format string containing a newline
  xprtrdma: Clean up physical_op_map()
  xprtrdma: Clean up unused RPCRDMA_INLINE_PAD_THRESH macro

4 years agonfs4: nfs4_ff_layout_prepare_ds should return NULL if connection failed
Jeff Layton [Wed, 24 Feb 2016 20:28:29 +0000 (15:28 -0500)]
nfs4: nfs4_ff_layout_prepare_ds should return NULL if connection failed

I hit the following oops out of the blue while testing with flexfiles:

BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8
IP: [<ffffffffa048f6b8>] nfs4_ff_find_or_create_ds_client+0x48/0x50 [nfs_layout_flexfiles]
PGD 44031067 PUD 5062d067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: nfsv3 nfs_layout_flexfiles tun rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dcdbas nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bonding ipmi_devintf ipmi_msghandler snd_hda_codec_generic virtio_balloon ppdev snd_hda_intel snd_hda_controller snd_hda_codec iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_core parport_pc snd_hwdep parport snd_seq snd_seq_device snd_pcm snd_timer acpi_cpufreq
 snd soundcore i2c_piix4 xfs libcrc32c joydev virtio_net virtio_console qxl drm_kms_helper ttm crc32c_intel drm virtio_pci serio_raw ata_generic virtio_ring virtio pata_acpi
CPU: 0 PID: 19138 Comm: test5 Not tainted 4.1.9-100.pd.90.el7.x86_64 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
task: ffff88007b70cf00 ti: ffff88004cc44000 task.ti: ffff88004cc44000
RIP: 0010:[<ffffffffa048f6b8>]  [<ffffffffa048f6b8>] nfs4_ff_find_or_create_ds_client+0x48/0x50 [nfs_layout_flexfiles]
RSP: 0018:ffff88004cc47890  EFLAGS: 00010246
RAX: 0000000000000003 RBX: ffff880050932300 RCX: ffff88006978f488
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003e0e8540
RBP: ffff88004cc47908 R08: 0000000000000000 R09: 0000000000000000
R10: ffff88007ff8c758 R11: 0000000000000005 R12: ffff88003e0e8540
R13: 0000000000000000 R14: ffff88006978f488 R15: ffff88004431cc80
FS:  00007fea40c7c740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000e8 CR3: 0000000044318000 CR4: 00000000000406f0
Stack:
 ffffffffa048c934 ffff880050932310 0000000100000001 ffff88006978f510
 ffff88006978f3c8 ffff88003e56cd90 ffff88004cc479d0 00000020a052aff0
 000000000004b000 ffff88004cc47908 ffff880050932300 ffff88004cc479d0
Call Trace:
 [<ffffffffa048c934>] ? ff_layout_write_pagelist+0x64/0x220 [nfs_layout_flexfiles]
 [<ffffffffa057a3bf>] pnfs_generic_pg_writepages+0xaf/0x1b0 [nfsv4]
 [<ffffffffa051ab57>] nfs_pageio_doio+0x27/0x60 [nfs]
 [<ffffffffa051bfe4>] nfs_pageio_complete_mirror+0x54/0xa0 [nfs]
 [<ffffffffa051c7ad>] nfs_pageio_complete+0x2d/0x90 [nfs]
 [<ffffffffa052032d>] nfs_writepage_locked+0x8d/0xe0 [nfs]
 [<ffffffff811e4630>] ? page_referenced_one+0x1a0/0x1a0
 [<ffffffffa05210e7>] nfs_wb_single_page+0xf7/0x190 [nfs]
 [<ffffffffa05108d1>] nfs_launder_page+0x41/0x90 [nfs]
 [<ffffffff811b8930>] invalidate_inode_pages2_range+0x340/0x3a0
 [<ffffffff811b89a7>] invalidate_inode_pages2+0x17/0x20
 [<ffffffffa0513e1e>] nfs_release+0x9e/0xb0 [nfs]
 [<ffffffffa050fa1d>] nfs_file_release+0x3d/0x60 [nfs]
 [<ffffffff8122481c>] __fput+0xdc/0x1e0
 [<ffffffff8122496e>] ____fput+0xe/0x10
 [<ffffffff810bde67>] task_work_run+0xa7/0xe0
 [<ffffffff810af735>] get_signal+0x565/0x600
 [<ffffffff811a9815>] ? __filemap_fdatawrite_range+0x65/0x90
 [<ffffffff810144a7>] do_signal+0x37/0x730
 [<ffffffffa0569921>] ? nfs4_file_fsync+0x81/0x150 [nfsv4]
 [<ffffffff81254dbb>] ? vfs_fsync_range+0x3b/0xb0
 [<ffffffff811446a6>] ? __audit_syscall_exit+0x1e6/0x280
 [<ffffffff81014bff>] do_notify_resume+0x5f/0xa0
 [<ffffffff8178ec3c>] int_signal+0x12/0x17
Code: 48 8b 40 70 8b 00 83 f8 03 74 20 83 f8 04 75 13 55 48 89 ce 48 89 d7 48 89 e5 e8 14 0f 0e 00 5d c3 66 90 0f 0b 66 0f 1f 44 00 00 <48> 8b 82 e8 00 00 00 c3 66 66 66 66 90 55 48 89 e5 41 57 41 56
RIP  [<ffffffffa048f6b8>] nfs4_ff_find_or_create_ds_client+0x48/0x50 [nfs_layout_flexfiles]
 RSP <ffff88004cc47890>
CR2: 00000000000000e8

When the DS connection attempt fails, nfs4_ff_layout_prepare_ds marks it
for the error but then just returns the ds as if it were usable. The
comments though say:

  /* Upon return, either ds is connected, or ds is NULL */

Ensure that we set the return pointer to NULL in the event that the
connection attempt fails.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
4 years agonfs: remove nfs_inode_dio_wait
Christoph Hellwig [Wed, 2 Mar 2016 16:35:55 +0000 (17:35 +0100)]
nfs: remove nfs_inode_dio_wait

Just call inode_dio_wait directly instead of through a pointless wrapper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
4 years agonfs: remove nfs4_file_fsync
Christoph Hellwig [Wed, 2 Mar 2016 16:35:54 +0000 (17:35 +0100)]
nfs: remove nfs4_file_fsync

The only difference to nfs_file_fsync is the call to pnfs_sync_inode.  But
pnfs_sync_inode is just an inline that calls a pNFS layout driver method
if CONFIG_PNFS is designed, and thus can be called just fine from the core
NFS module.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
4 years agoxprtrdma: Use new CQ API for RPC-over-RDMA client send CQs
Chuck Lever [Fri, 4 Mar 2016 16:28:53 +0000 (11:28 -0500)]
xprtrdma: Use new CQ API for RPC-over-RDMA client send CQs

Calling ib_poll_cq() to sort through WCs during a completion is a
common pattern amongst RDMA consumers. Since commit 14d3a3b2498e
("IB: add a proper completion queue abstraction"), WC sorting can
be handled by the IB core.

By converting to this new API, xprtrdma is made a better neighbor to
other RDMA consumers, as it allows the core to schedule the delivery
of completions more fairly amongst all active consumers.

Because each ib_cqe carries a pointer to a completion method, the
core can now post its own operations on a consumer's QP, and handle
the completions itself, without changes to the consumer.

Send completions were previously handled entirely in the completion
upcall handler (ie, deferring to a process context is unneeded).
Thus IB_POLL_SOFTIRQ is a direct replacement for the current
xprtrdma send code path.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Devesh Sharma <devesh.sharma@broadcom.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoxprtrdma: Use an anonymous union in struct rpcrdma_mw
Chuck Lever [Fri, 4 Mar 2016 16:28:45 +0000 (11:28 -0500)]
xprtrdma: Use an anonymous union in struct rpcrdma_mw

Clean up: Make code more readable.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Devesh Sharma <devesh.sharma@broadcom.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoxprtrdma: Use new CQ API for RPC-over-RDMA client receive CQs
Chuck Lever [Fri, 4 Mar 2016 16:28:36 +0000 (11:28 -0500)]
xprtrdma: Use new CQ API for RPC-over-RDMA client receive CQs

Calling ib_poll_cq() to sort through WCs during a completion is a
common pattern amongst RDMA consumers. Since commit 14d3a3b2498e
("IB: add a proper completion queue abstraction"), WC sorting can
be handled by the IB core.

By converting to this new API, xprtrdma is made a better neighbor to
other RDMA consumers, as it allows the core to schedule the delivery
of completions more fairly amongst all active consumers.

Because each ib_cqe carries a pointer to a completion method, the
core can now post its own operations on a consumer's QP, and handle
the completions itself, without changes to the consumer.

xprtrdma's reply processing is already handled in a work queue, but
there is some initial order-dependent processing that is done in the
soft IRQ context before a work item is scheduled.

IB_POLL_SOFTIRQ is a direct replacement for the current xprtrdma
receive code path.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Devesh Sharma <devesh.sharma@broadcom.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoxprtrdma: Serialize credit accounting again
Chuck Lever [Fri, 4 Mar 2016 16:28:27 +0000 (11:28 -0500)]
xprtrdma: Serialize credit accounting again

Commit fe97b47cd623 ("xprtrdma: Use workqueue to process RPC/RDMA
replies") replaced the reply tasklet with a workqueue that allows
RPC replies to be processed in parallel. Thus the credit values in
RPC-over-RDMA replies can be applied in a different order than in
which the server sent them.

To fix this, revert commit eba8ff660b2d ("xprtrdma: Move credit
update to RPC reply handler"). Reverting is done by hand to
accommodate code changes that have occurred since then.

Fixes: fe97b47cd623 ("xprtrdma: Use workqueue to process . . .")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoxprtrdma: Properly handle RDMA_ERROR replies
Chuck Lever [Fri, 4 Mar 2016 16:28:18 +0000 (11:28 -0500)]
xprtrdma: Properly handle RDMA_ERROR replies

These are shorter than RPCRDMA_HDRLEN_MIN, and they need to
complete the waiting RPC.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agorpcrdma: Add RPCRDMA_HDRLEN_ERR
Chuck Lever [Fri, 4 Mar 2016 16:28:09 +0000 (11:28 -0500)]
rpcrdma: Add RPCRDMA_HDRLEN_ERR

Error headers are shorter than either RDMA_MSG or RDMA_NOMSG.

Since HDRLEN_MIN is already used in several other places that would
be annoying to change, add RPCRDMA_HDRLEN_ERR for the one or two
spots where the shorter length is needed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Devesh Sharma <devesh.sharma@broadcom.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoxprtrdma: Do not wait if ib_post_send() fails
Chuck Lever [Fri, 4 Mar 2016 16:28:01 +0000 (11:28 -0500)]
xprtrdma: Do not wait if ib_post_send() fails

If ib_post_send() in ro_unmap_sync() fails, the WRs have not been
posted, no completions will fire, and wait_for_completion() will
wait forever. Skip the wait in that case.

To ensure the MRs are invalid, disconnect.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoxprtrdma: Segment head and tail XDR buffers on page boundaries
Chuck Lever [Fri, 4 Mar 2016 16:27:52 +0000 (11:27 -0500)]
xprtrdma: Segment head and tail XDR buffers on page boundaries

A single memory allocation is used for the pair of buffers wherein
the RPC client builds an RPC call message and decodes its matching
reply. These buffers are sized based on the maximum possible size
of the RPC call and reply messages for the operation in progress.

This means that as the call buffer increases in size, the start of
the reply buffer is pushed farther into the memory allocation.

RPC requests are growing in size. It used to be that both the call
and reply buffers fit inside a single page.

But these days, thanks to NFSv4 (and especially security labels in
NFSv4.2) the maximum call and reply sizes are large. NFSv4.0 OPEN,
for example, now requires a 6KB allocation for a pair of call and
reply buffers, and NFSv4 LOOKUP is not far behind.

As the maximum size of a call increases, the reply buffer is pushed
far enough into the buffer's memory allocation that a page boundary
can appear in the middle of it.

When the maximum possible reply size is larger than the client's
RDMA receive buffers (currently 1KB), the client has to register a
Reply chunk for the server to RDMA Write the reply into.

The logic in rpcrdma_convert_iovs() assumes that xdr_buf head and
tail buffers would always be contained on a single page. It supplies
just one segment for the head and one for the tail.

FMR, for example, registers up to a page boundary (only a portion of
the reply buffer in the OPEN case above). But without additional
segments, it doesn't register the rest of the buffer.

When the server tries to write the OPEN reply, the RDMA Write fails
with a remote access error since the client registered only part of
the Reply chunk.

rpcrdma_convert_iovs() must split the XDR buffer into multiple
segments, each of which are guaranteed not to contain a page
boundary. That way fmr_op_map is given the proper number of segments
to register the whole reply buffer.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Devesh Sharma <devesh.sharma@broadcom.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoxprtrdma: Clean up dprintk format string containing a newline
Chuck Lever [Fri, 4 Mar 2016 16:27:43 +0000 (11:27 -0500)]
xprtrdma: Clean up dprintk format string containing a newline

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoxprtrdma: Clean up physical_op_map()
Chuck Lever [Fri, 4 Mar 2016 16:27:35 +0000 (11:27 -0500)]
xprtrdma: Clean up physical_op_map()

physical_op_unmap{_sync} don't use mr_nsegs, so don't bother to set
it in physical_op_map.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoxprtrdma: Clean up unused RPCRDMA_INLINE_PAD_THRESH macro
Chuck Lever [Fri, 4 Mar 2016 16:27:26 +0000 (11:27 -0500)]
xprtrdma: Clean up unused RPCRDMA_INLINE_PAD_THRESH macro

Fixes: b3221d6a53c4 ('xprtrdma: Remove logic that constructs...')
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
4 years agoLinux 4.5-rc6 v4.5-rc6
Linus Torvalds [Sun, 28 Feb 2016 16:41:20 +0000 (08:41 -0800)]
Linux 4.5-rc6

4 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 28 Feb 2016 15:52:00 +0000 (07:52 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:
 "A rather largish series of 12 patches addressing a maze of race
  conditions in the perf core code from Peter Zijlstra"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Robustify task_function_call()
  perf: Fix scaling vs. perf_install_in_context()
  perf: Fix scaling vs. perf_event_enable()
  perf: Fix scaling vs. perf_event_enable_on_exec()
  perf: Fix ctx time tracking by introducing EVENT_TIME
  perf: Cure event->pending_disable race
  perf: Fix race between event install and jump_labels
  perf: Fix cloning
  perf: Only update context time when active
  perf: Allow perf_release() with !event->ctx
  perf: Do not double free
  perf: Close install vs. exit race

4 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 28 Feb 2016 15:49:23 +0000 (07:49 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "This update contains:

   - Hopefully the last ASM CLAC fixups

   - A fix for the Quark family related to the IMR lock which makes
     kexec work again

   - A off-by-one fix in the MPX code.  Ironic, isn't it?

   - A fix for X86_PAE which addresses once more an unsigned long vs
     phys_addr_t hickup"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mpx: Fix off-by-one comparison with nr_registers
  x86/mm: Fix slow_virt_to_phys() for X86_PAE again
  x86/entry/compat: Add missing CLAC to entry_INT80_32
  x86/entry/32: Add an ASM_CLAC to entry_SYSENTER_32
  x86/platform/intel/quark: Change the kernel's IMR lock bit to false

4 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 28 Feb 2016 15:48:01 +0000 (07:48 -0800)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixlet from Thomas Gleixner:
 "A trivial printk typo fix"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/deadline: Fix trivial typo in printk() message

4 years agoMerge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 28 Feb 2016 15:45:58 +0000 (07:45 -0800)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
 "Four small fixes for irqchip drivers:

   - Add missing low level irq handler initialization on mxs, so
     interrupts can acutally be delivered

   - Add a missing barrier to the GIC driver

   - Two fixes for the GIC-V3-ITS driver, addressing a double EOI write
     and a cache flush beyond the actual region"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3: Add missing barrier to 32bit version of gic_read_iar()
  irqchip/mxs: Add missing set_handle_irq()
  irqchip/gicv3-its: Avoid cache flush beyond ITS_BASERn memory size
  irqchip/gic-v3-its: Fix double ICC_EOIR write for LPI in EOImode==1

4 years agoMerge tag 'staging-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sun, 28 Feb 2016 15:39:15 +0000 (07:39 -0800)]
Merge tag 'staging-4.5-rc6' of git://git./linux/kernel/git/gregkh/staging

Pull staging/android fix from Greg KH:
 "Here is one patch, for the android binder driver, to resolve a
  reported problem.  Turns out it has been around for a while (since
  3.15), so it is good to finally get it resolved.

  It has been in linux-next for a while with no reported issues"

* tag 'staging-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  drivers: android: correct the size of struct binder_uintptr_t for BC_DEAD_BINDER_DONE

4 years agoMerge tag 'usb-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sun, 28 Feb 2016 15:37:30 +0000 (07:37 -0800)]
Merge tag 'usb-4.5-rc6' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are a few USB fixes for 4.5-rc6

  They fix a reported bug for some USB 3 devices by reverting the recent
  patch, a MAINTAINERS change for some drivers, some new device ids, and
  of course, the usual bunch of USB gadget driver fixes.

  All have been in linux-next for a while with no reported issues"

* tag 'usb-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  MAINTAINERS: drop OMAP USB and MUSB maintainership
  usb: musb: fix DMA for host mode
  usb: phy: msm: Trigger USB state detection work in DRD mode
  usb: gadget: net2280: fix endpoint max packet for super speed connections
  usb: gadget: gadgetfs: unregister gadget only if it got successfully registered
  usb: gadget: remove driver from pending list on probe error
  Revert "usb: hub: do not clear BOS field during reset device"
  usb: chipidea: fix return value check in ci_hdrc_pci_probe()
  usb: chipidea: error on overflow for port_test_write
  USB: option: add "4G LTE usb-modem U901"
  USB: cp210x: add IDs for GE B650V3 and B850V3 boards
  USB: option: add support for SIM7100E
  usb: musb: Fix DMA desired mode for Mentor DMA engine
  usb: gadget: fsl_qe_udc: fix IS_ERR_VALUE usage
  usb: dwc2: USB_DWC2 should depend on HAS_DMA
  usb: dwc2: host: fix the data toggle error in full speed descriptor dma
  usb: dwc2: host: fix logical omissions in dwc2_process_non_isoc_desc
  usb: dwc3: Fix assignment of EP transfer resources
  usb: dwc2: Add extra delay when forcing dr_mode

4 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Sun, 28 Feb 2016 01:10:32 +0000 (17:10 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  do_last(): ELOOP failure exit should be done after leaving RCU mode
  should_follow_link(): validate ->d_seq after having decided to follow
  namei: ->d_inode of a pinned dentry is stable only for positives
  do_last(): don't let a bogus return value from ->open() et.al. to confuse us
  fs: return -EOPNOTSUPP if clone is not supported
  hpfs: don't truncate the file when delete fails

4 years agoMerge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Linus Torvalds [Sun, 28 Feb 2016 00:58:32 +0000 (16:58 -0800)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "We didn't have a batch last week, so this one is slightly larger.

  None of them are scary though, a handful of fixes for small DT pieces,
  replacing properties with newer conventions.

  Highlights:
   - N900 fix for setting system revision
   - onenand init fix to avoid filesystem corruption
   - Clock fix for audio on Beaglebone-x15
   - Fixes on shmobile to deal with CONFIG_DEBUG_RODATA (default y in 4.6)

  + misc smaller stuff"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  MAINTAINERS: Extend info, add wiki and ml for meson arch
  MAINTAINERS: alpine: add a new maintainer and update the entry
  ARM: at91/dt: fix typo in sama5d2 pinmux descriptions
  ARM: OMAP2+: Fix onenand initialization to avoid filesystem corruption
  Revert "regulator: tps65217: remove tps65217.dtsi file"
  ARM: shmobile: Remove shmobile_boot_arg
  ARM: shmobile: Move shmobile_smp_{mpidr, fn, arg}[] from .text to .bss
  ARM: shmobile: r8a7779: Remove remainings of removed SCU boot setup code
  ARM: shmobile: Move shmobile_scu_base from .text to .bss
  ARM: OMAP2+: Fix omap_device for module reload on PM runtime forbid
  ARM: OMAP2+: Improve omap_device error for driver writers
  ARM: DTS: am57xx-beagle-x15: Select SYS_CLK2 for audio clocks
  ARM: dts: am335x/am57xx: replace gpio-key,wakeup with wakeup-source property
  ARM: OMAP2+: Set system_rev from ATAGS for n900
  ARM: dts: orion5x: fix the missing mtd flash on linkstation lswtgl
  ARM: dts: kirkwood: use unique machine name for ds112
  ARM: dts: imx6: remove bogus interrupt-parent from CAAM node

4 years agodo_last(): ELOOP failure exit should be done after leaving RCU mode
Al Viro [Sun, 28 Feb 2016 00:37:37 +0000 (19:37 -0500)]
do_last(): ELOOP failure exit should be done after leaving RCU mode

... or we risk seeing a bogus value of d_is_symlink() there.

Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
4 years agoshould_follow_link(): validate ->d_seq after having decided to follow
Al Viro [Sun, 28 Feb 2016 00:31:01 +0000 (19:31 -0500)]
should_follow_link(): validate ->d_seq after having decided to follow

... otherwise d_is_symlink() above might have nothing to do with
the inode value we've got.

Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
4 years agonamei: ->d_inode of a pinned dentry is stable only for positives
Al Viro [Sun, 28 Feb 2016 00:23:16 +0000 (19:23 -0500)]
namei: ->d_inode of a pinned dentry is stable only for positives

both do_last() and walk_component() risk picking a NULL inode out
of dentry about to become positive, *then* checking its flags and
seeing that it's not negative anymore and using (already stale by
then) value they'd fetched earlier.  Usually ends up oopsing soon
after that...

Cc: stable@vger.kernel.org # v3.13+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
4 years agodo_last(): don't let a bogus return value from ->open() et.al. to confuse us
Al Viro [Sun, 28 Feb 2016 00:17:33 +0000 (19:17 -0500)]
do_last(): don't let a bogus return value from ->open() et.al. to confuse us

... into returning a positive to path_openat(), which would interpret that
as "symlink had been encountered" and proceed to corrupt memory, etc.
It can only happen due to a bug in some ->open() instance or in some LSM
hook, etc., so we report any such event *and* make sure it doesn't trick
us into further unpleasantness.

Cc: stable@vger.kernel.org # v3.6+, at least
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
4 years agofs: return -EOPNOTSUPP if clone is not supported
Christoph Hellwig [Fri, 26 Feb 2016 17:53:12 +0000 (18:53 +0100)]
fs: return -EOPNOTSUPP if clone is not supported

-EBADF is a rather confusing error if an operations is not supported,
and nfsd gets rather upset about it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
4 years agohpfs: don't truncate the file when delete fails
Mikulas Patocka [Thu, 25 Feb 2016 17:17:38 +0000 (18:17 +0100)]
hpfs: don't truncate the file when delete fails

The delete opration can allocate additional space on the HPFS filesystem
due to btree split. The HPFS driver checks in advance if there is
available space, so that it won't corrupt the btree if we run out of space
during splitting.

If there is not enough available space, the HPFS driver attempted to
truncate the file, but this results in a deadlock since the commit
7dd29d8d865efdb00c0542a5d2c87af8c52ea6c7 ("HPFS: Introduce a global mutex
and lock it on every callback from VFS").

This patch removes the code that tries to truncate the file and -ENOSPC is
returned instead. If the user hits -ENOSPC on delete, he should try to
delete other files (that are stored in a leaf btree node), so that the
delete operation will make some space for deleting the file stored in
non-leaf btree node.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: stable@vger.kernel.org # 2.6.39+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
4 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sat, 27 Feb 2016 20:46:16 +0000 (12:46 -0800)]
Merge branch 'akpm' (patches from Andrew)

Merge fixes from Andrew Morton:
 "10 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  dax: move writeback calls into the filesystems
  dax: give DAX clearing code correct bdev
  ext4: online defrag not supported with DAX
  ext2, ext4: only set S_DAX for regular inodes
  block: disable block device DAX by default
  ocfs2: unlock inode if deleting inode from orphan fails
  mm: ASLR: use get_random_long()
  drivers: char: random: add get_random_long()
  mm: numa: quickly fail allocations for NUMA balancing on full nodes
  mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED

4 years agoMerge tag 'tags/ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 27 Feb 2016 20:40:49 +0000 (12:40 -0800)]
Merge tag 'tags/ext4_for_linus_stable' of git://git./linux/kernel/git/tytso/ext4

Pull ext2/4 DAX fix from Ted Ts'o:
 "This fixes a file system corruption bug with DAX"

* tag 'tags/ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext2, ext4: fix issue with missing journal entry in ext4_dax_mkwrite()

4 years agoMerge tag 'pci-v4.5-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Linus Torvalds [Sat, 27 Feb 2016 20:33:42 +0000 (12:33 -0800)]
Merge tag 'pci-v4.5-fixes-3' of git://git./linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
 "Enumeration:
    Revert x86 pcibios_alloc_irq() to fix regression (Bjorn Helgaas)

  Marvell MVEBU host bridge driver:
    Restrict build to 32-bit ARM (Thierry Reding)"

* tag 'pci-v4.5-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: mvebu: Restrict build to 32-bit ARM
  Revert "PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()"
  Revert "PCI: Add helpers to manage pci_dev->irq and pci_dev->irq_managed"
  Revert "x86/PCI: Don't alloc pcibios-irq when MSI is enabled"

4 years agoext2, ext4: fix issue with missing journal entry in ext4_dax_mkwrite()
Ross Zwisler [Sat, 27 Feb 2016 19:01:13 +0000 (14:01 -0500)]
ext2, ext4: fix issue with missing journal entry in ext4_dax_mkwrite()

As it is currently written ext4_dax_mkwrite() assumes that the call into
__dax_mkwrite() will not have to do a block allocation so it doesn't create
a journal entry.  For a read that creates a zero page to cover a hole
followed by a write that actually allocates storage this is incorrect.  The
ext4_dax_mkwrite() -> __dax_mkwrite() -> __dax_fault() path calls
get_blocks() to allocate storage.

Fix this by having the ->page_mkwrite fault handler call ext4_dax_fault()
as this function already has all the logic needed to allocate a journal
entry and call __dax_fault().

Also update the ext2 fault handlers in this same way to remove duplicate
code and keep the logic between ext2 and ext4 the same.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
4 years agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 27 Feb 2016 18:30:14 +0000 (10:30 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clk fix from Stephen Boyd:
 "One small fix to keep OMAP platforms working across a suspend/resume
  cycle"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: ti: omap3+: dpll: use non-locking version of clk_get_rate

4 years agodax: move writeback calls into the filesystems
Ross Zwisler [Fri, 26 Feb 2016 23:19:55 +0000 (15:19 -0800)]
dax: move writeback calls into the filesystems

Previously calls to dax_writeback_mapping_range() for all DAX filesystems
(ext2, ext4 & xfs) were centralized in filemap_write_and_wait_range().

dax_writeback_mapping_range() needs a struct block_device, and it used
to get that from inode->i_sb->s_bdev.  This is correct for normal inodes
mounted on ext2, ext4 and XFS filesystems, but is incorrect for DAX raw
block devices and for XFS real-time files.

Instead, call dax_writeback_mapping_range() directly from the filesystem
->writepages function so that it can supply us with a valid block
device.  This also fixes DAX code to properly flush caches in response
to sync(2).

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agodax: give DAX clearing code correct bdev
Ross Zwisler [Fri, 26 Feb 2016 23:19:52 +0000 (15:19 -0800)]
dax: give DAX clearing code correct bdev

dax_clear_blocks() needs a valid struct block_device and previously it
was using inode->i_sb->s_bdev in all cases.  This is correct for normal
inodes on mounted ext2, ext4 and XFS filesystems, but is incorrect for
DAX raw block devices and for XFS real-time devices.

Instead, rename dax_clear_blocks() to dax_clear_sectors(), and change
its arguments to take a bdev and a sector instead of an inode and a
block.  This better reflects what the function does, and it allows the
filesystem and raw block device code to pass in an appropriate struct
block_device.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoext4: online defrag not supported with DAX
Ross Zwisler [Fri, 26 Feb 2016 23:19:49 +0000 (15:19 -0800)]
ext4: online defrag not supported with DAX

Online defrag operations for ext4 are hard coded to use the page cache.
See ext4_ioctl() -> ext4_move_extents() -> move_extent_per_page()

When combined with DAX I/O, which circumvents the page cache, this can
result in data corruption.  This was observed with xfstests ext4/307 and
ext4/308.

Fix this by only allowing online defrag for non-DAX files.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoext2, ext4: only set S_DAX for regular inodes
Ross Zwisler [Fri, 26 Feb 2016 23:19:46 +0000 (15:19 -0800)]
ext2, ext4: only set S_DAX for regular inodes

When S_DAX is set on an inode we assume that if there are pages attached
to the mapping (mapping->nrpages != 0), those pages are clean zero pages
that were used to service reads from holes.  Any dirty data associated
with the inode should be in the form of DAX exceptional entries
(mapping->nrexceptional) that is written back via
dax_writeback_mapping_range().

With the current code, though, this isn't always true.  For example,
ext2 and ext4 directory inodes can have S_DAX set, but have their dirty
data stored as dirty page cache entries.  For these types of inodes,
having S_DAX set doesn't really make sense since their I/O doesn't
actually happen through the DAX code path.

Instead, only allow S_DAX to be set for regular inodes for ext2 and
ext4.  This allows us to have strict DAX vs non-DAX paths in the
writeback code.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoblock: disable block device DAX by default
Dan Williams [Fri, 26 Feb 2016 23:19:43 +0000 (15:19 -0800)]
block: disable block device DAX by default

The recent *sync enabling discovered that we are inserting into the
block_device pagecache counter to the expectations of the dirty data
tracking for dax mappings.  This can lead to data corruption.

We want to support DAX for block devices eventually, but it requires
wider changes to properly manage the pagecache.

   dump_stack+0x85/0xc2
   dax_writeback_mapping_range+0x60/0xe0
   blkdev_writepages+0x3f/0x50
   do_writepages+0x21/0x30
   __filemap_fdatawrite_range+0xc6/0x100
   filemap_write_and_wait+0x4a/0xa0
   set_blocksize+0x70/0xd0
   sb_set_blocksize+0x1d/0x50
   ext4_fill_super+0x75b/0x3360
   mount_bdev+0x180/0x1b0
   ext4_mount+0x15/0x20
   mount_fs+0x38/0x170

Mark the support broken so its disabled by default, but otherwise still
available for testing.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@fb.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoocfs2: unlock inode if deleting inode from orphan fails
Guozhonghua [Fri, 26 Feb 2016 23:19:40 +0000 (15:19 -0800)]
ocfs2: unlock inode if deleting inode from orphan fails

When doing append direct io cleanup, if deleting inode fails, it goes
out without unlocking inode, which will cause the inode deadlock.

This issue was introduced by commit cf1776a9e834 ("ocfs2: fix a tiny
race when truncate dio orohaned entry").

Signed-off-by: Guozhonghua <guozhonghua@h3c.com>
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Gang He <ghe@suse.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: <stable@vger.kernel.org> [4.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm: ASLR: use get_random_long()
Daniel Cashman [Fri, 26 Feb 2016 23:19:37 +0000 (15:19 -0800)]
mm: ASLR: use get_random_long()

Replace calls to get_random_int() followed by a cast to (unsigned long)
with calls to get_random_long().  Also address shifting bug which, in
case of x86 removed entropy mask for mmap_rnd_bits values > 31 bits.

Signed-off-by: Daniel Cashman <dcashman@android.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nick Kralevich <nnk@google.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agodrivers: char: random: add get_random_long()
Daniel Cashman [Fri, 26 Feb 2016 23:19:34 +0000 (15:19 -0800)]
drivers: char: random: add get_random_long()

Commit d07e22597d1d ("mm: mmap: add new /proc tunable for mmap_base
ASLR") added the ability to choose from a range of values to use for
entropy count in generating the random offset to the mmap_base address.

The maximum value on this range was set to 32 bits for 64-bit x86
systems, but this value could be increased further, requiring more than
the 32 bits of randomness provided by get_random_int(), as is already
possible for arm64.  Add a new function: get_random_long() which more
naturally fits with the mmap usage of get_random_int() but operates
exactly the same as get_random_int().

Also, fix the shifting constant in mmap_rnd() to be an unsigned long so
that values greater than 31 bits generate an appropriate mask without
overflow.  This is especially important on x86, as its shift instruction
uses a 5-bit mask for the shift operand, which meant that any value for
mmap_rnd_bits over 31 acts as a no-op and effectively disables mmap_base
randomization.

Finally, replace calls to get_random_int() with get_random_long() where
appropriate.

This patch (of 2):

Add get_random_long().

Signed-off-by: Daniel Cashman <dcashman@android.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nick Kralevich <nnk@google.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm: numa: quickly fail allocations for NUMA balancing on full nodes
Mel Gorman [Fri, 26 Feb 2016 23:19:31 +0000 (15:19 -0800)]
mm: numa: quickly fail allocations for NUMA balancing on full nodes

Commit 4167e9b2cf10 ("mm: remove GFP_THISNODE") removed the GFP_THISNODE
flag combination due to confusing semantics.  It noted that
alloc_misplaced_dst_page() was one such user after changes made by
commit e97ca8e5b864 ("mm: fix GFP_THISNODE callers and clarify").

Unfortunately when GFP_THISNODE was removed, users of
alloc_misplaced_dst_page() started waking kswapd and entering direct
reclaim because the wrong GFP flags are cleared.  The consequence is
that workloads that used to fit into memory now get reclaimed which is
addressed by this patch.

The problem can be demonstrated with "mutilate" that exercises memcached
which is software dedicated to memory object caching.  The configuration
uses 80% of memory and is run 3 times for varying numbers of clients.
The results on a 4-socket NUMA box are

mutilate
                            4.4.0                 4.4.0
                          vanilla           numaswap-v1
Hmean    1      8394.71 (  0.00%)     8395.32 (  0.01%)
Hmean    4     30024.62 (  0.00%)    34513.54 ( 14.95%)
Hmean    7     32821.08 (  0.00%)    70542.96 (114.93%)
Hmean    12    55229.67 (  0.00%)    93866.34 ( 69.96%)
Hmean    21    39438.96 (  0.00%)    85749.21 (117.42%)
Hmean    30    37796.10 (  0.00%)    50231.49 ( 32.90%)
Hmean    47    18070.91 (  0.00%)    38530.13 (113.22%)

The metric is queries/second with the more the better.  The results are
way outside of the noise and the reason for the improvement is obvious
from some of the vmstats

                                 4.4.0       4.4.0
                               vanillanumaswap-v1r1
Minor Faults                1929399272  2146148218
Major Faults                  19746529        3567
Swap Ins                      57307366        9913
Swap Outs                     50623229       17094
Allocation stalls                35909         443
DMA allocs                           0           0
DMA32 allocs                  72976349   170567396
Normal allocs               5306640898  5310651252
Movable allocs                       0           0
Direct pages scanned         404130893      799577
Kswapd pages scanned         160230174           0
Kswapd pages reclaimed        55928786           0
Direct pages reclaimed         1843936       41921
Page writes file                  2391           0
Page writes anon              50623229       17094

The vanilla kernel is swapping like crazy with large amounts of direct
reclaim and kswapd activity.  The figures are aggregate but it's known
that the bad activity is throughout the entire test.

Note that simple streaming anon/file memory consumers also see this
problem but it's not as obvious.  In those cases, kswapd is awake when
it should not be.

As there are at least two reclaim-related bugs out there, it's worth
spelling out the user-visible impact.  This patch only addresses bugs
related to excessive reclaim on NUMA hardware when the working set is
larger than a NUMA node.  There is a bug related to high kswapd CPU
usage but the reports are against laptops and other UMA hardware and is
not addressed by this patch.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: <stable@vger.kernel.org> [4.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED
Andrea Arcangeli [Fri, 26 Feb 2016 23:19:28 +0000 (15:19 -0800)]
mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED

pmd_trans_unstable()/pmd_none_or_trans_huge_or_clear_bad() were
introduced to locklessy (but atomically) detect when a pmd is a regular
(stable) pmd or when the pmd is unstable and can infinitely transition
from pmd_none() and pmd_trans_huge() from under us, while only holding
the mmap_sem for reading (for writing not).

While holding the mmap_sem only for reading, MADV_DONTNEED can run from
under us and so before we can assume the pmd to be a regular stable pmd
we need to compare it against pmd_none() and pmd_trans_huge() in an
atomic way, with pmd_trans_unstable().  The old pmd_trans_huge() left a
tiny window for a race.

Useful applications are unlikely to notice the difference as doing
MADV_DONTNEED concurrently with a page fault would lead to undefined
behavior.

[akpm@linux-foundation.org: tidy up comment grammar/layout]
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoPCI: mvebu: Restrict build to 32-bit ARM
Thierry Reding [Thu, 18 Feb 2016 13:32:10 +0000 (14:32 +0100)]
PCI: mvebu: Restrict build to 32-bit ARM

This driver uses PCI glue that is only available on 32-bit ARM.  This used
to work fine as long as ARCH_MVEBU and ARCH_DOVE were exclusively 32-bit,
but there's a patch in the pipe to make ARCH_MVEBU also available on 64-bit
ARM.

[bhelgaas: changelog; patch is coming but not merged yet]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
4 years agoRevert "PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()"
Bjorn Helgaas [Wed, 17 Feb 2016 18:26:42 +0000 (12:26 -0600)]
Revert "PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()"

991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and
pcibios_free_irq()") appeared in v4.3 and helps support IOAPIC hotplug.

Олег reported that the Elcus-1553 TA1-PCI driver worked in v4.2 but not
v4.3 and bisected it to 991de2e59090.  Sunjin reported that the RocketRAID
272x driver worked in v4.2 but not v4.3.  In both cases booting with
"pci=routirq" is a workaround.

I think the problem is that after 991de2e59090, we no longer call
pcibios_enable_irq() for upstream bridges.  Prior to 991de2e59090, when a
driver called pci_enable_device(), we recursively called
pcibios_enable_irq() for upstream bridges via pci_enable_bridge().

After 991de2e59090, we call pcibios_enable_irq() from pci_device_probe()
instead of the pci_enable_device() path, which does *not* call
pcibios_enable_irq() for upstream bridges.

Revert 991de2e59090 to fix these driver regressions.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=111211
Fixes: 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()")
Reported-and-tested-by: Олег Мороз <oleg.moroz@mcc.vniiem.ru>
Reported-by: Sunjin Yang <fan4326@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
CC: Jiang Liu <jiang.liu@linux.intel.com>
4 years agox86/mpx: Fix off-by-one comparison with nr_registers
Colin Ian King [Fri, 26 Feb 2016 18:55:31 +0000 (18:55 +0000)]
x86/mpx: Fix off-by-one comparison with nr_registers

In the unlikely event that regno == nr_registers then we get an array
overrun on regoff because the invalid register check is currently
off-by-one. Fix this with a check that regno is >= nr_registers instead.

Detected with static analysis using CoverityScan.

Fixes: fcc7ffd67991 "x86, mpx: Decode MPX instruction to get bound violation information"
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1456512931-3388-1-git-send-email-colin.king@canonical.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
4 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph...
Linus Torvalds [Fri, 26 Feb 2016 17:35:03 +0000 (09:35 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client

Pull Ceph fixes from Sage Weil:
 "There are two small messenger bug fixes and a log spam regression fix"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  libceph: don't spam dmesg with stray reply warnings
  libceph: use the right footer size when skipping a message
  libceph: don't bail early from try_read() when skipping a message

4 years agoMerge tag 'sound-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 26 Feb 2016 17:27:21 +0000 (09:27 -0800)]
Merge tag 'sound-4.5-rc6' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Things got calmed down for rc6, as it seems, and we have only a few
  HD-audio fixes at this time: a fix for Skylake codec probe errors, a
  fix for missing interrupt handling, and a few Dell and HP quirks"

* tag 'sound-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Loop interrupt handling until really cleared
  ALSA: hda - Fix headset support and noise on HP EliteBook 755 G2
  ALSA: hda - Fixup speaker pass-through control for nid 0x14 on ALC225
  ALSA: hda - Fixing background noise on Dell Inspiron 3162
  ALSA: hda - Apply clock gate workaround to Skylake, too

4 years agoMerge tag 'pm+acpi-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 26 Feb 2016 17:21:48 +0000 (09:21 -0800)]
Merge tag 'pm+acpi-4.5-rc6' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
 "These are two reverts of recent PCI-related ACPI core changes (one of
  which caused some systems to crash on boot and the other was a cleanup
  on top of it) and a devfreq fix for Tegra.

  Specifics:

   - Revert an ACPI core change related to IRQ management in PCI that
     introduced code relying on the use of kmalloc() which turned out to
     also run during early init when that's not available yet and caused
     some systems to crash on boot for this reason along with a cleanup
     on top of it (Rafael Wysocki).

   - Prevent devfreq from flooding the kernel log with useless messages
     on Tegra (which started to happen after some recent changes in the
     devfreq core) by fixing the driver to follow the documentation and
     the core's expectations in its ->target callback (Tomeu Vizoso)"

* tag 'pm+acpi-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "ACPI, PCI, irq: remove interrupt count restriction"
  Revert "ACPI / PCI: Simplify acpi_penalize_isa_irq()"
  PM / devfreq: tegra: Set freq in rate callback

4 years agoMerge branches 'pm-devfreq' and 'acpi-pci'
Rafael J. Wysocki [Fri, 26 Feb 2016 12:50:55 +0000 (13:50 +0100)]
Merge branches 'pm-devfreq' and 'acpi-pci'

* pm-devfreq:
  PM / devfreq: tegra: Set freq in rate callback

* acpi-pci:
  Revert "ACPI, PCI, irq: remove interrupt count restriction"
  Revert "ACPI / PCI: Simplify acpi_penalize_isa_irq()"

4 years agoMerge branch 'stable-4.5' of git://git.infradead.org/users/pcmoore/selinux into for...
James Morris [Fri, 26 Feb 2016 08:32:16 +0000 (19:32 +1100)]
Merge branch 'stable-4.5' of git://git.infradead.org/users/pcmoore/selinux into for-linus

4 years agoALSA: hda - Loop interrupt handling until really cleared
Takashi Iwai [Tue, 23 Feb 2016 14:54:47 +0000 (15:54 +0100)]
ALSA: hda - Loop interrupt handling until really cleared

Currently the interrupt handler of HD-audio driver assumes that no irq
update is needed while processing the irq.  But in reality, it has
been confirmed that the HW irq is issued even during the irq
handling.  Since we clear the irq status at the beginning, process the
interrupt, then exits from the handler, the lately issued interrupt is
left untouched without being properly processed.

This patch changes the interrupt handler code to loop over the
check-and-process.  The handler tries repeatedly as long as the IRQ
status are turned on, and either stream or CORB/RIRB is handled.

For checking the stream handling, snd_hdac_bus_handle_stream_irq()
returns a value indicating the stream indices bits.  Other than that,
the change is only in the irq handler itself.

Reported-by: Libin Yang <libin.yang@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
4 years agoMerge tag 'trace-fixes-v4.5-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 26 Feb 2016 04:12:09 +0000 (20:12 -0800)]
Merge tag 'trace-fixes-v4.5-rc5-2' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "Another small bug reported to me by Chunyu Hu.

  When perf added a "reg" function to the function tracing event (not a
  tracepoint), it caused that event to be displayed as a tracepoint and
  could cause errors in tracepoint handling.  That was solved by adding
  a flag to ignore ftrace non-tracepoint events.  But that flag was
  missed when displaying events in available_events, which should only
  contain tracepoint events.

  This broke a documented way to enable all events with:

      cat available_events > set_event

  As the function non-tracepoint event would cause that to error out.
  The commit here fixes that by having the available_events file not
  list events that have the ignore flag set"

* tag 'trace-fixes-v4.5-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix showing function event in available_events

4 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Fri, 26 Feb 2016 03:53:54 +0000 (19:53 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "KVM/ARM fixes:
   - Fix per-vcpu vgic bitmap allocation
   - Do not give copy random memory on MMIO read
   - Fix GICv3 APR register restore order

  KVM/x86 fixes:
   - Fix ubsan warning
   - Fix hardware breakpoints in a guest vs. preempt notifiers
   - Fix Hurd

  Generic:
   - use __GFP_NOWARN together with GFP_NOWAIT"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: MMU: fix ubsan index-out-of-range warning
  arm64: KVM: vgic-v3: Restore ICH_APR0Rn_EL2 before ICH_APR1Rn_EL2
  KVM: async_pf: do not warn on page allocation failures
  KVM: x86: fix conversion of addresses to linear in 32-bit protected mode
  KVM: x86: fix missed hardware breakpoints
  arm/arm64: KVM: Feed initialized memory to MMIO accesses
  KVM: arm/arm64: vgic: Ensure bitmaps are long enough

4 years agoMerge tag 'renesas-sh-drivers-fixes-for-v4.5' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Fri, 26 Feb 2016 03:47:01 +0000 (19:47 -0800)]
Merge tag 'renesas-sh-drivers-fixes-for-v4.5' of git://git./linux/kernel/git/horms/renesas

Pull SuperH driver fix from Simon Horman:
 "Restore legacy clock domain on SuperH platforms"

* tag 'renesas-sh-drivers-fixes-for-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  drivers: sh: Restore legacy clock domain on SuperH platforms

4 years agoMerge tag 'powerpc-4.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Fri, 26 Feb 2016 03:41:53 +0000 (19:41 -0800)]
Merge tag 'powerpc-4.5-4' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 - eeh: Fix partial hotplug criterion from Gavin Shan
 - mm: Clear the invalid slot information correctly from Aneesh Kumar K.V

* tag 'powerpc-4.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm/hash: Clear the invalid slot information correctly
  powerpc/eeh: Fix partial hotplug criterion

4 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Fri, 26 Feb 2016 03:36:33 +0000 (19:36 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux

Pull s390 bugfixes from Martin Schwidefsky:
 "Two critical bug fixes for the signal handling"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/fpu: signals vs. floating point control register
  s390/compat: correct restore of high gprs on signal return

4 years agoMerge tag 'nfsd-4.5-1' of git://linux-nfs.org/~bfields/linux
Linus Torvalds [Fri, 26 Feb 2016 03:31:01 +0000 (19:31 -0800)]
Merge tag 'nfsd-4.5-1' of git://linux-nfs.org/~bfields/linux

Pull nfsd bugfix from Bruce Fields:
 "One fix for a bug that could cause a NULL write past the end of a
  buffer in case of unusually long writes to some system interfaces used
  by mountd and other nfs support utilities"

* tag 'nfsd-4.5-1' of git://linux-nfs.org/~bfields/linux:
  sunrpc/cache: fix off-by-one in qword_get()

4 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 26 Feb 2016 03:01:42 +0000 (19:01 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "This is a bit larger than Id like, but I asked the Intel guys to pull
  in some Skylake fixes in the possibly vain hope that Skylake might be
  more functional now that I'm seeing production hardware shipping.

  For i915, it's mostly the same patch in a few places, making sure the
  hw doesn't turn off when we are programming it.

  Apart from that are two nouveau fixes, one for a module defer bug, and
  one for using nouveau on new Lenovo P50 models.

  Then there are a bunch of AMDGPU fixes, one is a fix for v4.4 vblank
  regressions, and some PM fixes"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (26 commits)
  drm/nouveau/disp/dp: ensure sink is powered up before attempting link training
  drm/nouveau: platform: Fix deferred probe
  drm/amdgpu: disable direct VM updates when vm_debug is set
  amdgpu: fix NULL pointer dereference at tonga_check_states_equal
  drm/i915/gen9: Verify and enforce dc6 state writes
  drm/i915/gen9: Check for DC state mismatch
  drm/radeon/pm: adjust display configuration after powerstate
  drm/amdgpu/pm: adjust display configuration after powerstate
  drm/amdgpu/pm: add some checks for PX
  drm/amdgpu: fix locking in force performance level
  drm/amdgpu/gfx8: fix priv reg interrupt enable
  drm/i915/skl: Ensure HW is powered during DDB HW state readout
  drm/i915/lvds: Ensure the HW is powered during HW state readout
  drm/i915/hdmi: Ensure the HW is powered during HW state readout
  drm/i915/dsi: Ensure the HW is powered during HW state readout
  drm/i915/dp: Ensure the HW is powered during HW state readout
  drm/i915: Ensure the HW is powered when accessing the CRC HW block
  drm/i915/ddi: Ensure the HW is powered during HW state readout
  drm/i915/crt: Ensure the HW is powered during HW state readout
  drm/i915: Ensure the HW is powered during HW access in assert_pipe
  ...

4 years agoMerge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdim...
Linus Torvalds [Fri, 26 Feb 2016 02:54:53 +0000 (18:54 -0800)]
Merge branch 'libnvdimm-fixes' of git://git./linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fixes from Dan Williams:

 - Two fixes for compatibility with the ACPI 6.1 specification.

   Without these fixes multi-interface DIMMs will fail to be probed, and
   address range scrub commands to find memory errors will give results
   that the kernel will mis-interpret.  For multi-interface DIMMs Linux
   will accept either the original 6.0 implementation or 6.1.

   For address range scrub we'll only support 6.1 since ACPI formalized
   this DSM differently than the original example [1] implemented in
   v4.2.  The expectation is that production systems will only ever ship
   the ACPI 6.1 address range scrub command definition.

 - The wider async address range scrub work targeting 4.6 discovered
   that the original synchronous implementation in 4.5 is not sizing its
   return buffer correctly.

 - Arnd caught that my recent fix to the size of the pfn_t flags missed
   updating the flags variable used in the pmem driver.

 - Toshi found that we mishandle the memremap() return value in
   devm_memremap().

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  nvdimm: use 'u64' for pfn flags
  devm_memremap: Fix error value when memremap failed
  nfit: update address range scrub commands to the acpi 6.1 format
  libnvdimm, tools/testing/nvdimm: fix 'ars_status' output buffer sizing
  nfit: fix multi-interface dimm handling, acpi6.1 compatibility

4 years agoMerge tag 'for-v4.5-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux...
Linus Torvalds [Fri, 26 Feb 2016 02:42:08 +0000 (18:42 -0800)]
Merge tag 'for-v4.5-rc' of git://git./linux/kernel/git/sre/linux-power-supply

Pull power supply fixes from Sebastian Reichel:
 "Add a regression fix for changed sysfs path of bq27xxx_battery and
  update MAINTAINERS file"

* tag 'for-v4.5-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: bq27xxx_battery: Restore device name
  MAINTAINERS: update bq27xxx driver

4 years agox86/mm: Fix slow_virt_to_phys() for X86_PAE again
Dexuan Cui [Thu, 25 Feb 2016 09:58:12 +0000 (01:58 -0800)]
x86/mm: Fix slow_virt_to_phys() for X86_PAE again

"d1cd12108346: x86, pageattr: Prevent overflow in slow_virt_to_phys() for
X86_PAE" was unintentionally removed by the recent "34437e67a672: x86/mm: Fix
slow_virt_to_phys() to handle large PAT bit".

And, the variable 'phys_addr' was defined as "unsigned long" by mistake -- it should
be "phys_addr_t".

As a result, Hyper-V network driver in 32-PAE Linux guest can't work again.

Fixes: commit 34437e67a672: "x86/mm: Fix slow_virt_to_phys() to handle large PAT bit"
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Cc: olaf@aepfle.de
Cc: gregkh@linuxfoundation.org
Cc: jasowang@redhat.com
Cc: driverdev-devel@linuxdriverproject.org
Cc: linux-mm@kvack.org
Cc: apw@canonical.com
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Link: http://lkml.kernel.org/r/1456394292-9030-1-git-send-email-decui@microsoft.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
4 years agoALSA: hda - Fix headset support and noise on HP EliteBook 755 G2
Takashi Iwai [Thu, 25 Feb 2016 13:31:59 +0000 (14:31 +0100)]
ALSA: hda - Fix headset support and noise on HP EliteBook 755 G2

HP EliteBook 755 G2 with ALC3228 (ALC280) codec [103c:221c] requires
the known fixup (ALC269_FIXUP_HEADSET_MIC) for making the headset mic
working.  Also, it suffers from the loopback noise problem, so we
should disable aamix path as well.

Reported-by: Derick Eddington <derick.eddington@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
4 years agoALSA: hda - Fixup speaker pass-through control for nid 0x14 on ALC225
David Henningsson [Thu, 25 Feb 2016 08:37:05 +0000 (09:37 +0100)]
ALSA: hda - Fixup speaker pass-through control for nid 0x14 on ALC225

On one of the machines we enable, we found that the actual speaker volume
did not always correspond to the volume set in alsamixer. This patch
fixes that problem.

This patch was orginally written by Kailang @ Realtek, I've rebased it
to fit sound git master.

Cc: stable@vger.kernel.org
BugLink: https://bugs.launchpad.net/bugs/1549660
Co-Authored-By: Kailang <kailang@realtek.com>
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
4 years agoMerge tag 'kvm-arm-for-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Paolo Bonzini [Thu, 25 Feb 2016 08:53:55 +0000 (09:53 +0100)]
Merge tag 'kvm-arm-for-4.5-rc6' of git://git./linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/ARM fixes for 4.5-rc6

- Fix per-vcpu vgic bitmap allocation
- Do not give copy random memory on MMIO read
- Fix GICv3 APR register restore order

4 years agoKVM: x86: MMU: fix ubsan index-out-of-range warning
Mike Krinkin [Wed, 24 Feb 2016 18:02:31 +0000 (21:02 +0300)]
KVM: x86: MMU: fix ubsan index-out-of-range warning

Ubsan reports the following warning due to a typo in
update_accessed_dirty_bits template, the patch fixes
the typo:

[  168.791851] ================================================================================
[  168.791862] UBSAN: Undefined behaviour in arch/x86/kvm/paging_tmpl.h:252:15
[  168.791866] index 4 is out of range for type 'u64 [4]'
[  168.791871] CPU: 0 PID: 2950 Comm: qemu-system-x86 Tainted: G           O L  4.5.0-rc5-next-20160222 #7
[  168.791873] Hardware name: LENOVO 23205NG/23205NG, BIOS G2ET95WW (2.55 ) 07/09/2013
[  168.791876]  0000000000000000 ffff8801cfcaf208 ffffffff81c9f780 0000000041b58ab3
[  168.791882]  ffffffff82eb2cc1 ffffffff81c9f6b4 ffff8801cfcaf230 ffff8801cfcaf1e0
[  168.791886]  0000000000000004 0000000000000001 0000000000000000 ffffffffa1981600
[  168.791891] Call Trace:
[  168.791899]  [<ffffffff81c9f780>] dump_stack+0xcc/0x12c
[  168.791904]  [<ffffffff81c9f6b4>] ? _atomic_dec_and_lock+0xc4/0xc4
[  168.791910]  [<ffffffff81da9e81>] ubsan_epilogue+0xd/0x8a
[  168.791914]  [<ffffffff81daafa2>] __ubsan_handle_out_of_bounds+0x15c/0x1a3
[  168.791918]  [<ffffffff81daae46>] ? __ubsan_handle_shift_out_of_bounds+0x2bd/0x2bd
[  168.791922]  [<ffffffff811287ef>] ? get_user_pages_fast+0x2bf/0x360
[  168.791954]  [<ffffffffa1794050>] ? kvm_largepages_enabled+0x30/0x30 [kvm]
[  168.791958]  [<ffffffff81128530>] ? __get_user_pages_fast+0x360/0x360
[  168.791987]  [<ffffffffa181b818>] paging64_walk_addr_generic+0x1b28/0x2600 [kvm]
[  168.792014]  [<ffffffffa1819cf0>] ? init_kvm_mmu+0x1100/0x1100 [kvm]
[  168.792019]  [<ffffffff8129e350>] ? debug_check_no_locks_freed+0x350/0x350
[  168.792044]  [<ffffffffa1819cf0>] ? init_kvm_mmu+0x1100/0x1100 [kvm]
[  168.792076]  [<ffffffffa181c36d>] paging64_gva_to_gpa+0x7d/0x110 [kvm]
[  168.792121]  [<ffffffffa181c2f0>] ? paging64_walk_addr_generic+0x2600/0x2600 [kvm]
[  168.792130]  [<ffffffff812e848b>] ? debug_lockdep_rcu_enabled+0x7b/0x90
[  168.792178]  [<ffffffffa17d9a4a>] emulator_read_write_onepage+0x27a/0x1150 [kvm]
[  168.792208]  [<ffffffffa1794d44>] ? __kvm_read_guest_page+0x54/0x70 [kvm]
[  168.792234]  [<ffffffffa17d97d0>] ? kvm_task_switch+0x160/0x160 [kvm]
[  168.792238]  [<ffffffff812e848b>] ? debug_lockdep_rcu_enabled+0x7b/0x90
[  168.792263]  [<ffffffffa17daa07>] emulator_read_write+0xe7/0x6d0 [kvm]
[  168.792290]  [<ffffffffa183b620>] ? em_cr_write+0x230/0x230 [kvm]
[  168.792314]  [<ffffffffa17db005>] emulator_write_emulated+0x15/0x20 [kvm]
[  168.792340]  [<ffffffffa18465f8>] segmented_write+0xf8/0x130 [kvm]
[  168.792367]  [<ffffffffa1846500>] ? em_lgdt+0x20/0x20 [kvm]
[  168.792374]  [<ffffffffa14db512>] ? vmx_read_guest_seg_ar+0x42/0x1e0 [kvm_intel]
[  168.792400]  [<ffffffffa1846d82>] writeback+0x3f2/0x700 [kvm]
[  168.792424]  [<ffffffffa1846990>] ? em_sidt+0xa0/0xa0 [kvm]
[  168.792449]  [<ffffffffa185554d>] ? x86_decode_insn+0x1b3d/0x4f70 [kvm]
[  168.792474]  [<ffffffffa1859032>] x86_emulate_insn+0x572/0x3010 [kvm]
[  168.792499]  [<ffffffffa17e71dd>] x86_emulate_instruction+0x3bd/0x2110 [kvm]
[  168.792524]  [<ffffffffa17e6e20>] ? reexecute_instruction.part.110+0x2e0/0x2e0 [kvm]
[  168.792532]  [<ffffffffa14e9a81>] handle_ept_misconfig+0x61/0x460 [kvm_intel]
[  168.792539]  [<ffffffffa14e9a20>] ? handle_pause+0x450/0x450 [kvm_intel]
[  168.792546]  [<ffffffffa15130ea>] vmx_handle_exit+0xd6a/0x1ad0 [kvm_intel]
[  168.792572]  [<ffffffffa17f6a6c>] ? kvm_arch_vcpu_ioctl_run+0xbdc/0x6090 [kvm]
[  168.792597]  [<ffffffffa17f6bcd>] kvm_arch_vcpu_ioctl_run+0xd3d/0x6090 [kvm]
[  168.792621]  [<ffffffffa17f6a6c>] ? kvm_arch_vcpu_ioctl_run+0xbdc/0x6090 [kvm]
[  168.792627]  [<ffffffff8293b530>] ? __ww_mutex_lock_interruptible+0x1630/0x1630
[  168.792651]  [<ffffffffa17f5e90>] ? kvm_arch_vcpu_runnable+0x4f0/0x4f0 [kvm]
[  168.792656]  [<ffffffff811eeb30>] ? preempt_notifier_unregister+0x190/0x190
[  168.792681]  [<ffffffffa17e0447>] ? kvm_arch_vcpu_load+0x127/0x650 [kvm]
[  168.792704]  [<ffffffffa178e9a3>] kvm_vcpu_ioctl+0x553/0xda0 [kvm]
[  168.792727]  [<ffffffffa178e450>] ? vcpu_put+0x40/0x40 [kvm]
[  168.792732]  [<ffffffff8129e350>] ? debug_check_no_locks_freed+0x350/0x350
[  168.792735]  [<ffffffff82946087>] ? _raw_spin_unlock+0x27/0x40
[  168.792740]  [<ffffffff8163a943>] ? handle_mm_fault+0x1673/0x2e40
[  168.792744]  [<ffffffff8129daa8>] ? trace_hardirqs_on_caller+0x478/0x6c0
[  168.792747]  [<ffffffff8129dcfd>] ? trace_hardirqs_on+0xd/0x10
[  168.792751]  [<ffffffff812e848b>] ? debug_lockdep_rcu_enabled+0x7b/0x90
[  168.792756]  [<ffffffff81725a80>] do_vfs_ioctl+0x1b0/0x12b0
[  168.792759]  [<ffffffff817258d0>] ? ioctl_preallocate+0x210/0x210
[  168.792763]  [<ffffffff8174aef3>] ? __fget+0x273/0x4a0
[  168.792766]  [<ffffffff8174acd0>] ? __fget+0x50/0x4a0
[  168.792770]  [<ffffffff8174b1f6>] ? __fget_light+0x96/0x2b0
[  168.792773]  [<ffffffff81726bf9>] SyS_ioctl+0x79/0x90
[  168.792777]  [<ffffffff82946880>] entry_SYSCALL_64_fastpath+0x23/0xc1
[  168.792780] ================================================================================

Signed-off-by: Mike Krinkin <krinkin.m.u@gmail.com>
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoALSA: hda - Fixing background noise on Dell Inspiron 3162
Kai-Heng Feng [Thu, 25 Feb 2016 07:19:38 +0000 (15:19 +0800)]
ALSA: hda - Fixing background noise on Dell Inspiron 3162

After login to the desktop on Dell Inspiron 3162,
there's a very loud background noise comes from the builtin speaker.
The noise does not go away even if the speaker is muted.

The noise disappears after using the aamix fixup.

Codec: Realtek ALC3234
Address: 0
AFG Function Id: 0x1 (unsol 1)
    Vendor Id: 0x10ec0255
    Subsystem Id: 0x10280725
    Revision Id: 0x100002
    No Modem Function Group found

BugLink: http://bugs.launchpad.net/bugs/1549620
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
4 years agoperf: Robustify task_function_call()
Peter Zijlstra [Wed, 24 Feb 2016 17:45:51 +0000 (18:45 +0100)]
perf: Robustify task_function_call()

Since there is no serialization between task_function_call() doing
task_curr() and the other CPU doing context switches, we could end
up not sending an IPI even if we had to.

And I'm not sure I still buy my own argument we're OK.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160224174948.340031200@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
4 years agoperf: Fix scaling vs. perf_install_in_context()
Peter Zijlstra [Wed, 24 Feb 2016 17:45:50 +0000 (18:45 +0100)]
perf: Fix scaling vs. perf_install_in_context()

Completely reworks perf_install_in_context() (again!) in order to
ensure that there will be no ctx time hole between add_event_to_ctx()
and any potential ctx_sched_in().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160224174948.279399438@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
4 years agoperf: Fix scaling vs. perf_event_enable()
Peter Zijlstra [Wed, 24 Feb 2016 17:45:49 +0000 (18:45 +0100)]
perf: Fix scaling vs. perf_event_enable()

Similar to the perf_enable_on_exec(), ensure that event timings are
consistent across perf_event_enable().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160224174948.218288698@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
4 years agoperf: Fix scaling vs. perf_event_enable_on_exec()
Peter Zijlstra [Wed, 24 Feb 2016 17:45:48 +0000 (18:45 +0100)]
perf: Fix scaling vs. perf_event_enable_on_exec()

The recent commit 3e349507d12d ("perf: Fix perf_enable_on_exec() event
scheduling") caused this by moving task_ctx_sched_out() from before
__perf_event_mask_enable() to after it.

The overlooked consequence of that change is that task_ctx_sched_out()
would update the ctx time fields, and now __perf_event_mask_enable()
uses stale time.

In order to fix this, explicitly stop our context's time before
enabling the event(s).

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Fixes: 3e349507d12d ("perf: Fix perf_enable_on_exec() event scheduling")
Link: http://lkml.kernel.org/r/20160224174948.159242158@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
4 years agoperf: Fix ctx time tracking by introducing EVENT_TIME
Peter Zijlstra [Wed, 24 Feb 2016 17:45:47 +0000 (18:45 +0100)]
perf: Fix ctx time tracking by introducing EVENT_TIME

Currently any ctx_sched_in() call will re-start the ctx time tracking,
this means that calls like:

ctx_sched_in(.event_type = EVENT_PINNED);
ctx_sched_in(.event_type = EVENT_FLEXIBLE);

will have a hole in their ctx time tracking. This is likely harmless
but can confuse things a little. By adding EVENT_TIME, we can have the
first ctx_sched_in() (is_active: 0 -> !0) start the time and any
further ctx_sched_in() will leave the timestamps alone.

Secondly, this allows for an early disable like:

ctx_sched_out(.event_type = EVENT_TIME);

which would update the ctx time (if the ctx is active) and any further
calls to ctx_sched_out() would not further modify the ctx time.

For ctx_sched_in() any 0 -> !0 transition will automatically include
EVENT_TIME.

For ctx_sched_out(), any transition that clears EVENT_ALL will
automatically clear EVENT_TIME.

These two rules ensure that under normal circumstances we need not
bother with EVENT_TIME and get natural ctx time behaviour.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160224174948.100446561@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
4 years agoperf: Cure event->pending_disable race
Peter Zijlstra [Wed, 24 Feb 2016 17:45:46 +0000 (18:45 +0100)]
perf: Cure event->pending_disable race

Because event_sched_out() checks event->pending_disable _before_
actually disabling the event, it can happen that the event fires after
it checks but before it gets disabled.

This would leave event->pending_disable set and the queued irq_work
will try and process it.

However, if the event trigger was during schedule(), the event might
have been de-scheduled by the time the irq_work runs, and
perf_event_disable_local() will fail.

Fix this by checking event->pending_disable _after_ we call
event->pmu->del(). This depends on the latter being a compiler
barrier, such that the compiler does not lift the load and re-creates
the problem.

Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160224174948.040469884@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
4 years agoperf: Fix race between event install and jump_labels
Peter Zijlstra [Wed, 24 Feb 2016 17:45:45 +0000 (18:45 +0100)]
perf: Fix race between event install and jump_labels

perf_install_in_context() relies upon the context switch hooks to have
scheduled in events when the IPI misses its target -- after all, if
the task has moved from the CPU (or wasn't running at all), it will
have to context switch to run elsewhere.

This however doesn't appear to be happening.

It is possible for the IPI to not happen (task wasn't running) only to
later observe the task running with an inactive context.

The only possible explanation is that the context switch hooks are not
called. Therefore put in a sync_sched() after toggling the jump_label
to guarantee all CPUs will have them enabled before we install an
event.

A simple if (0->1) sync_sched() will not in fact work, because any
further increment can race and complete before the sync_sched().
Therefore we must jump through some hoops.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160224174947.980211985@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
4 years agoperf: Fix cloning
Peter Zijlstra [Wed, 24 Feb 2016 17:45:44 +0000 (18:45 +0100)]
perf: Fix cloning

Alexander reported that when the 'original' context gets destroyed, no
new clones happen.

This can happen irrespective of the ctx switch optimization, any task
can die, even the parent, and we want to continue monitoring the task
hierarchy until we either close the event or no tasks are left in the
hierarchy.

perf_event_init_context() will attempt to pin the 'parent' context
during clone(). At that point current is the parent, and since current
cannot have exited while executing clone(), its context cannot have
passed through perf_event_exit_task_context(). Therefore
perf_pin_task_context() cannot observe ctx->task == TASK_TOMBSTONE.

However, since inherit_event() does:

if (parent_event->parent)
parent_event = parent_event->parent;

it looks at the 'original' event when it does: is_orphaned_event().
This can return true if the context that contains the this event has
passed through perf_event_exit_task_context(). And thus we'll fail to
clone the perf context.

Fix this by adding a new state: STATE_DEAD, which is set by
perf_release() to indicate that the filedesc (or kernel reference) is
dead and there are no observers for our data left.

Only for STATE_DEAD will is_orphaned_event() be true and inhibit
cloning.

STATE_EXIT is otherwise preserved such that is_event_hup() remains
functional and will report when the observed task hierarchy becomes
empty.

Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Fixes: c6e5b73242d2 ("perf: Synchronously clean up child events")
Link: http://lkml.kernel.org/r/20160224174947.919845295@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
4 years agoperf: Only update context time when active
Peter Zijlstra [Wed, 24 Feb 2016 17:45:43 +0000 (18:45 +0100)]
perf: Only update context time when active

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160224174947.860690919@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
4 years agoperf: Allow perf_release() with !event->ctx
Peter Zijlstra [Wed, 24 Feb 2016 17:45:42 +0000 (18:45 +0100)]
perf: Allow perf_release() with !event->ctx

In the err_file: fput(event_file) case, the event will not yet have
been attached to a context. However perf_release() does assume it has
been. Cure this.

Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160224174947.793996260@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
4 years agoperf: Do not double free
Peter Zijlstra [Wed, 24 Feb 2016 17:45:41 +0000 (18:45 +0100)]
perf: Do not double free

In case of: err_file: fput(event_file), we'll end up calling
perf_release() which in turn will free the event.

Do not then free the event _again_.

Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160224174947.697350349@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
4 years agoperf: Close install vs. exit race
Peter Zijlstra [Wed, 24 Feb 2016 17:45:40 +0000 (18:45 +0100)]
perf: Close install vs. exit race

Consider the following scenario:

  CPU0 CPU1

  ctx = find_get_ctx();
perf_event_exit_task_context()
  mutex_lock(&ctx->mutex);
  perf_install_in_context(ctx, ...);
    /* NO-OP */
  mutex_unlock(&ctx->mutex);

  ...

  perf_release()
    WARN_ON_ONCE(event->state != STATE_EXIT);

Since the event doesn't pass through perf_remove_from_context()
because perf_install_in_context() NO-OPs because the ctx is dead, and
perf_event_exit_task_context() will not observe the event because its
not attached yet, the event->state will not be set.

Solve this by revalidating ctx->task after we acquire ctx->mutex and
failing the event creation as a whole.

Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160224174947.626853419@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
4 years agox86/entry/compat: Add missing CLAC to entry_INT80_32
Andy Lutomirski [Wed, 24 Feb 2016 20:18:49 +0000 (12:18 -0800)]
x86/entry/compat: Add missing CLAC to entry_INT80_32

This doesn't seem to fix a regression -- I don't think the CLAC was
ever there.

I double-checked in a debugger: entries through the int80 gate do
not automatically clear AC.

Stable maintainers: I can provide a backport to 4.3 and earlier if
needed.  This needs to be backported all the way to 3.10.

Reported-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org> # v3.10 and later
Fixes: 63bcff2a307b ("x86, smap: Add STAC and CLAC instructions to control user space access")
Link: http://lkml.kernel.org/r/b02b7e71ae54074be01fc171cbd4b72517055c0e.1456345086.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
4 years agoMerge branch 'linux-4.5' of git://github.com/skeggsb/linux into drm-fixes
Dave Airlie [Thu, 25 Feb 2016 03:17:50 +0000 (13:17 +1000)]
Merge branch 'linux-4.5' of git://github.com/skeggsb/linux into drm-fixes

single for for eDP panel issues on Lenovo P50
* 'linux-4.5' of git://github.com/skeggsb/linux:
  drm/nouveau/disp/dp: ensure sink is powered up before attempting link training

4 years agodrm/nouveau/disp/dp: ensure sink is powered up before attempting link training
Ben Skeggs [Wed, 17 Feb 2016 22:14:19 +0000 (08:14 +1000)]
drm/nouveau/disp/dp: ensure sink is powered up before attempting link training

This can happen under some annoying circumstances, and is a quick fix
until more substantial changes can be made.

Fixed eDP mode changes on (at least) the Lenovo P50.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
4 years agodrm/nouveau: platform: Fix deferred probe
Thierry Reding [Wed, 24 Feb 2016 17:34:43 +0000 (18:34 +0100)]
drm/nouveau: platform: Fix deferred probe

The error cleanup paths aren't quite correct and will crash upon
deferred probe.

Cc: stable@vger.kernel.org # v4.3+
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
4 years agodrivers: sh: Restore legacy clock domain on SuperH platforms
Geert Uytterhoeven [Wed, 24 Feb 2016 08:43:23 +0000 (09:43 +0100)]
drivers: sh: Restore legacy clock domain on SuperH platforms

CONFIG_ARCH_SHMOBILE is not only enabled for Renesas ARM platforms
(which are DT based and multi-platform), but also on a select set of
Renesas SuperH platforms (SH7722/SH7723/SH7724/SH7343/SH7366). Hence
since commit 0ba58de231066e47 ("drivers: sh: Get rid of
CONFIG_ARCH_SHMOBILE_MULTI"), the legacy clock domain is no longer
installed on these SuperH platforms, and module clocks may not be
enabled when needed, leading to driver failures.

To fix this, add an additional check for CONFIG_OF.

Fixes: 0ba58de231066e47 ("drivers: sh: Get rid of CONFIG_ARCH_SHMOBILE_MULTI").
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
4 years agoMerge tag 'drm-intel-fixes-2016-02-22' of git://anongit.freedesktop.org/drm-intel...
Dave Airlie [Wed, 24 Feb 2016 22:22:43 +0000 (08:22 +1000)]
Merge tag 'drm-intel-fixes-2016-02-22' of git://anongit.freedesktop.org/drm-intel into drm-fixes

This is a bit large, but it really helps Skylake bugs we are seeing
on a number of laptops.

Most of the commits are quite similar, ensuring the display power
doesn't vanish under us during hardware access. Also do note that it's
not just Skylake that's affected.

* tag 'drm-intel-fixes-2016-02-22' of git://anongit.freedesktop.org/drm-intel:
  drm/i915/gen9: Verify and enforce dc6 state writes
  drm/i915/gen9: Check for DC state mismatch
  drm/i915/skl: Ensure HW is powered during DDB HW state readout
  drm/i915/lvds: Ensure the HW is powered during HW state readout
  drm/i915/hdmi: Ensure the HW is powered during HW state readout
  drm/i915/dsi: Ensure the HW is powered during HW state readout
  drm/i915/dp: Ensure the HW is powered during HW state readout
  drm/i915: Ensure the HW is powered when accessing the CRC HW block
  drm/i915/ddi: Ensure the HW is powered during HW state readout
  drm/i915/crt: Ensure the HW is powered during HW state readout
  drm/i915: Ensure the HW is powered during HW access in assert_pipe
  drm/i915: Ensure the HW is powered when disabling VGA
  drm/i915/ibx: Ensure the HW is powered during PLL HW readout
  drm/i915: Ensure the HW is powered during display pipe HW readout
  drm/i915: Add helper to get a display power ref if it was already enabled

4 years agoMerge branch 'drm-fixes-4.5' of git://people.freedesktop.org/~agd5f/linux into drm...
Dave Airlie [Wed, 24 Feb 2016 22:21:33 +0000 (08:21 +1000)]
Merge branch 'drm-fixes-4.5' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

A few radeon and amdgpu fixes for 4.5.  A few further fixes for the vblank
regressions in 4.4 and a couple of other minor fixes.

* 'drm-fixes-4.5' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: disable direct VM updates when vm_debug is set
  amdgpu: fix NULL pointer dereference at tonga_check_states_equal
  drm/radeon/pm: adjust display configuration after powerstate
  drm/amdgpu/pm: adjust display configuration after powerstate
  drm/amdgpu/pm: add some checks for PX
  drm/amdgpu: fix locking in force performance level
  drm/amdgpu/gfx8: fix priv reg interrupt enable
  drm/amdgpu: Don't hang in amdgpu_flip_work_func on disabled crtc.
  drm/radeon: Don't hang in radeon_flip_work_func on disabled crtc. (v2)

4 years agoMerge tag 'arc-4.5-rc6-fixes-upd' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 24 Feb 2016 22:06:17 +0000 (14:06 -0800)]
Merge tag 'arc-4.5-rc6-fixes-upd' of git://git./linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:
 - Fix for csd deadlock due to missing self IPI
 - Accompanying IPI cleanups / optimization
 - Brown paper bag bug in one of the cleanups above
 - Boot reporting updates for new hardware features
 - Don't force DEVTMPFS if INITRAMFS

* tag 'arc-4.5-rc6-fixes-upd' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  arc: SMP: CONFIG_ARC_IPI_DBG cleanup
  ARC: SMP: No need for CONFIG_ARC_IPI_DBG
  ARCv2: Elide sending new cross core intr if receiver didn't ack prev
  ARCv2: SMP: Push IPI_IRQ into IPI provider
  ARC: [intc-compact] Remove IPI setup from ARCompact port
  ARCv2: SMP: Emulate IPI to self using software triggered interrupt
  arc: get rid of DEVTMPFS dependency on INITRAMFS_SOURCE
  ARCv2: boot report CCMs (Closely Coupled Memories)
  ARCv2: boot print Low Latency Memory
  ARC: Assume multiplier is always present

4 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Wed, 24 Feb 2016 22:00:26 +0000 (14:00 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
 "Assorted fixes - xattr one from this cycle, the rest - stable fodder"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs/pnode.c: treat zero mnt_group_id-s as unequal
  affs_do_readpage_ofs(): just use kmap_atomic() around memcpy()
  xattr handlers: plug a lock leak in simple_xattr_list
  fs: allow no_seek_end_llseek to actually seek

4 years agolibceph: don't spam dmesg with stray reply warnings
Ilya Dryomov [Fri, 19 Feb 2016 10:38:57 +0000 (11:38 +0100)]
libceph: don't spam dmesg with stray reply warnings

Commit d15f9d694b77 ("libceph: check data_len in ->alloc_msg()")
mistakenly bumped the log level on the "tid %llu unknown, skipping"
message.  Turn it back into a dout() - stray replies are perfectly
normal when OSDs flap, crash, get killed for testing purposes, etc.

Cc: stable@vger.kernel.org # 4.3+
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
4 years agolibceph: use the right footer size when skipping a message
Ilya Dryomov [Fri, 19 Feb 2016 10:38:57 +0000 (11:38 +0100)]
libceph: use the right footer size when skipping a message

ceph_msg_footer is 21 bytes long, while ceph_msg_footer_old is only 13.
Don't skip too much when CEPH_FEATURE_MSG_AUTH isn't negotiated.

Cc: stable@vger.kernel.org # 3.19+
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
4 years agolibceph: don't bail early from try_read() when skipping a message
Ilya Dryomov [Wed, 17 Feb 2016 19:04:08 +0000 (20:04 +0100)]
libceph: don't bail early from try_read() when skipping a message

The contract between try_read() and try_write() is that when called
each processes as much data as possible.  When instructed by osd_client
to skip a message, try_read() is violating this contract by returning
after receiving and discarding a single message instead of checking for
more.  try_write() then gets a chance to write out more requests,
generating more replies/skips for try_read() to handle, forcing the
messenger into a starvation loop.

Cc: stable@vger.kernel.org # 3.10+
Reported-by: Varada Kari <Varada.Kari@sandisk.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Varada Kari <Varada.Kari@sandisk.com>
Reviewed-by: Alex Elder <elder@linaro.org>
4 years agothp: call pmdp_invalidate() with correct virtual address
Kirill A. Shutemov [Wed, 24 Feb 2016 15:58:03 +0000 (18:58 +0300)]
thp: call pmdp_invalidate() with correct virtual address

Sebastian Ott and Gerald Schaefer reported random crashes on s390.
It was bisected to my THP refcounting patchset.

The problem is that pmdp_invalidated() called with wrong virtual
address. It got offset up by HPAGE_PMD_SIZE by loop over ptes.

The solution is to introduce new variable to be used in loop and don't
touch 'haddr'.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-and-tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reported-and-tested-by Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agodrm/amdgpu: disable direct VM updates when vm_debug is set
Christian König [Fri, 19 Feb 2016 09:03:03 +0000 (10:03 +0100)]
drm/amdgpu: disable direct VM updates when vm_debug is set

That should make user space bugs more obvious.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
4 years agoamdgpu: fix NULL pointer dereference at tonga_check_states_equal
Bradley Pankow [Tue, 23 Feb 2016 01:11:47 +0000 (20:11 -0500)]
amdgpu: fix NULL pointer dereference at tonga_check_states_equal

The event_data passed from pem_fini was not cleared upon initialization.
This caused NULL checks to pass and cast_const_phw_tonga_power_state to
attempt to dereference an invalid pointer. Clear the event_data in
pem_init and pem_fini before calling pem_handle_event.

Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Bradley Pankow <btpankow@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agoarm64: KVM: vgic-v3: Restore ICH_APR0Rn_EL2 before ICH_APR1Rn_EL2
Marc Zyngier [Wed, 17 Feb 2016 10:25:05 +0000 (10:25 +0000)]
arm64: KVM: vgic-v3: Restore ICH_APR0Rn_EL2 before ICH_APR1Rn_EL2

The GICv3 architecture spec says:

Writing to the active priority registers in any order other than
the following order will result in UNPREDICTABLE behavior:
- ICH_AP0R<n>_EL2.
- ICH_AP1R<n>_EL2.

So let's not pointlessly go against the rule...

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
4 years agoMerge tag 'fixes-for-v4.5-rc6' of http://git.kernel.org/pub/scm/linux/kernel/git...
Greg Kroah-Hartman [Wed, 24 Feb 2016 17:04:21 +0000 (09:04 -0800)]
Merge tag 'fixes-for-v4.5-rc6' of git./linux/kernel/git/balbi/usb into usb-linus

Felipe writes:

usb: fixes for v4.5-rc6

The most important fixes here are:

a) yet another fix to dwc3's EP transfer resource
assignment logic. This time around we will be
pre-allocating transfer resources to avoid any
future issues;

b) two DMA fixes for the old MUSB driver.

c) dwc2's data toggle fix for FS

Other than these, we have a few other minor fixes
elsewhere.

4 years agoMerge tag 'renesas-soc-fixes-for-v4.5' of git://git.kernel.org/pub/scm/linux/kernel...
Olof Johansson [Wed, 24 Feb 2016 16:48:22 +0000 (08:48 -0800)]
Merge tag 'renesas-soc-fixes-for-v4.5' of git://git./linux/kernel/git/horms/renesas into fixes

Renesas ARM Based SoC Fixes for v4.5

* Avoid writing to .text

* tag 'renesas-soc-fixes-for-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  ARM: shmobile: Remove shmobile_boot_arg
  ARM: shmobile: Move shmobile_smp_{mpidr, fn, arg}[] from .text to .bss
  ARM: shmobile: r8a7779: Remove remainings of removed SCU boot setup code
  ARM: shmobile: Move shmobile_scu_base from .text to .bss

Signed-off-by: Olof Johansson <olof@lixom.net>
4 years agotracing: Fix showing function event in available_events
Steven Rostedt (Red Hat) [Wed, 24 Feb 2016 14:04:24 +0000 (09:04 -0500)]
tracing: Fix showing function event in available_events

The ftrace:function event is only displayed for parsing the function tracer
data. It is not used to enable function tracing, and does not include an
"enable" file in its event directory.

Originally, this event was kept separate from other events because it did
not have a ->reg parameter. But perf added a "reg" parameter for its use
which caused issues, because it made the event available to functions where
it was not compatible for.

Commit 9b63776fa3ca9 "tracing: Do not enable function event with enable"
added a TRACE_EVENT_FL_IGNORE_ENABLE flag that prevented the function event
from being enabled by normal trace events. But this commit missed keeping
the function event from being displayed by the "available_events" directory,
which is used to show what events can be enabled by set_event.

One documented way to enable all events is to:

 cat available_events > set_event

But because the function event is displayed in the available_events, this
now causes an INVALID error:

 cat: write error: Invalid argument

Reported-by: Chunyu Hu <chuhu@redhat.com>
Fixes: 9b63776fa3ca9 "tracing: Do not enable function event with enable"
Cc: stable@vger.kernel.org # 3.4+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>