git.samba.org - sfrench/cifs-2.6.git/log

Merge branch 'pnfs_generic'

* pnfs_generic:
  NFSv4.1/pNFS: Cleanup constify struct pnfs_layout_range arguments
  NFSv4.1/pnfs: Cleanup copying of pnfs_layout_range structures
  NFSv4.1/pNFS: Cleanup pnfs_mark_matching_lsegs_invalid()
  NFSv4.1/pNFS: Fix a race in initiate_file_draining()
  NFSv4.1/pNFS: pnfs_error_mark_layout_for_return() must always return layout
  NFSv4.1/pNFS: pnfs_mark_matching_lsegs_return() should set the iomode
  NFSv4.1/pNFS: Use nfs4_stateid_copy for copying stateids
  NFSv4.1/pNFS: Don't pass stateids by value to pnfs_send_layoutreturn()
  NFS: Relax requirements in nfs_flush_incompatible
  NFSv4.1/pNFS: Don't queue up a new commit if the layout segment is invalid
  NFS: Allow multiple commit requests in flight per file
  NFS/pNFS: Fix up pNFS write reschedule layering violations and bugs
  NFSv4: List stateid information in the callback tracepoints
  NFSv4.1/pNFS: Don't return NFS4ERR_DELAY unnecessarily in CB_LAYOUTRECALL
  NFSv4.1/pNFS: Ensure we enforce RFC5661 Section 12.5.5.2.1
  pNFS: If we have to delay the layout callback, mark the layout for return
  NFSv4.1/pNFS: Add a helper to mark the layout as returned
  pNFS: Ensure nfs4_layoutget_prepare returns the correct error

NFSv4.1/pNFS: Cleanup constify struct pnfs_layout_range arguments

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFSv4.1/pnfs: Cleanup copying of pnfs_layout_range structures

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFSv4.1/pNFS: Cleanup pnfs_mark_matching_lsegs_invalid()

Make it more obvious what we're returning...

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFSv4.1/pNFS: Fix a race in initiate_file_draining()

Peng Tao points out that the call to pnfs_mark_matching_lsegs_return()
could race with pnfs_put_lseg(), in which case the layout segment is
cleared, but no layoutreturn will be sent.
Fix is to replace the call to pnfs_mark_matching_lsegs_invalid().

Reported-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFSv4.1/pNFS: pnfs_error_mark_layout_for_return() must always return layout

Fix a bug whereby if all the layout segments could be immediately freed,
the call to pnfs_error_mark_layout_for_return() would never result in
a layoutreturn.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFSv4.1/pNFS: pnfs_mark_matching_lsegs_return() should set the iomode

If pnfs_mark_matching_lsegs_return() needs to mark a layout segment for
return, then it must also set the return iomode.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFSv4.1/pNFS: Use nfs4_stateid_copy for copying stateids

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFSv4.1/pNFS: Don't pass stateids by value to pnfs_send_layoutreturn()

A stateid is a structure, pass it as a pointer.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFS: Relax requirements in nfs_flush_incompatible

If two processes share the same credentials and NFSv4 open stateid, then
allow them both to dirty the same page, even if their nfs_open_context
differs.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFSv4.1/pNFS: Don't queue up a new commit if the layout segment is invalid

If the layout segment is invalid, then we should not be adding more
write requests to the commit list. Instead, those writes should be
replayed after requesting a new layout.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFS: Allow multiple commit requests in flight per file

Allow synchronous RPC calls to wait for pending RPC calls to finish,
but also allow asynchronous ones to just fire off another commit.

With this patch, the xfstests generic/074 test completes in 226s
instead of 242s

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFS/pNFS: Fix up pNFS write reschedule layering violations and bugs

The flexfiles layout in particular, seems to want to poke around in the
O_DIRECT flags when retransmitting.
This patch sets up an interface to allow it to call back into O_DIRECT
to handle retransmission correctly. It also fixes a potential bug whereby
we could change the behaviour of O_DIRECT if an error is already pending.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

Merge branch 'flexfiles'

* flexfiles:
  pNFS/flexfiles: Ensure we record layoutstats even if RPC is terminated early
  pNFS: Add flag to track if we've called nfs4_ff_layout_stat_io_start_read/write
  pNFS/flexfiles: Fix a statistics gathering imbalance
  pNFS/flexfiles: Don't mark the entire layout as failed, when returning it
  pNFS/flexfiles: Don't prevent flexfiles client from retrying LAYOUTGET
  pnfs/flexfiles: count io stat in rpc_count_stats callback
  pnfs/flexfiles: do not mark delay-like status as DS failure
  NFS41: map NFS4ERR_LAYOUTUNAVAILABLE to ENODATA
  nfs: only remove page from mapping if launder_page fails
  nfs: handle request add failure properly
  nfs: centralize pgio error cleanup
  nfs: clean up rest of reqs when failing to add one
  NFS41: pop some layoutget errors to application
  pNFS/flexfiles: Support server-supplied layoutstats sampling period

Merge tag 'nfs-rdma-4.5' of git://git.linux-nfs.org/projects/anna/nfs-rdma

NFS: NFSoRDMA Client Side Changes

These patches mostly fix send queue ordering issues inside the NFSoRDMA
client, but there are also two patches from Dan Carpenter fixing up smatch
warnings.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* tag 'nfs-rdma-4.5' of git://git.linux-nfs.org/projects/anna/nfs-rdma:
  xprtrdma: Revert commit e7104a2a9606 ('xprtrdma: Cap req_cqinit').
  xprtrdma: Invalidate in the RPC reply handler
  xprtrdma: Add ro_unmap_sync method for all-physical registration
  xprtrdma: Add ro_unmap_sync method for FMR
  xprtrdma: Add ro_unmap_sync method for FRWR
  xprtrdma: Introduce ro_unmap_sync method
  xprtrdma: Move struct ib_send_wr off the stack
  xprtrdma: Disable RPC/RDMA backchannel debugging messages
  xprtrdma: xprt_rdma_free() must not release backchannel reqs
  xprtrdma: Fix additional uses of spin_lock_irqsave(rb_lock)
  xprtrdma: checking for NULL instead of IS_ERR()
  xprtrdma: clean up some curly braces

NFSv4: List stateid information in the callback tracepoints

The stateid is extremely valuable when debugging.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFSv4.1/pNFS: Don't return NFS4ERR_DELAY unnecessarily in CB_LAYOUTRECALL

If the client is promising to return the layout ASAP, then there is no
need to return DELAY and have the server retry. Instead default to the
normal procedure described in RFC5661.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFSv4.1/pNFS: Ensure we enforce RFC5661 Section 12.5.5.2.1

The RFC requires us to check if the server is recalling a stateid that we
haven't yet received. If so, tell it to wait.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

pNFS: If we have to delay the layout callback, mark the layout for return

If the client needs to delay the layout callback, then speed up the recall
process by marking the remaining layout segments to be actively returned
by the client.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFSv4.1/pNFS: Add a helper to mark the layout as returned

This ensures that we don't reuse the stateid if a layout return or
implied layout return means that we've returned all layout segments

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

pNFS: Ensure nfs4_layoutget_prepare returns the correct error

If we're unable to perform the layoutget due to an invalid open stateid
or a bulk recall, ensure that we return the error so that the caller
can decide on an appropriate action.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

pNFS/flexfiles: Ensure we record layoutstats even if RPC is terminated early

Currently, we will only record the layoutstats correctly if the
RPC call successfully obtains a slot. If we exit before that
happens, then we may find ourselves starting the busy timer through
the call in ff_layout_(read|write)_prepare_layoutstats, but never stopping it.

The same thing happens if we're doing DA-DS.

The fix is to ensure that we catch these cases in the rpc_release()
callback.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

pNFS: Add flag to track if we've called nfs4_ff_layout_stat_io_start_read/write

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

pNFS/flexfiles: Fix a statistics gathering imbalance

When we replay a failed read, write or commit to the dataserver, we
need to ensure that we call ff_layout_read_prepare_v3(),
ff_layout_write_prepare_v3 or ff_layout_commit_prepare_v3() so that we
reset the statistics.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

pNFS/flexfiles: Don't mark the entire layout as failed, when returning it

In pNFS/flexfiles, we want to return the layout without necessarily marking
it as having completely failed. We therefore move the call to
pnfs_layout_io_set_failed() out of pnfs_error_mark_layout_for_return(),
and then ensura that pNFS/files layout calls it separately.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

pNFS/flexfiles: Don't prevent flexfiles client from retrying LAYOUTGET

Fix a bug in which flexfiles clients are falling back to I/O through the
MDS even when the FF_FLAGS_NO_IO_THRU_MDS flag is set.

The flexfiles client will always report errors through the LAYOUTRETURN
and/or LAYOUTERROR mechanisms, so it should normally be safe for it
to retry the LAYOUTGET until it fails or succeeds.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

pnfs/flexfiles: count io stat in rpc_count_stats callback

If client ever restarts IO due to some errors, we'll endup
mis-counting IO stats if we do the counting in .rpc_done
callback. Move it to .rpc_count_stats callback that is only
called when releasing RPC.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

pnfs/flexfiles: do not mark delay-like status as DS failure

We just need to delay and retry in these cases.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFS41: map NFS4ERR_LAYOUTUNAVAILABLE to ENODATA

Instead of mapping it to EIO that is a fatal error and
fails application. We'll go inband after getting
NFS4ERR_LAYOUTUNAVAILABLE.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: only remove page from mapping if launder_page fails

Instead of dropping pages when write fails, only do it when
we get fatal failure in launder_page write back.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: handle request add failure properly

When we fail to queue a read page to IO descriptor,
we need to clean it up otherwise it is hanging around
preventing nfs module from being removed.

When we fail to queue a write page to IO descriptor,
we need to clean it up and also save the failure status
to open context. Then at file close, we can try to write
pages back again and drop the page if it fails to writeback
in .launder_page, which will be done in the next patch.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: centralize pgio error cleanup

In case we fail during setting things up for read/write IO, set
pg_error in IO descriptor and do the cleanup in nfs_pageio_add_request,
where we clean up all pages that are still hanging around on the IO
descriptor.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: clean up rest of reqs when failing to add one

If we fail to set up things before sending anything over wire,
we need to clean up the reqs that are still attached to the
IO descriptor.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFS41: pop some layoutget errors to application

For ERESTARTSYS/EIO/EROFS/ENOSPC/E2BIG in layoutget, we
should just bail out instead of hiding the error and
retrying inband IO.

Change all the call sites to pop the error all the way up.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

pNFS/flexfiles: Support server-supplied layoutstats sampling period

Some servers want to be able to control the frequency with which clients
report layoutstats, for instance, in order to monitor QoS for a particular
file or set of file. In order to support this, the flexfiles layout allows
the server to pass this info as a hint in the layout payload.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

SUNRPC: drop unused xs_reclassify_socketX() helpers

xs_reclassify_socket4() and friends used to be called directly.
xs_reclassify_socket() is called instead nowadays.

The xs_reclassify_socketX() helper functions are empty when
CONFIG_DEBUG_LOCK_ALLOC is not defined. Drop them since they have no
callers.

Note that AF_LOCAL still calls xs_reclassify_socketu() directly but is
easily converted to generic xs_reclassify_socket().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: machine credential support for additional operations

Allow LAYOUTRETURN and DELEGRETURN to use machine credentials if the
server supports it. Add request for OPEN_DOWNGRADE as the close path
also uses that.

Signed-off-by: Andrew Elble <aweits@rit.edu>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: do not initialise statics to 0

This patch fixes the checkpatch.pl error to nfs4sysctl.c:

ERROR: do not initialise statics to 0

Signed-off-by: Wei Tang <tangwei@cmss.chinamobile.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

pNFS: Modify pnfs_update_layout tracepoints to use layout stateid

Instead of displaying a layout segment pointer in these tracepoints,
let's use the layout stateid, now that Olga gave us a set of tools for
displaying them.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFSv4: Fix unused variable warnings in nfs4_init_*_client_string()

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: add new tracepoint for pnfs_update_layout

pnfs_update_layout is really the "nexus" of layout handling. If it
returns NULL then we end up going through the MDS. This patch adds
some tracepoints to that function that allow us to determine the
cause when we end up going through the MDS unexpectedly.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

Adding tracepoint to cached open

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

Adding stateid information to tracepoints

Operations to which stateid information is added:
close, delegreturn, open, read, setattr, layoutget, layoutcommit, test_stateid,
write, lock, locku, lockt

Format is "stateid=<seqid>:<crc32 hash stateid.other>", also "openstateid=",
"layoutstateid=", and "lockstateid=" for open_file, layoutget, set_lock
tracepoints.

New function is added to internal.h, nfs_stateid_hash(), to compute the hash

trace_nfs4_setattr() is moved from nfs4_do_setattr() to _nfs4_do_setattr()
to get access to stateid.

trace_nfs4_setattr and trace_nfs4_delegreturn are changed from INODE_EVENT
to new event type, INODE_STATEID_EVENT which is same as INODE_EVENT but adds
stateid information

for locking tracepoints, moved trace_nfs4_set_lock() into _nfs4_do_setlk()
to get access to stateid information, and removed trace_nfs4_lock_reclaim(),
trace_nfs4_lock_expired() as they call into _nfs4_do_setlk() and both were
previously same LOCK_EVENT type.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

Linux 4.4-rc7

Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus

Pull MIPS fixes from Ralf Baechle:

- Fix bitrot in __get_user_unaligned()
- EVA userspace accessor bug fixes.
- Fix for build issues with certain toolchains.
- Fix build error for VDSO with particular toolchain versions.
- Fix build error due to a variable that should have been removed by an
   earlier patch

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: Fix bitrot in __get_user_unaligned()
  MIPS: Fix build error due to unused variables.
  MIPS: VDSO: Fix build error
  MIPS: CPS: drop .set mips64r2 directives
  MIPS: uaccess: Take EVA into account in [__]clear_user
  MIPS: uaccess: Take EVA into account in __copy_from_user()
  MIPS: uaccess: Fix strlen_user with EVA

Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
"A smallish set of fixes that we've been sitting on for a while now,
  flushing the queue here so they go in.  Summary:

  A handful of fixes for OMAP, i.MX, Allwinner and Tegra:

   - A clock rate and a PHY setup fix for i.MX6Q/DL
   - A couple of fixes for the reduced serial bus (sunxi-rsb) on
     Allwinner
   - UART wakeirq fix for an OMAP4 board, timer config fixes for AM43XX.
   - Suspend fix for Tegra124 Chromebooks
   - Fix for missing implicit include that's different between
     ARM/ARM64"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: tegra: Fix suspend hang on Tegra124 Chromebooks
  bus: sunxi-rsb: Fix peripheral IC mapping runtime address
  bus: sunxi-rsb: Fix primary PMIC mapping hardware address
  ARM: dts: Fix UART wakeirq for omap4 duovero parlor
  ARM: OMAP2+: AM43xx: select ARM TWD timer
  ARM: OMAP2+: am43xx: enable GENERIC_CLOCKEVENTS_BROADCAST
  fsl-ifc: add missing include on ARM64
  ARM: dts: imx6: Fix Ethernet PHY mode on Ventana boards
  ARM: dts: imx: Fix the assigned-clock mismatch issue on imx6q/dl
  bus: sunxi-rsb: unlock on error in sunxi_rsb_read()
  ARM: dts: sunxi: sun6i-a31s-primo81.dts: add touchscreen axis swapping property

MIPS: Fix bitrot in __get_user_unaligned()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

Merge tag 'pm+acpi-4.4-rc7' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
"These fix an ACPI processor driver regression introduced during the
  4.3 cycle and a mistake in the recently added SCPI support in the
  arm_big_little cpufreq driver.

  Specifics:

   - Fix a thermal management issue introduced by an ACPI processor
     driver change made during the 4.3 development cycle that failed to
     return 0 from a function on success which triggered an error
     cleanup path every time it had been called that deleted useful data
     structures created previously (Srinivas Pandruvada).

   - Fix a variable data type issue in the arm_big_little cpufreq
     driver's SCPI support code added recently that prevents error
     handling in there from working correctly (Dan Carpenter)"

* tag 'pm+acpi-4.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: scpi-cpufreq: signedness bug in scpi_get_dvfs_info()
  ACPI / processor: Fix thermal cooling device regression

Merge tag 'md/4.4-rc6-fix' of git://neil.brown.name/md

Pull md bugfix from Neil Brown:
"One more md fix for 4.4-rc

Fix a regression which causes reshape to not start properly sometimes"

* tag 'md/4.4-rc6-fix' of git://neil.brown.name/md:
md: remove check for MD_RECOVERY_NEEDED in action_store.

Merge tag 'upstream-4.4-rc7' of git://git.infradead.org/linux-ubifs

Pull UBI bug fixes from Richard Weinberger:
"This contains four bug fixes for UBI"

* tag 'upstream-4.4-rc7' of git://git.infradead.org/linux-ubifs:
  mtd: ubi: don't leak e if schedule_erase() fails
  mtd: ubi: fixup error correction in do_sync_erase()
  UBI: fix use of "VID" vs. "EC" in header self-check
  UBI: fix return error code

Merge tag 'trace-v4.4-rc4-2' of git://git./linux/kernel/git/rostedt/linux-trace

Pull ftrace/recordmcount fix from Steven Rostedt:
"Russell King was reporting lots of warnings when he compiled his
  kernel with ftrace enabled.  With some investigation it was discovered
  that it was his compile setup.  He was using ccache with hard links,
  which allowed recordmcount to process the same .o twice.  When this
  happens, recordmcount will detect that it was already done and give a
  warning about it.

  Russell fixed this by having recordmcount detect that the object file
  has more than one hard link, and if it does, it unlinks the object
  file after it maps it and processes then.  This appears to fix the
  issue.

  As you did not like the fact that recordmcount modified the file in
  place and thought that it should do the modifications in memory and
  then write it out to disk and move it over the old file to prevent
  other more subtle issues like the one above, a second patch is added
  on top of Russell's to do just that.  Luckily the original code had
  write and lseek wrappers that I was able to modify to not do inplace
  writes, but simply keep track of the changes made in memory.  When a
  write is made, a "update" flag is set, and at the end of processing,
  if the update is set, then it writes the file with changes out to a
  new file, and then renames it over the original one.

  The file descriptor is still passed to the write and lseek wrappers
  because removing that would cause the change to be more intrusive.
  That can be removed in a follow up cleanup patch that can wait till
  the next merge window"

* tag 'trace-v4.4-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace/scripts: Have recordmcount copy the object file
  scripts: recordmcount: break hardlinks

Merge tag 'arc-4.4-rc7-fixes' of git://git./linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:
"Sorry for this late pull request, but these are all important fixes
  for code introduced/updated in this release which we will otherwise
  end up back porting.

   - Unwinder rework (A revert followed by better fix)
   - Build errors: MMUv2, modules with -Os
   - highmem section mismatch build splat"

* tag 'arc-4.4-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: dw2 unwind: Catch Dwarf SNAFUs early
  ARC: dw2 unwind: Don't bail for CIE.version != 1
  Revert "ARC: dw2 unwind: Ignore CIE version !=1 gracefully instead of bailing"
  ARC: Fix linking errors with CONFIG_MODULE + CONFIG_CC_OPTIMIZE_FOR_SIZE
  ARC: mm: fix building for MMU v2
  ARC: mm: HIGHMEM: Fix section mismatch splat

Merge branches 'acpi-processor' and 'pm-cpufreq'

* acpi-processor:
ACPI / processor: Fix thermal cooling device regression

* pm-cpufreq:
cpufreq: scpi-cpufreq: signedness bug in scpi_get_dvfs_info()

Merge branch 'parisc-4.4-4' of git://git./linux/kernel/git/deller/parisc-linux

Pull parisc system call restart fix from Helge Deller:
"The architectural design of parisc always uses two instructions to
  call kernel syscalls (delayed branch feature).  This means that the
  instruction following the branch (located in the delay slot of the
  branch instruction) is executed before control passes to the branch
  destination.

  Depending on which assembler instruction and how it is used in
  usersapce in the delay slot, this sometimes made restarted syscalls
  like futex() and poll() failing with -ENOSYS"

* 'parisc-4.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix syscall restarts

Merge git://git./linux/kernel/git/davem/sparc

Pull sparc fixes from David Miller:

1) Finally make perf stack backtraces stable on sparc, several problems
    (mostly due to the context in which the user copies from the stack
    are done) contributed to this.

    From Rob Gardner.

2) Export ADI capability if the cpu supports it.

3) Hook up userfaultfd system call.

4) When faults happen during user copies we really have to clean up and
    restore the FPU state fully.  Also from Rob Gardner

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  tty/serial: Skip 'NULL' char after console break when sysrq enabled
  sparc64: fix FP corruption in user copy functions
  sparc64: Perf should save/restore fault info
  sparc64: Ensure perf can access user stacks
  sparc64: Don't set %pil in rtrap_nmi too early
  sparc64: Add ADI capability to cpu capabilities
  tty: serial: constify sunhv_ops structs
  sparc: Hook up userfaultfd system call

tty/serial: Skip 'NULL' char after console break when sysrq enabled

When sysrq is triggered from console, serial driver for SUN hypervisor
console receives a console break and enables the sysrq. It expects a valid
sysrq char following with break. Meanwhile if driver receives 'NULL'
ASCII char then it disables sysrq and sysrq handler will never be invoked.

This fix skips calling uart sysrq handler when 'NULL' is received while
sysrq is enabled.

Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Acked-by: Karl Volz <karl.volz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc64: fix FP corruption in user copy functions

Short story: Exception handlers used by some copy_to_user() and
copy_from_user() functions do not diligently clean up floating point
register usage, and this can result in a user process seeing invalid
values in floating point registers. This sometimes makes the process
fail.

Long story: Several cpu-specific (NG4, NG2, U1, U3) memcpy functions
use floating point registers and VIS alignaddr/faligndata to
accelerate data copying when source and dest addresses don't align
well. Linux uses a lazy scheme for saving floating point registers; It
is not done upon entering the kernel since it's a very expensive
operation. Rather, it is done only when needed. If the kernel ends up
not using FP regs during the course of some trap or system call, then
it can return to user space without saving or restoring them.

The various memcpy functions begin their FP code with VISEntry (or a
variation thereof), which saves the FP regs. They conclude their FP
code with VISExit (or a variation) which essentially marks the FP regs
"clean", ie, they contain no unsaved values. fprs.FPRS_FEF is turned
off so that a lazy restore will be triggered when/if the user process
accesses floating point regs again.

The bug is that the user copy variants of memcpy, copy_from_user() and
copy_to_user(), employ an exception handling mechanism to detect faults
when accessing user space addresses, and when this handler is invoked,
an immediate return from the function is forced, and VISExit is not
executed, thus leaving the fprs register in an indeterminate state,
but often with fprs.FPRS_FEF set and one or more dirty bits. This
results in a return to user space with invalid values in the FP regs,
and since fprs.FPRS_FEF is on, no lazy restore occurs.

This bug affects copy_to_user() and copy_from_user() for NG4, NG2,
U3, and U1. All are fixed by using a new exception handler for those
loads and stores that are done during the time between VISEnter and
VISExit.

n.b. In NG4memcpy, the problematic code can be triggered by a copy
size greater than 128 bytes and an unaligned source address. This bug
is known to be the cause of random user process memory corruptions
while perf is running with the callgraph option (ie, perf record -g).
This occurs because perf uses copy_from_user() to read user stacks,
and may fault when it follows a stack frame pointer off to an
invalid page. Validation checks on the stack address just obscure
the underlying problem.

Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc64: Perf should save/restore fault info

There have been several reports of random processes being killed with
a bus error or segfault during userspace stack walking in perf. One
of the root causes of this problem is an asynchronous modification to
thread_info fault_address and fault_code, which stems from a perf
counter interrupt arriving during kernel processing of a "benign"
fault, such as a TSB miss. Since perf_callchain_user() invokes
copy_from_user() to read user stacks, a fault is not only possible,
but probable. Validity checks on the stack address merely cover up the
problem and reduce its frequency.

The solution here is to save and restore fault_address and fault_code
in perf_callchain_user() so that the benign fault handler is not
disturbed by a perf interrupt.

Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc64: Ensure perf can access user stacks

When an interrupt (such as a perf counter interrupt) is delivered
while executing in user space, the trap entry code puts ASI_AIUS in
%asi so that copy_from_user() and copy_to_user() will access the
correct memory. But if a perf counter interrupt is delivered while the
cpu is already executing in kernel space, then the trap entry code
will put ASI_P in %asi, and this will prevent copy_from_user() from
reading any useful stack data in either of the perf_callchain_user_X
functions, and thus no user callgraph data will be collected for this
sample period. An additional problem is that a fault is guaranteed
to occur, and though it will be silently covered up, it wastes time
and could perturb state.

In perf_callchain_user(), we ensure that %asi contains ASI_AIUS
because we know for a fact that the subsequent calls to
copy_from_user() are intended to read the user's stack.

[ Use get_fs()/set_fs() -DaveM ]

Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc64: Don't set %pil in rtrap_nmi too early

Commit 28a1f53 delays setting %pil to avoid potential
hardirq stack overflow in the common rtrap_irq path.
Setting %pil also needs to be delayed in the rtrap_nmi
path for the same reason.

Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc64: Add ADI capability to cpu capabilities

Add ADI (Application Data Integrity) capability to cpu capabilities list.
ADI capability allows virtual addresses to be encoded with a tag in
bits 63-60. This tag serves as an access control key for the regions
of virtual address with ADI enabled and a key set on them. Hypervisor
encodes this capability as "adp" in "hwcap-list" property in machine
description.

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tty: serial: constify sunhv_ops structs

Constifies sunhv_ops structures in tty's serial
driver since they are not modified after their
initialization.

Detected and found using Coccinelle.

Suggested-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Aya Mahfouz <mahfouz.saif.elyazal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cpufreq: scpi-cpufreq: signedness bug in scpi_get_dvfs_info()

The "domain" variable needs to be signed for the error handling to work.

Fixes: 8def31034d03 (cpufreq: arm_big_little: add SCPI interface driver)
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

sparc: Hook up userfaultfd system call

After hooking up system call, userfaultfd selftest was successful for
both 32 and 64 bit version of test.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'sound-4.4-rc7' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"This shouldn't be a nightmare before Christmas: just a handful small
  device-specific fixes for various ASoC and HD-audio drivers.  Most of
  them are stable fixes"

* tag 'sound-4.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek - Fix silent headphone output on MacPro 4,1 (v2)
  ASoC: fsl_sai: fix no frame clk in master mode
  ALSA: hda - Set SKL+ hda controller power at freeze() and thaw()
  ASoC: sgtl5000: fix VAG power up timing
  ASoC: rockchip: spdif: Set transmit data level to 16 samples
  ASoC: wm8974: set cache type for regmap
  ASoC: es8328: Fix shifts for mixer switches
  ASoC: davinci-mcasp: Fix XDATA check in mcasp_start_tx
  ASoC: es8328: Fix deemphasis values

Merge tag 'drm-intel-fixes-2015-12-23' of git://anongit.freedesktop.org/drm-intel

Pull i915 drm fixes from Jani Nikula:
"Here's a batch of i915 fixes all around.  It may be slightly bigger
  than one would hope for at this stage, but they've all been through
  testing in our -next before being picked up for v4.4.  Also, I missed
  Dave's fixes pull earlier today just because I wanted an extra testing
  round on this.  So I'm fairly confident.

  Wishing you all the things it is customary to wish this time of the
  year"

* tag 'drm-intel-fixes-2015-12-23' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Correct max delay for HDMI hotplug live status checking
  drm/i915: mdelay(10) considered harmful
  drm/i915: Kill intel_crtc->cursor_bo
  drm/i915: Workaround CHV pipe C cursor fail
  drm/i915: Only spin whilst waiting on the current request
  drm/i915: Limit the busy wait on requests to 5us not 10ms!
  drm/i915: Break busywaiting for requests on pending signals
  drm/i915: Disable primary plane if we fail to reconstruct BIOS fb (v2)
  drm/i915: Set the map-and-fenceable flag for preallocated objects
  drm/i915: Drop the broken cursor base==0 special casing

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"Not much happening, should have dequeued this lot earlier.

  One amdgpu, one nouveau and one exynos fix"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/exynos: atomic check only enabled crtc states
  drm/nouveau/bios/fan: hardcode the fan mode to linear
  drm/amdgpu: fix user fence handling

Merge tag 'asoc-fix-v4.4-rc6' of git://git./linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v4.4

A collection of small driver specific fixes here, nothing that'll affect
users who don't have the devices concerned. At least the wm8974 bug
indicates that there's not too many users of some of these devices.

Merge remote-tracking branches 'asoc/fix/davinci', 'asoc/fix/es8328', 'asoc/fix/fsl-sai', 'asoc/fix/rockchip', 'asoc/fix/sgtl5000' and 'asoc/fix/wm8974' into asoc-linus

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block layer fixes from Jens Axboe:
"Three small fixes for 4.4 final. Specifically:

   - The segment issue fix from Junichi, where the old IO path does a
     bio limit split before potentially bouncing the pages.  We need to
     do that in the right order, to ensure that limitations are met.

   - A NVMe surprise removal IO hang fix from Keith.

   - A use-after-free in null_blk, introduced by a previous patch in
     this series.  From Mike Krinkin"

* 'for-linus' of git://git.kernel.dk/linux-block:
  null_blk: fix use-after-free error
  block: ensure to split after potentially bouncing a bio
  NVMe: IO ending fixes on surprise removal

Merge tag 'nfsd-4.4-1' of git://linux-nfs.org/~bfields/linux

Pull nfsd fix from Bruce Fields:
"Just one fix for a NFSv4 callback bug introduced in 4.4"

* tag 'nfsd-4.4-1' of git://linux-nfs.org/~bfields/linux:
nfsd: don't hold ls_mutex across a layout recall

Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:

- A series of fixes to the MTRR emulation, tested in the BZ by several
   users so they should be safe this late

- A fix for a division by zero

- Two very simple ARM and PPC fixes

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Reload pit counters for all channels when restoring state
  KVM: MTRR: treat memory as writeback if MTRR is disabled in guest CPUID
  KVM: MTRR: observe maxphyaddr from guest CPUID, not host
  KVM: MTRR: fix fixed MTRR segment look up
  KVM: VMX: Fix host initiated access to guest MSR_TSC_AUX
  KVM: arm/arm64: vgic: Fix kvm_vgic_map_is_active's dist check
  kvm: x86: move tracepoints outside extended quiescent state
  KVM: PPC: Book3S HV: Prohibit setting illegal transaction state in MSR

Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux

Pull s390 fixes from Martin Schwidefsky:
"Two late bug fixes for kernel 4.4.

  Merry Christmas"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/dis: Fix handling of format specifiers
  s390/zcrypt: Fix AP queue handling if queue is full

ARM: tegra: Fix suspend hang on Tegra124 Chromebooks

Enabling CPUFreq support for Tegra124 Chromebooks is causing the Tegra124
to hang when resuming from suspend.

When CPUFreq is enabled, the CPU clock is changed from the PLLX clock to
the DFLL clock during kernel boot. When resuming from suspend the CPU
clock is temporarily changed back to the PLLX clock before switching back
to the DFLL. If the DFLL is operating at a much lower frequency than the
PLLX when we enter suspend, and so the CPU voltage rail is at a voltage
too low for the CPUs to operate at the PLLX frequency, then the device
will hang.

Please note that the PLLX is used in the resume sequence to switch the CPU
clock from the very slow 32K clock to a faster clock during early resume
to speed up the resume sequence before the DFLL is resumed.

Ideally, we should fix this by setting the suspend frequency so that it
matches the PLLX frequency, however, that would be a bigger change. For
now simply disable CPUFreq support for Tegra124 Chromebooks to avoid the
hang when resuming from suspend.

Fixes: 9a0baee960a7 ("ARM: tegra: Enable CPUFreq support for Tegra124
Chromebooks")

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Olof Johansson <olof@lixom.net>

Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost

Pull virtio fix from Michael Tsirkin:
"This includes a single fix for virtio ccw error handling"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio/s390: handle error values in irb

um: Fix pointer cast

Fix a pointer cast typo introduced in v4.4-rc5 especially visible for
the i386 subarchitecture where it results in a kernel crash.

[ Also removed pointless cast as per Al Viro - Linus ]

Fixes: 8090bfd2bb9a ("um: Fix fpstate handling")
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Jeff Dike <jdike@addtoit.com>
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'omap-for-v4.4/fixes-rc6' of git://git./linux/kernel/git/tmlind/linux-omap into fixes

Few fixes for omaps to allow am437x only builds to boot properly with
CPU_IDLE and ARM TWD timer. This is probably a common configuration setup
for people making products with these SoCs so let's make sure it works.

Also a wakeirq fix for duovero parlor making my life a bit easier as that
allows me to run basic PM regression tests on it.

It would be nice to have these in v4.4, but if it gets too late for that
because of the holidays, it is not super critical if these get merged for
v4.5.

* tag 'omap-for-v4.4/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: dts: Fix UART wakeirq for omap4 duovero parlor
  ARM: OMAP2+: AM43xx: select ARM TWD timer
  ARM: OMAP2+: am43xx: enable GENERIC_CLOCKEVENTS_BROADCAST

Signed-off-by: Olof Johansson <olof@lixom.net>

Merge tag 'imx-fixes-4.4-3' of git://git./linux/kernel/git/shawnguo/linux into fixes

The i.MX fixes for 4.4, 3rd round:
- Fix Ethernet PHY mode on i.MX6 Ventana boards, which can result in
  a non-functional Ethernet when Marvell phy driver rather than generic
  phy driver is selected.
- Fix an assigned-clock configuration bug on imx6qdl-sabreauto board
  which was introduced by commit ed339363de1b ("ARM: dts:
  imx6qdl-sabreauto: Allow HDMI and LVDS to work simultaneously").

* tag 'imx-fixes-4.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: imx6: Fix Ethernet PHY mode on Ventana boards
  ARM: dts: imx: Fix the assigned-clock mismatch issue on imx6q/dl

bus: sunxi-rsb: Fix peripheral IC mapping runtime address

0x4e is the runtime address normally associated with perihperal ICs.
0x45 is not a valid runtime address.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Olof Johansson <olof@lixom.net>

bus: sunxi-rsb: Fix primary PMIC mapping hardware address

The primary PMICs use 0x3a3 as their hardware address, not 0x3e3.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Olof Johansson <olof@lixom.net>

null_blk: fix use-after-free error

blk_end_request_all may free request, so we need to save
request_queue pointer before blk_end_request_all call.

The problem was introduced in commit cf8ecc5a8455266f8d51
("null_blk: guarantee device restart in all irq modes")
and causes general protection fault with slab poisoning
enabled.

Fixes: cf8ecc5a8455266f8d51 ("null_blk: guarantee device
restart in all irq modes")

Signed-off-by: Mike Krinkin <krinkin.m.u@gmail.com>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

block: ensure to split after potentially bouncing a bio

blk_queue_bio() does split then bounce, which makes the segment
counting based on pages before bouncing and could go wrong. Move
the split to after bouncing, like we do for blk-mq, and the we
fix the issue of having the bio count for segments be wrong.

Fixes: 54efd50bfd87 ("block: make generic_make_request handle arbitrarily sized bios")
Cc: stable@vger.kernel.org
Tested-by: Artem S. Tashkinov <t.artem@lycos.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

NVMe: IO ending fixes on surprise removal

This patch fixes a lost request discovered during IO + hot removal.

The driver's pci removal deletes gendisks prior to shutting down the
controller to allow dirty data to sync. Dirty data can not be synced on
a surprise removal, though, and would potentially block indefinitely.

The driver previously had marked the queue as dying in this scenario
to prevent new requests from attempting, however it will still block
for requests that already entered the queue. This patch fixes this by
quiescing IO first, then aborting the requeued requests before deleting
disks.

Reported-by: Sujith Pandel <sujith_pandel@dell.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Tested-by: Sujith Pandel <sujith_pandel@dell.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

KVM: x86: Reload pit counters for all channels when restoring state

Currently if userspace restores the pit counters with a count of 0
on channels 1 or 2 and the guest attempts to read the count on those
channels, then KVM will perform a mod of 0 and crash. This will ensure
that 0 values are converted to 65536 as per the spec.

This is CVE-2015-7513.

Signed-off-by: Andy Honig <ahonig@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: MTRR: treat memory as writeback if MTRR is disabled in guest CPUID

Virtual machines can be run with CPUID such that there are no MTRRs.
In that case, the firmware will never enable MTRRs and it is obviously
undesirable to run the guest entirely with UC memory. Check out guest
CPUID, and use WB memory if MTRR do not exist.

Cc: qemu-stable@nongnu.org
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107561
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: MTRR: observe maxphyaddr from guest CPUID, not host

Conversion of MTRRs to ranges used the maxphyaddr from the boot CPU.
This is wrong, because var_mtrr_range's mask variable then is discontiguous
(like FF00FFFF000, where the first run of 0s corresponds to the bits
between host and guest maxphyaddr). Instead always set up the masks
to be full 64-bit values---we know that the reserved bits at the top
are zero, and we can restore them when reading the MSR. This way
var_mtrr_range gets a mask that just works.

Fixes: a13842dc668b40daef4327294a6d3bdc8bd30276
Cc: qemu-stable@nongnu.org
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107561
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: MTRR: fix fixed MTRR segment look up

This fixes the slow-down of VM running with pci-passthrough, since some MTRR
range changed from MTRR_TYPE_WRBACK to MTRR_TYPE_UNCACHABLE. Memory in the
0K-640K range was incorrectly treated as uncacheable.

Fixes: f7bfb57b3e89ff89c0da9f93dedab89f68d6ca27
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107561
Cc: qemu-stable@nongnu.org
Signed-off-by: Alexis Dambricourt <alexis.dambricourt@gmail.com>
[Use correct BZ for "Fixes" annotation. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

MIPS: Fix build error due to unused variables.

c861519fcf95b2d46cb4275903423b43ae150a40 ("MIPS: Fix delay loops which may
be removed by GCC.") which made it upstream was an outdated version of the
patch and is lacking some the removal of two variables that became unused
thus resulting in further warnings and build breakage. The commit
from ae878615d7cee5d7346946cf1ae1b60e427013c2 was correct however.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

MIPS: VDSO: Fix build error

Commit ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") introduced a
build error.

For MIPS VDSO to be compiled it requires binutils version 2.25 or above but
the check in the Makefile had inverted logic causing it to be compiled in if
binutils is below 2.25.

This fixes the following compilation error:

CC arch/mips/vdso/gettimeofday.o
/tmp/ccsExcUd.s: Assembler messages:
/tmp/ccsExcUd.s:62: Error: can't resolve `_start' {*UND* section} - `L0' {.text section}
/tmp/ccsExcUd.s:467: Error: can't resolve `_start' {*UND* section} - `L0' {.text section}
make[2]: *** [arch/mips/vdso/gettimeofday.o] Error 1
make[1]: *** [arch/mips/vdso] Error 2
make: *** [arch/mips] Error 2

[ralf@linux-mips: Fixed Sergei's complaint on the formatting of the
cited commit and generally reformatted the log message.]

Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: alex@alex-smith.me.uk
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/11745/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

MIPS: CPS: drop .set mips64r2 directives

Commit 977e043d5ea1 ("MIPS: kernel: cps-vec: Replace mips32r2 ISA level
with mips64r2") leads to .set mips64r2 directives being present in 32
bit (ie. CONFIG_32BIT=y) kernels. This is incorrect & leads to MIPS64
instructions being emitted by the assembler when expanding
pseudo-instructions. For example the "move" instruction can legitimately
be expanded to a "daddu". This causes problems when the kernel is run on
a MIPS32 CPU, as CONFIG_32BIT kernels of course often are...

Fix this by dropping the .set <ISA> directives entirely now that Kconfig
should be ensuring that kernels including this code are built with a
suitable -march= compiler flag.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: <stable@vger.kernel.org> # 3.16+
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/10869/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

drm/i915: Correct max delay for HDMI hotplug live status checking

The total delay of HDMI hotplug detecting with 30ms have already
been split into a resolution of 3 retries of 10ms each, for the worst
cases. But it still suffered from only waiting 10ms at most in
intel_hdmi_detect(). This patch corrects it by reading hotplug status
with 4 times at most for 30ms delay.

v2:
- straight up to loop execution for more clear in code readability
- mdelay will replace with msleep by Daniel's new patch

drm/i915: mdelay(10) considered harmful

- suggest to re-evaluate try times for being compatible to old HDMI monitor

Reviewed-by: Cooper Chiou <cooper.chiou@intel.com>
Tested-by: Gary Wang <gary.c.wang@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Gavin Hindman <gavin.hindman@intel.com>
Cc: Sonika Jindal <sonika.jindal@intel.com>
Cc: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Gary Wang <gary.c.wang@intel.com>
[danvet: fixup conflict with s/mdelay/msleep/ patch.]
Cc: drm-intel-fixes@lists.freedesktop.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit 61fb3980dd396880ffba48523b1e27579868b82b)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: mdelay(10) considered harmful

I missed this myself when reviewing

commit 237ed86c693d8a8e4db476976aeb30df4deac74b
Author: Sonika Jindal <sonika.jindal@intel.com>
Date: Tue Sep 15 09:44:20 2015 +0530

drm/i915: Check live status before reading edid

Long sleeps like this really shouldn't waste cpu cycles spinning.

Cc: Sonika Jindal <sonika.jindal@intel.com>
Cc: "Wang, Gary C" <gary.c.wang@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1449859455-32609-1-git-send-email-daniel.vetter@ffwll.ch
Reviewed-by: Sonika Jindal <sonika.jindal@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit 71a199bacb398ee54eeac001699257dda083a455)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: Kill intel_crtc->cursor_bo

The vma may have been rebound between the last time the cursor was
enabled and now, so skipping the cursor gtt offset deduction is not
safe unless we would also reset cursor_bo to NULL when disabling the
cursor. Just thow cursor_bo to the bin instead since it's lost all
other uses thanks to universal plane support.

Chris pointed out that cursor updates are currently too slow
via universal planes that micro optimizations like these wouldn't
even help.

v2: Add a note about futility of micro optimizations (Chris)

Cc: drm-intel-fixes@lists.freedesktop.org
References: http://lists.freedesktop.org/archives/intel-gfx/2015-December/082976.html
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450107302-17171-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 1264859d648c4bdc9f0a098efbff90cbf462a075)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

MIPS: uaccess: Take EVA into account in [__]clear_user

__clear_user() (and clear_user() which uses it), always access the user
mode address space, which results in EVA store instructions when EVA is
enabled even if the current user address limit is KERNEL_DS.

Fix this by adding a new symbol __bzero_kernel for the normal kernel
address space bzero in EVA mode, and call that from __clear_user() if
eva_kernel_access().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10844/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

drm/i915: Workaround CHV pipe C cursor fail

Turns out CHV pipe C was glued on somewhat poorly, and there's something
wrong with the cursor. If the cursor straddles the left screen edge,
and is then moved away from the edge or disabled, the pipe will often
underrun. If enough underruns are triggered quickly enough the pipe
will fall over and die (it just scans out a solid color and reports
a constant underrun). We need to turn the disp2d power well off and
on again to recover the pipe.

None of that is very nice for the user, so let's just refuse to place
the cursor in the compromised position. The ddx appears to fall back
to swcursor when the ioctl returns an error, so theoretically there's
no loss of functionality for the user (discounting swcursor bugs).
I suppose most cursors images actually have the hotspot not exactly
at 0,0 so under typical conditions the fallback will in fact kick in
as soon as the cursor touches the left edge of the screen.

Any atomic compositor should anyway be prepared to fall back to
GPU composition when things don't work out, so there should be no
problem with those.

Other things that I tried to solve this include flipping all
display related clock gating knobs I could find, increasing the
minimum gtt alignment all the way up to 512k. I also tried to see
if there are more specific screen coordinates that hit the bug, but
the findings were somewhat inconclusive. Sometimes the failures
happen almost across the whole left edge, sometimes more at the very
top and around the bottom half. I wasn't able to find any real pattern
to these variations, so it seems our only choice is to just refuse
to straddle the left screen edge at all.

Cc: stable@vger.kernel.org
Cc: Jason Plum <max@warheads.net>
Testcase: igt/kms_chv_cursor_fail
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92826
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450459479-16286-1-git-send-email-ville.syrjala@linux.intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit b29ec92c4f5e6d45d8bae8194e664427a01c6687)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: Only spin whilst waiting on the current request

Limit busywaiting only to the request currently being processed by the
GPU. If the request is not currently being processed by the GPU, there
is a very low likelihood of it being completed within the 2 microsecond
spin timeout and so we will just be wasting CPU cycles.

v2: Check for logical inversion when rebasing - we were incorrectly
checking for this request being active, and instead busywaiting for
when the GPU was not yet processing the request of interest.

v3: Try another colour for the seqno names.
v4: Another colour for the function names.

v5: Remove the forced coherency when checking for the active request. On
reflection and plenty of recent experimentation, the issue is not a
cache coherency problem - but an irq/seqno ordering problem (timing issue).
Here, we do not need the w/a to force ordering of the read with an
interrupt.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1449833608-22125-4-git-send-email-chris@chris-wilson.co.uk
(cherry picked from commit 821485dc2ad665f136c57ee589bf7a8210160fe2)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: Limit the busy wait on requests to 5us not 10ms!

When waiting for high frequency requests, the finite amount of time
required to set up the irq and wait upon it limits the response rate. By
busywaiting on the request completion for a short while we can service
the high frequency waits as quick as possible. However, if it is a slow
request, we want to sleep as quickly as possible. The tradeoff between
waiting and sleeping is roughly the time it takes to sleep on a request,
on the order of a microsecond. Based on measurements of synchronous
workloads from across big core and little atom, I have set the limit for
busywaiting as 10 microseconds. In most of the synchronous cases, we can
reduce the limit down to as little as 2 miscroseconds, but that leaves
quite a few test cases regressing by factors of 3 and more.

The code currently uses the jiffie clock, but that is far too coarse (on
the order of 10 milliseconds) and results in poor interactivity as the
CPU ends up being hogged by slow requests. To get microsecond resolution
we need to use a high resolution timer. The cheapest of which is polling
local_clock(), but that is only valid on the same CPU. If we switch CPUs
because the task was preempted, we can also use that as an indicator that
the system is too busy to waste cycles on spinning and we should sleep
instead.

__i915_spin_request was introduced in
commit 2def4ad99befa25775dd2f714fdd4d92faec6e34 [v4.2]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Apr 7 16:20:41 2015 +0100

drm/i915: Optimistically spin for the request completion

v2: Drop full u64 for unsigned long - the timer is 32bit wraparound safe,
so we can use native register sizes on smaller architectures. Mention
the approximate microseconds units for elapsed time and add some extra
comments describing the reason for busywaiting.

v3: Raise the limit to 10us
v4: Now 5us.

Reported-by: Jens Axboe <axboe@kernel.dk>
Link: https://lkml.org/lkml/2015/11/12/621
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1449833608-22125-3-git-send-email-chris@chris-wilson.co.uk
(cherry picked from commit ca5b721e238226af1d767103ac852aeb8e4c0764)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

MIPS: uaccess: Take EVA into account in __copy_from_user()

When EVA is in use, __copy_from_user() was unconditionally using the EVA
instructions to read the user address space, however this can also be
used for kernel access. If the address isn't a valid user address it
will cause an address error or TLB exception, and if it is then user
memory may be read instead of kernel memory.

For example in the following stack trace from Linux v3.10 (changes since
then will prevent this particular one still happening) kernel_sendmsg()
set the user address limit to KERNEL_DS, and tcp_sendmsg() goes on to
use __copy_from_user() with a kernel address in KSeg0.

[<8002d434>] __copy_fromuser_common+0x10c/0x254
[<805710e0>] tcp_sendmsg+0x5f4/0xf00
[<804e8e3c>] sock_sendmsg+0x78/0xa0
[<804e8f28>] kernel_sendmsg+0x24/0x38
[<804ee0f8>] sock_no_sendpage+0x70/0x7c
[<8017c820>] pipe_to_sendpage+0x80/0x98
[<8017c6b0>] splice_from_pipe_feed+0xa8/0x198
[<8017cc54>] __splice_from_pipe+0x4c/0x8c
[<8017e844>] splice_from_pipe+0x58/0x78
[<8017e884>] generic_splice_sendpage+0x20/0x2c
[<8017d690>] do_splice_from+0xb4/0x110
[<8017d710>] direct_splice_actor+0x24/0x30
[<8017d394>] splice_direct_to_actor+0xd8/0x208
[<8017d51c>] do_splice_direct+0x58/0x7c
[<8014eaf4>] do_sendfile+0x1dc/0x39c
[<8014f82c>] SyS_sendfile+0x90/0xf8

Add the eva_kernel_access() check in __copy_from_user() like the one in
copy_from_user().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10843/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

drm/i915: Break busywaiting for requests on pending signals

The busywait in __i915_spin_request() does not respect pending signals
and so may consume the entire timeslice for the task instead of
returning to userspace to handle the signal.

In the worst case this could cause a delay in signal processing of 20ms,
which would be a noticeable jitter in cursor tracking. If a higher
resolution signal was being used, for example to provide fairness of a
server timeslices between clients, we could expect to detect some
unfairness between clients (i.e. some windows not updating as fast as
others). This issue was noticed when inspecting a report of poor
interactivity resulting from excessively high __i915_spin_request usage.

Fixes regression from
commit 2def4ad99befa25775dd2f714fdd4d92faec6e34 [v4.2]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Apr 7 16:20:41 2015 +0100

drm/i915: Optimistically spin for the request completion

v2: Try to assess the impact of the bug

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc; "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1449833608-22125-2-git-send-email-chris@chris-wilson.co.uk
(cherry picked from commit 91b0c352ace9afec1fb51590c7b8bd60e0eb9fbd)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

MIPS: uaccess: Fix strlen_user with EVA

The strlen_user() function calls __strlen_kernel_asm in both branches of
the eva_kernel_access() conditional. For EVA it should be calling
__strlen_user_eva for user accesses, otherwise it will load from the
kernel address space instead of the user address space, and the access
checking will likely be ineffective at preventing it due to EVA's
overlapping user and kernel address spaces.

This was found after extending the test_user_copy module to cover user
string access functions, which gave the following error with EVA:

test_user_copy: illegal strlen_user passed

Fortunately the use of strlen_user() has been all but eradicated from
the mainline kernel, so only out of tree modules could be affected.

Fixes: e3a9b07a9caf ("MIPS: asm: uaccess: Add EVA support for str*_user operations")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.15.x-
Patchwork: https://patchwork.linux-mips.org/patch/10842/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>