git.samba.org - sfrench/cifs-2.6.git/log

Makefile.headersinst: cleanup input files

After the last three patches, all exported headers are under uapi/, thus
input-files2 are not needed anymore.
The side effect is that input-files1-name is exactly header-y.

Note also that input-files3-name is genhdr-y.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

x86: stop exporting msr-index.h to userland

Even if this file was not in an uapi directory, it was exported because
it was listed in the Kbuild file.

Fixes: b72e7464e4cf ("x86/uapi: Do not export <asm/msr-index.h> as part of the user API headers")
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

nios2: put setup.h in uapi

This header file is exported, but from a userland pov, it's just a wrapper
to asm-generic/setup.h.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

h8300: put bitsperlong.h in uapi

This header file is exported, thus move it to uapi.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

blk-mq: NVMe 512B/4K+T10 DIF/DIX format returns I/O error on dd with split op

When formatting NVMe to 512B/4K + T10 DIf/DIX, dd with split op returns
"Input/output error". Looks block layer split the bio after calling
bio_integrity_prep(bio). This patch fixes the issue.

Below is how we debug this issue:
(1)format nvme to 4K block # size with type 2 DIF
(2)dd with block size bigger than 1024k.
oflag=direct
dd: error writing '/dev/nvme0n1': Input/output error

We added some debug code in nvme device driver. It showed us the first
op and the second op have the same bi and pi address. This is not
correct.

1st op: nvme0n1 Op:Wr slba 0x505 length 0x100, PI ctrl=0x1400,
dsmgmt=0x0, AT=0x0 & RT=0x505
Guard 0x00b1, AT 0x0000, RT physical 0x00000505 RT virtual 0x00002828

2nd op: nvme0n1 Op:Wr slba 0x605 length 0x1, PI ctrl=0x1400, dsmgmt=0x0,
AT=0x0 & RT=0x605 ==> This op fails and subsequent 5 retires..
Guard 0x00b1, AT 0x0000, RT physical 0x00000605 RT virtual 0x00002828

With the fix, It showed us both of the first op and the second op have
correct bi and pi address.

1st op: nvme2n1 Op:Wr slba 0x505 length 0x100, PI ctrl=0x1400,
dsmgmt=0x0, AT=0x0 & RT=0x505
Guard 0x5ccb, AT 0x0000, RT physical 0x00000505 RT virtual
0x00002828
2nd op: nvme2n1 Op:Wr slba 0x605 length 0x1, PI ctrl=0x1400, dsmgmt=0x0,
AT=0x0 & RT=0x605
Guard 0xab4c, AT 0x0000, RT physical 0x00000605 RT virtual
0x00003028

Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

blk-stat: don't use this_cpu_ptr() in a preemptable section

If PREEMPT_RCU is enabled, rcu_read_lock() isn't strong enough
for us to use this_cpu_ptr() in that section. Use the safer
get/put_cpu_ptr() variants instead.

Reported-by: Mike Galbraith <efault@gmx.de>
Fixes: 34dbad5d26e2 ("blk-stat: convert to callback-based statistics reporting")
Signed-off-by: Jens Axboe <axboe@fb.com>

elevator: remove redundant warnings on IO scheduler switch

We warn twice for switching to a scheduler, if that switch fails.
As we also report the failure in the return value to the
sysfs write, remove the dmesg induced failures.

Keep the failure print for warning to switch to the kconfig
selected IO scheduler, as we can't report errors for that in
any other way.

Signed-off-by: Jens Axboe <axboe@fb.com>

block, bfq: stress that low_latency must be off to get max throughput

The introduction of the BFQ and Kyber I/O schedulers has triggered a
new wave of I/O benchmarks. Unfortunately, comments and discussions on
these benchmarks confirm that there is still little awareness that it
is very hard to achieve, at the same time, a low latency and a high
throughput. In particular, virtually all benchmarks measure
throughput, or throughput-related figures of merit, but, for BFQ, they
use the scheduler in its default configuration. This configuration is
geared, instead, toward a low latency. This is evidently a sign that
BFQ documentation is still too unclear on this important aspect. This
commit addresses this issue by stressing how BFQ configuration must be
(easily) changed if the only goal is maximum throughput.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@fb.com>

block, bfq: use pointer entity->sched_data only if set

In the function __bfq_deactivate_entity, the pointer
entity->sched_data could happen to be used before being properly
initialized. This led to a NULL pointer dereference. This commit fixes
this bug by just using this pointer only where it is safe to do so.

Reported-by: Tom Harrison <l12436.tw@gmail.com>
Tested-by: Tom Harrison <l12436.tw@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@fb.com>

nvme: lightnvm: fix memory leak

Free up kmalloc allocated memory if failure happens while handling L2P
table transfer in nvme_nvm_get_l2p_tbl.

Fixes: 8e79b5cb ("lightnvm: move block provisioning to targets")
Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Reviewed-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

ALSA: hda: Fix cpu lockup when stopping the cmd dmas

Using jiffies in hdac_wait_for_cmd_dmas() to determine when to time out
when interrupts are off (snd_hdac_bus_stop_cmd_io()/spin_lock_irq())
causes hard lockup so unlock while waiting using jiffies.

---<-snip->---
<0>[ 1211.603046] NMI watchdog: Watchdog detected hard LOCKUP on cpu 3
<4>[ 1211.603047] Modules linked in: snd_hda_intel i915 vgem
<4>[ 1211.603053] irq event stamp: 13366
<4>[ 1211.603053] hardirqs last  enabled at (13365):
...
<4>[ 1211.603059] Call Trace:
<4>[ 1211.603059]  ? delay_tsc+0x3d/0xc0
<4>[ 1211.603059]  __delay+0xa/0x10
<4>[ 1211.603060]  __const_udelay+0x31/0x40
<4>[ 1211.603060]  snd_hdac_bus_stop_cmd_io+0x96/0xe0 [snd_hda_core]
<4>[ 1211.603060]  ? azx_dev_disconnect+0x20/0x20 [snd_hda_intel]
<4>[ 1211.603061]  snd_hdac_bus_stop_chip+0xb1/0x100 [snd_hda_core]
<4>[ 1211.603061]  azx_stop_chip+0x9/0x10 [snd_hda_codec]
<4>[ 1211.603061]  azx_suspend+0x72/0x220 [snd_hda_intel]
<4>[ 1211.603061]  pci_pm_suspend+0x71/0x140
<4>[ 1211.603062]  dpm_run_callback+0x6f/0x330
<4>[ 1211.603062]  ? pci_pm_freeze+0xe0/0xe0
<4>[ 1211.603062]  __device_suspend+0xf9/0x370
<4>[ 1211.603062]  ? dpm_watchdog_set+0x60/0x60
<4>[ 1211.603063]  async_suspend+0x1a/0x90
<4>[ 1211.603063]  async_run_entry_fn+0x34/0x160
<4>[ 1211.603063]  process_one_work+0x1f4/0x6d0
<4>[ 1211.603063]  ? process_one_work+0x16e/0x6d0
<4>[ 1211.603064]  worker_thread+0x49/0x4a0
<4>[ 1211.603064]  kthread+0x107/0x140
<4>[ 1211.603064]  ? process_one_work+0x6d0/0x6d0
<4>[ 1211.603065]  ? kthread_create_on_node+0x40/0x40
<4>[ 1211.603065]  ret_from_fork+0x2e/0x40

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100419
Fixes: 38b19ed7f81ec ("ALSA: hda: fix to wait for RIRB & CORB DMA to set")
Reported-by: Marta Lofstedt <marta.lofstedt@intel.com>
Suggested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jeeja KP <jeeja.kp@intel.com>
Acked-by: Vinod Koul <vinod.koul@intel.com>
CC: stable <stable@vger.kernel.org> # 4.7
Signed-off-by: Takashi Iwai <tiwai@suse.de>

perf/callchain: Force USER_DS when invoking perf_callchain_user()

Perf can generate and record a user callchain in response to a synchronous
request, such as a tracepoint firing. If this happens under set_fs(KERNEL_DS),
then we can end up walking the user stack (and dereferencing/saving whatever we
find there) without the protections usually afforded by checks such as
access_ok.

Rather than play whack-a-mole with each architecture's stack unwinding
implementation, fix the root of the problem by ensuring that we force USER_DS
when invoking perf_callchain_user from the perf core.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge git://git./linux/kernel/git/davem/ide

Pull IDE updates from David Miller:
"Two small cleanups in the IDE layer"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
ide: don't call memcpy with the same source and destination
ide: use setup_timer

Merge git://git./linux/kernel/git/davem/sparc

Pull sparc updates from David Miller:
"sparc changes, including a bug fix for handling exceptions during
  bzero on some sparc64 cpus"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc64: fix fault handling in NGbzero.S and GENbzero.S
  sparc: use memdup_user_nul in sun4m LED driver
  sparc: Remove redundant tests in boot_flags_init().

Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Fix multiqueue in stmmac driver on PCI, from Andy Shevchenko.

2) cdc_ncm doesn't actually fully zero out the padding area is
    allocates on TX, from Jim Baxter.

3) Don't leak map addresses in BPF verifier, from Daniel Borkmann.

4) If we randomize TCP timestamps, we have to do it everywhere
    including SYN cookies. From Eric Dumazet.

5) Fix "ethtool -S" crash in aquantia driver, from Pavel Belous.

6) Fix allocation size for ntp filter bitmap in bnxt_en driver, from
    Dan Carpenter.

7) Add missing memory allocation return value check to DSA loop driver,
    from Christophe Jaillet.

8) Fix XDP leak on driver unload in qed driver, from Suddarsana Reddy
    Kalluru.

9) Don't inherit MC list from parent inet connection sockets, another
    syzkaller spotted gem. Fix from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
  dccp/tcp: do not inherit mc_list from parent
  qede: Split PF/VF ndos.
  qed: Correct doorbell configuration for !4Kb pages
  qed: Tell QM the number of tasks
  qed: Fix VF removal sequence
  qede: Fix XDP memory leak on unload
  net/mlx4_core: Reduce harmless SRIOV error message to debug level
  net/mlx4_en: Avoid adding steering rules with invalid ring
  net/mlx4_en: Change the error print to debug print
  drivers: net: wimax: i2400m: i2400m-usb: Use time_after for time comparison
  DECnet: Use container_of() for embedded struct
  Revert "ipv4: restore rt->fi for reference counting"
  net: mdio-mux: bcm-iproc: call mdiobus_free() in error path
  net: ethernet: ti: cpsw: adjust cpsw fifos depth for fullduplex flow control
  ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf
  net: cdc_ncm: Fix TX zero padding
  stmmac: pci: split out common_default_data() helper
  stmmac: pci: RX queue routing configuration
  stmmac: pci: TX and RX queue priority configuration
  stmmac: pci: set default number of rx and tx queues
  ...

Merge tag 'dmaengine-4.12-rc1' of git://git.infradead.org/users/vkoul/slave-dma

Pull dmaengine updates from Vinod Koul:
"This time again a smaller update consisting of:

   - support for TI DA8xx dma controller and updates to the cppi driver

   - updates on bunch of drivers like xilinx, pl08x, stm32-dma, mv_xor,
     ioat, dmatest"

* tag 'dmaengine-4.12-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (35 commits)
  dmaengine: pl08x: remove lock documentation
  dmaengine: pl08x: fix pl08x_dma_chan_state documentation
  dmaengine: pl08x: Use the BIT() macro consistently
  dmaengine: pl080: Fix some missing kerneldoc
  dmaengine: pl080: Cut some unused defines
  dmaengine: dmatest: Add check for supported buffer count (sg_buffers)
  dmaengine: dmatest: Select DMA_ENGINE_RAID as its needed for the slave_sg test
  dmaengine: virt-dma: Convert to use list_for_each_entry_safe()
  dma-debug: use offset_in_page() macro
  dmaengine: mv_xor: use offset_in_page() macro
  dmaengine: dmatest: use offset_in_page() macro
  dmaengine: sun4i: fix invalid argument
  dmaengine: ioat: use setup_timer
  dmaengine: cppi41: Fix an Oops happening in cppi41_dma_probe()
  dmaengine: pl330: remove pdata based initialization
  dmaengine: cppi: fix build error due to bad variable
  dmaengine: imx-sdma: add 1ms delay to ensure SDMA channel is stopped
  dmaengine: cppi41: use managed functions devm_*()
  dmaengine: cppi41: fix cppi41_dma_tx_status() logic
  dmaengine: qcom_hidma: pause the channel on shutdown
  ...

Merge tag 'pwm/for-4.12-rc1' of git://git./linux/kernel/git/thierry.reding/linux-pwm

Pull pwm updates from Thierry Reding:
"Adds a new driver for the PWM controller found on MediaTek SoCs and
  extends support for the Atmel PWM controller to include the SAMA5D2.

  Some existing drivers have been migrated to the atomic API and a few
  others see miscellaneous improvements"

* tag 'pwm/for-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
  pwm: tegra: Read PWM clock source rate in driver init
  pwm: pca9685: Fix GPIO-only operation
  pwm: mediatek: Don't explicitly set .owner
  pwm: tegra: Avoid potential overflow for short periods
  pwm: tegra: Add support to configure pin state in suspends/resume
  pwm: tegra: Add DT binding details to configure pin in suspends/resume
  pwm: tegra: Increase precision in PWM rate calculation
  pwm: tegra: Use DIV_ROUND_CLOSEST_ULL() instead of local implementation
  pwm: Add MediaTek PWM support
  dt-bindings: pwm: Add MediaTek PWM bindings
  pwm: atmel: Enable PWM on sama5d2
  pwm: atmel: Switch to atomic PWM
  pwm: atmel-hlcdc: Implement the suspend/resume hooks
  pwm: atmel-hlcdc: Convert to the atomic PWM API

Merge tag 'iommu-updates-v4.12' of git://git./linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:

- code optimizations for the Intel VT-d driver

- ability to switch off a previously enabled Intel IOMMU

- support for 'struct iommu_device' for OMAP, Rockchip and Mediatek
   IOMMUs

- header optimizations for IOMMU core code headers and a few fixes that
   became necessary in other parts of the kernel because of that

- ACPI/IORT updates and fixes

- Exynos IOMMU optimizations

- updates for the IOMMU dma-api code to bring it closer to use per-cpu
   iova caches

- new command-line option to set default domain type allocated by the
   iommu core code

- another command line option to allow the Intel IOMMU switched off in
   a tboot environment

- ARM/SMMU: TLB sync optimisations for SMMUv2, Support for using an
   IDENTITY domain in conjunction with DMA ops, Support for SMR masking,
   Support for 16-bit ASIDs (was previously broken)

- various other small fixes and improvements

* tag 'iommu-updates-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (63 commits)
  soc/qbman: Move dma-mapping.h include to qman_priv.h
  soc/qbman: Fix implicit header dependency now causing build fails
  iommu: Remove trace-events include from iommu.h
  iommu: Remove pci.h include from trace/events/iommu.h
  arm: dma-mapping: Don't override dma_ops in arch_setup_dma_ops()
  ACPI/IORT: Fix CONFIG_IOMMU_API dependency
  iommu/vt-d: Don't print the failure message when booting non-kdump kernel
  iommu: Move report_iommu_fault() to iommu.c
  iommu: Include device.h in iommu.h
  x86, iommu/vt-d: Add an option to disable Intel IOMMU force on
  iommu/arm-smmu: Return IOVA in iova_to_phys when SMMU is bypassed
  iommu/arm-smmu: Correct sid to mask
  iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid()
  iommu: Make iommu_bus_notifier return NOTIFY_DONE rather than error code
  omap3isp: Remove iommu_group related code
  iommu/omap: Add iommu-group support
  iommu/omap: Make use of 'struct iommu_device'
  iommu/omap: Store iommu_dev pointer in arch_data
  iommu/omap: Move data structures to omap-iommu.h
  iommu/omap: Drop legacy-style device support
  ...

Merge branches 'acpi-soc', 'acpi-bus', 'acpi-pmic' and 'acpi-power'

* acpi-soc:
  ACPI / LPSS: Call pwm_add_table() for Bay Trail PWM device
  i2c: designware: Add ACPI HID for Hisilicon Hip07/08 I2C controller
  ACPI / APD: Add clock frequency for Hisilicon Hip07/08 I2C controller

* acpi-bus:
  ACPI / bus: Add INT0002 to list of always-present devices
  ACPI / bus: Introduce a list of ids for "always present" devices

* acpi-pmic:
  ACPI / PMIC: xpower: Fix power_table addresses

* acpi-power:
  ACPI / power: Delay turning off unused power resources after suspend

Merge branch 'acpica'

* acpica:
  ACPICA: Update version to 20170303
  ACPICA: iasl: add ASL conversion tool
  ACPICA: Local cache support: Allow small cache objects
  ACPICA: Disassembler: Do not unconditionally remove temporary names
  ACPICA: iasl: Fix IORT SMMU GSI disassembling
  ACPICA: Cleanup AML opcode definitions, no functional change
  ACPICA: Debugger: Add interpreter blocking mark for single-step mode
  ACPICA: debugger: fix memory leak on Pathname
  ACPICA: Update for automatic repair code for objects returned by evaluate_object
  ACPICA: Namespace: fix operand cache leak
  ACPICA: Fix several incorrect invocations of ACPICA return macro
  ACPICA: Fix a module for excessive debug output
  ACPICA: Update some function headers, no funtional change
  ACPICA: Disassembler: Enhance resource descriptor detection
  ACPICA: Add non-linux host build support

Merge branches 'pm-domains', 'pm-cpuidle', 'pm-sleep' and 'powercap'

* pm-domains:
  PM / Domains: Add DT file to MAINTAINERS
  PM / Domains: Fix DT example

* pm-cpuidle:
  x86/intel_idle: add Gemini Lake support
  cpuidle: check dev before usage in cpuidle_use_deepest_state()

* pm-sleep:
  ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle
  PM / wakeup: Integrate mechanism to abort transitions in progress

* powercap:
  powercap: intel_rapl: Add support for Gemini Lake

nfsd: fix undefined behavior in nfsd4_layout_verify

UBSAN: Undefined behaviour in fs/nfsd/nfs4proc.c:1262:34
shift exponent 128 is too large for 32-bit type 'int'

Depending on compiler+architecture, this may cause the check for
layout_type to succeed for overly large values (which seems to be the
case with amd64). The large value will be later used in de-referencing
nfsd4_layout_ops for function pointers.

Reported-by: Jani Tuovila <tuovila@synopsys.com>
Signed-off-by: Ari Kauppi <ari@synopsys.com>
[colin.king@canonical.com: use LAYOUT_TYPE_MAX instead of 32]
Cc: stable@vger.kernel.org
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

pNFS/flexfiles: Always attempt to call layoutstats when flexfiles is enabled

Layoutstats is always desirable when using the flexfiles driver, so
we should enable it if that driver is being loaded. It is safe to do
so, because even when the mount specifies NFSv4.1, we will turn it
off if the server tells us it is unsupported.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

NFSv4.1: Work around a Linux server bug...

It turns out the Linux server has a bug in its implementation of
supattr_exclcreat; it returns the set of all attributes, whether
or not they are supported by minor version 1.
In order to avoid a regression, we therefore apply the supported_attrs
as a mask on top of whatever the server sent us.

Reported-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

docs: update references to the device io book

While converting the deviceiobook from DocBook to RST, dangling
references were left behind. This commit updates all remaining
references to the new location. SeongJae Park improved the ko_KR
translation.

Fixes: 8a8a602fdb83 ("docs: Convert the deviceio template to RST")
Signed-off-by: Helmut Grohne <h.grohne@intenta.de>
Signed-off-by: SeongJae Park <sj38.park@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

Documentation: earlycon: fix Marvell Armada 3700 UART name

The Marvell Armada 3700 UART uses "ar3700_uart" for its earlycon name.
Adjust documentation to match the code.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

docs-rst: add input docs at main index and use kernel-figure

The input subsystem documentation got converted into ReST.

Add it to the main documentation index and use kernel-figure
for the two svg images there.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

dccp/tcp: do not inherit mc_list from parent

syzkaller found a way to trigger double frees from ip_mc_drop_socket()

It turns out that leave a copy of parent mc_list at accept() time,
which is very bad.

Very similar to commit 8b485ce69876 ("tcp: do not inherit
fastopen_req from parent")

Initial report from Pray3r, completed by Andrey one.
Thanks a lot to them !

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Pray3r <pray3r.z@gmail.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc64: fix fault handling in NGbzero.S and GENbzero.S

When any of the functions contained in NGbzero.S and GENbzero.S
vector through *bzero_from_clear_user, we may end up taking a
fault when executing one of the store alternate address space
instructions. If this happens, the exception handler does not
restore the %asi register.

This commit fixes the issue by introducing a new exception
handler that ensures the %asi register is restored when
a fault is handled.

Orabug: 25577560

Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc: use memdup_user_nul in sun4m LED driver

Use memdup_user_nul() helper instead of open-coding to simplify the code.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'arc-4.12-rc1' of git://git./linux/kernel/git/vgupta/arc

Pull ARC updates from Vineet Gupta:

- AXS10x platform clk updates for I2S, PGU

- add region based cache flush operation for ARCv2 cores

- enforce PAE40 dependency on HIGHMEM

- ptrace support for additional regs in ARCv2 cores

- fix build failure in linux-next dut to a header include ordering
   change

* tag 'arc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  Revert "ARCv2: Allow enabling PAE40 w/o HIGHMEM"
  ARC: mm: fix build failure in linux-next for UP builds
  ARCv2: ptrace: provide regset for accumulator/r30 regs
  elf: Add ARCv2 specific core note section
  ARCv2: mm: micro-optimize region flush generated code
  ARCv2: mm: Merge 2 updates to DC_CTRL for region flush
  ARCv2: mm: Implement cache region flush operations
  ARC: mm: Move full_page computation into cache version agnostic wrapper
  arc: axs10x: Fix ARC PGU default clock frequency
  arc: axs10x: Add DT bindings for I2S audio playback

Merge tag 'armsoc-dt64' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM 64-bit DT updates from Olof Johansson:
"Device-tree updates for arm64 platforms. Just as with 32-bit, a bunch
  of smaller changes, but also some new platforms that are worth
  mentioning:

   - Rockchip RK3399 platforms for Chromebooks, including Samsung
     Chromebook Plus (Kevin)

   - Orange Pi PC2 (Allwinner H5)

   - Freescale LS2088A and LS1088A SoCs

   - Expanded support for Nvidia Tegra186 (and Jetson TX2)"

* tag 'armsoc-dt64' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (180 commits)
  arm64: dts: Add basic DT to support Spreadtrum's SP9860G
  arm64: dts: exynos: Use - instead of @ for DT OPP entries
  arm64: dts: exynos: Add support for s6e3hf2 panel device on TM2e board
  arm64: dts: juno: add information about L1 and L2 caches
  arm64: dts: juno: fix few unit address format warnings
  arm64: marvell: dts: enable the crypto engine on the Armada 8040 DB
  arm64: marvell: dts: enable the crypto engine on the Armada 7040 DB
  arm64: marvell: dts: add crypto engine description for 7k/8k
  arm64: dts: marvell: add sdhci support for Armada 7K/8K
  arm64: dts: marvell: add eMMC support for Armada 37xx
  arm64: dts: hisi: add pinctrl dtsi file for HiKey960 development board
  arm64: dts: hisi: add drive strength levels of the pins for Hi3660 SoC
  arm64: dts: hisi: enable the NIC and SAS for the hip07-d05 board
  arm64: dts: hisi: add SAS nodes for the hip07 SoC
  arm64: dts: hisi: add RoCE nodes for the hip07 SoC
  arm64: dts: hisi: add network related nodes for the hip07 SoC
  arm64: dts: hisi: add mbigen nodes for the hip07 SoC
  arm64: dts: rockchip: fix the memory size of PX5 Evaluation board
  arm64: dts: hisilicon: add dts files for hi3798cv200-poplar board
  dt-bindings: arm: hisilicon: add bindings for hi3798cv200 SoC and Poplar board
  ...

Merge tag 'armsoc-arm64' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC 64-bit changes from Olof Johansson:
"Changes to platform code for 64-bit ARM platforms.

  Most of these are small changes to the one defconfig we use on arm64
  (no per-platform configs there), to enable new drivers.

  There are also a few other changes. Broadcom sold off their 'Vulcan'
  design to Cavium, where it is now called ThunderX2. While we normally
  don't rename stuff based on marketing's whims, it seemed appropriate
  to bring in renames on a few things such as MAINTAINERS, etc"

* tag 'armsoc-arm64' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  arm64: sunxi: always enable reset controller
  arm64: defconfig: enable the Safexcel crypto engine as a module
  arm64: configs: enable SDHCI driver for Xenon
  MAINTAINERS: Broadcom Vulcan is now Cavium ThunderX2
  arm64: defconfig: add Allwinner USB PHY
  arm64: defconfig: enable MVPP2
  arm64: defconfig: Enable video, DRM and LPASS drivers for Exynos5433 and Exynos7
  arm64: exynos: Enable Exynos PMU and PM domains drivers
  arm64: only select PINCTRL for Allwinner platforms
  arm64: set CONFIG_MMC_BCM2835=y in defconfig
  arm64: defconfig: enable I2C_PXA
  arm64: defconfig: enable MVNETA
  ARM64: defconfig: enable the leds-pwm driver and default-on trigger
  arm64: defconfig: Enable SH Mobile I2C controller

Merge tag 'armsoc-drivers' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC driver updates from Olof Johansson:
"Driver updates for ARM SoCs:

  Reset subsystem, merged through arm-soc by tradition:
   - Make bool drivers explicitly non-modular
   - New support for i.MX7 and Arria10 reset controllers

  PATA driver for Palmchip BK371 (acked by Tejun)

  Power domain drivers for i.MX (GPC, GPCv2)
   - Moved out of mach-imx for GPC
   - Bunch of tweaks, fixes, etc

  PMC support for Tegra186

  SoC detection support for Renesas RZ/G1H and RZ/G1N

  Move Tegra flow controller driver from mach directory to drivers/soc
   - (Power management / CPU power driver)

  Misc smaller tweaks for other platforms"

* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (60 commits)
  soc: pm-domain: Fix the mangled urls
  soc: renesas: rcar-sysc: Add support for R-Car H3 ES2.0
  soc: renesas: rcar-sysc: Add support for fixing up power area tables
  soc: renesas: Register SoC device early
  soc: imx: gpc: add workaround for i.MX6QP to the GPC PD driver
  dt-bindings: imx-gpc: add i.MX6 QuadPlus compatible
  soc: imx: gpc: add defines for domain index
  soc: imx: Add GPCv2 power gating driver
  dt-bindings: Add GPCv2 power gating driver
  ARM/clk: move the ICST library to drivers/clk
  ARM: plat-versatile: remove stale clock header
  ARM: keystone: Drop PM domain support for k2g
  soc: ti: Add ti_sci_pm_domains driver
  dt-bindings: Add TI SCI PM Domains
  PM / Domains: Do not check if simple providers have phandle cells
  PM / Domains: Add generic data pointer to genpd data struct
  soc/tegra: Add initial flowctrl support for Tegra132/210
  soc/tegra: flowctrl: Add basic platform driver
  soc/tegra: Move Tegra flowctrl driver
  ARM: tegra: Remove unnecessary inclusion of flowctrl header
  ...

Merge tag 'armsoc-defconfig' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM: SoC defconfig updates from Olof Johansson:
"We've traditionally kept defconfig updates in a separate branch, often
  to encourage submaintainers to handle those patches separately to
  avoid conflicts on the shared files. The amount of changes seem to be
  decreasing though, so we might rethink how we handle this going
  forward.

  There really isn't much to write about here. The bulk of changes here
  are enabling drivers for whatever platforms the hardware is found on
  (and multi-configs)"

* tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (30 commits)
  multi_v7_defconfig: make Rockchip usb2-phy built-in
  ARM: omap2plus_defconfig: Enable droid 4 devices
  ARM: omap2plus_defconfig: Add QMI, ACM and PPP as loadable modules
  ARM: configs: aspeed: Add new drivers
  ARM: configs: aspeed: Update configs for BMC systems
  ARM: omap2plus_defconfig: Enable TI Ethernet PHY
  ARM: configs: Add new config fragment to change RAM start point
  ARM: configs: stm32: Add I2C support
  multi_v7_defconfig: make Rockchip DRM drivers built-in
  ARM: configs: stm32: Set CPU_V7M_NUM_IRQ to max value
  ARM: imx_v6_v7_defconfig: Select SMSC_PHY
  ARM: davinci_all_defconfig: convert to use libata PATA
  ARM: qcom_defconfig: Enable Qualcomm remoteproc and related drivers
  ARM: omap2plus_defconfig: enable ahci-dm816 module
  arm: set CONFIG_MMC_BCM2835=y in bcm2835_defconfig and multi_v7_defconfig
  ARM: bcm2835: Enable missing CMA settings for VC4 driver
  ARM: socfpga: updates for socfpga_defconfig
  ARM: imx_v6_v7_defconfig: Select hid-multitouchdriver
  ARM: imx_v6_v7_defconfig: Select max11801_ts touchscreen driver
  ARM: exynos_defconfig: Increase CONFIG_CMA_SIZE_MBYTES to 96
  ...

Merge tag 'armsoc-dt' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM Device-tree updates from Olof Johansson:
"Device-tree continues to see lots of updates. The majority of patches
  here are smaller changes for new hardware on existing platforms, and
  there are a few larger changes worth pointing out.

  Major new platforms:

   - Gemini has been ported to DT, so a handful of "new" platforms moved
     over from board files

   - Rockchip RK3288 support for Tinkerboard and Phytec phyCORE-RK3288
     SoM and RDK

   - A bunch of embedded platforms, several Linksys platforms, Synology
     DS116,

   - Motorola Droid4 (really old OMAP-based phone) support is added.

  Some refactorings, i.e. Allwinner H3/H5 support is commonalized.

  And lots of smaller changes, cleanups, etc. See shortlog for more
  description

  We're adding ability to cross-include DT files between arm and arm64,
  by creating appropriate links in the dt-include directory, and using
  arm/ and arm64/ as include prefixes. This will avoid other local hacks
  such as per-file links between the two arch trees (this broke for
  external mirroring of DT contents). Now they can just provide their
  own appropriate dt-include hierarcy per platform"

* tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (349 commits)
  ARM: dts: exynos: Use - instead of @ for DT OPP entries
  arm: spear6xx: add DT description of the ADC on SPEAr600
  arm: spear6xx: remove unneeded pinctrl properties in spear600-evb
  arm: spear6xx: switch spear600-evb to the new flash partition DT binding
  arm: spear6xx: fix spaces in spear600-evb.dts
  arm: spear6xx: use node labels in spear600-evb.dts
  arm: spear6xx: add labels to various nodes in spear600.dtsi
  ARM: dts: vexpress: fix few unit address format warnings
  ARM: dts: at91: sama5d3_xplained: not all ADC channels are available
  ARM: dts: at91: sama5d3_xplained: fix ADC vref
  ARM: dts: at91: add envelope detector mux to the Axentia TSE-850
  ARM: dts: armada-38x: label USB and SATA nodes
  ARM: dts: imx6q-utilite-pro: add hpd gpio
  ARM: dts: imx6qp-sabresd: Set reg_arm regulator supply
  ARM: dts: imx6qdl-sabresd: Set LDO regulator supply
  ARM: dts: imx: add Gateworks Ventana GW5903 support
  ARM: dts: i.MX25: add AIPS control registers
  ARM: dts: imx7-colibri: add Carrier Board 3.3V/5V regulators
  ARM: dts: imx7-colibri: remove 1.8V fixed regulator
  ARM: dts: imx7-colibri: allow to disable Ethernet rail
  ...

Merge tag 'armsoc-soc' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC platform updates from Olof Johansson:
"SoC platform changes (arch/arm/mach-*). This merge window, the bulk is
  for a few platforms:

  Gemini:
   - Legacy platform that Linus Walleij has converted to multiplatform
     and DT, so a handful of various tweaks there, removal of some old
     stale support, etc.

  Atmel AT91:
   - Fixup of various power management related pieces
   - Move of SoC detection to a drivers/soc driver instead

  ST Micro STM32:
   - New SoC support: STM32H743

  TI platforms:
   - More driver support for Davinci (SATA in particular)
   - Removal of some old stale hwmod files (linkspace platform)

  Misc:
   - A couple of smaller patches for i.MX, sunxi, hisi"

* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (57 commits)
  ARM: davinci: Add clock for CPPI 4.1 DMA engine
  ARM: mxs: add support for I2SE Duckbill 2 boards
  MAINTAINERS: Update the Allwinner sunXi entry
  ARM: i.MX25: globally disable supervisor protect
  ARM: at91: move SoC detection to its own driver
  ARM: at91: pm: correct typo
  ARM: at91: pm: Remove at91_pm_set_standby
  ARM: at91: pm: Merge all at91sam9*_pm_init
  ARM: at91: pm: Tie the USB clock mask to the pmc
  ARM: at91: pm: Tie the memory controller type to the ramc id
  ARM: at91: pm: Workaround DDRSDRC self-refresh bug with LPDDR1 memories.
  ARM: at91: pm: Simplify at91rm9200_standby
  ARM: at91: pm: Use struct at91_pm_data in pm_suspend.S
  ARM: at91: pm: Move global variables into at91_pm_data
  ARM: at91: pm: Move at91_ramc_read/write to pm.c
  ARM: at91: pm: Cleanup headers
  MAINTAINERS: Add memory drivers to AT91 entry
  MAINTAINERS: Update AT91 entry
  ARM: davinci: add pata_bk3710 libata driver support
  ARM: OMAP2+: mark omap_init_rng as __init
  ...

arm64: uaccess: suppress spurious clang warning

Clang tries to warn when there's a mismatch between an operand's size,
and the size of the register it is held in, as this may indicate a bug.
Specifically, clang warns when the operand's type is less than 64 bits
wide, and the register is used unqualified (i.e. %N rather than %xN or
%wN).

Unfortunately clang can generate these warnings for unreachable code.
For example, for code like:

do {                                            \
        typeof(*(ptr)) __v = (v);               \
        switch(sizeof(*(ptr))) {                \
        case 1:                                 \
                // assume __v is 1 byte wide    \
                asm ("{op}b %w0" : : "r" (v));  \
                break;                          \
        case 8:                                 \
                // assume __v is 8 bytes wide   \
                asm ("{op} %0" : : "r" (v));    \
                break;                          \
        }
while (0)

... if op() were passed a char value and pointer to char, clang may
produce a warning for the unreachable case where sizeof(*(ptr)) is 8.

For the same reasons, clang produces warnings when __put_user_err() is
used for types that are less than 64 bits wide.

We could avoid this with a cast to a fixed-width type in each of the
cases. However, GCC will then warn that pointer types are being cast to
mismatched integer sizes (in unreachable paths).

Another option would be to use the same union trickery as we do for
__smp_store_release() and __smp_load_acquire(), but this is fairly
invasive.

Instead, this patch suppresses the clang warning by using an x modifier
in the assembly for the 8 byte case of __put_user_err(). No additional
work is necessary as the value has been cast to typeof(*(ptr)), so the
compiler will have performed any necessary extension for the reachable
case.

For consistency, __get_user_err() is also updated to use the x modifier
for its 8 byte case.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: atomic_lse: match asm register sizes

The LSE atomic code uses asm register variables to ensure that
parameters are allocated in specific registers. In the majority of cases
we specifically ask for an x register when using 64-bit values, but in a
couple of cases we use a w regsiter for a 64-bit value.

For asm register variables, the compiler only cares about the register
index, with wN and xN having the same meaning. The compiler determines
the register size to use based on the type of the variable. Thus, this
inconsistency is merely confusing, and not harmful to code generation.

For consistency, this patch updates those cases to use the x register
alias. There should be no functional change as a result of this patch.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: armv8_deprecated: ensure extension of addr

Our compat swp emulation holds the compat user address in an unsigned
int, which it passes to __user_swpX_asm(). When a 32-bit value is passed
in a register, the upper 32 bits of the register are unknown, and we
must extend the value to 64 bits before we can use it as a base address.

This patch casts the address to unsigned long to ensure it has been
suitably extended, avoiding the potential issue, and silencing a related
warning from clang.

Fixes: bd35a4adc413 ("arm64: Port SWP/SWPB emulation support from arm")
Cc: <stable@vger.kernel.org> # 3.19.x-
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: uaccess: ensure extension of access_ok() addr

Our access_ok() simply hands its arguments over to __range_ok(), which
implicitly assummes that the addr parameter is 64 bits wide. This isn't
necessarily true for compat code, which might pass down a 32-bit address
parameter.

In these cases, we don't have a guarantee that the address has been zero
extended to 64 bits, and the upper bits of the register may contain
unknown values, potentially resulting in a suprious failure.

Avoid this by explicitly casting the addr parameter to an unsigned long
(as is done on other architectures), ensuring that the parameter is
widened appropriately.

Fixes: 0aea86a2176c ("arm64: User access library functions")
Cc: <stable@vger.kernel.org> # 3.7.x-
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: ensure extension of smp_store_release value

When an inline assembly operand's type is narrower than the register it
is allocated to, the least significant bits of the register (up to the
operand type's width) are valid, and any other bits are permitted to
contain any arbitrary value. This aligns with the AAPCS64 parameter
passing rules.

Our __smp_store_release() implementation does not account for this, and
implicitly assumes that operands have been zero-extended to the width of
the type being stored to. Thus, we may store unknown values to memory
when the value type is narrower than the pointer type (e.g. when storing
a char to a long).

This patch fixes the issue by casting the value operand to the same
width as the pointer operand in all cases, which ensures that the value
is zero-extended as we expect. We use the same union trickery as
__smp_load_acquire and {READ,WRITE}_ONCE() to avoid GCC complaining that
pointers are potentially cast to narrower width integers in unreachable
paths.

A whitespace issue at the top of __smp_store_release() is also
corrected.

No changes are necessary for __smp_load_acquire(). Load instructions
implicitly clear any upper bits of the register, and the compiler will
only consider the least significant bits of the register as valid
regardless.

Fixes: 47933ad41a86 ("arch: Introduce smp_load_acquire(), smp_store_release()")
Fixes: 878a84d5a8a1 ("arm64: add missing data types in smp_load_acquire/smp_store_release")
Cc: <stable@vger.kernel.org> # 3.14.x-
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: xchg: hazard against entire exchange variable

The inline assembly in __XCHG_CASE() uses a +Q constraint to hazard
against other accesses to the memory location being exchanged. However,
the pointer passed to the constraint is a u8 pointer, and thus the
hazard only applies to the first byte of the location.

GCC can take advantage of this, assuming that other portions of the
location are unchanged, as demonstrated with the following test case:

union u {
unsigned long l;
unsigned int i[2];
};

unsigned long update_char_hazard(union u *u)
{
unsigned int a, b;

a = u->i[1];
asm ("str %1, %0" : "+Q" (*(char *)&u->l) : "r" (0UL));
b = u->i[1];

return a ^ b;
}

unsigned long update_long_hazard(union u *u)
{
unsigned int a, b;

a = u->i[1];
asm ("str %1, %0" : "+Q" (*(long *)&u->l) : "r" (0UL));
b = u->i[1];

return a ^ b;
}

The linaro 15.08 GCC 5.1.1 toolchain compiles the above as follows when
using -O2 or above:

0000000000000000 <update_char_hazard>:
   0: d2800001 mov x1, #0x0                    // #0
   4: f9000001 str x1, [x0]
   8: d2800000 mov x0, #0x0                    // #0
   c: d65f03c0 ret

0000000000000010 <update_long_hazard>:
  10: b9400401 ldr w1, [x0,#4]
  14: d2800002 mov x2, #0x0                    // #0
  18: f9000002 str x2, [x0]
  1c: b9400400 ldr w0, [x0,#4]
  20: 4a000020 eor w0, w1, w0
  24: d65f03c0 ret

This patch fixes the issue by passing an unsigned long pointer into the
+Q constraint, as we do for our cmpxchg code. This may hazard against
more than is necessary, but this is better than missing a necessary
hazard.

Fixes: 305d454aaa29 ("arm64: atomics: implement native {relaxed, acquire, release} atomics")
Cc: <stable@vger.kernel.org> # 4.4.x-
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: documentation: document tagged pointer stack constraints

Some kernel features don't currently work if a task puts a non-zero
address tag in its stack pointer, frame pointer, or frame record entries
(FP, LR).

For example, with a tagged stack pointer, the kernel can't deliver
signals to the process, and the task is killed instead. As another
example, with a tagged frame pointer or frame records, perf fails to
generate call graphs or resolve symbols.

For now, just document these limitations, instead of finding and fixing
everything that doesn't work, as it's not known if anyone needs to use
tags in these places anyway.

In addition, as requested by Dave Martin, generalize the limitations
into a general kernel address tag policy, and refactor
tagged-pointers.txt to include it.

Fixes: d50240a5f6ce ("arm64: mm: permit use of tagged pointers at EL0")
Cc: <stable@vger.kernel.org> # 3.12.x-
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: entry: improve data abort handling of tagged pointers

When handling a data abort from EL0, we currently zero the top byte of
the faulting address, as we assume the address is a TTBR0 address, which
may contain a non-zero address tag. However, the address may be a TTBR1
address, in which case we should not zero the top byte. This patch fixes
that. The effect is that the full TTBR1 address is passed to the task's
signal handler (or printed out in the kernel log).

When handling a data abort from EL1, we leave the faulting address
intact, as we assume it's either a TTBR1 address or a TTBR0 address with
tag 0x00. This is true as far as I'm aware, we don't seem to access a
tagged TTBR0 address anywhere in the kernel. Regardless, it's easy to
forget about address tags, and code added in the future may not always
remember to remove tags from addresses before accessing them. So add tag
handling to the EL1 data abort handler as well. This also makes it
consistent with the EL0 data abort handler.

Fixes: d50240a5f6ce ("arm64: mm: permit use of tagged pointers at EL0")
Cc: <stable@vger.kernel.org> # 3.12.x-
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: hw_breakpoint: fix watchpoint matching for tagged pointers

When we take a watchpoint exception, the address that triggered the
watchpoint is found in FAR_EL1. We compare it to the address of each
configured watchpoint to see which one was hit.

The configured watchpoint addresses are untagged, while the address in
FAR_EL1 will have an address tag if the data access was done using a
tagged address. The tag needs to be removed to compare the address to
the watchpoints.

Currently we don't remove it, and as a result can report the wrong
watchpoint as being hit (specifically, always either the highest TTBR0
watchpoint or lowest TTBR1 watchpoint). This patch removes the tag.

Fixes: d50240a5f6ce ("arm64: mm: permit use of tagged pointers at EL0")
Cc: <stable@vger.kernel.org> # 3.12.x-
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

arm64: traps: fix userspace cache maintenance emulation on a tagged pointer

When we emulate userspace cache maintenance in the kernel, we can
currently send the task a SIGSEGV even though the maintenance was done
on a valid address. This happens if the address has a non-zero address
tag, and happens to not be mapped in.

When we get the address from a user register, we don't currently remove
the address tag before performing cache maintenance on it. If the
maintenance faults, we end up in either __do_page_fault, where find_vma
can't find the VMA if the address has a tag, or in do_translation_fault,
where the tagged address will appear to be above TASK_SIZE. In both
cases, the address is not mapped in, and the task is sent a SIGSEGV.

This patch removes the tag from the address before using it. With this
patch, the fault is handled correctly, the address gets mapped in, and
the cache maintenance succeeds.

As a second bug, if cache maintenance (correctly) fails on an invalid
tagged address, the address gets passed into arm64_notify_segfault,
where find_vma fails to find the VMA due to the tag, and the wrong
si_code may be sent as part of the siginfo_t of the segfault. With this
patch, the correct si_code is sent.

Fixes: 7dd01aef0557 ("arm64: trap userspace "dc cvau" cache operation on errata-affected core")
Cc: <stable@vger.kernel.org> # 4.8.x-
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Merge tag 'armsoc-fixes-nc' of git://git./linux/kernel/git/arm/arm-soc

Pull misc ARM SoC fixes from Olof Johansson:
"ARM SoC non-urgent fixes for merge window

  Smaller patches that didn't seem to find a home in other branches, and
  low-priority fixes from late in the merge window. A number of these
  are MAINTAINER updates, it seems.

  Highlights:

   * Maintainers:
     - Remove Alexandre Courbot and Stephen Warren from Tegra
       maintainership, add Jon Hunter
     - Remove Stephen Warren and add Stefan Wahren to bcm2835
     - Tweaks for file flagging for Marvell Dove

   * Fixes:
     - For two non-common-clk platform, handle clk_disable with NULL arg
     - Remove redundant Kconfig select for Oxnas"

* tag 'armsoc-fixes-nc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: mmp: let clk_disable() return immediately if clk is NULL
  ARM: w90x900: let clk_disable() return immediately if clk is NULL
  MAINTAINERS: Add file patterns for dove device tree bindings
  ARM: oxnas: remove redundant select CPU_V6K
  MAINTAINERS: tegra: Remove self as maintainer
  MAINTAINERS: tegra: Replace Stephen with Jon
  MAINTAINERS: Add Stefan Wahren to bcm2835.
  MAINTAINERS: remove swarren from bcm2835
  MAINTAINERS: Add Jon Mason to BCM5301X maintainers

Merge branch 'work.misc' of git://git./linux/kernel/git/viro/vfs

Pull misc vfs updates from Al Viro:
"Assorted bits and pieces from various people. No common topic in this
  pile, sorry"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs/affs: add rename exchange
  fs/affs: add rename2 to prepare multiple methods
  Make stat/lstat/fstatat pass AT_NO_AUTOMOUNT to vfs_statx()
  fs: don't set *REFERENCED on single use objects
  fs: compat: Remove warning from COMPATIBLE_IOCTL
  remove pointless extern of atime_need_update_rcu()
  fs: completely ignore unknown open flags
  fs: add a VALID_OPEN_FLAGS
  fs: remove _submit_bh()
  fs: constify tree_descr arrays passed to simple_fill_super()
  fs: drop duplicate header percpu-rwsem.h
  fs/affs: bugfix: Write files greater than page size on OFS
  fs/affs: bugfix: enable writes on OFS disks
  fs/affs: remove node generation check
  fs/affs: import amigaffs.h
  fs/affs: bugfix: make symbolic links work again

Merge branch 'work.iov_iter' of git://git./linux/kernel/git/viro/vfs

Pull vfs fix from Al Viro:
"Braino fix for iov_iter_revert() misuse"

* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix braino in generic_file_read_iter()

proc: try to remove use of FOLL_FORCE entirely

We fixed the bugs in it, but it's still an ugly interface, so let's see
if anybody actually depends on it. It's entirely possible that nothing
actually requires the whole "punch through read-only mappings"
semantics.

For example, gdb definitely uses the /proc/<pid>/mem interface, but it
looks like it mainly does it for regular reads of the target (that don't
need FOLL_FORCE), and looking at the gdb source code seems to fall back
on the traditional ptrace(PTRACE_POKEDATA) interface if it needs to.

If this breaks something, I do have a (more complex) version that only
enables FOLL_FORCE when somebody has PTRACE_ATTACH'ed to the target,
like the comment here used to say ("Maybe we should limit FOLL_FORCE to
actual ptrace users?").

Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'qed-general-fixes'

Yuval Mintz says:

====================
qed*: General fixes

This series contain several fixes for qed and qede.

- #1 [and ~#5] relate to XDP cleanups
- #2 and #5 correct VF behavior
- #3 and #4 fix and add missing configurations needed for RoCE & storage
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

qede: Split PF/VF ndos.

PFs and VFs share the same structure of NDOs today,
and the VFs explicitly fails the ndo_xdp() callback stating
it doesn't support XDP.

This results in lots of:

  [qede_xdp:1032(enp131s2)]VFs don't support XDP
  ------------[ cut here ]------------
  WARNING: CPU: 4 PID: 1426 at net/core/rtnetlink.c:1637 rtnl_dump_ifinfo+0x354/0x3c0
  ...
  Call Trace:
    ? __alloc_skb+0x9b/0x1d0
    netlink_dump+0x122/0x290
    netlink_recvmsg+0x27d/0x430
    sock_recvmsg+0x3d/0x50
  ...

As every dump request for the VF interface info would fail due to
rtnl_xdp_fill() returning an error code.

To resolve this, introduce a subset of the NDOs meant for the VF
in a seperate structure and register that one instead for VFs,
and omit the ndo_xdp initialization.

Fixes: 40b8c45492ef ("qede: Prevent VFs from using XDP")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Correct doorbell configuration for !4Kb pages

When configuring the doorbell DPI address, driver aligns the start
address to 4KB [HW-pages] instead of host PAGE_SIZE.
As a result, RoCE applications might receive addresses which are
unaligned to pages [when PAGE_SIZE > 4KB], which is a security risk.

Fixes: 51ff17251c9c ("qed: Add support for RoCE hw init")
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Tell QM the number of tasks

Driver doesn't pass the number of tasks to the QM init logic
which would cause back-pressure in scenarios requiring many tasks
[E.g., using max MRs] and thus reduced performance.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Fix VF removal sequence

After previos changes in HW-stop scheme, VFs stopped sending CLOSE
messages to their PFs when they unload.

Fixes: 1226337ad98f ("qed: Correct HW stop flow")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qede: Fix XDP memory leak on unload

When (re|un)loading, Tx-queues belonging to XDP would not get freed.

Fixes: cb6aeb079294 ("qede: Add support for XDP_TX")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mlx4-misc-fixes'

Tariq Toukan says:

====================
mlx4 misc fixes

This patchset contains misc bug fixes from the team
to the mlx4 Core and Eth drivers.

Series generated against net commit:
32f1bc0f3d26 Revert "ipv4: restore rt->fi for reference counting"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_core: Reduce harmless SRIOV error message to debug level

Under SRIOV resource management, extra counters are allocated to VFs
from a free pool. If that pool is empty, the ALLOC_RES command for
a counter resource fails -- and this generates a misleading error
message in the message log.

Under SRIOV, each VF is allocated (i.e., guaranteed) 2 counters --
one counter per port. For ETH ports, the RoCE driver requests an
additional counter (above the guaranteed counters). If that request
fails, the VF RoCE driver simply uses the default (i.e., guaranteed)
counter for that port.

Thus, failing to allocate an additional counter does not constitute
a problem, and the error message on the PF when this occurs should
be reduced to debug level.

Finally, to identify the situation that the reason for the failure is
that no resources are available to grant to the VF, we modified the
error returned by mlx4_grant_resource to -EDQUOT (Quota exceeded),
which more accurately describes the error.

Fixes: c3abb51bdb0e ("IB/mlx4: Add RoCE/IB dedicated counters")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Avoid adding steering rules with invalid ring

Inserting steering rules with illegal ring is an invalid operation,
block it.

Fixes: 820672812f82 ('net/mlx4_en: Manage flow steering rules with ethtool')
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4_en: Change the error print to debug print

The error print within mlx4_en_calc_rx_buf() should be a debug print.

Fixes: 51151a16a60f ('mlx4: allow order-0 memory allocations in RX path')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

s390/virtio: change maintainership

Halil is doing a lot more work in the virtio area on s390 than I
do. Let's reflect the reality in the maintainers file.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

tools/virtio: fix spelling mistake: "wakeus" -> "wakeups"

trivial fix to spelling mistake in an error message.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>

virtio_net: tidy a couple debug statements

We are printing a decimal value for truesize so we shouldn't use an "0x"
prefix.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

ptr_ring: support testing different batching sizes

Use the param flag for that.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

ringtest: support test specific parameters

Add a new flag for passing test-specific parameters.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

ptr_ring: batch ring zeroing

A known weakness in ptr_ring design is that it does not handle well the
situation when ring is almost full: as entries are consumed they are
immediately used again by the producer, so consumer and producer are
writing to a shared cache line.

To fix this, add batching to consume calls: as entries are
consumed do not write NULL into the ring until we get
a multiple (in current implementation 2x) of cache lines
away from the producer. At that point, write them all out.

We do the write out in the reverse order to keep
producer from sharing cache with consumer for as long
as possible.

Writeout also triggers when ring wraps around - there's
no special reason to do this but it helps keep the code
a bit simpler.

What should we do if getting away from producer by 2 cache lines
would mean we are keeping the ring moe than half empty?
Maybe we should reduce the batching in this case,
current patch simply reduces the batching.

Notes:
- it is no longer true that a call to consume guarantees
  that the following call to produce will succeed.
  No users seem to assume that.
- batching can also in theory reduce the signalling rate:
  users that would previously send interrups to the producer
  to wake it up after consuming each entry would now only
  need to do this once in a batch.
  Doing this would be easy by returning a flag to the caller.
  No users seem to do signalling on consume yet so this was not
  implemented yet.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>

virtio: virtio_driver doc

Add comments for the virtio_driver members that were not documented.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

virtio_net: don't reset twice on XDP on/off

We already do a reset once in remove_vq_common -
there appears to be no point in doing another one
when we add/remove XDP.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

virtio_net: fix support for small rings

When ring size is small (<32 entries) making buffers smaller means a
full ring might not be able to hold enough buffers to fit a single large
packet.

Make sure a ring full of buffers is large enough to allow at least one
packet of max size.

Fixes: 2613af0ed18a ("virtio_net: migrate mergeable rx buffers to page frag allocators")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

virtio_net: reduce alignment for buffers

We don't need to align length to any particular
value anymore. Aligning to L1 cache size probably
sill makes sense to reduce false sharing.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

virtio_net: rework mergeable buffer handling

Use the new _ctx virtio API to maintain true length for each buffer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

virtio_net: allow specifying context for rx

With mergeable buffers we never use s/g for rx,
so allow specifying context in that case.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

powerpc/64s: Support new device tree binding for discovering CPU features

The ibm,powerpc-cpu-features device tree binding describes CPU features with
ASCII names and extensible compatibility, privilege, and enablement metadata
that allows improved flexibility and compatibility with new hardware.

The interface is described in detail in ibm,powerpc-cpu-features.txt in this
patch.

Currently this code is not enabled by default, and there are no released
firmwares that provide the binding.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

drivers: net: wimax: i2400m: i2400m-usb: Use time_after for time comparison

Use time_after() for time comparison with the new fix.

Signed-off-by: Karim Eshapa <karim.eshapa@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

DECnet: Use container_of() for embedded struct

Instead of a direct cross-type cast, use conatiner_of() to locate
the embedded structure, even in the face of future struct layout
randomization.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

powerpc: Don't print cpu_spec->cpu_name if it's NULL

Currently we assume that if the cpu_spec has a pvr_mask then it must also have a
cpu_name. But that will change in a subsequent commit when we do CPU feature
discovery via the device tree, so check explicitly if cpu_name is NULL.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

of/fdt: introduce of_scan_flat_dt_subnodes and of_get_flat_dt_phandle

Introduce primitives for FDT parsing. These will be used for powerpc
cpufeatures node scanning, which has quite complex structure but should
be processed early.

Cc: devicetree@vger.kernel.org
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

Merge branch 'next' of git://git./linux/kernel/git/scottwood/linux into next

Freescale updates from Scott:

"Includes a fix for a powerpc/next mm regression on 64e, a fix for a
kernel hang on 64e when using a debugger inside a relocated kernel, a
qman fix, and misc qe improvements."

Merge tag 'kvm-arm-for-v4.12-round2' of git://git./linux/kernel/git/kvmarm/kvmarm into HEAD

Second round of KVM/ARM Changes for v4.12.

Changes include:
- A fix related to the 32-bit idmap stub
- A fix to the bitmask used to deode the operands of an AArch32 CP
   instruction
- We have moved the files shared between arch/arm/kvm and
   arch/arm64/kvm to virt/kvm/arm
- We add support for saving/restoring the virtual ITS state to
   userspace

KVM: arm/arm64: vgic-its: Cleanup after failed ITT restore

When failing to restore the ITT for a DTE, we should remove the failed
device entry from the list and free the object.

We slightly refactor vgic_its_destroy to be able to reuse the now
separate vgic_its_free_dte() function.

Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

KVM: arm/arm64: Don't call map_resources when restoring ITS tables

The only reason we called kvm_vgic_map_resources() when restoring the
ITS tables was because we wanted to have the KVM iodevs registered in
the KVM IO bus framework at the time when the ITS was restored such that
a restored and active device can inject MSIs prior to otherwise calling
kvm_vgic_map_resources() from the first run of a VCPU.

Since we now register the KVM iodevs for the redestributors and ITS as
soon as possible (when setting the base addresses), we no longer need
this call and kvm_vgic_map_resources() is again called only when first
running a VCPU.

Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

KVM: arm/arm64: Register ITS iodev when setting base address

We have to register the ITS iodevice before running the VM, because in
migration scenarios, we may be restoring a live device that wishes to
inject MSIs before the VCPUs have started.

All we need to register the ITS io device is the base address of the
ITS, so we can simply register that when the base address of the ITS is
set.

[ Code to fix concurrency issues when setting the ITS base address and
to fix the undef base address check written by Marc Zyngier ]

Signed-off-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

KVM: arm/arm64: Get rid of its->initialized field

The its->initialized doesn't bring much to the table, and creates
unnecessary ordering between setting the address and initializing it
(which amounts to exactly nothing).

Let's kill it altogether, making KVM_DEV_ARM_VGIC_CTRL_INIT the no-op
it deserves to be.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

KVM: arm/arm64: Register iodevs when setting redist base and creating VCPUs

Instead of waiting with registering KVM iodevs until the first VCPU is
run, we can actually create the iodevs when the redist base address is
set. The only downside is that we must now also check if we need to do
this for VCPUs which are created after creating the VGIC, because there
is no enforced ordering between creating the VGIC (and setting its base
addresses) and creating the VCPUs.

Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

KVM: arm/arm64: Slightly rework kvm_vgic_addr

As we are about to handle setting the address for the redistributor base
region separately from some of the other base addresses, let's rework
this function to leave a little more room for being flexible in what
each type of base address does.

Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

KVM: arm/arm64: Make vgic_v3_check_base more broadly usable

As we are about to fiddle with the IO device registration mechanism,
let's be a little more careful when setting base addresses as early as
possible. When setting a base address, we can check that there's
address space enough for its scope and when the last of the two
base addresses (dist and redist) get set, we can also check if the
regions overlap at that time.

This allows us to provide error messages to the user at time when trying
to set the base address, as opposed to later when trying to run the VM.

To do this, we make vgic_v3_check_base available in the core vgic-v3
code as well as in the other parts of the GICv3 code, namely the MMIO
config code.

We also return true for undefined base addresses so that the function
can be used before all base addresses are set; all callers already check
for uninitialized addresses before calling this function.

Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

KVM: arm/arm64: Refactor vgic_register_redist_iodevs

Split out the function to register all the redistributor iodevs into a
function that handles a single redistributor at a time in preparation
for being able to call this per VCPU as these get created.

Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

KVM: Add kvm_vcpu_get_idx to get vcpu index in kvm->vcpus

There are occasional needs to use the index of vcpu in the kvm->vcpus
array to map something related to a VCPU. For example, unlike the
vcpu->vcpu_id, the vcpu index is guaranteed to not be sparse across all
vcpus which is useful when allocating a memory area for each vcpu.

Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

nVMX: Advertise PML to L1 hypervisor

Advertise the PML bit in vmcs12 but don't try to enable
it in hardware when running L2 since L0 is emulating it. Also,
preserve L0's settings for PML since it may still
want to log writes.

Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

nVMX: Implement emulated Page Modification Logging

With EPT A/D enabled, processor access to L2 guest
paging structures will result in a write violation.
When this happens, write the GUEST_PHYSICAL_ADDRESS
to the pml buffer provided by L1 if the access is
write and the dirty bit is being set.

This patch also adds necessary checks during VMEntry if L1
has enabled PML. If the PML index overflows, we change the
exit reason and run L1 to simulate a PML full event.

Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

kvm: x86: Add a hook for arch specific dirty logging emulation

When KVM updates accessed/dirty bits, this hook can be used
to invoke an arch specific function that implements/emulates
dirty logging such as PML.

Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

kvm: nVMX: Validate CR3 target count on nested VM-entry

According to the SDM, the CR3-target count must not be greater than
4. Future processors may support a different number of CR3-target
values. Software should read the VMX capability MSR IA32_VMX_MISC to
determine the number of values supported.

Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Merge branch 'kvm-ppc-next' of git://git./linux/kernel/git/paulus/powerpc into HEAD

The main thing here is a new implementation of the in-kernel
XICS interrupt controller emulation for POWER9 machines, from Ben
Herrenschmidt.

POWER9 has a new interrupt controller called XIVE (eXternal Interrupt
Virtualization Engine) which is able to deliver interrupts directly
to guest virtual CPUs in hardware without hypervisor intervention.
With this new code, the guest still sees the old XICS interface but
performance is better because the XICS emulation in the host uses the
XIVE directly rather than going through a XICS emulation in firmware.

Conflicts:
arch/powerpc/kernel/cpu_setup_power.S [cherry-picked fix]
arch/powerpc/kvm/book3s_xive.c [include asm/debugfs.h]

KVM: set no_llseek in stat_fops_per_vm

In vm_stat_get_per_vm_fops and vcpu_stat_get_per_vm_fops, since we
use nonseekable_open() to open, we should use no_llseek() to seek,
not generic_file_llseek().

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

powerpc/64s: Fix unnecessary machine check handler relocation branch

Similarly to commit 2563a70c3b ("powerpc/64s: Remove unnecessary relocation
branch from idle handler"), the machine check handler has a BRANCH_TO from
relocated to relocated code, which is unnecessary.

It has also caused build errors with some toolchains:

  arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
  arch/powerpc/kernel/exceptions-64s.S:395: Error: operand out of range
  (0xffffffffffff8280 is not between 0x0000000000000000 and
  0x000000000000ffff)

Fixes: 1945bc4549e5 ("powerpc/64s: Fix POWER9 machine check handler from stop state")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reported-and-tested-by : Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

powerpc/mm/book3s/64: Rework page table geometry for lower memory usage

Recently in commit f6eedbba7a26 ("powerpc/mm/hash: Increase VA range to 128TB")
we increased the virtual address space for user processes to 128TB by default,
and up to 512TB if user space opts in.

This obviously required expanding the range of the Linux page tables. For Book3s
64-bit using hash and with PAGE_SIZE=64K, we increased the PGD to 2^15 entries.
This meant we could cover the full address range, while still being able to
insert a 16G hugepage at the PGD level and a 16M hugepage in the PMD.

The downside of that geometry is that it uses a lot of memory for the PGD, and
in particular makes the PGD a 4-page allocation, which means it's much more
likely to fail under memory pressure.

Instead we can make the PMD larger, so that a single PUD entry maps 16G,
allowing the 16G hugepages to sit at that level in the tree. We're then able to
split the remaining bits between the PUG and PGD. We make the PGD slightly
larger as that results in lower memory usage for typical programs.

When THP is enabled the PMD actually doubles in size, to 2^11 entries, or 2^14
bytes, which is large but still < PAGE_SIZE.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

powerpc: Fix distclean with Makefile.postlink

Makefile.postlink always includes include/config/auto.conf, however
this file is not present in a clean kernel tree, causing make to fail:

  $ git clone linuxppc.git
  $ cd linuxppc.git
  $ make distclean
  arch/powerpc/Makefile.postlink:10: include/config/auto.conf: No such file or directory
  make[1]: *** No rule to make target `include/config/auto.conf'.  Stop.
  make: *** [vmlinuxclean] Error 2

Equally running 'make distclean; make distclean' will trip the error case.

Change the inclusion such that file not being found does not trigger an error.

Fixes: f188d0524d7e ("powerpc: Use the new post-link pass to check relocations")
Reported-by: Mircea Pop <mircea.pop@nxp.com>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

KVM: arm/arm64: vgic: Rename kvm_vgic_vcpu_init to kvm_vgic_vcpu_enable

This function really doesn't init anything, it enables the CPU
interface, so name it as such, which gives us the name to use for actual
init work later on.

Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

KVM: arm/arm64: Clarification and relaxation to ITS save/restore ABI

Clarify what is meant by the save/restore ABI only supporting virtual
physical interrupts.

Relax the requirement of the order that the collection entries are
written in and be clear that there is no particular ordering enforced.

Some cosmetic changes in the capitalization of ID names to align with
the GICv3 manual and remove the empty line in the bottom of the patch.

Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>